5. Character Classes
●
Groups of character
[abcdefgh0123]
●
Range of character
[a-h0-3]
●
Inverse Range of Character
[^i-z4-9]
e.g. /[gG]uide[lL]ine/ -> guideline or
guideLine
6. Shorthand Char. Classes
d -> [0-9]
w -> [a-zA-Z]
s -> whitespace + [trn]
negative:
D -> [^0-9]
W -> [^a-zA-Z]
S
'.' -> any character!!
7. Non-printable Characters
t -> tab character (ASCII 0x09)
r -> carriage return (0x0D)
n -> line feed (0x0A)
(a (bell, 0x07), e (escape, 0x1B), f (form feed, 0x0C) ,v (vertical
tab, 0x0B) )
xFF -> hexadecimal index in the char. set
e.g. xA9 -> copyright symbol in the Latin-1
uFFFF -> Unicode character
e.g. u20AC -> the euro currency sign
^ begin of the string
$ end of the string
b -> word boundary - B -> not word boundary
8. Quantifiers
REGEXES ARE GREEDY
{min, max} / {min,} / {,max} / {exact}
? -> {0,1}
+ -> {1,}
* -> {0,}
lazy quantifiers:
carefull when using /.*/
+?
*?
| : not quantifier, simple 'OR'
9. Modifiers
//i : case incensitive
//m : multiline
//x : ignore whitespace
Internal Option Set:
(? .. )
(?i)
e.g. /ab(?i)c/ -> "abc" and "abC"
10. Subpatterns
Pattern in a pattern in .....
Can be nested!!
e.g. /((red|white) (king|queen))/
reg king
white king
red queen
white queen
11. PHP & Regex
preg_.... : PCRE
strpos() or strstr() faster
ereg_.... : Posix
deprecated in 5.3.0
preg_ is often faster
mb_ereg_...: "multibyte"