5. Simple Regular Expressions
● Traditional regular expressions.
● Not a standard.
● Support by some applications for backwards
compatibility.
● Deprecated.
6. POSIX Basic Regular
Expressions
● Created to provide a common standard for Unix
tools.
● Designed to be backwards compatible with
traditional regular expressions.
● Adopted as the default syntax of many Unix
tools.
● Some metacharacters require escaping.
7. POSIX Extended Regular
Expressions
● Adds some new metacharacters.
● Metacharacters do not require escaping.
● Dropped support for back references (n).
● Many Unix tools provide support with a
command line argument (usually -E).
8. Perl Regular Expressions
● Adds lazy quantification, named capture groups
and recursive patterns.
● Adopted by many programming languages due
to its power.
● Requires non-alphanumeric delimiters around
expression.
● Other languages only implement a subset, so
implementations vary.
10. Basic Metacharacters
. Match any single character.
^ Matches beginning of a string.
$ Matches end of a string.
| Matches the expression before or after (think ||).
11. Character Classes
[] Match any characters within the group.
[^ ] Match any characters NOT within the group.
[n-m] Match a range of characters.
Examples:
[A-Za-z0-9]
[^G-Zg-z _]
12. Shorthand Character Classes
s Any whitespace character such as space, tab and newlines.
Same as [nrt ]
w Any word character.
Same as [A-Za-z0-9_]
d Any digit character.
Same as [0-9]
S, W, D Negated version of the above. Can be used inside character
classes but could be confusing.
13. Quantifiers
* Match the preceding expression 0 or more times.
+ Match the preceding expression 1 or more times.
? Match the preceding expression 0 or 1 time.
{m,n} Match the preceding expression at least m times but no more than n times.
{m,} Match the preceding expression at least m times with no maximum.
{,n} Match the preceding expression no more than n times with no minimum.
{n} Match the preceding expression exactly n times.
14. Lazy Quantifiers
Standard Quantifiers are greedy.
Example:
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
"Hello .*"
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
15. Lazy Quantifiers
Use ? to make a quantifier lazy.
Example:
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
"Hello .*?"
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
16. Grouping
() Group the expression and capture the text.
(?: ) Group the expression but DO NOT capture the text.
17. Backreferences
1 through 9 reference previously captured text.
Example:
Many programming courses start with a "Hello World"
example. 'Hello World' examples are extremely simple,
especially when they just output "Hello World'.
('|")Hello World(1)
Many programming courses start with a "Hello World"
example. 'Hello World' examples are extremely simple,
especially when they just output "Hello World'.
18. Word Boundaries
b matches the position between a word character
(w) and a non-word character (W).
Example:
Hello World
ob
Hello| World
19. Word Boundaries
B matches the position between two word
characters (ww).
Example:
Hello World
oB
Hello Wo|rld
20. Lookaheads
(?= ) matches the position directly before the
expression is matched.
Example:
Hello World sounds better than "Hello Earth".
Hello(?= World)
Hello World sounds better than "Hello Earth".
21. Lookbehinds
(?<= ) matches the position directly after the
expression is matched.
Example:
Hello World sounds better than "Hello Earth".
(?<=")Hello
Hello World sounds better than "Hello Earth".
22. Lookaheads
(?! ) matches the position directly before the
expression is NOT matched.
Example:
Hello World sounds better than "Hello Earth".
Hello(?! World)
Hello World sounds better than "Hello Earth".
23. Lookbehinds
(?<! ) matches the position directly after the
expression is NOT matched.
Example:
Hello World sounds better than "Hello Earth".
(?<!")Hello
Hello World sounds better than "Hello Earth".
24. Conditionals
(?(condition)then|else)
● condition must be a lookahead or a lookbehind.
● If condition is matched, then must match for the
expression to pass.
● If condition is not matched, else must match for
the expression to pass.
25. Conditionals
Example:
Hello World sounds better than "Hello Earth".
Hello (?(?<=World)World|Earth)
Hello World sounds better than "Hello Earth".
Hello (?(?<=People)People|Earth)
Hello World sounds better than "Hello Earth".
26. Modifiers
i Case insensitive matching.
s . matches newline characters.
m ^ and $ match after and before newlines (respectively).
x Whitespace within the expression is ignored unless escaped.
g Match globally.
27. Modifiers
● (?a) to turn modifiers on.
●(?-a) to turn modifiers off.
Examples:
(?i)WORLD(?-i)
(?i-s)WORLD.(?s-i)
(?i:WORLD)