Regular expressions are patterns used to match patterns of symbols in text. They allow specifying rules for what constitutes a match using special characters like brackets, pipes, periods, asterisks, and more. Regular expressions can be used in programming languages and tools like Python, Perl, Java, Excel, and more for tasks like validation and search/replace. Examples demonstrate matching strings using expressions like a(ab)*a, abc|xyz, ab+c, and a.[bc]+.
2. What are Regular Expressions?
• They are a way of specifying patterns
• Specifically patterns of symbols
– A set of rules
– A mini language
– A tiny, highly specialised programming
language
• Number one use is for pattern
matching – ie: finding things
3. Where could you use this
awesome skill?
• Python
• Perl, C, Java, VB.NET, Javascript
• Yahoo Widgets
• Excel, Word ….
4. Basic RegEx Syntax
[abc] = either a, b or c (class)
a|b = a or b
[a-z] = anything between a and z
[^5] = anything except 5
. = anything (except a new line)
* = multiple instances of the same char (0->)
The backslash is a metacharacter
(ab)* Round brackets group chars together
.^$*+?{[]|()
5. Exercise 1
Which of the following matches regexp a(ab)*a
• abababa
• aaba
• aabbaa
• aba
• aabababa
6. Exercise 2
• Which of the following matches regexp abc|xyz
• abc
• xyz
• abc|xyz
7. Repetitions
• a* = zero or more repetitions of a
• a+ = one or more repetitions of a
• a? = zero or one repetitions of a
– (in other words, might not be there)
8. Exercise 3
• Which of the following matches regexp ab+c?
• abc
• ac
• abbb
• bbc
9. Exercise 4
• Which of the following matches regexp a.[bc]+
• abc
• abbbbbbbb
• azc
• abcbcbcbc
• ac
• asccbbbbcbcccc