SlideShare a Scribd company logo
1 of 34
Regular Expressions and You

An introduction to regular expressions.




James I. Armes
Web Developer, AllPlayers.com
@jamesiarmes
Email Validation Examples




 ^[w.%+-]+@[w.-]+.[A-Za-z]{2,4}$
Email Validation Examples
(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:
[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 
000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[
["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:
(?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[
t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+
(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[
["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:
(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:
(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:
[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(?:rn)?[ t])*)|(?:[^()<>@,;:".[] 000-031]+(?:(?:
(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*:(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?
=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:
[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?
[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000-031]+
(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])
+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^
[]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:
(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 
000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:
(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[
["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:
(?:rn)?[ t])*))*>(?:(?:rn)?[ t])*)(?:,s*(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)
(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:
[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:
(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:
(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.
(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 
000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[
["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:
(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:
(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:
[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(?:rn)?[ t])*))*)?;s*)
Types of Regular Expressions

●   Simple Regular Expressions
●   POSIX Basic Regular Expressions
●   POSIX Extended Regular Expressions
●   Perl Regular Expressions
Simple Regular Expressions
●   Traditional regular expressions.
●   Not a standard.
●   Support by some applications for backwards
    compatibility.
●   Deprecated.
POSIX Basic Regular
             Expressions
●   Created to provide a common standard for Unix
    tools.
●   Designed to be backwards compatible with
    traditional regular expressions.
●   Adopted as the default syntax of many Unix
    tools.
●   Some metacharacters require escaping.
POSIX Extended Regular
             Expressions
●   Adds some new metacharacters.
●   Metacharacters do not require escaping.
●   Dropped support for back references (n).
●   Many Unix tools provide support with a
    command line argument (usually -E).
Perl Regular Expressions

●   Adds lazy quantification, named capture groups
    and recursive patterns.
●   Adopted by many programming languages due
    to its power.
●   Requires non-alphanumeric delimiters around
    expression.
●   Other languages only implement a subset, so
    implementations vary.
Syntax
Basic Metacharacters

.   Match any single character.

^   Matches beginning of a string.

$   Matches end of a string.

|   Matches the expression before or after (think ||).
Character Classes

[]      Match any characters within the group.
[^ ]    Match any characters NOT within the group.
[n-m]   Match a range of characters.




Examples:
[A-Za-z0-9]
[^G-Zg-z _]
Shorthand Character Classes

s           Any whitespace character such as space, tab and newlines.
             Same as [nrt ]
w           Any word character.
             Same as [A-Za-z0-9_]
d           Any digit character.
             Same as [0-9]
S, W, D   Negated version of the above. Can be used inside character
             classes but could be confusing.
Quantifiers

*       Match the preceding expression 0 or more times.
+       Match the preceding expression 1 or more times.
?       Match the preceding expression 0 or 1 time.
{m,n}   Match the preceding expression at least m times but no more than n times.
{m,}    Match the preceding expression at least m times with no maximum.
{,n}    Match the preceding expression no more than n times with no minimum.
{n}     Match the preceding expression exactly n times.
Lazy Quantifiers

Standard Quantifiers are greedy.
Example:
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
"Hello .*"
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
Lazy Quantifiers

Use ? to make a quantifier lazy.
Example:
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
"Hello .*?"
Many programming courses start with a "Hello World" example.
That would be "Hallo Welt" in German.
Grouping

()      Group the expression and capture the text.
(?: )   Group the expression but DO NOT capture the text.
Backreferences

1 through 9 reference previously captured text.
Example:
Many programming courses start with a "Hello World"
example. 'Hello World' examples are extremely simple,
especially when they just output "Hello World'.
('|")Hello World(1)
Many programming courses start with a "Hello World"
example. 'Hello World' examples are extremely simple,
especially when they just output "Hello World'.
Word Boundaries

b matches the position between a word character
(w) and a non-word character (W).
Example:
Hello World
ob
Hello| World
Word Boundaries

B matches the position between two word
characters (ww).
Example:
Hello World
oB
Hello Wo|rld
Lookaheads

(?= ) matches the position directly before the
expression is matched.
Example:
Hello World sounds better than "Hello Earth".
Hello(?= World)
Hello World sounds better than "Hello Earth".
Lookbehinds

(?<= ) matches the position directly after the
expression is matched.
Example:
Hello World sounds better than "Hello Earth".
(?<=")Hello
Hello World sounds better than "Hello Earth".
Lookaheads

(?! ) matches the position directly before the
expression is NOT matched.
Example:
Hello World sounds better than "Hello Earth".
Hello(?! World)
Hello World sounds better than "Hello Earth".
Lookbehinds

(?<! ) matches the position directly after the
expression is NOT matched.
Example:
Hello World sounds better than "Hello Earth".
(?<!")Hello
Hello World sounds better than "Hello Earth".
Conditionals

(?(condition)then|else)
●   condition must be a lookahead or a lookbehind.
●   If condition is matched, then must match for the
    expression to pass.
●   If condition is not matched, else must match for
    the expression to pass.
Conditionals

Example:
Hello World sounds better than "Hello Earth".
Hello (?(?<=World)World|Earth)
Hello World sounds better than "Hello Earth".
Hello (?(?<=People)People|Earth)
Hello World sounds better than "Hello Earth".
Modifiers

i   Case insensitive matching.
s   . matches newline characters.
m   ^ and $ match after and before newlines (respectively).
x   Whitespace within the expression is ignored unless escaped.
g   Match globally.
Modifiers

●   (?a) to turn modifiers on.
●(?-a) to turn modifiers off.
Examples:
(?i)WORLD(?-i)
(?i-s)WORLD.(?s-i)
(?i:WORLD)
Language
Implementations
JavaScript

●   RegExp object.
        –   var expression = new RegExp('World', 'g');
        –   var expression = /World/g;
●   String.match()
●   String.replace()
●   String.split()
Perl

●   if ($string =~ /regex/)
●   $string =~ s/regex/replacement/
●   Regexp::Common
        –   http://search.cpan.org/dist/Regexp-Common/
        –   Provides common expressions.
        –   Examples:
                ●   IP Address
                ●   Credit Card Number
                ●   Profanity
PHP

●   ereg vs. preg
       –   preg uses Perl syntax.
       –   ereg uses POSIX Extended syntax.
       –   preg is much faster.
       –   ereg has been deprecated as of PHP 5.3.
PHP

●   preg_match()
●   preg_match_all()
●   preg_replace()
●   preg_split()
●   preg_quote()
●   http://www.php.net/manual/en/book.pcre.php
●   http://php.net/manual/reference.pcre.pattern.modifiers.php
Tools and Resources

●   txt2regex - http://aurelio.net/txt2regex/
●   Reggy (mac) - http://reggyapp.com/
●   Patterns (mac) - http://krillapps.com/patterns/
●   Web based - http://regex.larsolavtorvik.com/
●   Regular-Expressions.info (reference) -
    http://www.regular-expressions.info/
Thanks!




http://xkcd.com/208/

More Related Content

What's hot

16 Java Regex
16 Java Regex16 Java Regex
16 Java Regex
wayn
 
Regular Expressions grep and egrep
Regular Expressions grep and egrepRegular Expressions grep and egrep
Regular Expressions grep and egrep
Tri Truong
 
Python advanced 2. regular expression in python
Python advanced 2. regular expression in pythonPython advanced 2. regular expression in python
Python advanced 2. regular expression in python
John(Qiang) Zhang
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentation
arnolambert
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
Raj Gupta
 
Regular Expressions 101
Regular Expressions 101Regular Expressions 101
Regular Expressions 101
Raj Rajandran
 
Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013
Ben Brumfield
 
Java: Regular Expression
Java: Regular ExpressionJava: Regular Expression
Java: Regular Expression
Masudul Haque
 

What's hot (20)

Finaal application on regular expression
Finaal application on regular expressionFinaal application on regular expression
Finaal application on regular expression
 
Textpad and Regular Expressions
Textpad and Regular ExpressionsTextpad and Regular Expressions
Textpad and Regular Expressions
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
2.regular expressions
2.regular expressions2.regular expressions
2.regular expressions
 
Basta mastering regex power
Basta mastering regex powerBasta mastering regex power
Basta mastering regex power
 
16 Java Regex
16 Java Regex16 Java Regex
16 Java Regex
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regular Expression (Regex) Fundamentals
Regular Expression (Regex) FundamentalsRegular Expression (Regex) Fundamentals
Regular Expression (Regex) Fundamentals
 
Regular Expressions grep and egrep
Regular Expressions grep and egrepRegular Expressions grep and egrep
Regular Expressions grep and egrep
 
Python advanced 2. regular expression in python
Python advanced 2. regular expression in pythonPython advanced 2. regular expression in python
Python advanced 2. regular expression in python
 
Regex Presentation
Regex PresentationRegex Presentation
Regex Presentation
 
Looking for Patterns
Looking for PatternsLooking for Patterns
Looking for Patterns
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Regex Basics
Regex BasicsRegex Basics
Regex Basics
 
Regular Expressions 101
Regular Expressions 101Regular Expressions 101
Regular Expressions 101
 
Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013
 
Regular expressions using Python
Regular expressions using PythonRegular expressions using Python
Regular expressions using Python
 
Java: Regular Expression
Java: Regular ExpressionJava: Regular Expression
Java: Regular Expression
 
Regular expression examples
Regular expression examplesRegular expression examples
Regular expression examples
 

Viewers also liked

Recreo clase 2 fernanda 1c
Recreo clase 2 fernanda 1cRecreo clase 2 fernanda 1c
Recreo clase 2 fernanda 1c
fernanda567
 
BSc Hons - Quantity Surveying
BSc Hons - Quantity SurveyingBSc Hons - Quantity Surveying
BSc Hons - Quantity Surveying
Ben Schofield
 

Viewers also liked (8)

Recreo clase 2 fernanda 1c
Recreo clase 2 fernanda 1cRecreo clase 2 fernanda 1c
Recreo clase 2 fernanda 1c
 
BSc Hons - Quantity Surveying
BSc Hons - Quantity SurveyingBSc Hons - Quantity Surveying
BSc Hons - Quantity Surveying
 
Cách làm đèn kéo quân “走馬燈”- the miss la sen revolving lantern
Cách làm đèn kéo quân  “走馬燈”- the miss la sen revolving lanternCách làm đèn kéo quân  “走馬燈”- the miss la sen revolving lantern
Cách làm đèn kéo quân “走馬燈”- the miss la sen revolving lantern
 
Las apps en tipo presentacion
Las apps en tipo presentacionLas apps en tipo presentacion
Las apps en tipo presentacion
 
F I C+ + L P 05
F I C+ + L P 05F I C+ + L P 05
F I C+ + L P 05
 
Getting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless CloudGetting Started with AWS Lambda and the Serverless Cloud
Getting Started with AWS Lambda and the Serverless Cloud
 
Co bot 발표자료
Co bot 발표자료Co bot 발표자료
Co bot 발표자료
 
Climate Change and Tourism: Global Context
Climate Change and Tourism: Global ContextClimate Change and Tourism: Global Context
Climate Change and Tourism: Global Context
 

Similar to Regular Expressions and You

Regular expressions
Regular expressionsRegular expressions
Regular expressions
Raghu nath
 
PERL Regular Expression
PERL Regular ExpressionPERL Regular Expression
PERL Regular Expression
Binsent Ribera
 

Similar to Regular Expressions and You (20)

Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions
 
Bioinformatica p2-p3-introduction
Bioinformatica p2-p3-introductionBioinformatica p2-p3-introduction
Bioinformatica p2-p3-introduction
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
 
PERL Regular Expression
PERL Regular ExpressionPERL Regular Expression
PERL Regular Expression
 
Introduction To Regex in Lasso 8.5
Introduction To Regex in Lasso 8.5Introduction To Regex in Lasso 8.5
Introduction To Regex in Lasso 8.5
 
Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014
 
Regex startup
Regex startupRegex startup
Regex startup
 
Class 5 - PHP Strings
Class 5 - PHP StringsClass 5 - PHP Strings
Class 5 - PHP Strings
 
2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex
 
Regular Expression in Action
Regular Expression in ActionRegular Expression in Action
Regular Expression in Action
 
Python - Regular Expressions
Python - Regular ExpressionsPython - Regular Expressions
Python - Regular Expressions
 
Regular Expressions in Stata
Regular Expressions in StataRegular Expressions in Stata
Regular Expressions in Stata
 
RegEx : Expressions and Parsing Examples
RegEx : Expressions and Parsing ExamplesRegEx : Expressions and Parsing Examples
RegEx : Expressions and Parsing Examples
 
Don't Fear the Regex LSP15
Don't Fear the Regex LSP15Don't Fear the Regex LSP15
Don't Fear the Regex LSP15
 
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekingeBioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
 
Regexp
RegexpRegexp
Regexp
 
Don't Fear the Regex WordCamp DC 2017
Don't Fear the Regex WordCamp DC 2017Don't Fear the Regex WordCamp DC 2017
Don't Fear the Regex WordCamp DC 2017
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Regular Expressions and You

  • 1. Regular Expressions and You An introduction to regular expressions. James I. Armes Web Developer, AllPlayers.com @jamesiarmes
  • 2. Email Validation Examples ^[w.%+-]+@[w.-]+.[A-Za-z]{2,4}$
  • 3. Email Validation Examples (?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?: [^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[ ["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?: (?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+ (?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[ ["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?: (?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?: (?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?: [^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(?:rn)?[ t])*)|(?:[^()<>@,;:".[] 000-031]+(?:(?: (?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*:(?:(?:rn)?[ t])*(?:(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(? =[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?: [^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)? [ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000-031]+ (?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t]) +|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^ []r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?: (?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?: (?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[ ["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?: (?:rn)?[ t])*))*>(?:(?:rn)?[ t])*)(?:,s*(?:(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*) (?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?:(?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?: [^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?: (?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*|(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?: (?:rn)?[ t]))*"(?:(?:rn)?[ t])*)*<(?:(?:rn)?[ t])*(?:@(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:. (?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*(?:,@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[ ["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*)*:(?:(?:rn)?[ t])*)?(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?: (?:rn)?[ t]))*"(?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|"(?:[^"r]|.|(?:(?:rn)?[ t]))*"(?: (?:rn)?[ t])*))*@(?:(?:rn)?[ t])*(?:[^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*)(?:.(?:(?:rn)?[ t])*(?: [^()<>@,;:".[] 000-031]+(?:(?:(?:rn)?[ t])+|Z|(?=[["()<>@,;:".[]]))|[([^[]r]|.)*](?:(?:rn)?[ t])*))*>(?:(?:rn)?[ t])*))*)?;s*)
  • 4. Types of Regular Expressions ● Simple Regular Expressions ● POSIX Basic Regular Expressions ● POSIX Extended Regular Expressions ● Perl Regular Expressions
  • 5. Simple Regular Expressions ● Traditional regular expressions. ● Not a standard. ● Support by some applications for backwards compatibility. ● Deprecated.
  • 6. POSIX Basic Regular Expressions ● Created to provide a common standard for Unix tools. ● Designed to be backwards compatible with traditional regular expressions. ● Adopted as the default syntax of many Unix tools. ● Some metacharacters require escaping.
  • 7. POSIX Extended Regular Expressions ● Adds some new metacharacters. ● Metacharacters do not require escaping. ● Dropped support for back references (n). ● Many Unix tools provide support with a command line argument (usually -E).
  • 8. Perl Regular Expressions ● Adds lazy quantification, named capture groups and recursive patterns. ● Adopted by many programming languages due to its power. ● Requires non-alphanumeric delimiters around expression. ● Other languages only implement a subset, so implementations vary.
  • 10. Basic Metacharacters . Match any single character. ^ Matches beginning of a string. $ Matches end of a string. | Matches the expression before or after (think ||).
  • 11. Character Classes [] Match any characters within the group. [^ ] Match any characters NOT within the group. [n-m] Match a range of characters. Examples: [A-Za-z0-9] [^G-Zg-z _]
  • 12. Shorthand Character Classes s Any whitespace character such as space, tab and newlines. Same as [nrt ] w Any word character. Same as [A-Za-z0-9_] d Any digit character. Same as [0-9] S, W, D Negated version of the above. Can be used inside character classes but could be confusing.
  • 13. Quantifiers * Match the preceding expression 0 or more times. + Match the preceding expression 1 or more times. ? Match the preceding expression 0 or 1 time. {m,n} Match the preceding expression at least m times but no more than n times. {m,} Match the preceding expression at least m times with no maximum. {,n} Match the preceding expression no more than n times with no minimum. {n} Match the preceding expression exactly n times.
  • 14. Lazy Quantifiers Standard Quantifiers are greedy. Example: Many programming courses start with a "Hello World" example. That would be "Hallo Welt" in German. "Hello .*" Many programming courses start with a "Hello World" example. That would be "Hallo Welt" in German.
  • 15. Lazy Quantifiers Use ? to make a quantifier lazy. Example: Many programming courses start with a "Hello World" example. That would be "Hallo Welt" in German. "Hello .*?" Many programming courses start with a "Hello World" example. That would be "Hallo Welt" in German.
  • 16. Grouping () Group the expression and capture the text. (?: ) Group the expression but DO NOT capture the text.
  • 17. Backreferences 1 through 9 reference previously captured text. Example: Many programming courses start with a "Hello World" example. 'Hello World' examples are extremely simple, especially when they just output "Hello World'. ('|")Hello World(1) Many programming courses start with a "Hello World" example. 'Hello World' examples are extremely simple, especially when they just output "Hello World'.
  • 18. Word Boundaries b matches the position between a word character (w) and a non-word character (W). Example: Hello World ob Hello| World
  • 19. Word Boundaries B matches the position between two word characters (ww). Example: Hello World oB Hello Wo|rld
  • 20. Lookaheads (?= ) matches the position directly before the expression is matched. Example: Hello World sounds better than "Hello Earth". Hello(?= World) Hello World sounds better than "Hello Earth".
  • 21. Lookbehinds (?<= ) matches the position directly after the expression is matched. Example: Hello World sounds better than "Hello Earth". (?<=")Hello Hello World sounds better than "Hello Earth".
  • 22. Lookaheads (?! ) matches the position directly before the expression is NOT matched. Example: Hello World sounds better than "Hello Earth". Hello(?! World) Hello World sounds better than "Hello Earth".
  • 23. Lookbehinds (?<! ) matches the position directly after the expression is NOT matched. Example: Hello World sounds better than "Hello Earth". (?<!")Hello Hello World sounds better than "Hello Earth".
  • 24. Conditionals (?(condition)then|else) ● condition must be a lookahead or a lookbehind. ● If condition is matched, then must match for the expression to pass. ● If condition is not matched, else must match for the expression to pass.
  • 25. Conditionals Example: Hello World sounds better than "Hello Earth". Hello (?(?<=World)World|Earth) Hello World sounds better than "Hello Earth". Hello (?(?<=People)People|Earth) Hello World sounds better than "Hello Earth".
  • 26. Modifiers i Case insensitive matching. s . matches newline characters. m ^ and $ match after and before newlines (respectively). x Whitespace within the expression is ignored unless escaped. g Match globally.
  • 27. Modifiers ● (?a) to turn modifiers on. ●(?-a) to turn modifiers off. Examples: (?i)WORLD(?-i) (?i-s)WORLD.(?s-i) (?i:WORLD)
  • 29. JavaScript ● RegExp object. – var expression = new RegExp('World', 'g'); – var expression = /World/g; ● String.match() ● String.replace() ● String.split()
  • 30. Perl ● if ($string =~ /regex/) ● $string =~ s/regex/replacement/ ● Regexp::Common – http://search.cpan.org/dist/Regexp-Common/ – Provides common expressions. – Examples: ● IP Address ● Credit Card Number ● Profanity
  • 31. PHP ● ereg vs. preg – preg uses Perl syntax. – ereg uses POSIX Extended syntax. – preg is much faster. – ereg has been deprecated as of PHP 5.3.
  • 32. PHP ● preg_match() ● preg_match_all() ● preg_replace() ● preg_split() ● preg_quote() ● http://www.php.net/manual/en/book.pcre.php ● http://php.net/manual/reference.pcre.pattern.modifiers.php
  • 33. Tools and Resources ● txt2regex - http://aurelio.net/txt2regex/ ● Reggy (mac) - http://reggyapp.com/ ● Patterns (mac) - http://krillapps.com/patterns/ ● Web based - http://regex.larsolavtorvik.com/ ● Regular-Expressions.info (reference) - http://www.regular-expressions.info/