SlideShare uma empresa Scribd logo
1 de 35
Baixar para ler offline
Regular Expression For
Everyone
Sanjeev Kumar Jaiswal (Jassi)
Perl Programmer and Security Enthusiast
#NullHyd
•Introduction to RegEx
•Match, substitute, transliterate
•Quantifiers, Character Class, Modifiers
•RegEx Examples in Perl
•Books, sites, blogs to learn RegEx
http://www.aliencoders.org/
Agenda to make you superhero!
•Stephen Col Kleene -> Regular Sets (1950)
• Ken Thompson -> ed (1971)
• Global Regular Expression Print (1973)
•Sed (1974) & awk (1977)(text processing tools)
•Search engines, word processors, text editors
•Henry Spencer -> Software library for Regular Expression (regex)
•Regular Expression (regex or regexp)
http://www.aliencoders.org/
Regex needs Intro?
•g/re/p (grep –P, egrep)
•sed and awk
•Perl and PCRE (Python, Ruby, PHP, JavaScript)
•Text Editor: Notepad++ ,vim,emacs,nano
•Search Engines
•Think of pattern, think of regex
http://www.aliencoders.org/
Where Regex is used!
Regex Fundamentals
http://www.aliencoders.org/
Match operator
• Represented by m at the beginning
• Matches Pan Card: m/^[a-z]{5}d{4}[a-z]$/i
Substitute operator
• Represented by s/match_it/substitute_with/
• Remove Blank Line: s/^s*$//
Transliteration Operator
• tr/// or earlier y///
• Change the case of string (uc function): tr/a-z/A-Z/
http://www.aliencoders.org/
Regex Operators
Metacharacters
http://www.aliencoders.org/
Any character that has special meaning while using regex.
Ex: anchors, quantifiers, character classes.
The full list of metacharacters (12) is , |, ^, $, *, +, ?, ., (, ), [, {
Somewhere ( in POSIX?), you will see 14 ( including } and ] )
To use it with it’s literal meaning, you need to use backslash before using
these characters.
. doesn’t mean fullstop in regex
* doesn’t mean multiplication in regex
http://www.aliencoders.org/
Having special meaning?
-> overrides the next metacharacter.
^ -> matches the position at the beginning of the input string.
. -> matches any single character except newline
$ -> matches the position at the end of the input string.
| -> specifies the or condition
() -> matches a pattern and creates a capture buffer for the match
[] -> Bracketed Character group/class
http://www.aliencoders.org/
Position Metacharacters (Anchors)
A character class is a way of denoting a set of characters in such a way that
one character of the set is matched
• The dot
• The backslash sequence
• Word Characters -> w or [a-zA-Z0-9_]
• Whitespace -> s or [tnfr ]
• Bracketed character classes, represented by [character_group]
• [aeiou], [a-f-z], [a-z], [-f] -> Positive character class/group
• [^d] , [^^], [x^]-> negative character class/group
• POSIX Character classes, [:class:]
• [[:alpha:]], [01[:alpha:]%], [:alnum:]
• Read more: http://perldoc.perl.org/perlrecharclass.html
http://www.aliencoders.org/
Character Classes
• d for [0-9]
• D for [^d]
• s for [trfn ]
• S for [^s]
• w for [A-Za-z0-9_]
• W for [^w]
Zero –width Assertions
• b Match a word boundary
• B Match except at a word boundary
• A Match only at beginning of string (think of ^)
• Z Match only at end of string, or before newline at the end (think of $)
• z Match only at end of string
http://www.aliencoders.org/
Shorthand Character Classes
Quantifiers
• . -> matches any character except new line
• * -> matches 0 or more
• ? -> matches 0 or 1
• + -> matches 1 or more
• {n} -> matches exactly n times
• {min,max} -> matches min to max times
• {,max} -> matches till max times
• {min,} -> matches min to no limit
http://www.aliencoders.org/
Quantifies the pattern
Modifiers
http://www.aliencoders.org/
• Changes the behavior of the regular expression operators.
These modifiers appear at:
1. the end of the match,
2. substitution, and
3. qr// operators
• i -> case insensitive pattern matching
• g -> globally match the pattern repeatedly in the string
• m -> treat string as multiple lines
• s -> treat string as single line
• x -> permitting whitespace and comments
• e -> evaluate the right-hand side as an expression
Want to know more: http://perldoc.perl.org/perlre.html#Modifiers
http://www.aliencoders.org/
Minimal Modifiers
Regex Examples
http://www.aliencoders.org/
Requirements:
• Trim if any space(s) found before the string
• Trim if any space(s) found after the string
• Ignore if space(s) exists in between the string.
• " c and cpp " or " c and cpp" or " c and cpp " should save as "c and cpp "
$word =~ s/^s+//;
$word =~ s/s+$//;
Or
$word =~ s/^s+|s+$//g;
http://www.aliencoders.org/
Trim extra spaces from a string
Requirements:
You have to accept only if user types
yes or no in any of the below manner:
• yes Yes YEs YES YeS yES yeS
• No no NO nO
• y Y
• n N
$answer = m/^(y(?:es)?|no?)$/i;
http://www.aliencoders.org/
Only Yes or No
Requirements:
• We need to count
• Number of words
• Number of lines
• Number of characters with space or without space.
open(IN, $filename) || die “Can’t open $filename: $!n";
foreach my $line (<IN>) {
$chars_with_space += length($line);
$space_count++ while $line =~ /s+/g;
$words++ while $line =~ /S+/g;
$lines+= $line=~/n$/;
}
close(IN);
http://www.aliencoders.org/
Simulate wc command
Requirements:
• String is “1212211322312223459812222098abcXYZ”
• Look for pattern12+ i.e. matches 12, 122, 1222 and so on
• Find 4th occurrence of the pattern
my $n = 4; # nth occurrence
my ($match) =~ /(?:.*?(12+)){$n}/;
print "#$n match is $NthMatchn";
Or
my @matches = ($_ =~ /(12+)/g);
print "#$n match is $matches[3]n";
http://www.aliencoders.org/
Nth Match in a string
Requirements:
Email could be anything but it should have some boundary like
• It will have @ and .
• Username can be a mixture of character, digit and _ (usually) of any length but it should
start with only character i.e from a-z or A-Z (I restricted length from 8-15)
• Domain Name should be between 2 to 63
• After the last . there should not be @ and it could be in range of 2-4 only
$pattern = /^([a-zA-Z][w_]{8,15})@([a-zA-Z0-9.-]+).([a-
zA-Z]{2,4})$/;
* We can write another email pattern for newly available domains like .website, .shikha etc.
http://www.aliencoders.org/
Email Validation
Requirements:
• Match IPv4 Address
• It should be between 0.0.0.0 and 255.255.255.255
Simplest One
$ip=/^(d{1,3}).(d{1,3}).(d{1,3}).(d{1,3})$/;
Advanced One
$ip =~ /^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$/
http://www.aliencoders.org/
IP Address Validation
Requirements:
• Find a site which runs on WordPress
• Try to figure our WP version
• Either by response header. Check for X-Meta-Generator keyword
• Or by readme.html page
• Or by wp-login.php page
($version) = $content =~ m/X-Meta-Generator:s+(.*)/img;
($version) = $content =~ /versions+(d+.d+(?:.d+)?)/img;
($version) = $content =~ m#wp-
(?:admin|content|includes)/(?!plugins|js).*?ver=(d+.d+(?:.
d+)?(?:[-w.]+)?)#img;
http://www.aliencoders.org/
Find WordPress Version
Requirements:
• Request url and get content
• Parse through each js and css links
• Based on js and css files, guess the technology used
• Regex to find all possible CSS links
my (@css)= $content =~ m#<links+(?:.*?)href=['"]([^'"]*)#mgi;
foreach (@css){
($css) = $_ =~ m#(?:.*)/(.*)#g;
($css) = $_ =~ m#([^/]*.css)#gi;
($css) = $css =~ m#(.*).css# if($css);
}
• Regex to find all possible JavaScript links
my (@js)= $content=~ m#<scripts+(?:.*?)src=['"]([^'"]*)#mig;
http://www.aliencoders.org/
Extract css/js files used in a site
Amex Card: ^3[47][0-9]{13}$
Diners Club Card: ^3(?:0[0-5]|[68][0-9])[0-9]{11}$
Discover Card: 6(?:011|5[0-9]{2})[0-9]{12}
JCB Card: ^(?:2131|1800|35d{3})d{11}$
Maestro Card: ^(5018|5020|5038|6304|6759|6761|6763)[0-9]{8,15}$
Master Card: ^5[1-5][0-9]{14})$
Visa Card: ^4[0-9]{12}(?:[0-9]{3})?$
Snort Rule to alert detect Visa Card:
alert tcp any any <> any any (pcre:"/4d{3}(s|-)?d{4}(s|-)?d{4}(s|-
)?d{4}/";msg:"VISA card number detected in clear text"; content:"visa"; nocase;
sid:9000000; rev:1;)
http://www.aliencoders.org/
Common credit card vendor RegEx
• Just an example, can add many such regex for better prevention
• Regex for simple CSS attack.
s/((%3C)|<)((%2F)|/)*[a-z0-9%]+((%3E)|>)//gix;
• Blind regex for CSS attacks
• s/((%3C)|<)[^<%3C]+((%3E)|>?)//sig; Regex for simple CSS attack.
s/((%3C)|<)((%2F)|/)*[a-z0-9%]+((%3E)|>)//gix;
• Regex check for "<img src" CSS attack
s/((%3C)|<)((%69)|i|(%49))((%6D)|m|(%4D))((%67)|g|(%47))
[^n]+((%3E)|>)//gi;
http://www.aliencoders.org/
Preventing XSS
Requirements:
• Output lines that contain duplicate words
• Find doubled words that expand lines
• Ignore capitalization differences
• Ignore HTML tags
$/ = ".n";
while (<>) {
next if !s/b([a-
z]+)((?:s|<[^>]+>)+)(1b)/e[7m$1e[m$2e[7m$3e[m/ig;
s/^(?:[^e]*n)+//mg; # Remove any unmarked lines.
s/^/$ARGV: /mg; # Ensure lines begin with filename.
print;
}
http://www.aliencoders.org/
Find Duplicate words in a file
• # Famous one line Perl RegEx for Prime Number by Abigail
perl -le 'print "Prime Number"
if (1 x shift) !~ /^1?$|^(11+?)1+$/' [number]
• # decimal number
/^[+-]?d+.?d*$/
• # Extract HTTP User-Agent
/^User-Agent: (.+)$/
• # Replace <b> with <strong>
$html =~ s|<(/)?b>|<$1strong>|g
http://www.aliencoders.org/
More Regex
• My Favorite: http://regexr.com/
• Expresso for Windows: http://www.ultrapico.com/Expresso.htm
• Firefox Addon: https://addons.mozilla.org/en-US/firefox/addon/rext/
• Ruby based regex tool: http://www.rubular.com/
• Python based regex tool: http://www.pythonregex.com/
• JavaScript based regex tool: http://regexpal.com/
http://www.aliencoders.org/
Regex Tools
• Don’t make simple things complex by using Regex
• Don’t try to do everything in one regex, better use multi regex
• Test Regex in regex tool before using
• Use Post processing whenever is possible
• Alternations? Think of grouping the regex
• Capture only it’s necessary
• Use /x for proper whitespace and comments, if possible
• Use regex wisely when dealing with lazy or greedy quantifiers
• Regex is not a natural language parser
*Next move is to learn backtrack and lookaround concepts in regex
http://www.aliencoders.org/
Final Note
http://www.aliencoders.org/
Just a Note!
Resources
http://www.aliencoders.org/
1. Compilers Principles Techniques & Tools by Aho,Ullman
2. Mastering Regular Expression by Jeffrey Friedl
3. Regular Expression Recipes by Nathan A. Good
4. Mastering Python Regular Expression by [PACKT]
5. Learn Regex the hard way online
6. Wiki Book on Regular Expression
http://www.aliencoders.org/
Books to read
1. http://perldoc.perl.org/perlre.html
2. http://www.regular-expressions.info/
3. https://www.owasp.org/index.php/OWASP_Validation_Regex_Repository
4. http://rick.measham.id.au/paste/explain.pl
5. http://www.catonmat.net/series/perl-one-liners-explained
6. http://www.regular-expressions.info/
7. https://www.owasp.org/index.php/OWASP_Validation_Regex_Repository
8. http://rick.measham.id.au/paste/explain.pl
9. YouTube Video Series
10. Lecture on Regex Video
http://www.aliencoders.org/
Resources/References
1. FB: https://www.facebook.com/jassics
2. Twitter: https://www.twitter.com/aliencoders
3. G+: https://plus.google.com/+Aliencoders
4. Website: http://www.aliencoders.org/
5. Mailid: jassics [at] gmail [dot] com
6. Slideshare: https://www.slideshare.net/jassics
http://www.aliencoders.org/
Sharing is Caring!

Mais conteúdo relacionado

Mais procurados

Defensive Coding Crash Course Tutorial
Defensive Coding Crash Course TutorialDefensive Coding Crash Course Tutorial
Defensive Coding Crash Course TutorialMark Niebergall
 
PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)Nikita Popov
 
Adventures in Optimization
Adventures in OptimizationAdventures in Optimization
Adventures in OptimizationDavid Golden
 
Introductionto fp with groovy
Introductionto fp with groovyIntroductionto fp with groovy
Introductionto fp with groovyIsuru Samaraweera
 
Crystal presentation in NY
Crystal presentation in NYCrystal presentation in NY
Crystal presentation in NYCrystal Language
 
An OCaml newbie meets Camlp4 parser
An OCaml newbie meets Camlp4 parserAn OCaml newbie meets Camlp4 parser
An OCaml newbie meets Camlp4 parserKiwamu Okabe
 
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanSolr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanGregg Donovan
 
Programming in Computational Biology
Programming in Computational BiologyProgramming in Computational Biology
Programming in Computational BiologyAtreyiB
 
Introduction to Perl - Day 1
Introduction to Perl - Day 1Introduction to Perl - Day 1
Introduction to Perl - Day 1Dave Cross
 
Leveraging the Power of Graph Databases in PHP
Leveraging the Power of Graph Databases in PHPLeveraging the Power of Graph Databases in PHP
Leveraging the Power of Graph Databases in PHPJeremy Kendall
 
Leveraging the Power of Graph Databases in PHP
Leveraging the Power of Graph Databases in PHPLeveraging the Power of Graph Databases in PHP
Leveraging the Power of Graph Databases in PHPJeremy Kendall
 

Mais procurados (20)

Groovy unleashed
Groovy unleashed Groovy unleashed
Groovy unleashed
 
Defensive Coding Crash Course Tutorial
Defensive Coding Crash Course TutorialDefensive Coding Crash Course Tutorial
Defensive Coding Crash Course Tutorial
 
PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)
 
Clean code
Clean codeClean code
Clean code
 
Adventures in Optimization
Adventures in OptimizationAdventures in Optimization
Adventures in Optimization
 
Perl basics for Pentesters
Perl basics for PentestersPerl basics for Pentesters
Perl basics for Pentesters
 
Introductionto fp with groovy
Introductionto fp with groovyIntroductionto fp with groovy
Introductionto fp with groovy
 
Scalar data types
Scalar data typesScalar data types
Scalar data types
 
Lists and arrays
Lists and arraysLists and arrays
Lists and arrays
 
Crystal presentation in NY
Crystal presentation in NYCrystal presentation in NY
Crystal presentation in NY
 
Perl Presentation
Perl PresentationPerl Presentation
Perl Presentation
 
An OCaml newbie meets Camlp4 parser
An OCaml newbie meets Camlp4 parserAn OCaml newbie meets Camlp4 parser
An OCaml newbie meets Camlp4 parser
 
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanSolr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
 
Perl Scripting
Perl ScriptingPerl Scripting
Perl Scripting
 
Programming in Computational Biology
Programming in Computational BiologyProgramming in Computational Biology
Programming in Computational Biology
 
Introduction to Perl - Day 1
Introduction to Perl - Day 1Introduction to Perl - Day 1
Introduction to Perl - Day 1
 
First steps in C-Shell
First steps in C-ShellFirst steps in C-Shell
First steps in C-Shell
 
Leveraging the Power of Graph Databases in PHP
Leveraging the Power of Graph Databases in PHPLeveraging the Power of Graph Databases in PHP
Leveraging the Power of Graph Databases in PHP
 
Subroutines
SubroutinesSubroutines
Subroutines
 
Leveraging the Power of Graph Databases in PHP
Leveraging the Power of Graph Databases in PHPLeveraging the Power of Graph Databases in PHP
Leveraging the Power of Graph Databases in PHP
 

Destaque

Draft of Memory of the World Thailand Communication and Advocacy Plan
Draft of Memory of the World Thailand Communication and Advocacy PlanDraft of Memory of the World Thailand Communication and Advocacy Plan
Draft of Memory of the World Thailand Communication and Advocacy PlanRachabodin Suwannakanthi
 
VietRees_Newsletter_66_Tuan3_Thang1
VietRees_Newsletter_66_Tuan3_Thang1VietRees_Newsletter_66_Tuan3_Thang1
VietRees_Newsletter_66_Tuan3_Thang1internationalvr
 
Dyu各學系網站2014招生行銷力提升對策 20140207
Dyu各學系網站2014招生行銷力提升對策 20140207Dyu各學系網站2014招生行銷力提升對策 20140207
Dyu各學系網站2014招生行銷力提升對策 20140207gctarng gctarng
 
Progetto E Learning
Progetto E LearningProgetto E Learning
Progetto E Learningguest624452
 
Sub Prime Primer
Sub Prime PrimerSub Prime Primer
Sub Prime Primerk220809
 
Web intelligence-future of next generation web
Web intelligence-future of next generation webWeb intelligence-future of next generation web
Web intelligence-future of next generation webSanjeev Kumar Jaiswal
 
Basics of JSON (JavaScript Object Notation) with examples
Basics of JSON (JavaScript Object Notation) with examplesBasics of JSON (JavaScript Object Notation) with examples
Basics of JSON (JavaScript Object Notation) with examplesSanjeev Kumar Jaiswal
 

Destaque (20)

Draft of Memory of the World Thailand Communication and Advocacy Plan
Draft of Memory of the World Thailand Communication and Advocacy PlanDraft of Memory of the World Thailand Communication and Advocacy Plan
Draft of Memory of the World Thailand Communication and Advocacy Plan
 
Building a Community
Building a CommunityBuilding a Community
Building a Community
 
DrupalCon jQuery
DrupalCon jQueryDrupalCon jQuery
DrupalCon jQuery
 
Zipcast test
Zipcast testZipcast test
Zipcast test
 
Presentaci ã³n4 (1) (1)
Presentaci ã³n4 (1) (1)Presentaci ã³n4 (1) (1)
Presentaci ã³n4 (1) (1)
 
Manual corrales
Manual corralesManual corrales
Manual corrales
 
VietRees_Newsletter_66_Tuan3_Thang1
VietRees_Newsletter_66_Tuan3_Thang1VietRees_Newsletter_66_Tuan3_Thang1
VietRees_Newsletter_66_Tuan3_Thang1
 
Dyu各學系網站2014招生行銷力提升對策 20140207
Dyu各學系網站2014招生行銷力提升對策 20140207Dyu各學系網站2014招生行銷力提升對策 20140207
Dyu各學系網站2014招生行銷力提升對策 20140207
 
Technology For Botanical Garden
Technology For Botanical GardenTechnology For Botanical Garden
Technology For Botanical Garden
 
Night Vision
Night VisionNight Vision
Night Vision
 
Progetto E Learning
Progetto E LearningProgetto E Learning
Progetto E Learning
 
DSVC Design Talk
DSVC Design TalkDSVC Design Talk
DSVC Design Talk
 
Morphological Image Processing
Morphological Image ProcessingMorphological Image Processing
Morphological Image Processing
 
Sub Prime Primer
Sub Prime PrimerSub Prime Primer
Sub Prime Primer
 
Beekman5 std ppt_13
Beekman5 std ppt_13Beekman5 std ppt_13
Beekman5 std ppt_13
 
R&D in Technology for Botanical Garden
R&D in Technology for Botanical GardenR&D in Technology for Botanical Garden
R&D in Technology for Botanical Garden
 
Web intelligence-future of next generation web
Web intelligence-future of next generation webWeb intelligence-future of next generation web
Web intelligence-future of next generation web
 
Manipulation in games by Sunny
Manipulation in games by SunnyManipulation in games by Sunny
Manipulation in games by Sunny
 
Basics of JSON (JavaScript Object Notation) with examples
Basics of JSON (JavaScript Object Notation) with examplesBasics of JSON (JavaScript Object Notation) with examples
Basics of JSON (JavaScript Object Notation) with examples
 
Wimax vsWi-fi
Wimax vsWi-fiWimax vsWi-fi
Wimax vsWi-fi
 

Semelhante a Regular expression for everyone

Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekingeBioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekingeProf. Wim Van Criekinge
 
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Data Con LA
 
How to check valid Email? Find using regex.
How to check valid Email? Find using regex.How to check valid Email? Find using regex.
How to check valid Email? Find using regex.Poznań Ruby User Group
 
/Regex makes me want to (weep|give up|(╯°□°)╯︵ ┻━┻)\.?/i
/Regex makes me want to (weep|give up|(╯°□°)╯︵ ┻━┻)\.?/i/Regex makes me want to (weep|give up|(╯°□°)╯︵ ┻━┻)\.?/i
/Regex makes me want to (weep|give up|(╯°□°)╯︵ ┻━┻)\.?/ibrettflorio
 
Don't Fear the Regex LSP15
Don't Fear the Regex LSP15Don't Fear the Regex LSP15
Don't Fear the Regex LSP15Sandy Smith
 
Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions Ahmed El-Arabawy
 
Regular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And BeyondRegular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And BeyondMax Shirshin
 
Helvetia
HelvetiaHelvetia
HelvetiaESUG
 
And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...Codemotion
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsEran Zimbler
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandrarantav
 
How to check valid Email? Find using regex.
How to check valid Email? Find using regex.How to check valid Email? Find using regex.
How to check valid Email? Find using regex.Poznań Ruby User Group
 
2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge
2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge
2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekingeProf. Wim Van Criekinge
 

Semelhante a Regular expression for everyone (20)

Bioinformatica p2-p3-introduction
Bioinformatica p2-p3-introductionBioinformatica p2-p3-introduction
Bioinformatica p2-p3-introduction
 
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekingeBioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
 
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
Big Data Day LA 2015 - Compiling DSLs for Diverse Execution Environments by Z...
 
Bioinformatics p2-p3-perl-regexes v2014
Bioinformatics p2-p3-perl-regexes v2014Bioinformatics p2-p3-perl-regexes v2014
Bioinformatics p2-p3-perl-regexes v2014
 
How to check valid Email? Find using regex.
How to check valid Email? Find using regex.How to check valid Email? Find using regex.
How to check valid Email? Find using regex.
 
/Regex makes me want to (weep|give up|(╯°□°)╯︵ ┻━┻)\.?/i
/Regex makes me want to (weep|give up|(╯°□°)╯︵ ┻━┻)\.?/i/Regex makes me want to (weep|give up|(╯°□°)╯︵ ┻━┻)\.?/i
/Regex makes me want to (weep|give up|(╯°□°)╯︵ ┻━┻)\.?/i
 
Don't Fear the Regex LSP15
Don't Fear the Regex LSP15Don't Fear the Regex LSP15
Don't Fear the Regex LSP15
 
Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions
 
Regular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And BeyondRegular Expressions: JavaScript And Beyond
Regular Expressions: JavaScript And Beyond
 
Helvetia
HelvetiaHelvetia
Helvetia
 
And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...And now you have two problems. Ruby regular expressions for fun and profit by...
And now you have two problems. Ruby regular expressions for fun and profit by...
 
php string part 4
php string part 4php string part 4
php string part 4
 
Reg EX
Reg EXReg EX
Reg EX
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
 
How to check valid Email? Find using regex.
How to check valid Email? Find using regex.How to check valid Email? Find using regex.
How to check valid Email? Find using regex.
 
2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge
2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge
2016 bioinformatics i_python_part_3_io_and_strings_wim_vancriekinge
 
Bioinformatics v2014 wim_vancriekinge
Bioinformatics v2014 wim_vancriekingeBioinformatics v2014 wim_vancriekinge
Bioinformatics v2014 wim_vancriekinge
 

Último

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 

Último (20)

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 

Regular expression for everyone

  • 1. Regular Expression For Everyone Sanjeev Kumar Jaiswal (Jassi) Perl Programmer and Security Enthusiast #NullHyd
  • 2. •Introduction to RegEx •Match, substitute, transliterate •Quantifiers, Character Class, Modifiers •RegEx Examples in Perl •Books, sites, blogs to learn RegEx http://www.aliencoders.org/ Agenda to make you superhero!
  • 3.
  • 4. •Stephen Col Kleene -> Regular Sets (1950) • Ken Thompson -> ed (1971) • Global Regular Expression Print (1973) •Sed (1974) & awk (1977)(text processing tools) •Search engines, word processors, text editors •Henry Spencer -> Software library for Regular Expression (regex) •Regular Expression (regex or regexp) http://www.aliencoders.org/ Regex needs Intro?
  • 5. •g/re/p (grep –P, egrep) •sed and awk •Perl and PCRE (Python, Ruby, PHP, JavaScript) •Text Editor: Notepad++ ,vim,emacs,nano •Search Engines •Think of pattern, think of regex http://www.aliencoders.org/ Where Regex is used!
  • 7. Match operator • Represented by m at the beginning • Matches Pan Card: m/^[a-z]{5}d{4}[a-z]$/i Substitute operator • Represented by s/match_it/substitute_with/ • Remove Blank Line: s/^s*$// Transliteration Operator • tr/// or earlier y/// • Change the case of string (uc function): tr/a-z/A-Z/ http://www.aliencoders.org/ Regex Operators
  • 9. Any character that has special meaning while using regex. Ex: anchors, quantifiers, character classes. The full list of metacharacters (12) is , |, ^, $, *, +, ?, ., (, ), [, { Somewhere ( in POSIX?), you will see 14 ( including } and ] ) To use it with it’s literal meaning, you need to use backslash before using these characters. . doesn’t mean fullstop in regex * doesn’t mean multiplication in regex http://www.aliencoders.org/ Having special meaning?
  • 10. -> overrides the next metacharacter. ^ -> matches the position at the beginning of the input string. . -> matches any single character except newline $ -> matches the position at the end of the input string. | -> specifies the or condition () -> matches a pattern and creates a capture buffer for the match [] -> Bracketed Character group/class http://www.aliencoders.org/ Position Metacharacters (Anchors)
  • 11. A character class is a way of denoting a set of characters in such a way that one character of the set is matched • The dot • The backslash sequence • Word Characters -> w or [a-zA-Z0-9_] • Whitespace -> s or [tnfr ] • Bracketed character classes, represented by [character_group] • [aeiou], [a-f-z], [a-z], [-f] -> Positive character class/group • [^d] , [^^], [x^]-> negative character class/group • POSIX Character classes, [:class:] • [[:alpha:]], [01[:alpha:]%], [:alnum:] • Read more: http://perldoc.perl.org/perlrecharclass.html http://www.aliencoders.org/ Character Classes
  • 12. • d for [0-9] • D for [^d] • s for [trfn ] • S for [^s] • w for [A-Za-z0-9_] • W for [^w] Zero –width Assertions • b Match a word boundary • B Match except at a word boundary • A Match only at beginning of string (think of ^) • Z Match only at end of string, or before newline at the end (think of $) • z Match only at end of string http://www.aliencoders.org/ Shorthand Character Classes
  • 13. Quantifiers • . -> matches any character except new line • * -> matches 0 or more • ? -> matches 0 or 1 • + -> matches 1 or more • {n} -> matches exactly n times • {min,max} -> matches min to max times • {,max} -> matches till max times • {min,} -> matches min to no limit http://www.aliencoders.org/ Quantifies the pattern
  • 15. • Changes the behavior of the regular expression operators. These modifiers appear at: 1. the end of the match, 2. substitution, and 3. qr// operators • i -> case insensitive pattern matching • g -> globally match the pattern repeatedly in the string • m -> treat string as multiple lines • s -> treat string as single line • x -> permitting whitespace and comments • e -> evaluate the right-hand side as an expression Want to know more: http://perldoc.perl.org/perlre.html#Modifiers http://www.aliencoders.org/ Minimal Modifiers
  • 17. Requirements: • Trim if any space(s) found before the string • Trim if any space(s) found after the string • Ignore if space(s) exists in between the string. • " c and cpp " or " c and cpp" or " c and cpp " should save as "c and cpp " $word =~ s/^s+//; $word =~ s/s+$//; Or $word =~ s/^s+|s+$//g; http://www.aliencoders.org/ Trim extra spaces from a string
  • 18. Requirements: You have to accept only if user types yes or no in any of the below manner: • yes Yes YEs YES YeS yES yeS • No no NO nO • y Y • n N $answer = m/^(y(?:es)?|no?)$/i; http://www.aliencoders.org/ Only Yes or No
  • 19. Requirements: • We need to count • Number of words • Number of lines • Number of characters with space or without space. open(IN, $filename) || die “Can’t open $filename: $!n"; foreach my $line (<IN>) { $chars_with_space += length($line); $space_count++ while $line =~ /s+/g; $words++ while $line =~ /S+/g; $lines+= $line=~/n$/; } close(IN); http://www.aliencoders.org/ Simulate wc command
  • 20. Requirements: • String is “1212211322312223459812222098abcXYZ” • Look for pattern12+ i.e. matches 12, 122, 1222 and so on • Find 4th occurrence of the pattern my $n = 4; # nth occurrence my ($match) =~ /(?:.*?(12+)){$n}/; print "#$n match is $NthMatchn"; Or my @matches = ($_ =~ /(12+)/g); print "#$n match is $matches[3]n"; http://www.aliencoders.org/ Nth Match in a string
  • 21. Requirements: Email could be anything but it should have some boundary like • It will have @ and . • Username can be a mixture of character, digit and _ (usually) of any length but it should start with only character i.e from a-z or A-Z (I restricted length from 8-15) • Domain Name should be between 2 to 63 • After the last . there should not be @ and it could be in range of 2-4 only $pattern = /^([a-zA-Z][w_]{8,15})@([a-zA-Z0-9.-]+).([a- zA-Z]{2,4})$/; * We can write another email pattern for newly available domains like .website, .shikha etc. http://www.aliencoders.org/ Email Validation
  • 22. Requirements: • Match IPv4 Address • It should be between 0.0.0.0 and 255.255.255.255 Simplest One $ip=/^(d{1,3}).(d{1,3}).(d{1,3}).(d{1,3})$/; Advanced One $ip =~ /^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3} (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$/ http://www.aliencoders.org/ IP Address Validation
  • 23. Requirements: • Find a site which runs on WordPress • Try to figure our WP version • Either by response header. Check for X-Meta-Generator keyword • Or by readme.html page • Or by wp-login.php page ($version) = $content =~ m/X-Meta-Generator:s+(.*)/img; ($version) = $content =~ /versions+(d+.d+(?:.d+)?)/img; ($version) = $content =~ m#wp- (?:admin|content|includes)/(?!plugins|js).*?ver=(d+.d+(?:. d+)?(?:[-w.]+)?)#img; http://www.aliencoders.org/ Find WordPress Version
  • 24. Requirements: • Request url and get content • Parse through each js and css links • Based on js and css files, guess the technology used • Regex to find all possible CSS links my (@css)= $content =~ m#<links+(?:.*?)href=['"]([^'"]*)#mgi; foreach (@css){ ($css) = $_ =~ m#(?:.*)/(.*)#g; ($css) = $_ =~ m#([^/]*.css)#gi; ($css) = $css =~ m#(.*).css# if($css); } • Regex to find all possible JavaScript links my (@js)= $content=~ m#<scripts+(?:.*?)src=['"]([^'"]*)#mig; http://www.aliencoders.org/ Extract css/js files used in a site
  • 25. Amex Card: ^3[47][0-9]{13}$ Diners Club Card: ^3(?:0[0-5]|[68][0-9])[0-9]{11}$ Discover Card: 6(?:011|5[0-9]{2})[0-9]{12} JCB Card: ^(?:2131|1800|35d{3})d{11}$ Maestro Card: ^(5018|5020|5038|6304|6759|6761|6763)[0-9]{8,15}$ Master Card: ^5[1-5][0-9]{14})$ Visa Card: ^4[0-9]{12}(?:[0-9]{3})?$ Snort Rule to alert detect Visa Card: alert tcp any any <> any any (pcre:"/4d{3}(s|-)?d{4}(s|-)?d{4}(s|- )?d{4}/";msg:"VISA card number detected in clear text"; content:"visa"; nocase; sid:9000000; rev:1;) http://www.aliencoders.org/ Common credit card vendor RegEx
  • 26. • Just an example, can add many such regex for better prevention • Regex for simple CSS attack. s/((%3C)|<)((%2F)|/)*[a-z0-9%]+((%3E)|>)//gix; • Blind regex for CSS attacks • s/((%3C)|<)[^<%3C]+((%3E)|>?)//sig; Regex for simple CSS attack. s/((%3C)|<)((%2F)|/)*[a-z0-9%]+((%3E)|>)//gix; • Regex check for "<img src" CSS attack s/((%3C)|<)((%69)|i|(%49))((%6D)|m|(%4D))((%67)|g|(%47)) [^n]+((%3E)|>)//gi; http://www.aliencoders.org/ Preventing XSS
  • 27. Requirements: • Output lines that contain duplicate words • Find doubled words that expand lines • Ignore capitalization differences • Ignore HTML tags $/ = ".n"; while (<>) { next if !s/b([a- z]+)((?:s|<[^>]+>)+)(1b)/e[7m$1e[m$2e[7m$3e[m/ig; s/^(?:[^e]*n)+//mg; # Remove any unmarked lines. s/^/$ARGV: /mg; # Ensure lines begin with filename. print; } http://www.aliencoders.org/ Find Duplicate words in a file
  • 28. • # Famous one line Perl RegEx for Prime Number by Abigail perl -le 'print "Prime Number" if (1 x shift) !~ /^1?$|^(11+?)1+$/' [number] • # decimal number /^[+-]?d+.?d*$/ • # Extract HTTP User-Agent /^User-Agent: (.+)$/ • # Replace <b> with <strong> $html =~ s|<(/)?b>|<$1strong>|g http://www.aliencoders.org/ More Regex
  • 29. • My Favorite: http://regexr.com/ • Expresso for Windows: http://www.ultrapico.com/Expresso.htm • Firefox Addon: https://addons.mozilla.org/en-US/firefox/addon/rext/ • Ruby based regex tool: http://www.rubular.com/ • Python based regex tool: http://www.pythonregex.com/ • JavaScript based regex tool: http://regexpal.com/ http://www.aliencoders.org/ Regex Tools
  • 30. • Don’t make simple things complex by using Regex • Don’t try to do everything in one regex, better use multi regex • Test Regex in regex tool before using • Use Post processing whenever is possible • Alternations? Think of grouping the regex • Capture only it’s necessary • Use /x for proper whitespace and comments, if possible • Use regex wisely when dealing with lazy or greedy quantifiers • Regex is not a natural language parser *Next move is to learn backtrack and lookaround concepts in regex http://www.aliencoders.org/ Final Note
  • 33. 1. Compilers Principles Techniques & Tools by Aho,Ullman 2. Mastering Regular Expression by Jeffrey Friedl 3. Regular Expression Recipes by Nathan A. Good 4. Mastering Python Regular Expression by [PACKT] 5. Learn Regex the hard way online 6. Wiki Book on Regular Expression http://www.aliencoders.org/ Books to read
  • 34. 1. http://perldoc.perl.org/perlre.html 2. http://www.regular-expressions.info/ 3. https://www.owasp.org/index.php/OWASP_Validation_Regex_Repository 4. http://rick.measham.id.au/paste/explain.pl 5. http://www.catonmat.net/series/perl-one-liners-explained 6. http://www.regular-expressions.info/ 7. https://www.owasp.org/index.php/OWASP_Validation_Regex_Repository 8. http://rick.measham.id.au/paste/explain.pl 9. YouTube Video Series 10. Lecture on Regex Video http://www.aliencoders.org/ Resources/References
  • 35. 1. FB: https://www.facebook.com/jassics 2. Twitter: https://www.twitter.com/aliencoders 3. G+: https://plus.google.com/+Aliencoders 4. Website: http://www.aliencoders.org/ 5. Mailid: jassics [at] gmail [dot] com 6. Slideshare: https://www.slideshare.net/jassics http://www.aliencoders.org/ Sharing is Caring!