SlideShare uma empresa Scribd logo
1 de 67
Regular Expressions Satyanarayana D  < satyavvd@yahoo-inc.com>
Topics ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What are Regular Expressions? ,[object Object],[object Object]
Why do we need? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
History Stephen Kleene A mathematician discovered ‘ regular sets ’.
History Ken Thompson 1968 -  Regular Expression Search Algorithm. Qed  ->  ed  ->  g/re/p
History Henry Spencer 1986 – Wrote a regex library in C
Regex Flavors ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Grammar of Regex *   RE  =  one or more non-empty ‘ branches ‘ separated by ‘|’ Branch  = one or more ‘ pieces ’ Piece  =  atom  followed by quantifier Quantifier  = ‘*,+,?’ or ‘ bound ’ Bound  =  atom{n}, atom{n,}, atom {m, n} Atom  = (RE) or    () or  ‘ ^,$,’ or  followed by `^.[$()|*+?{ or  any-char or ‘ bracket expression ’ Bracket Expression = is a list of characters enclosed in `[ ]'
Meta Chars? 2 + 4 Here ‘+’ has some special meaning In a normal Expression like :
Meta Chars  Quote the next metacharacter ^  Match the beginning of the line .  Match any character (except newline) = [^] $  Match the end of the line (or before newline at the end) |  Alternation ( )  Grouping [ ]  Character class { }  Match m to n times *  Match 0 or more times +  Match 1 or more times ?  Match 1 or 0 times
Non-printable Chars   tab  (HT, TAB)   newline  (LF, NL)   return  (CR)   form feed  (FF)   alarm (bell)  (BEL)   escape (think troff)  (ESC) 33  octal char  (example: ESC) 1B  hex char  (example: ESC) {263a} long hex char  (example: Unicode SMILEY) K  control char  (example: VT) {name} named Unicode character
Character Classes – [ ] ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[0-9]  Matches any one of 0,1,2,3,4,5,6,7,8,9. [aeiou]  Matches one English vowel char. [^aeiou]  Matches any non-vowel char. [a-z-]  Matches a to z and ‘-’ [a-z0-9]  Union matches a to z and 0 to 9. [a-z&&[m-z]]  Intersection matches m to z. [a-z-[m-z]  Subtraction matches a to l.
POSIX Character Classes – [: … :] [^[:digit:] ]=  = [^0-9]
Shorthand Chars   word character  [A-Za-z0-9_]   decimal digit  [0-9]   whitespace  [ ]   not a word character  [^A-Za-z0-9_]   not a decimal digit  [^0-9]   not whitespace  [^ ]
Anchors/Assertions ,[object Object],^  Match the beginning of the line $  Match the end of the line (or before newline at the end)   Matches only at the very beginning   Matches only at the very end   Matches like $ used in single-line mode    Matches when the current position is a word boundary lt;,gt;  Matches when the current position is a word boundary   Matches when the current position is not a word boundary
^Anchors ,[object Object],^  Match the beginning of the line Anchor matches  a  certain position  In the subject  string and  it won’t consume  a ny characters /^a/ String begin  with ‘a’
Anchors$ ,[object Object],$  Match the end of the line (or before newline at the end) Anchor matche s   a certain position  In the subject  string and  it won’t consume  any character s /s$/ String end with ‘s’
 Anchors ,[object Object],  Matches only at the very beginning Anchor matches  a certain position  In the subject  string and  it won’t consume  any characters ^ Vs
,  Anchors ,[object Object],  Matches only at the very end   Matches like $ used in single-line mode  Anchor matches  a certain position  In the subject  string and  it won’t consume  any characters $ Vs  ,
,  Anchors ,[object Object],  = |  =  Matches a word boundary   Matches when the current position is not a word boundary /2/ /2/ $ xl 2 twiki file  2  > /dev/null
Quantifiers ,[object Object],{m, n}  = Matches minimum of m and a max of n occurrences. *  =  {0,}   =  Matches zero or more occurrences ( any amount). +  =  {1,}   = Matches one or more occurrences.  ?  =  {0,1}  = Matches zero or one occurrence ( means optional ). Quantifiers ( repetition) :
Quantifiers ,[object Object],/{2,4}/   2010 /<.+>/   My first <strong> regex </strong> test. <strong> regex </strong> /+sion/  Expression If the entire match fails because they consumed too much, then they are forced to give up as much as needed to make the rest of regex succeed
Non Greedy Quantifiers {,}?   *? +?   ??   To make non greedy quantifiers append ‘?’ <.+?>   My first <strong> regex </strong> test. <strong> Use negated classes   <[^>]+>   My first <strong> regex </strong> test. <strong>
Grouping – ( ) ,[object Object],{2}-{2} -{2} ({2})?   Will match  01-01-10  and  01-01-2010  also.   ,[object Object]
Alternation - | ,[object Object],/( get | set )Value/   Match either  getValue  or  setValue . ,[object Object],[object Object],[object Object]
Capturing – ( ) ,[object Object],[object Object],[object Object],/(({2})-({2})-({2}({2})?))/  ( ( {2} ) - ( {2} ) - ({2} ( {2} ) ?) )  Today is ‘ 18-08-2010 ’.    -> date ->  18-08-2010   -> day->  18   -> month ->  08   -> year ->  2010   -> year -> last two digits ->  10
Non-Capturing sub patterns– (?: ) ,[object Object],{2}-{2} -{2} (?:{2})?   Will match  01-01-10  and  01-01-2010  also.
[object Object],(?P<name>pattern)  ->  Python Style, Perl 5.12 (?P=name)  ->  Back reference (?<name>pattern) or  (?’name’pattern)  ->Perl 5.10 <name> or ’name’ or  ->  Back reference {name} {-1}, {-2}  ->  Relative Back reference. (?<vowel>[ai]).<vowel>.  abr acada bra !! /(+)+{-1}/  &quot;Thus joyful  Troy Troy  maintained  the the  watch of night...” $date=&quot;18-08-2010&quot;; $date =~ s/(?<day>{2})-(?<month>{2})-(?<year>{4})/$+{year}-$+{month}-$+{day}/; Named Capture – (?<> )
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Before Evaluating Regex
Float number = integerpart.factionalpart Matching a float number Basic Principle – Split your task into sub tasks
Integerpart = + -> will match one or more digits Matching a float number
Matching a float number Literal dot =  Integerpart = + -> will match one or more digits
Matching a float number Literal dot =  Integerpart = + -> will match one or more digits Fractional part= + -> will match one or more digits
Integerpart = + Matching a float number Literal dot =  Fractional part = + Combine all of them = ++
Matching a float number /++/  -> Is generic. It won’t match  -123.45  or  +123.45
Matching a float number /++/  -> Is generic. It won’t match  -123.45  or  +123.45 /[+-]?++/  -> will match.
Matching a float number But It won’t match  - 123.45  or  + 123.45 /[+-]?++/  -> will match. /[+-]? *++/  -> will match. But It won’t match  123.  or  .45
Matching a float number /[+-]? *(?:++|+|+)/  -> will match. But It won’t match  123.  or  .45 /[+-]? * (?:   ++ | + | + ) /
Matching a float number /[+-]? *(?:++|+|+)(?:[eE]+)?/  -> will match. But It won’t match  10e2 or 101E5 / [+-]? * (?:   ++ | + | + ) (?:   [eE]+ )? /
Matching a float number /^[+-]? *(?:++|+|+)(?:[eE][+-]?+)?$/  -> will match. But It won’t match  10e-2 / ^[+-]? * (?:   ++ | + | + ) (?:   [eE][+-]?+ )? $/x
Match a float number /^ [+-]?*  # first, match an optional sign (?:  # then match integers or f.p. mantissas: ++  # mantissa of the form a.b |+  # mantissa of the form a. |+  # mantissa of the form .b |+  # integer of the form a ) (?:[eE][+-]?+)?  # finally, optionally match an exponent $/x;
Atomic Grouping – (?> ) ,[object Object],[object Object],+99  19999   1 9999  -> Add 1 to match -> 1 +  19 999  -> Add 9 to match -> 19 +  199 99  -> Add 9 to match -> 199 +  1999 9  -> Add 9 to match -> 1999 +  19999  -> Add 9 to match -> 19999 +   19999  -> Still need to match 99 + 99   1999 9   -> Give up a 9 + 99   199 99   -> Give up one more 9 +99  19999  -> Success
Atomic Grouping – (?> ) ,[object Object],[object Object],+xx  199Rs   1 99Rs  -> Add 1 to match -> 1 +  19 9Rs  -> Add 9 to match -> 19 +  199 Rs  -> Add 9 to match -> 199 +x  199 Rs  -> x not matched with R +x  19 9 Rs   -> Give up 9, still cannot match x +x   1 99 Rs   -> Give up 9, still cannot match x +x   1 99 Rs   -> Cannot give 1 due to + +xx  199Rs  -> Failure
Atomic Grouping – (?> ) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Atomic Grouping: Possessive Quantifiers:
Look Around Ahead Behind Positive Negative Positive Negative (?=...) (?!...) (?<=...) (?<!...) (?=...)  Zero-width positive lookahead assertion (?!...)  Zero-width negative lookahead assertion (?<=...)  Zero-width positive lookbehind assertion (?<!...)  Zero-width negative lookbehind assertion *Note  : Assertions can be nested. Example : /(?<=,   (?!   (?<=,)(?=) ) )/
Look Around ,[object Object],[object Object],[object Object],[object Object],“ I catch the housecat 'Tom-cat' with catnip” ,[object Object],[object Object],*Note  : look-behind expressions cannot be of variable length. means you cannot use quantifiers (?, *, +, or {1,5}) or alternation of different-length items inside them.
Conditional expressions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Match a (quoted)? string  ->  / ^(&quot;|')?[^”’]*(?(1))$ / Matches  'blah blah’ Matches  “blah blah” Matches   blah blah Won’t Match  ‘blah blah”
Conditional expression ,[object Object],[object Object],[object Object],/ (.)(?(<=AA)G|C)$ / ATGAAG TAGBBC GATGGC /usr/share/dict/words   -> / ^(.+)(.+)?(?(2)|)$ / aa baba beriberi maam vetitive
Recursive Patterns – (?) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Code Evaluation – (?{ }) ,[object Object],[object Object],$x = &quot;aaaa”; $x =~ /(a(?{print &quot;Yow&quot;;}))*aa/; produces Yow Yow Yow Yow
Pattern Code Expression – (??{ }) ,[object Object],[object Object],$length = 5; $char = 'a'; $str = 'aaaaabb'; $str =~ /(??{$char x $length})/x;  # matches, there are 5 of 'a'
Inline modifiers & Comments Matching can be modified inline by placing modifiers. (?i)  enables case-insensitive mode (?m)  enables multiline matching for ^ and $ (?s)  makes dot metacharacter match newline also (?x)  ignores literal whitespace (?U)  makes quantifiers ungreedy (lazy) by default $answers  =~ / (?i) y (?-i) (?:es)?/  -> Will match ‘y’, ’Y’, ’yes’, ’Yes’ but not ‘YES’. Comments can be inserted inline using (?#) construct. /^ (?#begin) + (?#match integer part)  (?#match dot) + (?#match fractional part) $/
Regex Testers Tools Editors Vim, TextMate, Edit Pad Pro, NoteTab, UltraEdit RegexBuddy Reggy –  http:// reggyapp.com http://rubular.com   (Ruby) RegexPal (JavaScript)  -  http://www.regexpal.com  http://www.gskinner.com/RegExr/ http://www.spaweditor.com/scripts/regex/index.php http://regex.larsolavtorvik.com/   (PHP, JavaScript) http://www.nregex.com/   ( .NET ) http://www.myregexp.com/  ( Java ) http://osteele.com/tools/reanimator   ( NFA Graphic repr. ) Expresso  -  http://www.ultrapico.com/Expresso.htm   ( .NET ) Regulator -   http://sourceforge.net/projects/regulator   ( .NET ) RegexRenamer -  http://regexrenamer.sourceforge.net/   ( .NET ) PowerGREP   http://www.powergrep.com/   Windows Grep  -  http://www.wingrep.com/
Regex Resources $perldoc perlre perlretut perlreref $man re_format “ Mastering Regular Expressions” by Jeffrey Friedl http://oreilly.com/catalog/9780596528126/ “ Regular Expressions Cookbook” by Jan Goyvaerts & Steven Levithan http://oreilly.com/catalog/9780596520694
Questions? * { } ^ ] + $ [ ( ? . ) - : #
Thank Y!ou * { } ^ ] + $ [ ( ? . ) - : #
Java Regex ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
PHP Regex ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
JavaScript Regex ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
.NET Regex ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Python Regex ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Ruby Regex ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Unicode Properties
Pattern Code Expression – (??{ }) ,[object Object],[object Object],Find Incremental numbers ? $str=&quot;abc  123 hai cde  34567  efg 1245 a132  123456789  10adf&quot;; print &quot;$1&quot; while($str=~/ (   ()   (?{$x=$2})   (   (??{++$x%10})   )*   )   /gx);'
Commify a number $no=123456789; substr($no,0,length($no)-1)=~s/(?=(?<=)(?:)+$)/,/g; print $no’ Produce  12,34,56,789
Find Incremental numbers ? $str=&quot;abc  123 hai cde  34567  efg 1245 a132  123456789  10adf&quot;; print &quot;$1&quot; while($str=~/ (   ()   (?{$x=$2})   (   (??{++$x%10})   )*   )   /gx);’ Non Capture group in a capture group won’t work : perl -e '$x=&quot;cat cat cat&quot;;$x=~/(cat(?:+))/;print &quot;:$1:&quot;;’

Mais conteúdo relacionado

Mais procurados

Regular expression
Regular expressionRegular expression
Regular expressionRajon
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in PythonSujith Kumar
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsEran Zimbler
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expressionvaluebound
 
Introduction to Regular Expressions
Introduction to Regular ExpressionsIntroduction to Regular Expressions
Introduction to Regular ExpressionsMatt Casto
 
Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressionsBen Brumfield
 
Regular Languages
Regular LanguagesRegular Languages
Regular Languagesparmeet834
 
Finaal application on regular expression
Finaal application on regular expressionFinaal application on regular expression
Finaal application on regular expressionGagan019
 
Regular expression
Regular expressionRegular expression
Regular expressionLarry Nung
 
Regular Expression
Regular ExpressionRegular Expression
Regular ExpressionLambert Lum
 
Top down and botttom up Parsing
Top down     and botttom up ParsingTop down     and botttom up Parsing
Top down and botttom up ParsingGerwin Ocsena
 
Regular Expressions in PHP
Regular Expressions in PHPRegular Expressions in PHP
Regular Expressions in PHPAndrew Kandels
 
Data Structures in Python
Data Structures in PythonData Structures in Python
Data Structures in PythonDevashish Kumar
 
Regular language and Regular expression
Regular language and Regular expressionRegular language and Regular expression
Regular language and Regular expressionAnimesh Chaturvedi
 
1.9. minimization of dfa
1.9. minimization of dfa1.9. minimization of dfa
1.9. minimization of dfaSampath Kumar S
 

Mais procurados (20)

Regular expression
Regular expressionRegular expression
Regular expression
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in Python
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Introduction to Regular Expressions
Introduction to Regular ExpressionsIntroduction to Regular Expressions
Introduction to Regular Expressions
 
Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressions
 
Regular Languages
Regular LanguagesRegular Languages
Regular Languages
 
Adv. python regular expression by Rj
Adv. python regular expression by RjAdv. python regular expression by Rj
Adv. python regular expression by Rj
 
Finaal application on regular expression
Finaal application on regular expressionFinaal application on regular expression
Finaal application on regular expression
 
Regular expression
Regular expressionRegular expression
Regular expression
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
 
Top down and botttom up Parsing
Top down     and botttom up ParsingTop down     and botttom up Parsing
Top down and botttom up Parsing
 
Regular Expressions in PHP
Regular Expressions in PHPRegular Expressions in PHP
Regular Expressions in PHP
 
Data Structures in Python
Data Structures in PythonData Structures in Python
Data Structures in Python
 
Regular language and Regular expression
Regular language and Regular expressionRegular language and Regular expression
Regular language and Regular expression
 
Python programming : Strings
Python programming : StringsPython programming : Strings
Python programming : Strings
 
Php string function
Php string function Php string function
Php string function
 
1.9. minimization of dfa
1.9. minimization of dfa1.9. minimization of dfa
1.9. minimization of dfa
 
MYSQL join
MYSQL joinMYSQL join
MYSQL join
 

Semelhante a Regular Expressions

Bioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionBioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionProf. Wim Van Criekinge
 
Eloquent Ruby chapter 4 - Find The Right String with Regular Expression
Eloquent Ruby chapter 4 - Find The Right String with Regular ExpressionEloquent Ruby chapter 4 - Find The Right String with Regular Expression
Eloquent Ruby chapter 4 - Find The Right String with Regular ExpressionKuyseng Chhoeun
 
Python (regular expression)
Python (regular expression)Python (regular expression)
Python (regular expression)Chirag Shetty
 
Regular expressions quick reference
Regular expressions quick referenceRegular expressions quick reference
Regular expressions quick referencejvinhit
 
3.2 javascript regex
3.2 javascript regex3.2 javascript regex
3.2 javascript regexJalpesh Vasa
 
Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013Ben Brumfield
 
An Introduction to Regular expressions
An Introduction to Regular expressionsAn Introduction to Regular expressions
An Introduction to Regular expressionsYamagata Europe
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracleLogan Palanisamy
 
Javascript正则表达式
Javascript正则表达式Javascript正则表达式
Javascript正则表达式ji guang
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsJames Gray
 
Regular Expression Cheat Sheet
Regular Expression Cheat SheetRegular Expression Cheat Sheet
Regular Expression Cheat SheetSydneyJohnson57
 
Looking for Patterns
Looking for PatternsLooking for Patterns
Looking for PatternsKeith Wright
 
Basta mastering regex power
Basta mastering regex powerBasta mastering regex power
Basta mastering regex powerMax Kleiner
 
Regex startup
Regex startupRegex startup
Regex startupPayPal
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressionsmussawir20
 

Semelhante a Regular Expressions (20)

Bioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionBioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introduction
 
Eloquent Ruby chapter 4 - Find The Right String with Regular Expression
Eloquent Ruby chapter 4 - Find The Right String with Regular ExpressionEloquent Ruby chapter 4 - Find The Right String with Regular Expression
Eloquent Ruby chapter 4 - Find The Right String with Regular Expression
 
Python (regular expression)
Python (regular expression)Python (regular expression)
Python (regular expression)
 
Regular expressions quick reference
Regular expressions quick referenceRegular expressions quick reference
Regular expressions quick reference
 
3.2 javascript regex
3.2 javascript regex3.2 javascript regex
3.2 javascript regex
 
Ruby RegEx
Ruby RegExRuby RegEx
Ruby RegEx
 
Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
An Introduction to Regular expressions
An Introduction to Regular expressionsAn Introduction to Regular expressions
An Introduction to Regular expressions
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
 
Javascript正则表达式
Javascript正则表达式Javascript正则表达式
Javascript正则表达式
 
Regex Basics
Regex BasicsRegex Basics
Regex Basics
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Regular Expression Cheat Sheet
Regular Expression Cheat SheetRegular Expression Cheat Sheet
Regular Expression Cheat Sheet
 
Les08
Les08Les08
Les08
 
Looking for Patterns
Looking for PatternsLooking for Patterns
Looking for Patterns
 
Basta mastering regex power
Basta mastering regex powerBasta mastering regex power
Basta mastering regex power
 
2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex
 
Regex startup
Regex startupRegex startup
Regex startup
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressions
 

Último

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 

Último (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

Regular Expressions

  • 1. Regular Expressions Satyanarayana D < satyavvd@yahoo-inc.com>
  • 2.
  • 3.
  • 4.
  • 5. History Stephen Kleene A mathematician discovered ‘ regular sets ’.
  • 6. History Ken Thompson 1968 - Regular Expression Search Algorithm. Qed -> ed -> g/re/p
  • 7. History Henry Spencer 1986 – Wrote a regex library in C
  • 8.
  • 9. Grammar of Regex * RE = one or more non-empty ‘ branches ‘ separated by ‘|’ Branch = one or more ‘ pieces ’ Piece = atom followed by quantifier Quantifier = ‘*,+,?’ or ‘ bound ’ Bound = atom{n}, atom{n,}, atom {m, n} Atom = (RE) or () or ‘ ^,$,’ or followed by `^.[$()|*+?{ or any-char or ‘ bracket expression ’ Bracket Expression = is a list of characters enclosed in `[ ]'
  • 10. Meta Chars? 2 + 4 Here ‘+’ has some special meaning In a normal Expression like :
  • 11. Meta Chars Quote the next metacharacter ^ Match the beginning of the line . Match any character (except newline) = [^] $ Match the end of the line (or before newline at the end) | Alternation ( ) Grouping [ ] Character class { } Match m to n times * Match 0 or more times + Match 1 or more times ? Match 1 or 0 times
  • 12. Non-printable Chars tab (HT, TAB) newline (LF, NL) return (CR) form feed (FF) alarm (bell) (BEL) escape (think troff) (ESC) 33 octal char (example: ESC) 1B hex char (example: ESC) {263a} long hex char (example: Unicode SMILEY) K control char (example: VT) {name} named Unicode character
  • 13.
  • 14. POSIX Character Classes – [: … :] [^[:digit:] ]= = [^0-9]
  • 15. Shorthand Chars word character [A-Za-z0-9_] decimal digit [0-9] whitespace [ ] not a word character [^A-Za-z0-9_] not a decimal digit [^0-9] not whitespace [^ ]
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24. Non Greedy Quantifiers {,}? *? +? ?? To make non greedy quantifiers append ‘?’ <.+?> My first <strong> regex </strong> test. <strong> Use negated classes <[^>]+> My first <strong> regex </strong> test. <strong>
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31. Float number = integerpart.factionalpart Matching a float number Basic Principle – Split your task into sub tasks
  • 32. Integerpart = + -> will match one or more digits Matching a float number
  • 33. Matching a float number Literal dot = Integerpart = + -> will match one or more digits
  • 34. Matching a float number Literal dot = Integerpart = + -> will match one or more digits Fractional part= + -> will match one or more digits
  • 35. Integerpart = + Matching a float number Literal dot = Fractional part = + Combine all of them = ++
  • 36. Matching a float number /++/ -> Is generic. It won’t match -123.45 or +123.45
  • 37. Matching a float number /++/ -> Is generic. It won’t match -123.45 or +123.45 /[+-]?++/ -> will match.
  • 38. Matching a float number But It won’t match - 123.45 or + 123.45 /[+-]?++/ -> will match. /[+-]? *++/ -> will match. But It won’t match 123. or .45
  • 39. Matching a float number /[+-]? *(?:++|+|+)/ -> will match. But It won’t match 123. or .45 /[+-]? * (?: ++ | + | + ) /
  • 40. Matching a float number /[+-]? *(?:++|+|+)(?:[eE]+)?/ -> will match. But It won’t match 10e2 or 101E5 / [+-]? * (?: ++ | + | + ) (?: [eE]+ )? /
  • 41. Matching a float number /^[+-]? *(?:++|+|+)(?:[eE][+-]?+)?$/ -> will match. But It won’t match 10e-2 / ^[+-]? * (?: ++ | + | + ) (?: [eE][+-]?+ )? $/x
  • 42. Match a float number /^ [+-]?* # first, match an optional sign (?: # then match integers or f.p. mantissas: ++ # mantissa of the form a.b |+ # mantissa of the form a. |+ # mantissa of the form .b |+ # integer of the form a ) (?:[eE][+-]?+)? # finally, optionally match an exponent $/x;
  • 43.
  • 44.
  • 45.
  • 46. Look Around Ahead Behind Positive Negative Positive Negative (?=...) (?!...) (?<=...) (?<!...) (?=...) Zero-width positive lookahead assertion (?!...) Zero-width negative lookahead assertion (?<=...) Zero-width positive lookbehind assertion (?<!...) Zero-width negative lookbehind assertion *Note : Assertions can be nested. Example : /(?<=, (?! (?<=,)(?=) ) )/
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53. Inline modifiers & Comments Matching can be modified inline by placing modifiers. (?i) enables case-insensitive mode (?m) enables multiline matching for ^ and $ (?s) makes dot metacharacter match newline also (?x) ignores literal whitespace (?U) makes quantifiers ungreedy (lazy) by default $answers =~ / (?i) y (?-i) (?:es)?/ -> Will match ‘y’, ’Y’, ’yes’, ’Yes’ but not ‘YES’. Comments can be inserted inline using (?#) construct. /^ (?#begin) + (?#match integer part) (?#match dot) + (?#match fractional part) $/
  • 54. Regex Testers Tools Editors Vim, TextMate, Edit Pad Pro, NoteTab, UltraEdit RegexBuddy Reggy – http:// reggyapp.com http://rubular.com (Ruby) RegexPal (JavaScript) - http://www.regexpal.com http://www.gskinner.com/RegExr/ http://www.spaweditor.com/scripts/regex/index.php http://regex.larsolavtorvik.com/ (PHP, JavaScript) http://www.nregex.com/ ( .NET ) http://www.myregexp.com/ ( Java ) http://osteele.com/tools/reanimator ( NFA Graphic repr. ) Expresso - http://www.ultrapico.com/Expresso.htm ( .NET ) Regulator - http://sourceforge.net/projects/regulator ( .NET ) RegexRenamer - http://regexrenamer.sourceforge.net/ ( .NET ) PowerGREP http://www.powergrep.com/ Windows Grep - http://www.wingrep.com/
  • 55. Regex Resources $perldoc perlre perlretut perlreref $man re_format “ Mastering Regular Expressions” by Jeffrey Friedl http://oreilly.com/catalog/9780596528126/ “ Regular Expressions Cookbook” by Jan Goyvaerts & Steven Levithan http://oreilly.com/catalog/9780596520694
  • 56. Questions? * { } ^ ] + $ [ ( ? . ) - : #
  • 57. Thank Y!ou * { } ^ ] + $ [ ( ? . ) - : #
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 65.
  • 66. Commify a number $no=123456789; substr($no,0,length($no)-1)=~s/(?=(?<=)(?:)+$)/,/g; print $no’ Produce 12,34,56,789
  • 67. Find Incremental numbers ? $str=&quot;abc 123 hai cde 34567 efg 1245 a132 123456789 10adf&quot;; print &quot;$1&quot; while($str=~/ ( () (?{$x=$2}) ( (??{++$x%10}) )* ) /gx);’ Non Capture group in a capture group won’t work : perl -e '$x=&quot;cat cat cat&quot;;$x=~/(cat(?:+))/;print &quot;:$1:&quot;;’

Notas do Editor

  1. LAN011213001-23445-819
  2. LAN011213001-23445-819
  3. LAN011213001-23445-819
  4. LAN011213001-23445-819
  5. LAN011213001-23445-819
  6. LAN011213001-23445-819
  7. LAN011213001-23445-819
  8. LAN011213001-23445-819
  9. LAN011213001-23445-819
  10. LAN011213001-23445-819
  11. LAN011213001-23445-819
  12. LAN011213001-23445-819
  13. LAN011213001-23445-819
  14. LAN011213001-23445-819
  15. LAN011213001-23445-819
  16. LAN011213001-23445-819
  17. LAN011213001-23445-819
  18. LAN011213001-23445-819
  19. LAN011213001-23445-819
  20. LAN011213001-23445-819
  21. LAN011213001-23445-819
  22. LAN011213001-23445-819
  23. LAN011213001-23445-819
  24. LAN011213001-23445-819
  25. LAN011213001-23445-819
  26. LAN011213001-23445-819
  27. LAN011213001-23445-819
  28. LAN011213001-23445-819
  29. LAN011213001-23445-819
  30. LAN011213001-23445-819
  31. LAN011213001-23445-819
  32. LAN011213001-23445-819
  33. LAN011213001-23445-819
  34. LAN011213001-23445-819
  35. LAN011213001-23445-819
  36. LAN011213001-23445-819
  37. LAN011213001-23445-819
  38. LAN011213001-23445-819
  39. LAN011213001-23445-819
  40. LAN011213001-23445-819
  41. LAN011213001-23445-819
  42. LAN011213001-23445-819
  43. LAN011213001-23445-819
  44. LAN011213001-23445-819
  45. LAN011213001-23445-819
  46. LAN011213001-23445-819
  47. LAN011213001-23445-819
  48. LAN011213001-23445-819
  49. LAN011213001-23445-819
  50. LAN011213001-23445-819
  51. LAN011213001-23445-819
  52. LAN011213001-23445-819
  53. LAN011213001-23445-819
  54. LAN011213001-23445-819 THAT’S why we need the version when promoting
  55. LAN011213001-23445-819 THAT’S why we need the version when promoting
  56. LAN011213001-23445-819
  57. LAN011213001-23445-819
  58. LAN011213001-23445-819
  59. LAN011213001-23445-819
  60. LAN011213001-23445-819
  61. LAN011213001-23445-819
  62. LAN011213001-23445-819
  63. LAN011213001-23445-819
  64. LAN011213001-23445-819
  65. LAN011213001-23445-819
  66. LAN011213001-23445-819
  67. LAN011213001-23445-819