SlideShare uma empresa Scribd logo
1 de 18
Baixar para ler offline
Memorable uses for a
Regular Expression
library
Learning the syntax by examples
Alex Perry
SRE, Google, Los Angeles
April 2014
Outline
● Simple Regular Expressions
● import re
○ http://docs.python.org/2/library/re.html
● Parsing
● import sre
● Formatting
● import sre_yield
● Arithmetic
● Performance uncertainty
● import re2
Basic Regular Expressions
abc “abc”
[abc] “a” “b” “c”
abc? “ab” “abc”
abc* “ab” “abc” “abcc” ...
abc+ “abc” “abcc” “abccc” ...
abc{3,4} “abccc” “abcccc”
ab|c+ “ab” “c+”
ab. “ab.” “ab1” … “abn”DOTALL
The standard library - compiling
>>> import re
>>> o = re.compile(“abc?”)
>>> [bool(o.match(s)) for s in
["a", "ab", "abc", "abcc", "aabcc"]]
[False, True, True, True, False]
>>> [bool(o.search(s)) for s in
["a", "ab", "abc", "abcc", "aabcc"]]
[False, True, True, True, True]
The standard library - endings
>>> o = re.compile("^abc?$")
>>> [bool(o.search(s)) for s in
["a", "ab", "abc", "abcc", "aabcc"]]
[False, True, True, False, False]
>>> s = re.compile("i*") # yes, that s matches “”
>>> s.split("oiooiioooiii") # split ignores that silliness
['o', 'oo', 'ooo', '']
>>> s.sub("x", "oiooiioooiii") # but sub does not
'xoxoxoxoxoxox'
Parsing strings easily
>>> import re
>>> cell = re.compile(r"(?P<row>[$]?[a-z]+)"
r"(?P<col>[$]?[0-9]+)")
>>> m = cell.search("Spreadsheet cell aa$15")
>>> m
<_sre.SRE_Match object at 0x7f220a8e9360>
>>> m.groupdict()
{'col': '$15', 'row': 'aa'}
Formatting after parsing using a regular expression
>>> rc = m.groupdict()
>>> rc
{'col': '$15', 'row': 'aa'}
>>> 'It was row %(row)s and column %(col)s' % rc
'It was row aa and column $15'
>>> txt = "from a1 2 b$22 as well as 4 $c4"
>>> f = r"<%(col)s,%(row)s>"
>>> ";".join(f % m.groupdict() for m in cell.finditer(txt))
'<1,a>;<$22,b>;<4,$c>'
Secret (labs) RE engine - internals
● Originally separate from module “re”
○ As of version 2.0 onwards they’re equivalent
○ Call it “sre” in any backward compatible code
>>> import sre_parse
>>> sre_parse.parse("ab|c")
[('branch', (None, [
[('literal', 97), ('literal', 98)],
[('literal', 99)]
])
)]
Secret Regular Expression Yield
● New module called sre_yield
○ https://github.com/google/sre_yield
● def Values(regex, flags=0, charset=CHARSET)
○ Examines output from sre_parse.parse()
○ Returns a convenient sequence like object
● Sequence has an efficient membership test
○ We were given a regex describing its content
● Some features (lookahead, etc) still missing
○ Easy to add if sequence can contain None
Iterating over all matching strings
>>> import sre_yield
>>> sre_yield.Values(r'1(?P<x>234?|49?)')[:]
['123', '1234', '14', '149']
>>> len(sre_yield.Values('.'))
256
>>> sre_yield.Values('a*')[5:10]
['aaaaa', 'aaaaaa', 'aaaaaaa', 'aaaaaaaa', 'aaaaaaaaa']
What do we do about infinite repetitions
>>> len(sre_yield.Values('0*'))
65536
# Yes, really. sre library can only specify 65534 max
>>> a77k = 'a' * 77000
>>> len(re.compile(r'.{,65534}').match(a77k).group(0))
65534
>>> len(re.compile(r'.{,65535}').match(a77k).group(0))
77000
>>> len(re.compile(r'.{60000}.{,6000}|.{,60000}')
.match(a77k).group(0))
66000
How many matching strings
>>> import sre_yield
>>> bits = sre_yield.Values('[01]*') # All binary nums
>>> len(bits) # how many are there?
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: long int too large to convert to int
>>> bits.__len__() == 2**65536 - 1 # check the answer
True
>>> len(str(bits.__len__())) # Is the number that big?
19729
>>> "001001" in bits, "002001" in bits
(True, False)
Python does understand working with large numbers
>>> import sre_yield
>>> anything = sre_yield.Values('.*')
>>> a = 1
>>> for _ in xrange(65535): a = a * 256 + 1
>>> anything.__len__() == a
True
>>> str_a = str(a) # This does take a while
>>> len(str_a)
157825
>>> str_a[:9], str_a[-9:]
('101818453', '945826561')
But why bother yielding from a regex
● It can be more compact than a literal list, for example:
ap-northeast-1|ap-southeast-1|ap-southeast-2|eu-west-1|
sa-east-1|us-east-1|us-west-1|us-west-2
● That doesn’t get much shorter when rewritten:
(ap-(nor|sou)th|sa-|us-)east-1|(eu|us)-west-1|(us-we|ap-
southea)st-2
● On the other hand, others are more convenient:
www-(?P<replica>[1-8])[.]((:?P<fleet>canary|beta)[.])
widget[.](?P<domain>com|co[.]uk|ch|de)
● Some things would better be machine generated:
192.168(?:.(?:[1-9]?d|1d{2}|2[0-4]d|25[0-5])){2}
● Implementation uses backtracking, i.e. PCRE
○ So it is fast providing it never guesses wrong
○ Trivial to write an expression that is … slow
def test(n):
t = "a" * n
r = "a?" * n + t
return bool(
re.match(r, t))
timeit.timeit(
stmt="test(6)", setup="from __main__ import test")
How fast is the “re” library
The RE2 library
● https://code.google.com/p/re2
● https://github.com/axiak/pyre2
● RE2 tries all possible code paths in parallel
○ never backtracks, so omits features that need it
● drops support for backreferences
○ and generalized zero-width assertions
● Predictable worst case performance for any input
○ Safe to accept untrusted regular expressions
Test(10) takes 4 milliseconds instead of one minute
Summary
●Regular expressions are built into Python
○re_obj = re.compile(pattern)
○print re_obj.pattern
●They can parse strings into a dictionary
○Or iteratively many dictionaries
●They can compactly represent large lists
○Without expanding the whole iterator out
●For reliable performance, use RE2
○Especially if users are supplying patterns
Questions?
●mail -s us.pycon.org/2014 
○Alex.Perry@Google.com
● Nothing to do with me, but pretty good:
○ http://qntm.org/files/re/re.html

Mais conteúdo relacionado

Mais procurados

Functional Programming with JavaScript
Functional Programming with JavaScriptFunctional Programming with JavaScript
Functional Programming with JavaScriptMark Shelton
 
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-C
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-CQuick 入門 | iOS RDD テストフレームワーク for Swift/Objective-C
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-CYuki Tanabe
 
Basicsof c make and git for a hello qt application
Basicsof c make and git for a hello qt applicationBasicsof c make and git for a hello qt application
Basicsof c make and git for a hello qt applicationDinesh Manajipet
 
Optimizing the Grafana Platform for Flux
Optimizing the Grafana Platform for FluxOptimizing the Grafana Platform for Flux
Optimizing the Grafana Platform for FluxInfluxData
 
Flux and InfluxDB 2.0 by Paul Dix
Flux and InfluxDB 2.0 by Paul DixFlux and InfluxDB 2.0 by Paul Dix
Flux and InfluxDB 2.0 by Paul DixInfluxData
 
2 BytesC++ course_2014_c8_ strings
2 BytesC++ course_2014_c8_ strings 2 BytesC++ course_2014_c8_ strings
2 BytesC++ course_2014_c8_ strings kinan keshkeh
 
Compositional I/O Stream in Scala
Compositional I/O Stream in ScalaCompositional I/O Stream in Scala
Compositional I/O Stream in ScalaC4Media
 
Write a program that calculate the no of prime no,even and odd no.
Write a program that calculate the no of prime no,even and odd no.Write a program that calculate the no of prime no,even and odd no.
Write a program that calculate the no of prime no,even and odd no.university of Gujrat, pakistan
 
Queue implementation
Queue implementationQueue implementation
Queue implementationRajendran
 
Reactive Programming in the Browser feat. Scala.js and PureScript
Reactive Programming in the Browser feat. Scala.js and PureScriptReactive Programming in the Browser feat. Scala.js and PureScript
Reactive Programming in the Browser feat. Scala.js and PureScriptLuka Jacobowitz
 
FS2 for Fun and Profit
FS2 for Fun and ProfitFS2 for Fun and Profit
FS2 for Fun and ProfitAdil Akhter
 
Flux and InfluxDB 2.0
Flux and InfluxDB 2.0Flux and InfluxDB 2.0
Flux and InfluxDB 2.0InfluxData
 

Mais procurados (20)

Functional Programming with JavaScript
Functional Programming with JavaScriptFunctional Programming with JavaScript
Functional Programming with JavaScript
 
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-C
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-CQuick 入門 | iOS RDD テストフレームワーク for Swift/Objective-C
Quick 入門 | iOS RDD テストフレームワーク for Swift/Objective-C
 
Basicsof c make and git for a hello qt application
Basicsof c make and git for a hello qt applicationBasicsof c make and git for a hello qt application
Basicsof c make and git for a hello qt application
 
Optimizing the Grafana Platform for Flux
Optimizing the Grafana Platform for FluxOptimizing the Grafana Platform for Flux
Optimizing the Grafana Platform for Flux
 
Sol7
Sol7Sol7
Sol7
 
A Shiny Example-- R
A Shiny Example-- RA Shiny Example-- R
A Shiny Example-- R
 
Flux and InfluxDB 2.0 by Paul Dix
Flux and InfluxDB 2.0 by Paul DixFlux and InfluxDB 2.0 by Paul Dix
Flux and InfluxDB 2.0 by Paul Dix
 
2 BytesC++ course_2014_c8_ strings
2 BytesC++ course_2014_c8_ strings 2 BytesC++ course_2014_c8_ strings
2 BytesC++ course_2014_c8_ strings
 
Compositional I/O Stream in Scala
Compositional I/O Stream in ScalaCompositional I/O Stream in Scala
Compositional I/O Stream in Scala
 
Array and functions
Array and functionsArray and functions
Array and functions
 
Sortings
SortingsSortings
Sortings
 
Write a program that calculate the no of prime no,even and odd no.
Write a program that calculate the no of prime no,even and odd no.Write a program that calculate the no of prime no,even and odd no.
Write a program that calculate the no of prime no,even and odd no.
 
2 a networkflow
2 a networkflow2 a networkflow
2 a networkflow
 
Queue implementation
Queue implementationQueue implementation
Queue implementation
 
Reactive Programming in the Browser feat. Scala.js and PureScript
Reactive Programming in the Browser feat. Scala.js and PureScriptReactive Programming in the Browser feat. Scala.js and PureScript
Reactive Programming in the Browser feat. Scala.js and PureScript
 
FS2 for Fun and Profit
FS2 for Fun and ProfitFS2 for Fun and Profit
FS2 for Fun and Profit
 
Flamingo in Production
Flamingo in ProductionFlamingo in Production
Flamingo in Production
 
Flux and InfluxDB 2.0
Flux and InfluxDB 2.0Flux and InfluxDB 2.0
Flux and InfluxDB 2.0
 
Sol 1
Sol 1Sol 1
Sol 1
 
Flamingo Core Concepts
Flamingo Core ConceptsFlamingo Core Concepts
Flamingo Core Concepts
 

Destaque

Presentation on reliability engineering
Presentation on reliability engineeringPresentation on reliability engineering
Presentation on reliability engineeringViraj Patil
 
Software Architecture Fundamentals Part-1-Architecture soft skills
Software Architecture Fundamentals Part-1-Architecture soft skillsSoftware Architecture Fundamentals Part-1-Architecture soft skills
Software Architecture Fundamentals Part-1-Architecture soft skillsFreddy Munandar
 
Secure Architecture and Programming 101
Secure Architecture and Programming 101Secure Architecture and Programming 101
Secure Architecture and Programming 101QAware GmbH
 
Load balancing in the SRE way
Load balancing in the SRE wayLoad balancing in the SRE way
Load balancing in the SRE wayShawn Zhu
 
Software Reliability Engineering
Software Reliability EngineeringSoftware Reliability Engineering
Software Reliability Engineeringguest90cec6
 
System Security Beyond the Libraries
System Security Beyond the LibrariesSystem Security Beyond the Libraries
System Security Beyond the LibrariesEoin Woods
 
Getting Your System to Production and Keeping it There
Getting Your System to Production and Keeping it ThereGetting Your System to Production and Keeping it There
Getting Your System to Production and Keeping it ThereEoin Woods
 
Monolith to Microservices - O’Reilly Oscon
Monolith to Microservices - O’Reilly OsconMonolith to Microservices - O’Reilly Oscon
Monolith to Microservices - O’Reilly OsconChristopher Grant
 
Software Architecture as Systems Dissolve (OOP2016)
Software Architecture as Systems Dissolve (OOP2016)Software Architecture as Systems Dissolve (OOP2016)
Software Architecture as Systems Dissolve (OOP2016)Eoin Woods
 
Evolving toward Microservices - O’Reilly SACON Keynote
Evolving toward Microservices  - O’Reilly SACON KeynoteEvolving toward Microservices  - O’Reilly SACON Keynote
Evolving toward Microservices - O’Reilly SACON KeynoteChristopher Grant
 
Staying in Sync: From Transactions to Streams
Staying in Sync: From Transactions to StreamsStaying in Sync: From Transactions to Streams
Staying in Sync: From Transactions to StreamsC4Media
 
SRECon USA 2016: Growing your Entry Level Talent
SRECon USA 2016: Growing your Entry Level TalentSRECon USA 2016: Growing your Entry Level Talent
SRECon USA 2016: Growing your Entry Level TalentMichael Kehoe
 
IntelliJ IDEA - Gems you can find inside
IntelliJ IDEA - Gems you can find insideIntelliJ IDEA - Gems you can find inside
IntelliJ IDEA - Gems you can find insideMilan Krystek
 
Migrating to IntelliJ IDEA from Eclipse
Migrating to IntelliJ IDEA from EclipseMigrating to IntelliJ IDEA from Eclipse
Migrating to IntelliJ IDEA from EclipseTrisha Gee
 
You got a couple Microservices, now what? - Adding SRE to DevOps
You got a couple Microservices, now what?  - Adding SRE to DevOpsYou got a couple Microservices, now what?  - Adding SRE to DevOps
You got a couple Microservices, now what? - Adding SRE to DevOpsGonzalo Maldonado
 
Radical ideas from the book: The Practice of Cloud System Administration
Radical ideas from the book: The Practice of Cloud System AdministrationRadical ideas from the book: The Practice of Cloud System Administration
Radical ideas from the book: The Practice of Cloud System AdministrationTom Limoncelli
 

Destaque (20)

Software fault management
Software fault managementSoftware fault management
Software fault management
 
Presentation on reliability engineering
Presentation on reliability engineeringPresentation on reliability engineering
Presentation on reliability engineering
 
Software Architecture Fundamentals Part-1-Architecture soft skills
Software Architecture Fundamentals Part-1-Architecture soft skillsSoftware Architecture Fundamentals Part-1-Architecture soft skills
Software Architecture Fundamentals Part-1-Architecture soft skills
 
Secure Architecture and Programming 101
Secure Architecture and Programming 101Secure Architecture and Programming 101
Secure Architecture and Programming 101
 
Load balancing in the SRE way
Load balancing in the SRE wayLoad balancing in the SRE way
Load balancing in the SRE way
 
Software Reliability Engineering
Software Reliability EngineeringSoftware Reliability Engineering
Software Reliability Engineering
 
System Security Beyond the Libraries
System Security Beyond the LibrariesSystem Security Beyond the Libraries
System Security Beyond the Libraries
 
Getting Your System to Production and Keeping it There
Getting Your System to Production and Keeping it ThereGetting Your System to Production and Keeping it There
Getting Your System to Production and Keeping it There
 
Monolith to Microservices - O’Reilly Oscon
Monolith to Microservices - O’Reilly OsconMonolith to Microservices - O’Reilly Oscon
Monolith to Microservices - O’Reilly Oscon
 
Software Architecture as Systems Dissolve (OOP2016)
Software Architecture as Systems Dissolve (OOP2016)Software Architecture as Systems Dissolve (OOP2016)
Software Architecture as Systems Dissolve (OOP2016)
 
Java memory model
Java memory modelJava memory model
Java memory model
 
Evolving toward Microservices - O’Reilly SACON Keynote
Evolving toward Microservices  - O’Reilly SACON KeynoteEvolving toward Microservices  - O’Reilly SACON Keynote
Evolving toward Microservices - O’Reilly SACON Keynote
 
Staying in Sync: From Transactions to Streams
Staying in Sync: From Transactions to StreamsStaying in Sync: From Transactions to Streams
Staying in Sync: From Transactions to Streams
 
SRECon USA 2016: Growing your Entry Level Talent
SRECon USA 2016: Growing your Entry Level TalentSRECon USA 2016: Growing your Entry Level Talent
SRECon USA 2016: Growing your Entry Level Talent
 
IntelliJ IDEA - Gems you can find inside
IntelliJ IDEA - Gems you can find insideIntelliJ IDEA - Gems you can find inside
IntelliJ IDEA - Gems you can find inside
 
Java Memory Model
Java Memory ModelJava Memory Model
Java Memory Model
 
Migrating to IntelliJ IDEA from Eclipse
Migrating to IntelliJ IDEA from EclipseMigrating to IntelliJ IDEA from Eclipse
Migrating to IntelliJ IDEA from Eclipse
 
You got a couple Microservices, now what? - Adding SRE to DevOps
You got a couple Microservices, now what?  - Adding SRE to DevOpsYou got a couple Microservices, now what?  - Adding SRE to DevOps
You got a couple Microservices, now what? - Adding SRE to DevOps
 
Radical ideas from the book: The Practice of Cloud System Administration
Radical ideas from the book: The Practice of Cloud System AdministrationRadical ideas from the book: The Practice of Cloud System Administration
Radical ideas from the book: The Practice of Cloud System Administration
 
SRE Tools
SRE ToolsSRE Tools
SRE Tools
 

Semelhante a Regular expressions, Alex Perry, Google, PyCon2014

Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with ClojureDmitry Buzdin
 
A3 sec -_regular_expressions
A3 sec -_regular_expressionsA3 sec -_regular_expressions
A3 sec -_regular_expressionsa3sec
 
Rainer Grimm, “Functional Programming in C++11”
Rainer Grimm, “Functional Programming in C++11”Rainer Grimm, “Functional Programming in C++11”
Rainer Grimm, “Functional Programming in C++11”Platonov Sergey
 
Practical Testing of Ruby Core
Practical Testing of Ruby CorePractical Testing of Ruby Core
Practical Testing of Ruby CoreHiroshi SHIBATA
 
PHP tips and tricks
PHP tips and tricks PHP tips and tricks
PHP tips and tricks Damien Seguy
 
Useful javascript
Useful javascriptUseful javascript
Useful javascriptLei Kang
 
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0mCloud
 
Developers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman OracleDevelopers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman OraclemCloud
 
New features in abap
New features in abapNew features in abap
New features in abapSrihari J
 
Hacking ansible
Hacking ansibleHacking ansible
Hacking ansiblebcoca
 
Rubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for RubyRubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for RubyPrasun Anand
 
Functional programming in ruby
Functional programming in rubyFunctional programming in ruby
Functional programming in rubyKoen Handekyn
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7decoupled
 
Дмитрий Верескун «Синтаксический сахар C#»
Дмитрий Верескун «Синтаксический сахар C#»Дмитрий Верескун «Синтаксический сахар C#»
Дмитрий Верескун «Синтаксический сахар C#»SpbDotNet Community
 
Beauty and the beast - Haskell on JVM
Beauty and the beast  - Haskell on JVMBeauty and the beast  - Haskell on JVM
Beauty and the beast - Haskell on JVMJarek Ratajski
 
Ruby Basics by Rafiq
Ruby Basics by RafiqRuby Basics by Rafiq
Ruby Basics by RafiqRafiqdeen
 

Semelhante a Regular expressions, Alex Perry, Google, PyCon2014 (20)

Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
 
A3 sec -_regular_expressions
A3 sec -_regular_expressionsA3 sec -_regular_expressions
A3 sec -_regular_expressions
 
Rainer Grimm, “Functional Programming in C++11”
Rainer Grimm, “Functional Programming in C++11”Rainer Grimm, “Functional Programming in C++11”
Rainer Grimm, “Functional Programming in C++11”
 
Python lecture 05
Python lecture 05Python lecture 05
Python lecture 05
 
Practical Testing of Ruby Core
Practical Testing of Ruby CorePractical Testing of Ruby Core
Practical Testing of Ruby Core
 
PHP tips and tricks
PHP tips and tricks PHP tips and tricks
PHP tips and tricks
 
Useful javascript
Useful javascriptUseful javascript
Useful javascript
 
Eta
EtaEta
Eta
 
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
 
Developers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman OracleDevelopers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman Oracle
 
New features in abap
New features in abapNew features in abap
New features in abap
 
Hacking ansible
Hacking ansibleHacking ansible
Hacking ansible
 
Rubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for RubyRubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for Ruby
 
Functional programming in ruby
Functional programming in rubyFunctional programming in ruby
Functional programming in ruby
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7
 
A tour of Python
A tour of PythonA tour of Python
A tour of Python
 
Дмитрий Верескун «Синтаксический сахар C#»
Дмитрий Верескун «Синтаксический сахар C#»Дмитрий Верескун «Синтаксический сахар C#»
Дмитрий Верескун «Синтаксический сахар C#»
 
Beauty and the beast - Haskell on JVM
Beauty and the beast  - Haskell on JVMBeauty and the beast  - Haskell on JVM
Beauty and the beast - Haskell on JVM
 
Ruby Basics by Rafiq
Ruby Basics by RafiqRuby Basics by Rafiq
Ruby Basics by Rafiq
 
[Start] Scala
[Start] Scala[Start] Scala
[Start] Scala
 

Último

Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptxhybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx9to5mart
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 

Último (20)

Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
hybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptxhybrid Seed Production In Chilli & Capsicum.pptx
hybrid Seed Production In Chilli & Capsicum.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 

Regular expressions, Alex Perry, Google, PyCon2014

  • 1. Memorable uses for a Regular Expression library Learning the syntax by examples Alex Perry SRE, Google, Los Angeles April 2014
  • 2. Outline ● Simple Regular Expressions ● import re ○ http://docs.python.org/2/library/re.html ● Parsing ● import sre ● Formatting ● import sre_yield ● Arithmetic ● Performance uncertainty ● import re2
  • 3. Basic Regular Expressions abc “abc” [abc] “a” “b” “c” abc? “ab” “abc” abc* “ab” “abc” “abcc” ... abc+ “abc” “abcc” “abccc” ... abc{3,4} “abccc” “abcccc” ab|c+ “ab” “c+” ab. “ab.” “ab1” … “abn”DOTALL
  • 4. The standard library - compiling >>> import re >>> o = re.compile(“abc?”) >>> [bool(o.match(s)) for s in ["a", "ab", "abc", "abcc", "aabcc"]] [False, True, True, True, False] >>> [bool(o.search(s)) for s in ["a", "ab", "abc", "abcc", "aabcc"]] [False, True, True, True, True]
  • 5. The standard library - endings >>> o = re.compile("^abc?$") >>> [bool(o.search(s)) for s in ["a", "ab", "abc", "abcc", "aabcc"]] [False, True, True, False, False] >>> s = re.compile("i*") # yes, that s matches “” >>> s.split("oiooiioooiii") # split ignores that silliness ['o', 'oo', 'ooo', ''] >>> s.sub("x", "oiooiioooiii") # but sub does not 'xoxoxoxoxoxox'
  • 6. Parsing strings easily >>> import re >>> cell = re.compile(r"(?P<row>[$]?[a-z]+)" r"(?P<col>[$]?[0-9]+)") >>> m = cell.search("Spreadsheet cell aa$15") >>> m <_sre.SRE_Match object at 0x7f220a8e9360> >>> m.groupdict() {'col': '$15', 'row': 'aa'}
  • 7. Formatting after parsing using a regular expression >>> rc = m.groupdict() >>> rc {'col': '$15', 'row': 'aa'} >>> 'It was row %(row)s and column %(col)s' % rc 'It was row aa and column $15' >>> txt = "from a1 2 b$22 as well as 4 $c4" >>> f = r"<%(col)s,%(row)s>" >>> ";".join(f % m.groupdict() for m in cell.finditer(txt)) '<1,a>;<$22,b>;<4,$c>'
  • 8. Secret (labs) RE engine - internals ● Originally separate from module “re” ○ As of version 2.0 onwards they’re equivalent ○ Call it “sre” in any backward compatible code >>> import sre_parse >>> sre_parse.parse("ab|c") [('branch', (None, [ [('literal', 97), ('literal', 98)], [('literal', 99)] ]) )]
  • 9. Secret Regular Expression Yield ● New module called sre_yield ○ https://github.com/google/sre_yield ● def Values(regex, flags=0, charset=CHARSET) ○ Examines output from sre_parse.parse() ○ Returns a convenient sequence like object ● Sequence has an efficient membership test ○ We were given a regex describing its content ● Some features (lookahead, etc) still missing ○ Easy to add if sequence can contain None
  • 10. Iterating over all matching strings >>> import sre_yield >>> sre_yield.Values(r'1(?P<x>234?|49?)')[:] ['123', '1234', '14', '149'] >>> len(sre_yield.Values('.')) 256 >>> sre_yield.Values('a*')[5:10] ['aaaaa', 'aaaaaa', 'aaaaaaa', 'aaaaaaaa', 'aaaaaaaaa']
  • 11. What do we do about infinite repetitions >>> len(sre_yield.Values('0*')) 65536 # Yes, really. sre library can only specify 65534 max >>> a77k = 'a' * 77000 >>> len(re.compile(r'.{,65534}').match(a77k).group(0)) 65534 >>> len(re.compile(r'.{,65535}').match(a77k).group(0)) 77000 >>> len(re.compile(r'.{60000}.{,6000}|.{,60000}') .match(a77k).group(0)) 66000
  • 12. How many matching strings >>> import sre_yield >>> bits = sre_yield.Values('[01]*') # All binary nums >>> len(bits) # how many are there? Traceback (most recent call last): File "<stdin>", line 1, in <module> OverflowError: long int too large to convert to int >>> bits.__len__() == 2**65536 - 1 # check the answer True >>> len(str(bits.__len__())) # Is the number that big? 19729 >>> "001001" in bits, "002001" in bits (True, False)
  • 13. Python does understand working with large numbers >>> import sre_yield >>> anything = sre_yield.Values('.*') >>> a = 1 >>> for _ in xrange(65535): a = a * 256 + 1 >>> anything.__len__() == a True >>> str_a = str(a) # This does take a while >>> len(str_a) 157825 >>> str_a[:9], str_a[-9:] ('101818453', '945826561')
  • 14. But why bother yielding from a regex ● It can be more compact than a literal list, for example: ap-northeast-1|ap-southeast-1|ap-southeast-2|eu-west-1| sa-east-1|us-east-1|us-west-1|us-west-2 ● That doesn’t get much shorter when rewritten: (ap-(nor|sou)th|sa-|us-)east-1|(eu|us)-west-1|(us-we|ap- southea)st-2 ● On the other hand, others are more convenient: www-(?P<replica>[1-8])[.]((:?P<fleet>canary|beta)[.]) widget[.](?P<domain>com|co[.]uk|ch|de) ● Some things would better be machine generated: 192.168(?:.(?:[1-9]?d|1d{2}|2[0-4]d|25[0-5])){2}
  • 15. ● Implementation uses backtracking, i.e. PCRE ○ So it is fast providing it never guesses wrong ○ Trivial to write an expression that is … slow def test(n): t = "a" * n r = "a?" * n + t return bool( re.match(r, t)) timeit.timeit( stmt="test(6)", setup="from __main__ import test") How fast is the “re” library
  • 16. The RE2 library ● https://code.google.com/p/re2 ● https://github.com/axiak/pyre2 ● RE2 tries all possible code paths in parallel ○ never backtracks, so omits features that need it ● drops support for backreferences ○ and generalized zero-width assertions ● Predictable worst case performance for any input ○ Safe to accept untrusted regular expressions Test(10) takes 4 milliseconds instead of one minute
  • 17. Summary ●Regular expressions are built into Python ○re_obj = re.compile(pattern) ○print re_obj.pattern ●They can parse strings into a dictionary ○Or iteratively many dictionaries ●They can compactly represent large lists ○Without expanding the whole iterator out ●For reliable performance, use RE2 ○Especially if users are supplying patterns
  • 18. Questions? ●mail -s us.pycon.org/2014 ○Alex.Perry@Google.com ● Nothing to do with me, but pretty good: ○ http://qntm.org/files/re/re.html