SlideShare uma empresa Scribd logo
1 de 25
Baixar para ler offline
Colloquium - grep
v1.0
A. Magee
April 6, 2010
1 / 16
Colloquium - grep, v1.0
A. Magee
Outline
1 Introduction
What does grep offer?
When should I use grep?
2 Understanding Regular Expressions
Class Basics
Quantifiers & Grouping
Online Tools
Examples
3 Using Regular Expressions With grep
2 / 16
Colloquium - grep, v1.0
A. Magee
Outline
1 Introduction
What does grep offer?
When should I use grep?
2 Understanding Regular Expressions
Class Basics
Quantifiers & Grouping
Online Tools
Examples
3 Using Regular Expressions With grep
2 / 16
Colloquium - grep, v1.0
A. Magee
Outline
1 Introduction
What does grep offer?
When should I use grep?
2 Understanding Regular Expressions
Class Basics
Quantifiers & Grouping
Online Tools
Examples
3 Using Regular Expressions With grep
2 / 16
Colloquium - grep, v1.0
A. Magee
Introduction What?
What does grep offer?
grep matches regular expressions.
Your first question should be“What is a regular expression?”
A regular expression is a language pattern.
grep and REs allow us to find complex things in text.
Complex is relative and can vary from a single character to an IP
address.
Single character complex: [ajk+0-]
IP complex: (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
3 / 16
Colloquium - grep, v1.0
A. Magee
Introduction What?
What does grep offer?
grep matches regular expressions.
Your first question should be“What is a regular expression?”
A regular expression is a language pattern.
grep and REs allow us to find complex things in text.
Complex is relative and can vary from a single character to an IP
address.
Single character complex: [ajk+0-]
IP complex: (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
3 / 16
Colloquium - grep, v1.0
A. Magee
Introduction What?
What does grep offer?
grep matches regular expressions.
Your first question should be“What is a regular expression?”
A regular expression is a language pattern.
grep and REs allow us to find complex things in text.
Complex is relative and can vary from a single character to an IP
address.
Single character complex: [ajk+0-]
IP complex: (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
3 / 16
Colloquium - grep, v1.0
A. Magee
Introduction When?
When should I use grep?
Always!
Unless you find some better tool.
P.S. - grep stands for g/re/p, an ed command that means global/reg
ex/print
4 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Class Basics
Class Basics
A character class is a symbol or collection of symbols that describes a
group of characters.
. (period): This matches any single character.
[...]: This matches any one character in the set.
[aeiou] matches one of the vowels.
[a-z] matches one of the lowercase alphabet.
[0-5] matches one numeral 0 through 5.
You will not remember all of these until you use them often, but
there are many special classes that can save you some typing.
5 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Class Basics
Common Classes
Special Class Meaning Simple RE
d Digit characters [0-9]
D Non-digit characters [ˆ0-9]
w Word characters [a-zA-Z 0-9]
W Non-word characters [ˆa-zA-Z 0-9]
s Whitespace characters characters [fnrt]
S Non-space characters [ˆfnrt]
b Word boundary
The word boundary class is very special as it is zero length and matches
transitions between s and w and vice versa.
6 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Class Basics
More Common Classes
Special Class Meaning Simple RE
[:alpha:] All alphabetic characters [a-zA-Z]
[:alnum:] All alphabetic and numeric [a-zA-Z0-9]
[:blank:] Tab and space
[:cntrl:] Control characters [x00-x1Fx7F]
[:digit:] A numeric digit [0-9]
[:graph:] Any visible character [x21-x7E]
[:lower:] Lowercase characters [a-z]
[:print:] Printables (i.e. no controls) [x20-x7E]
[:punct:] Punctuation & symbols [!”#$%&’()*+,-./:;<=>?
@[ ]ˆ ‘{|}∼]
[:space:] Space, tab, newline, etc [ trnvf]
[:upper:] Uppercase characters [A-Z]
[:word:] Word characters [a-zA-Z0-9 ]
[:xdigit:] Hex digits [A-Fa-f0-9]
7 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Quantifiers & Grouping
Quantifiers & Grouping
Quantifiers are how a RE counts things.
? Exactly zero or one occurrence
* Zero or more occurrences
+ One or more occurrences
*? Zero or more occurrences non-greedy
+? One or more occurrences non-greedy
{x} Exactly x occurrences
{x,} At least x occurrences
{x,y} At least x but no more than y occurrences
Grouping is used to collect patterns together and to create
back-references. A group is simply a set of parentheses ().
8 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Online Tools
Helpful Tools
The best way to understand the rest of this presentation is to see what is
being matched live. Here are some online tools that work for our needs.
RegExr - www.gskinner.com/RegExr
beware Flash, but it works well
regexpal - regexpal.com
very simple
reanimator - osteele.com/tools/reanimator
beware Flash, recommend CS 4/570 first
rubular - rubular.com
nice on-page reference
9 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Examples
Your First RE
Let’s skip trivial REs and get on to something useful. These may be more
complex than you’re used to but the quicker you are able to read long,
complex REs the better. This is a nice, but not perfect, email address
matcher.
[[:alnum:]][[:word:].%+-]*@(?:[[:alnum:]-]+.)+[[:alpha:]]{2,4}
[[:alnum:]][[:word:].%+-]*
Match a word that doesn’t start with [.%+-].
@(?:[[:alnum:]-]+.)+
Match the @ symbol and any number of subdomains followed by
periods.
[[:alpha:]]{2,4}
Match the top level domain of 2, 3 or 4 characters.
10 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Examples
Your First RE - Part 2
Let’s examine the first part.
[[:alnum:]][[:word:].%+-]*
[[:alnum:]] - Must start with an alphanumeric character.
NB: All [: ... :] classes must live in a set like [[: ... :]].
[[:word:].%+-] - Other characters maybe a ‘word’ character,
a literal space, percent symbol, plus symbol or a dash.
NB: The period must be escaped because it has special meaning.
* - repeat the previous set zero or more times.
11 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Examples
Your First RE - Part 2
Let’s examine the first part.
[[:alnum:]][[:word:].%+-]*
[[:alnum:]] - Must start with an alphanumeric character.
NB: All [: ... :] classes must live in a set like [[: ... :]].
[[:word:].%+-] - Other characters maybe a ‘word’ character,
a literal space, percent symbol, plus symbol or a dash.
NB: The period must be escaped because it has special meaning.
* - repeat the previous set zero or more times.
11 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Examples
Your First RE - Part 2
Let’s examine the first part.
[[:alnum:]][[:word:].%+-]*
[[:alnum:]] - Must start with an alphanumeric character.
NB: All [: ... :] classes must live in a set like [[: ... :]].
[[:word:].%+-] - Other characters maybe a ‘word’ character,
a literal space, percent symbol, plus symbol or a dash.
NB: The period must be escaped because it has special meaning.
* - repeat the previous set zero or more times.
11 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Examples
Your First RE - Part 3
Now the second part, the subdomains, sub-subdomains, etc.
@(?:[[:alnum:]-]+.)+
@ - Well that literally matches the ‘at’ character.
The parenthesis denote the beginning of a group.
The ?: is a confusing notation that suppresses the creation of a
back reference. It is here so you’ll know of it, but it is rarely needed.
Again we see a special class for alphanumerics, but we’ve also
included a dash. The plus symbol tells us to look for one or more of
these characters, followed by a period.
And lastly we close the group and the plus symbol now tells us to
look for one or more of these groups.
12 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Examples
Your First RE - Part 3
Now the second part, the subdomains, sub-subdomains, etc.
@(?:[[:alnum:]-]+.)+
@ - Well that literally matches the ‘at’ character.
The parenthesis denote the beginning of a group.
The ?: is a confusing notation that suppresses the creation of a
back reference. It is here so you’ll know of it, but it is rarely needed.
Again we see a special class for alphanumerics, but we’ve also
included a dash. The plus symbol tells us to look for one or more of
these characters, followed by a period.
And lastly we close the group and the plus symbol now tells us to
look for one or more of these groups.
12 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Examples
Your First RE - Part 3
Now the second part, the subdomains, sub-subdomains, etc.
@(?:[[:alnum:]-]+.)+
@ - Well that literally matches the ‘at’ character.
The parenthesis denote the beginning of a group.
The ?: is a confusing notation that suppresses the creation of a
back reference. It is here so you’ll know of it, but it is rarely needed.
Again we see a special class for alphanumerics, but we’ve also
included a dash. The plus symbol tells us to look for one or more of
these characters, followed by a period.
And lastly we close the group and the plus symbol now tells us to
look for one or more of these groups.
12 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Examples
Your First RE - Part 3
Now the second part, the subdomains, sub-subdomains, etc.
@(?:[[:alnum:]-]+.)+
@ - Well that literally matches the ‘at’ character.
The parenthesis denote the beginning of a group.
The ?: is a confusing notation that suppresses the creation of a
back reference. It is here so you’ll know of it, but it is rarely needed.
Again we see a special class for alphanumerics, but we’ve also
included a dash. The plus symbol tells us to look for one or more of
these characters, followed by a period.
And lastly we close the group and the plus symbol now tells us to
look for one or more of these groups.
12 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Examples
Your First RE - Part 4
Finally the third part, the domain.
[[:alpha:]]{2,4}
We’ll now this part is easy. Just match 2, 3 or 4 alphabetical
characters.
13 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Examples
Your Second RE
Now we’ll look at a RE that can help use build a header file for a c
program file, given that some neglectful programmer has failed to design
his/her c program properly. This will be a quicker example.
ˆ[ws]*([ws*&,]*)s*{
ˆ[ws]*(
At the beginning of a line match some keywords and types and
the function name and then literal parenthesis.
[ws*&,]*
Match some more words, keywords, variable modifiers and commas.
)s*{
Finally match the closing parenthesis, some whitespace and the
left curly brace, denoting the start of the function body.
14 / 16
Colloquium - grep, v1.0
A. Magee
Regular Expressions Examples
Your Second RE - Fine Details
ˆ[ws]*([ws*&,]*)s*{
In general, most RE parsers will not match across multiple lines, even
though the s class matches the newline character. This is very
bothersome but is easily overcome by using pcregrep. pcre is Perl
Compatible Regular Expression. This is all I will ever say about Perl.
Notice that the literal * must be escaped like so, *.
As must the parentheses due to their special RE meaning.
Escaping so many characters is very annoying, but unfortunately it is
necessary.
15 / 16
Colloquium - grep, v1.0
A. Magee
Appendix
4 Appendix
16 / 16
Colloquium - grep, v1.0
A. Magee

Mais conteúdo relacionado

Mais procurados

Introduction To Python
Introduction To  PythonIntroduction To  Python
Introduction To Python
shailaja30
 
Python advanced 2. regular expression in python
Python advanced 2. regular expression in pythonPython advanced 2. regular expression in python
Python advanced 2. regular expression in python
John(Qiang) Zhang
 
Python Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular ExpressionsPython Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular Expressions
Ranel Padon
 
String & its application
String & its applicationString & its application
String & its application
Tech_MX
 

Mais procurados (20)

Python (regular expression)
Python (regular expression)Python (regular expression)
Python (regular expression)
 
Introduction To Python
Introduction To  PythonIntroduction To  Python
Introduction To Python
 
Grep
GrepGrep
Grep
 
String in python lecture (3)
String in python lecture (3)String in python lecture (3)
String in python lecture (3)
 
Grep - A powerful search utility
Grep - A powerful search utilityGrep - A powerful search utility
Grep - A powerful search utility
 
Python
PythonPython
Python
 
Strings in Python
Strings in PythonStrings in Python
Strings in Python
 
Python advanced 2. regular expression in python
Python advanced 2. regular expression in pythonPython advanced 2. regular expression in python
Python advanced 2. regular expression in python
 
Python strings
Python stringsPython strings
Python strings
 
Python language data types
Python language data typesPython language data types
Python language data types
 
Textpad and Regular Expressions
Textpad and Regular ExpressionsTextpad and Regular Expressions
Textpad and Regular Expressions
 
BayFP: Concurrent and Multicore Haskell
BayFP: Concurrent and Multicore HaskellBayFP: Concurrent and Multicore Haskell
BayFP: Concurrent and Multicore Haskell
 
2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex2013 - Andrei Zmievski: Clínica Regex
2013 - Andrei Zmievski: Clínica Regex
 
Introduction to Python - Part Three
Introduction to Python - Part ThreeIntroduction to Python - Part Three
Introduction to Python - Part Three
 
Python Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular ExpressionsPython Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular Expressions
 
Processing Regex Python
Processing Regex PythonProcessing Regex Python
Processing Regex Python
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
 
Awk essentials
Awk essentialsAwk essentials
Awk essentials
 
String & its application
String & its applicationString & its application
String & its application
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
 

Destaque (6)

Cut command in unix
Cut command in unixCut command in unix
Cut command in unix
 
UNIX - Class6 - sed - Detail
UNIX - Class6 - sed - DetailUNIX - Class6 - sed - Detail
UNIX - Class6 - sed - Detail
 
Learning Grep
Learning GrepLearning Grep
Learning Grep
 
Unix command quickref
Unix command quickrefUnix command quickref
Unix command quickref
 
4, grep
4, grep4, grep
4, grep
 
Take Your Life To The Next Level
Take Your Life To The Next LevelTake Your Life To The Next Level
Take Your Life To The Next Level
 

Semelhante a Grep Introduction

Regular expressions
Regular expressionsRegular expressions
Regular expressions
Raghu nath
 
Questions4
Questions4Questions4
Questions4
hccit
 
Regular Expressions and You
Regular Expressions and YouRegular Expressions and You
Regular Expressions and You
James Armes
 
Scala Language Intro - Inspired by the Love Game
Scala Language Intro - Inspired by the Love GameScala Language Intro - Inspired by the Love Game
Scala Language Intro - Inspired by the Love Game
Antony Stubbs
 
Washington Practitioners Significant Changes To Rpc 1.5
Washington Practitioners Significant Changes To Rpc 1.5Washington Practitioners Significant Changes To Rpc 1.5
Washington Practitioners Significant Changes To Rpc 1.5
Oregon Law Practice Management
 
Paulo Freire Pedagpogia 1
Paulo Freire Pedagpogia 1Paulo Freire Pedagpogia 1
Paulo Freire Pedagpogia 1
Alejandra Perez
 
Jerry Shea Resume And Addendum 5 2 09
Jerry  Shea Resume And Addendum 5 2 09Jerry  Shea Resume And Addendum 5 2 09
Jerry Shea Resume And Addendum 5 2 09
gshea11
 

Semelhante a Grep Introduction (20)

Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Perl Presentation
Perl PresentationPerl Presentation
Perl Presentation
 
What is the general format for a Try-Catch block Assume that amt l .docx
 What is the general format for a Try-Catch block  Assume that amt l .docx What is the general format for a Try-Catch block  Assume that amt l .docx
What is the general format for a Try-Catch block Assume that amt l .docx
 
Regex lecture
Regex lectureRegex lecture
Regex lecture
 
Questions4
Questions4Questions4
Questions4
 
Unix
UnixUnix
Unix
 
Regular Expressions and You
Regular Expressions and YouRegular Expressions and You
Regular Expressions and You
 
Template Haskell
Template HaskellTemplate Haskell
Template Haskell
 
Don't Fear the Regex - Northeast PHP 2015
Don't Fear the Regex - Northeast PHP 2015Don't Fear the Regex - Northeast PHP 2015
Don't Fear the Regex - Northeast PHP 2015
 
Don't Fear the Regex WordCamp DC 2017
Don't Fear the Regex WordCamp DC 2017Don't Fear the Regex WordCamp DC 2017
Don't Fear the Regex WordCamp DC 2017
 
RegEx Book.pdf
RegEx Book.pdfRegEx Book.pdf
RegEx Book.pdf
 
Python Workshop - Learn Python the Hard Way
Python Workshop - Learn Python the Hard WayPython Workshop - Learn Python the Hard Way
Python Workshop - Learn Python the Hard Way
 
Scala Language Intro - Inspired by the Love Game
Scala Language Intro - Inspired by the Love GameScala Language Intro - Inspired by the Love Game
Scala Language Intro - Inspired by the Love Game
 
Regex startup
Regex startupRegex startup
Regex startup
 
Brogramming - Python, Bash for Data Processing, and Git
Brogramming - Python, Bash for Data Processing, and GitBrogramming - Python, Bash for Data Processing, and Git
Brogramming - Python, Bash for Data Processing, and Git
 
Don't Fear the Regex LSP15
Don't Fear the Regex LSP15Don't Fear the Regex LSP15
Don't Fear the Regex LSP15
 
MMBJ Shanzhai Culture
MMBJ Shanzhai CultureMMBJ Shanzhai Culture
MMBJ Shanzhai Culture
 
Washington Practitioners Significant Changes To Rpc 1.5
Washington Practitioners Significant Changes To Rpc 1.5Washington Practitioners Significant Changes To Rpc 1.5
Washington Practitioners Significant Changes To Rpc 1.5
 
Paulo Freire Pedagpogia 1
Paulo Freire Pedagpogia 1Paulo Freire Pedagpogia 1
Paulo Freire Pedagpogia 1
 
Jerry Shea Resume And Addendum 5 2 09
Jerry  Shea Resume And Addendum 5 2 09Jerry  Shea Resume And Addendum 5 2 09
Jerry Shea Resume And Addendum 5 2 09
 

Último

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 

Último (20)

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 

Grep Introduction

  • 1. Colloquium - grep v1.0 A. Magee April 6, 2010 1 / 16 Colloquium - grep, v1.0 A. Magee
  • 2. Outline 1 Introduction What does grep offer? When should I use grep? 2 Understanding Regular Expressions Class Basics Quantifiers & Grouping Online Tools Examples 3 Using Regular Expressions With grep 2 / 16 Colloquium - grep, v1.0 A. Magee
  • 3. Outline 1 Introduction What does grep offer? When should I use grep? 2 Understanding Regular Expressions Class Basics Quantifiers & Grouping Online Tools Examples 3 Using Regular Expressions With grep 2 / 16 Colloquium - grep, v1.0 A. Magee
  • 4. Outline 1 Introduction What does grep offer? When should I use grep? 2 Understanding Regular Expressions Class Basics Quantifiers & Grouping Online Tools Examples 3 Using Regular Expressions With grep 2 / 16 Colloquium - grep, v1.0 A. Magee
  • 5. Introduction What? What does grep offer? grep matches regular expressions. Your first question should be“What is a regular expression?” A regular expression is a language pattern. grep and REs allow us to find complex things in text. Complex is relative and can vary from a single character to an IP address. Single character complex: [ajk+0-] IP complex: (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3} (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) 3 / 16 Colloquium - grep, v1.0 A. Magee
  • 6. Introduction What? What does grep offer? grep matches regular expressions. Your first question should be“What is a regular expression?” A regular expression is a language pattern. grep and REs allow us to find complex things in text. Complex is relative and can vary from a single character to an IP address. Single character complex: [ajk+0-] IP complex: (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3} (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) 3 / 16 Colloquium - grep, v1.0 A. Magee
  • 7. Introduction What? What does grep offer? grep matches regular expressions. Your first question should be“What is a regular expression?” A regular expression is a language pattern. grep and REs allow us to find complex things in text. Complex is relative and can vary from a single character to an IP address. Single character complex: [ajk+0-] IP complex: (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3} (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) 3 / 16 Colloquium - grep, v1.0 A. Magee
  • 8. Introduction When? When should I use grep? Always! Unless you find some better tool. P.S. - grep stands for g/re/p, an ed command that means global/reg ex/print 4 / 16 Colloquium - grep, v1.0 A. Magee
  • 9. Regular Expressions Class Basics Class Basics A character class is a symbol or collection of symbols that describes a group of characters. . (period): This matches any single character. [...]: This matches any one character in the set. [aeiou] matches one of the vowels. [a-z] matches one of the lowercase alphabet. [0-5] matches one numeral 0 through 5. You will not remember all of these until you use them often, but there are many special classes that can save you some typing. 5 / 16 Colloquium - grep, v1.0 A. Magee
  • 10. Regular Expressions Class Basics Common Classes Special Class Meaning Simple RE d Digit characters [0-9] D Non-digit characters [ˆ0-9] w Word characters [a-zA-Z 0-9] W Non-word characters [ˆa-zA-Z 0-9] s Whitespace characters characters [fnrt] S Non-space characters [ˆfnrt] b Word boundary The word boundary class is very special as it is zero length and matches transitions between s and w and vice versa. 6 / 16 Colloquium - grep, v1.0 A. Magee
  • 11. Regular Expressions Class Basics More Common Classes Special Class Meaning Simple RE [:alpha:] All alphabetic characters [a-zA-Z] [:alnum:] All alphabetic and numeric [a-zA-Z0-9] [:blank:] Tab and space [:cntrl:] Control characters [x00-x1Fx7F] [:digit:] A numeric digit [0-9] [:graph:] Any visible character [x21-x7E] [:lower:] Lowercase characters [a-z] [:print:] Printables (i.e. no controls) [x20-x7E] [:punct:] Punctuation & symbols [!”#$%&’()*+,-./:;<=>? @[ ]ˆ ‘{|}∼] [:space:] Space, tab, newline, etc [ trnvf] [:upper:] Uppercase characters [A-Z] [:word:] Word characters [a-zA-Z0-9 ] [:xdigit:] Hex digits [A-Fa-f0-9] 7 / 16 Colloquium - grep, v1.0 A. Magee
  • 12. Regular Expressions Quantifiers & Grouping Quantifiers & Grouping Quantifiers are how a RE counts things. ? Exactly zero or one occurrence * Zero or more occurrences + One or more occurrences *? Zero or more occurrences non-greedy +? One or more occurrences non-greedy {x} Exactly x occurrences {x,} At least x occurrences {x,y} At least x but no more than y occurrences Grouping is used to collect patterns together and to create back-references. A group is simply a set of parentheses (). 8 / 16 Colloquium - grep, v1.0 A. Magee
  • 13. Regular Expressions Online Tools Helpful Tools The best way to understand the rest of this presentation is to see what is being matched live. Here are some online tools that work for our needs. RegExr - www.gskinner.com/RegExr beware Flash, but it works well regexpal - regexpal.com very simple reanimator - osteele.com/tools/reanimator beware Flash, recommend CS 4/570 first rubular - rubular.com nice on-page reference 9 / 16 Colloquium - grep, v1.0 A. Magee
  • 14. Regular Expressions Examples Your First RE Let’s skip trivial REs and get on to something useful. These may be more complex than you’re used to but the quicker you are able to read long, complex REs the better. This is a nice, but not perfect, email address matcher. [[:alnum:]][[:word:].%+-]*@(?:[[:alnum:]-]+.)+[[:alpha:]]{2,4} [[:alnum:]][[:word:].%+-]* Match a word that doesn’t start with [.%+-]. @(?:[[:alnum:]-]+.)+ Match the @ symbol and any number of subdomains followed by periods. [[:alpha:]]{2,4} Match the top level domain of 2, 3 or 4 characters. 10 / 16 Colloquium - grep, v1.0 A. Magee
  • 15. Regular Expressions Examples Your First RE - Part 2 Let’s examine the first part. [[:alnum:]][[:word:].%+-]* [[:alnum:]] - Must start with an alphanumeric character. NB: All [: ... :] classes must live in a set like [[: ... :]]. [[:word:].%+-] - Other characters maybe a ‘word’ character, a literal space, percent symbol, plus symbol or a dash. NB: The period must be escaped because it has special meaning. * - repeat the previous set zero or more times. 11 / 16 Colloquium - grep, v1.0 A. Magee
  • 16. Regular Expressions Examples Your First RE - Part 2 Let’s examine the first part. [[:alnum:]][[:word:].%+-]* [[:alnum:]] - Must start with an alphanumeric character. NB: All [: ... :] classes must live in a set like [[: ... :]]. [[:word:].%+-] - Other characters maybe a ‘word’ character, a literal space, percent symbol, plus symbol or a dash. NB: The period must be escaped because it has special meaning. * - repeat the previous set zero or more times. 11 / 16 Colloquium - grep, v1.0 A. Magee
  • 17. Regular Expressions Examples Your First RE - Part 2 Let’s examine the first part. [[:alnum:]][[:word:].%+-]* [[:alnum:]] - Must start with an alphanumeric character. NB: All [: ... :] classes must live in a set like [[: ... :]]. [[:word:].%+-] - Other characters maybe a ‘word’ character, a literal space, percent symbol, plus symbol or a dash. NB: The period must be escaped because it has special meaning. * - repeat the previous set zero or more times. 11 / 16 Colloquium - grep, v1.0 A. Magee
  • 18. Regular Expressions Examples Your First RE - Part 3 Now the second part, the subdomains, sub-subdomains, etc. @(?:[[:alnum:]-]+.)+ @ - Well that literally matches the ‘at’ character. The parenthesis denote the beginning of a group. The ?: is a confusing notation that suppresses the creation of a back reference. It is here so you’ll know of it, but it is rarely needed. Again we see a special class for alphanumerics, but we’ve also included a dash. The plus symbol tells us to look for one or more of these characters, followed by a period. And lastly we close the group and the plus symbol now tells us to look for one or more of these groups. 12 / 16 Colloquium - grep, v1.0 A. Magee
  • 19. Regular Expressions Examples Your First RE - Part 3 Now the second part, the subdomains, sub-subdomains, etc. @(?:[[:alnum:]-]+.)+ @ - Well that literally matches the ‘at’ character. The parenthesis denote the beginning of a group. The ?: is a confusing notation that suppresses the creation of a back reference. It is here so you’ll know of it, but it is rarely needed. Again we see a special class for alphanumerics, but we’ve also included a dash. The plus symbol tells us to look for one or more of these characters, followed by a period. And lastly we close the group and the plus symbol now tells us to look for one or more of these groups. 12 / 16 Colloquium - grep, v1.0 A. Magee
  • 20. Regular Expressions Examples Your First RE - Part 3 Now the second part, the subdomains, sub-subdomains, etc. @(?:[[:alnum:]-]+.)+ @ - Well that literally matches the ‘at’ character. The parenthesis denote the beginning of a group. The ?: is a confusing notation that suppresses the creation of a back reference. It is here so you’ll know of it, but it is rarely needed. Again we see a special class for alphanumerics, but we’ve also included a dash. The plus symbol tells us to look for one or more of these characters, followed by a period. And lastly we close the group and the plus symbol now tells us to look for one or more of these groups. 12 / 16 Colloquium - grep, v1.0 A. Magee
  • 21. Regular Expressions Examples Your First RE - Part 3 Now the second part, the subdomains, sub-subdomains, etc. @(?:[[:alnum:]-]+.)+ @ - Well that literally matches the ‘at’ character. The parenthesis denote the beginning of a group. The ?: is a confusing notation that suppresses the creation of a back reference. It is here so you’ll know of it, but it is rarely needed. Again we see a special class for alphanumerics, but we’ve also included a dash. The plus symbol tells us to look for one or more of these characters, followed by a period. And lastly we close the group and the plus symbol now tells us to look for one or more of these groups. 12 / 16 Colloquium - grep, v1.0 A. Magee
  • 22. Regular Expressions Examples Your First RE - Part 4 Finally the third part, the domain. [[:alpha:]]{2,4} We’ll now this part is easy. Just match 2, 3 or 4 alphabetical characters. 13 / 16 Colloquium - grep, v1.0 A. Magee
  • 23. Regular Expressions Examples Your Second RE Now we’ll look at a RE that can help use build a header file for a c program file, given that some neglectful programmer has failed to design his/her c program properly. This will be a quicker example. ˆ[ws]*([ws*&,]*)s*{ ˆ[ws]*( At the beginning of a line match some keywords and types and the function name and then literal parenthesis. [ws*&,]* Match some more words, keywords, variable modifiers and commas. )s*{ Finally match the closing parenthesis, some whitespace and the left curly brace, denoting the start of the function body. 14 / 16 Colloquium - grep, v1.0 A. Magee
  • 24. Regular Expressions Examples Your Second RE - Fine Details ˆ[ws]*([ws*&,]*)s*{ In general, most RE parsers will not match across multiple lines, even though the s class matches the newline character. This is very bothersome but is easily overcome by using pcregrep. pcre is Perl Compatible Regular Expression. This is all I will ever say about Perl. Notice that the literal * must be escaped like so, *. As must the parentheses due to their special RE meaning. Escaping so many characters is very annoying, but unfortunately it is necessary. 15 / 16 Colloquium - grep, v1.0 A. Magee
  • 25. Appendix 4 Appendix 16 / 16 Colloquium - grep, v1.0 A. Magee