SlideShare uma empresa Scribd logo
1 de 12
• Lex -- a Lexical Analyzer Generator (by
M.E. Lesk and Eric. Schmidt)
– Given tokens specified as regular expressions,
Lex automatically generates a routine (yylex)
that recognizes the tokens (and performs
corresponding actions).
• Lex source program
{definition}
%%
{rules}
%%
{user subroutines}
Rules: <regular expression> <action>
Each regular expression specifies a token.
Default action for anything that is not matched: copy to the
output
Action: C/C++ code fragment specifying what to do when
a token is recognized.
• lex program examples: ex1.l and ex2.l
– ‘lex ex1.l’ produces the lex.yy.c file that
contains a routine yylex().
• The int yylex() routine is the scanner that finds all
the regular expressions specified.
– yylex() returns a non-zero value (usually token id)
normally.
– yylex() returns 0 when end of file is reached.
– Need a drive to test the routine. Main.cpp is an
example.
• You need to have a yywrap() function in the lex file
(return 1), see the function in ex1.l.
– Something to do with compiling multiple files.
• Lex regular expression: contains text
characters and operators.
– Letters of alphabet and digits are always text
characters.
• Regular expression integer matches the string
“integer”
– Operators: “[]^-?.*+|()$/{}%<>
• When these characters happen in a regular
expression, they have special meanings
– Lex regular expressions cannot have space in
them!!!
– operators (characters that have special meanings):
“[]^-?.*+|()$/{}%<>
• ‘*’, ‘+’, ‘|’, ‘(‘,’)’ -- used in regular expression
• ‘ “ ‘ -- any character in between quote is a text character.
– E.g.: “xyz++” == xyz”++”
• ‘’ -- escape character,
– To get the operators back: “xyz++” == ??
– To specify special characters: 40 == “ “
• ‘[‘ and ‘]’ -- used to specify a set of characters
– e.g: [a-z], [a-zA-Z],
– Every character in it except ^, - and  is a text character
– [-+0-9], [40-176]
• ‘^’ -- not, used as the first character after the left bracket
– E.g [^abc] – any character except ‘a’, ‘b’ or ‘c’.
– [^a-zA-Z] -- ??
– operators (characters that have special
meanings): “[]^-?.*+|()$/{}%<>
• ‘.’ -- every character
• ‘?’ -- optional ab?c matches ‘ac’ or ‘abc’
• ‘/’ -- used in character lookahead:
– e.g. ab/cd -- matches ab only if it is followed by cd
• ‘{‘’}’ -- enclose a regular definition
• ‘%’ -- has special meaning in lex
• ‘$’ -- match the end of a line, ‘^’ -- match the
beginning of a line
– ab$ == ab/n
• ‘<‘ ‘>’: start condition (more context sensitivity
support, see the paper for details).
– Order of pattern matching:
• Always matches the longest pattern.
• When multiple patterns matches, use the first pattern.
– To override, add “REJECT” in the action.
...
%%
Ab {printf(“rule 1n”);}
Abc {printf(“rule 2n”);}
{letter}{letter|digit}* {printf(“rule 3n”);}
%%
Input: Abc
What happened when at ‘.*’ as a pattern?
Should regular expressions for reserved words happen before or
after the regular expression for identifier?
– Manipulate the lexeme and/or the input stream:
• yytext -- a char pointer pointing to the matched C string
• yyleng -- the length of the matched string
• I/O routines to manipulate the input stream:
– yyinput() -- get a character from the input character, return <=0
when reaching the end of the input stream, the character
otherwise
» yyinput() for c++, input() for c.
– unput( c ) -- put c back onto the input stream
– Deal with comments: (/* ….. */
» Why is pattern “/*”.*”*/” a problem?
%%
…
“/*” {char c1;
c2 = yyinput();
if (c2 <=0) {lex_error(“unfinished comment” …}
else { c1 = c2; c2 = yyinput();
while (((c1!=‘*’) || (c2 != ‘/’)) && (c2 > 0)) {c1 = c2; c2 = yyinput();}
if (c2 <= 0) {lex_error( ….)
}
– Reporting errors:
• What kind of errors? Not too many.
– Characters that cannot lead to a token
– unended comments (can we do it in later phases?)
– unended string constants.
– Reporting errors:
• How to keep track of current position (which line, which
column)?
– Use to global variable for this: yyline, yycolumn
%{
int yyline = 1, yycolumn = 1;
%}
...
%%
“n” {yyline++; yycolumn = 0;}
[ t]+ {/* do nothing*/ yycolumn += yyleng;}
If {yycolumn += yyleng; return (IFNumber);}
“+” {yycolumn += 1; return (PLUSNumber);}
{letter}{letter|digit}* {yylval = idtable_insert(yytext); yycolumn += yyleng;
return(IDNumber);}
...
%%
• Dealing with identifiers, string constants.
– Data structures:
• Put the lexeme in a string table: e.g. vector of C strings.
• See token.l
– Recognizing constant strings with special characters
• Assuming string cannot pass line boundary.
• Use yymore()
“[^”n]* {char c;
c = yyinput();
if (c != ‘”’) error
else if (yytext[yyleng-1] == ‘’) {
unput( c ); yymore();
} else {/* find the whole string, normal process*/}
Put it all together
• Checkout token.l and main.cpp in lex2.tar

Mais conteúdo relacionado

Semelhante a Generate Lexical Analyzers with Lex

Continuation Passing Style and Macros in Clojure - Jan 2012
Continuation Passing Style and Macros in Clojure - Jan 2012Continuation Passing Style and Macros in Clojure - Jan 2012
Continuation Passing Style and Macros in Clojure - Jan 2012Leonardo Borges
 
Lex (lexical analyzer)
Lex (lexical analyzer)Lex (lexical analyzer)
Lex (lexical analyzer)Sami Said
 
Compiler Construction | Lecture 13 | Code Generation
Compiler Construction | Lecture 13 | Code GenerationCompiler Construction | Lecture 13 | Code Generation
Compiler Construction | Lecture 13 | Code GenerationEelco Visser
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsEran Zimbler
 
Programming in C Basics
Programming in C BasicsProgramming in C Basics
Programming in C BasicsBharat Kalia
 
Lecture 1 - Lexical Analysis.ppt
Lecture 1 - Lexical Analysis.pptLecture 1 - Lexical Analysis.ppt
Lecture 1 - Lexical Analysis.pptNderituGichuki1
 
Denis Lebedev, Swift
Denis  Lebedev, SwiftDenis  Lebedev, Swift
Denis Lebedev, SwiftYandex
 
Hacking Go Compiler Internals / GoCon 2014 Autumn
Hacking Go Compiler Internals / GoCon 2014 AutumnHacking Go Compiler Internals / GoCon 2014 Autumn
Hacking Go Compiler Internals / GoCon 2014 AutumnMoriyoshi Koizumi
 
Lex and Yacc ppt
Lex and Yacc pptLex and Yacc ppt
Lex and Yacc pptpssraikar
 
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]RootedCON
 
Python language data types
Python language data typesPython language data types
Python language data typesHarry Potter
 
Python language data types
Python language data typesPython language data types
Python language data typesHoang Nguyen
 
Python language data types
Python language data typesPython language data types
Python language data typesLuis Goldster
 

Semelhante a Generate Lexical Analyzers with Lex (20)

Continuation Passing Style and Macros in Clojure - Jan 2012
Continuation Passing Style and Macros in Clojure - Jan 2012Continuation Passing Style and Macros in Clojure - Jan 2012
Continuation Passing Style and Macros in Clojure - Jan 2012
 
Lex (lexical analyzer)
Lex (lexical analyzer)Lex (lexical analyzer)
Lex (lexical analyzer)
 
7645347.ppt
7645347.ppt7645347.ppt
7645347.ppt
 
Compiler Construction | Lecture 13 | Code Generation
Compiler Construction | Lecture 13 | Code GenerationCompiler Construction | Lecture 13 | Code Generation
Compiler Construction | Lecture 13 | Code Generation
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Should i Go there
Should i Go thereShould i Go there
Should i Go there
 
Cbasic
CbasicCbasic
Cbasic
 
Programming in C Basics
Programming in C BasicsProgramming in C Basics
Programming in C Basics
 
Lecture 1 - Lexical Analysis.ppt
Lecture 1 - Lexical Analysis.pptLecture 1 - Lexical Analysis.ppt
Lecture 1 - Lexical Analysis.ppt
 
Denis Lebedev, Swift
Denis  Lebedev, SwiftDenis  Lebedev, Swift
Denis Lebedev, Swift
 
Hacking Go Compiler Internals / GoCon 2014 Autumn
Hacking Go Compiler Internals / GoCon 2014 AutumnHacking Go Compiler Internals / GoCon 2014 Autumn
Hacking Go Compiler Internals / GoCon 2014 Autumn
 
LEX & YACC TOOL
LEX & YACC TOOLLEX & YACC TOOL
LEX & YACC TOOL
 
Lexical
LexicalLexical
Lexical
 
Lex and Yacc ppt
Lex and Yacc pptLex and Yacc ppt
Lex and Yacc ppt
 
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
 
Lexicalanalyzer
LexicalanalyzerLexicalanalyzer
Lexicalanalyzer
 
Lexicalanalyzer
LexicalanalyzerLexicalanalyzer
Lexicalanalyzer
 
Python language data types
Python language data typesPython language data types
Python language data types
 
Python language data types
Python language data typesPython language data types
Python language data types
 
Python language data types
Python language data typesPython language data types
Python language data types
 

Último

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 

Último (20)

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 

Generate Lexical Analyzers with Lex

  • 1. • Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) – Given tokens specified as regular expressions, Lex automatically generates a routine (yylex) that recognizes the tokens (and performs corresponding actions).
  • 2. • Lex source program {definition} %% {rules} %% {user subroutines} Rules: <regular expression> <action> Each regular expression specifies a token. Default action for anything that is not matched: copy to the output Action: C/C++ code fragment specifying what to do when a token is recognized.
  • 3. • lex program examples: ex1.l and ex2.l – ‘lex ex1.l’ produces the lex.yy.c file that contains a routine yylex(). • The int yylex() routine is the scanner that finds all the regular expressions specified. – yylex() returns a non-zero value (usually token id) normally. – yylex() returns 0 when end of file is reached. – Need a drive to test the routine. Main.cpp is an example. • You need to have a yywrap() function in the lex file (return 1), see the function in ex1.l. – Something to do with compiling multiple files.
  • 4. • Lex regular expression: contains text characters and operators. – Letters of alphabet and digits are always text characters. • Regular expression integer matches the string “integer” – Operators: “[]^-?.*+|()$/{}%<> • When these characters happen in a regular expression, they have special meanings – Lex regular expressions cannot have space in them!!!
  • 5. – operators (characters that have special meanings): “[]^-?.*+|()$/{}%<> • ‘*’, ‘+’, ‘|’, ‘(‘,’)’ -- used in regular expression • ‘ “ ‘ -- any character in between quote is a text character. – E.g.: “xyz++” == xyz”++” • ‘’ -- escape character, – To get the operators back: “xyz++” == ?? – To specify special characters: 40 == “ “ • ‘[‘ and ‘]’ -- used to specify a set of characters – e.g: [a-z], [a-zA-Z], – Every character in it except ^, - and is a text character – [-+0-9], [40-176] • ‘^’ -- not, used as the first character after the left bracket – E.g [^abc] – any character except ‘a’, ‘b’ or ‘c’. – [^a-zA-Z] -- ??
  • 6. – operators (characters that have special meanings): “[]^-?.*+|()$/{}%<> • ‘.’ -- every character • ‘?’ -- optional ab?c matches ‘ac’ or ‘abc’ • ‘/’ -- used in character lookahead: – e.g. ab/cd -- matches ab only if it is followed by cd • ‘{‘’}’ -- enclose a regular definition • ‘%’ -- has special meaning in lex • ‘$’ -- match the end of a line, ‘^’ -- match the beginning of a line – ab$ == ab/n • ‘<‘ ‘>’: start condition (more context sensitivity support, see the paper for details).
  • 7. – Order of pattern matching: • Always matches the longest pattern. • When multiple patterns matches, use the first pattern. – To override, add “REJECT” in the action. ... %% Ab {printf(“rule 1n”);} Abc {printf(“rule 2n”);} {letter}{letter|digit}* {printf(“rule 3n”);} %% Input: Abc What happened when at ‘.*’ as a pattern? Should regular expressions for reserved words happen before or after the regular expression for identifier?
  • 8. – Manipulate the lexeme and/or the input stream: • yytext -- a char pointer pointing to the matched C string • yyleng -- the length of the matched string • I/O routines to manipulate the input stream: – yyinput() -- get a character from the input character, return <=0 when reaching the end of the input stream, the character otherwise » yyinput() for c++, input() for c. – unput( c ) -- put c back onto the input stream – Deal with comments: (/* ….. */ » Why is pattern “/*”.*”*/” a problem? %% … “/*” {char c1; c2 = yyinput(); if (c2 <=0) {lex_error(“unfinished comment” …} else { c1 = c2; c2 = yyinput(); while (((c1!=‘*’) || (c2 != ‘/’)) && (c2 > 0)) {c1 = c2; c2 = yyinput();} if (c2 <= 0) {lex_error( ….) }
  • 9. – Reporting errors: • What kind of errors? Not too many. – Characters that cannot lead to a token – unended comments (can we do it in later phases?) – unended string constants.
  • 10. – Reporting errors: • How to keep track of current position (which line, which column)? – Use to global variable for this: yyline, yycolumn %{ int yyline = 1, yycolumn = 1; %} ... %% “n” {yyline++; yycolumn = 0;} [ t]+ {/* do nothing*/ yycolumn += yyleng;} If {yycolumn += yyleng; return (IFNumber);} “+” {yycolumn += 1; return (PLUSNumber);} {letter}{letter|digit}* {yylval = idtable_insert(yytext); yycolumn += yyleng; return(IDNumber);} ... %%
  • 11. • Dealing with identifiers, string constants. – Data structures: • Put the lexeme in a string table: e.g. vector of C strings. • See token.l – Recognizing constant strings with special characters • Assuming string cannot pass line boundary. • Use yymore() “[^”n]* {char c; c = yyinput(); if (c != ‘”’) error else if (yytext[yyleng-1] == ‘’) { unput( c ); yymore(); } else {/* find the whole string, normal process*/}
  • 12. Put it all together • Checkout token.l and main.cpp in lex2.tar