Top down parsing(sid) (1)

Syntax Analysis
• Scanner and parser
– Scanner looks at the lower level part of the programming language
• Only the language for the tokens
– Parser looks at the higher lever part of the programming language
• E.g., if statement, functions
• The tokens are abstracted
• Parser = Syntax analyzer
– Input: sequence of tokens from lexical analysis
– Output: a parse tree of the program
• E.g., AST
– Process:
• Try to derive from a starting symbol to the input string (How?)
• Build the parse tree following the derivation

Top-Down Parsing
• Top down parsing
– Recursive descent parsing
– Making Recursive descent a Predictive parsing
algorithm
• Left recursion elimination
• Left factoring
– Predictive parsing
• First set and Follow set
• Parse table construction
• Parsing procedure
– LL grammars and languages
• LL(1) grammar

Predictive parsing
A predictive parser is an efficient way of implementing recursive-descent
parsing by handling the stack of activation records explicitly.

First set and Follow set
The construction of a predictive parser is aided by two functions associated with
a grammar G. These functions, FIRST and FOLLOW, allow us to fill in the proper entries of a
predictive parsing table for G, if such a parsing table for G exist.
First :
If a is any string of grammar symbols, let FIRST(a) be the set of terminals that
begin the strings derived from a. If a ⇒ ε then e is also in FIRST(a).
Follow :
Define FOLLOW(A), for nonterminal A, to be the set of terminals a that can
appear immediately to the right of A in some sentential form, that is, the set of terminals a
such that there exists a derivation of the form S ⇒ αAa β for some a and b. Note that there
may, at some time during the derivation, have been symbols between A and a, but if so,
they derived ε and disappeared. If A can be the rightmost symbol in some sentential form,
then $, representing the input right endmarker, is in FOLLOW(A).

First
Rules for Compute First set:
1. If X is terminal, then FIRST (X) is {X}.
2. If X is nonterminal and X  aα is a production, then add a to
FIRST (X). If X  ε is a production, then add ε to FIRST (X).
3. If XY1Y2…..Yk is a production, then for all i such that all of
Y1,Y2…..,Yi-1 are nonterminals and FIRST (Yj) contains ε for
j=1,2,…,i-1 (i.e. Y1Y2…. Yi-1 ⇒ ε), add every non-ε symbol in
FIRST (Yi) to FIRST(X). If ε is in FIRST (Yj) for all j=1, 2… k, then
add ε to FIRST(X).

Follow
Rules for Compute Follow set:
1. $ is in FOLLOW(S), where S is the starting symbol.
2. If there is a production A  αBβ, β ≠ ε, then everything is in
FIRST (Β) except for ε is placed in FOLLOW (B).
3. If there is a production A  αB, or a production A  αBβ,
where FIRST (Β) contains ε (i.e. β ⇒ ε), then everything in
FOLLOW (A) is in FOLLOW (B).

Example
Grammar
E TE’
E’ +TE’ |
T FT’
T’ *FT’ |
F (E) | id | num
Note : This grammar is non left-recursive type so,
we get Both FIRST and FOLLOW sets

First
Grammar
E TE’
E’ +TE’ |
T FT’
T’ *FT’ |
F (E) | id | num
First set
FIRST (E) = { } FIRST (+) = { + } FIRST (id) = { id }
FIRST(E') = { } FIRST (*) = { * } FIRST (num) = { num }
FIRST(T) = { } FIRST (‘(‘) = { ( }
FIRST(T') = { } FIRST (‘)’) = { ) }
FIRST( F ) = { } FIRST ( ) = { }
The First thing we do is Add terminal
itself in its FIRST set by applying 1st
rule of FIRST set

First
Grammar
E TE’
E’ +TE’ |
T FT’
T’ *FT’ |
F (E) | id | num
First set
FIRST (E) = { } FIRST (+) = { + } FIRST (id) = { id }
FIRST(E') = { + , ε } FIRST (*) = { * } FIRST (num) = { num }
FIRST(T) = { } FIRST (‘(‘) = { ( }
FIRST(T') = { * , ε } FIRST (‘)’) = { ) }
FIRST( F ) = { ( , id , num } FIRST ( ) = { }
Next, we apply rule 2 i.e. for every
production X  aα add a to First (X)
Rule 2 also says that for every
production X  ε add ε to First (X)

First
Grammar
E TE’
E’ +TE’ |
T FT’
T’ *FT’ |
F (E) | id | num
First set
FIRST (E) = { ( , id , num } FIRST (+) = { + } FIRST (id) = { id }
FIRST(E') = { + , ε } FIRST (*) = { * } FIRST (num) = { num }
FIRST(T) = { ( , id , num } FIRST (‘(‘) = { ( }
FIRST(T') = { * , ε } FIRST (‘)’) = { ) }
FIRST( F ) = { ( , id , num } FIRST ( ) = { }
For Production T FT’ 3rd rule says that
everything in FIRST (F) is into FIRST (T).
(since production is like XY1Y2…Yk and
FIRST (Y1) not contain ε )
Similarly For Production E TE’ 3rd rule says
that everything in FIRST (F) is into FIRST (T).

Follow
Grammar
E TE’
E’ +TE’ |
T FT’
T’ *FT’ |
F (E) | id | num
The First thing we do is Add $ to the
start Symbol nonterminal “E”
Follow set
FOLLOW(E) = { $ }
FOLLOW(E') = { }
FOLLOW(T) = { }
FOLLOW(T') = { }
FOLLOW( F ) = { }

Follow
Grammar
E TE’
E’ +TE’ |
T FT’
T’ *FT’ |
F (E) | id | num
Follow set
FOLLOW(E) = { $ , ) }
FOLLOW(E') = { }
FOLLOW(T) = { + }
FOLLOW(T') = { }
FOLLOW( F ) = { * }
Next we get this productions on which
we apply rule 2we apply rule 2 to E' →+TE' This says that
everything in First(E') except for ε should be
in Follow(T)Similarly we can apply this rule 2 to other
two productions T' → *FT‘ and F → (E)
which give Follow(F) & Follow(E)
respectively

Follow
Grammar
E TE’
E’ +TE’ |
T FT’
T’ *FT’ |
F (E) | id | num
Follow set
FOLLOW(E) = { $ , ) }
FOLLOW(E') = { $ , ) }
FOLLOW(T) = { + , $ , ) }
FOLLOW(T') = { + , $ , ) }
FOLLOW(F) = { * , + , $ , ) }
Now applying 3rd rule on this
production we can say everything
in FOLLOW (E) is into FOLLOW (E’).
(since here β = ε )
Now applying 3rd rule on this
production we can say everything
in FOLLOW (E’) is into FOLLOW (T).
(since here β ⇒ ε )
Same here everything in
FOLLOW (T) is into FOLLOW (T’).
(since here β = ε )
Same here everything in
FOLLOW (T’) is into FOLLOW (F).
(since here β ⇒ ε )

Parse table
 Construct a parse table M[N, T {$}]
 Non-terminals in the rows and terminals in the columns
 For each production A
 For each terminal a First( )
add A to M[A, a]
 Meaning: When at A and seeing input a, A should be used
 If First( ) then for each terminal a Follow(A)
add A to M[A, a]
 Meaning: When at A and seeing input a, A should be used
• In order to continue expansion to
• X AC A B B b | C cc
 If First( ) and $ Follow(A)
add A to M[A, $]
 Same as the above

Construct a Parse Table
Grammar
E TE’
E’ +TE’ |
T FT’
T’ *FT’ |
F (E) | id | num
First(*) = {*}
First(F) = {(, id, num}
First(T’) = {*, }
First(T) {(, id, num}
First(E’) = {+, }
First(E) {(, id, num}
Follow(E) = {$, )}
Follow(E’) = {$, )}
Follow(T) = {$, ), +}
Follow(T) = {$, ), +}
Follow(T’) = {$, ), +}
Follow(F) = {*, $, ), +}
E TE’: First(TE’) = {(, id, num}E’ +TE’: First(+TE’) = {+}E’ : Follow(E’) = {$,)}T FT’: First(FT’) = {(, id, num}T’ *FT’: First(*FT’) = {*}T’ : Follow(T’) = {$, ), +}
id num * + ( ) $
E E TE’ E TE’ E TE’
E’ E’ +TE’ E’ E’
T T FT’ T FT’ T FT’
T’ T’ *FT’ T’ T’ T’
F F id F num F (E)

 Now we can have a predictive parsing mechanism
 Use a stack to keep track of the expanded form
 Initialization
 Put starting symbol S and $ into the stack
 Add $ to the end of the input string
 $ is for the recognition of the termination configuration
 If ‘a’ is at the top of the stack and ‘a’ is the next input symbol then
 Simply pop ‘a’ from stack and advance on the input string
 If ‘A’ is on top of the stack and ‘a’ is the next input symbol then
 Assume that M [A, a] = A
 Replace A by in the stack
 Termination
 When only $ in the stack and in the input string
 If ‘A’ is on top of the stack and ‘a’ is the next input but
 M[A, a] = empty
 Error
Parsing procedure

id num * + ( ) $
E E TE’ E TE’ E TE’
E’ E’ +TE’ E’ E’
T T FT’ T FT’ T FT’
T’ T’ *FT’ T’ T’ T’
F F id F num F (E)
Stack Input Action
E $ id + num * id $ E TE’
T E’ $ id + num * id $ T FT’
F T’ E’ $ id + num * id $ F id
T’ E’ $ + num * id $ T’
E’ $ + num * id $ E’ +TE’
T E’ $ num * id $ T FT’
F T’ E’ $ num * id $ F num
T’ E’ $ * id $ T’ *FT’
F T’ E’ $ id $ F id
T’ E’ $ $ T’
E’ $ $ E’
$ $ Accept
ERROR
Example of Predictive Parsing:
id + num * id

Top down parsing(sid) (1)

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Top down parsing(sid) (1)

Semelhante a Top down parsing(sid) (1) (20)

Último

Último (20)

Top down parsing(sid) (1)