SlideShare uma empresa Scribd logo
1 de 52
Baixar para ler offline
Ch-three
Regular languages and regular
grammar
Chapter programme
Regular Language
Regular Expressions
Grammar
Regular Grammar
Introduction(prelude)
What is language?
May refer either to the (human) capacity for acquiring and using
complex systems of communication, or to a specific instance of such
a system of complex communication.
...and a formal language?
A language is formal if it is provided with a mathematically rigorous
representation of its:
Alphabet of symbols, and
Formation rules specifying which strings of symbols count as
well-formed.
Describing formal languages
Generative approach: A language is
the set of strings generated by a
grammar.
Generation process:
Start symbol
Expand with rewrite rules.
Stop when a word of the language
is generated.
A grammar generates regular
language regular grammar.
Recognition approach: A language is
the set of strings accepted by an
automaton.
Recognition process:
Start in initial state.
Transitions to other states guided
by the string symbols.
Until read whole string and reach
accept/reject state.
Linguistics and formal language
Two application settings: Formal languages are studied in linguistics and computer
science.
Linguistics
In linguistics, formal languages are used for the scientific study of human language.
Linguists privilege a generative approach, as they are interested in defining a
(finite) set of rules stating the grammar based on which any reasonable sentence in
the language can be constructed.
A grammar does not describe the meaning of the sentences or what can be done
with them in whatever context __ but only their form.
Computer science and formal
languages
Computer science
In computer science, formal languages are used for the precise definition of
programming languages and, therefore, in the development of compilers.
A compiler is a computer program (or set of programs) that transforms source code
written in a programming language (the source language) into another computer
language (the target language).
Computer scientists privilege a recognition approach based on abstract machines
(automata) that take in input a sentence and decide whether it belongs to the
reference language.
Grammars and formal languages
Chomsky
Noam Chomsky (1928) is an American linguist, philosopher, cognitive scientist,
historian, and activist.
In Syntactic Structures (1957), Chomsky models knowledge of language using a
formal grammar, by claiming that formal grammars can explain the ability of a
hearer/speaker to produce and interpret an infinite number of sentences with a
limited set of grammatical rules and a finite set of terms.
The human brain contains a limited set of rules for organizing language, known as
Universal Grammar. The basic rules of grammar are hard-wired into the brain, and
manifest themselves without being taught.
Conti…
Chomsky proposed a hierarchy that partitions formal grammars into classes with
increasing expressive power, i.e. each successive class can generate a broader set of
formal languages than the one before.
Interestingly, modeling some aspects of human language requires a more complex
formal grammar (as measured by the Chomsky hierarchy) than modeling others.
Chomsky Classification of Grammars
Grammar Type Grammar
Accepted
(generator)
Language
Accepted
Automaton
(recognizers )
Type 0
Unrestricted
grammar
Recursively
enumerable
language
Turing Machine
Type 1
Context-sensitive
grammar
Context-sensitive
language
Linear-bounded
automaton
Type 2
Context-free
grammar
Context-free
language
Pushdown
automaton
Type 3 Regular grammar Regular language
Finite state
automaton
Chomsky hierarchy
It shows the scope of each type of grammar −
Automata and formal language
Stephen Kleene introduced finite automata in the 50s: abstract state machines equipped
with a finite memory. He demonstrated the equivalence between such a model and the
description of sequences of symbols using only three logical primitives (set union, set
product, and iteration).
Alan Turing (and, independently, Emil Post and John Backus a) put the ideas
underlying the notion of pushdown automata: abstract state machines with an
unbounded memory that is accessible only through a restricted mode (called a stack).
In 1936, Alan Turing described the Turing Machine: an abstract state machine
equipped with an infinite memory in the form of a strip of tape. Turing machines
simulate the logic of any computer algorithm and recognize the larger class of formal
languages.
Automata and grammar
Already we have seen Formal languages and its definition and basic notions
Which class of formal languages is recognized by a given type of
automata?__Chomsky hierarchy.
Languages described by deterministic finite acceptors (DFAs) are called regular
languages.
For any nondeterministic finite acceptor (NFA) we can find an equivalent DFA.
Thus, NFAs also describe regular languages.
Regular language
Conti…
A language L is said to be regular if there is a DFA (actually any FA ) M such that L
= L(M). Sometimes we say that M recognizes L.
The set in a regular language is a regular set.
Now we can use regular set and regular language interchangeable.
In fact most languages are not regular!.
The class of regular languages is the smallest class of languages in a hierarchy of
classes that we will consider.
To explicitly give an example of a language that is not regular though, we will need
proof technique the pumping lemma.
Regular languages form a simple but important family of formal languages.
Conti…
They can be specified in several equivalent ways:
Using FA, already have seen
Using regular expressions, see next
Using regular grammars, see next
Regular expression(REX)
An FA (NFA or DFA) is a “blueprint" for constructing a machine recognizing a
regular language.
A regular expression ,RE is a “user-friendly," declarative way of describing a
regular language.
A Regular Expression (RE) is a string of symbols that describes a regular
Language.
An algebraic-expression (notation) for describing (some) languages
Represents regular languages or regular sets A language is regular IFF some
REX describes it.
Languages are nothing, but are basically set of strings.
Each REX represents set of strings __set of strings are regular sets.
cont…
For every regular set there is a corresponding REX.
Given a regular expression E, the language it describes is l(E)={…… set of strings….)
R is a regular expression over alphabet exactly if R is one of the following:
a, for some a in Σ,
ε,
∅ ,
( R1 + R2 ), where R1 and R2 are regular expressions,
R1 ° R2 ), where R1 and R2 are regular expressions,
, where R1 is a regular expression, and

1R
*
1R
Conti…
Operators of REX represents operations on regular languages.
Regular language operations are operations to generate regular language (RL) from
other RL( union , concatenation, power, intersection, kleene closure and plus… ).
Since RLs are set, we use set operations (example union, concatenation, kleene closure).
For every RL operation there is corresponding REX operator.
Star, *
Dot, 
Plus, +
Operations on RLs
Examples:
L1={a, b} and L2={c, d}, then
union L1 U L2={a, b, c, d}.
concatenation L1L2={ac, ad, bc, bd}.
kleene closure L1*( the strings of L1 kleene closure)={, a, ab, aba,…}.
plus closure L1+ (plus closure of L1= includes all strings of L1 kleene closure
except .
Conti…
Now look at the REXs for what type of regular set(or RL) does represent.
REX Corresponding regular set or
RL
1.  {}
2.  { }
3. a where a {a}
4. r1 and r2 R1 and R2 respectively
5. r1+r2 R1 U R2
6. r1r2 R1R2 the concatenation
7. r1* R1* kleene closure
8. r1+ R1+ plus closure
Conti…
Suppose a, b, c, and d are   and are REXs, then the corresponding regular sets
are {a},{b}, {c} and {d} respectively.
Now find the regular sets for the following REXs
REXs Regular set or RLs
( a + b ) {a} U {b} = { a, b }
ab
(a +b )( c + d )
(a +b ) c ( c + d )
a*
(ab)*
(a+b)*
Order of precedence for operators:
(Precedence Rules)
1. Star
2. Dot
3. Plus
The star (*) operation has the highest precedence.
The concatenation ( . ) operation is second on the preference order.
The union (+) operation is the least preferred.
Parentheses can be omitted using these rules.
Example:
01 +1 is grouped (0(1))+1
a+bb*a can be parenthesized as a+(b(b*))a
The operations + and . in a regular expression satisfy the distributive law: For any
regular expressions r, s and t,
r(s + t) = rs + rt,
(r + s)t = rt + st.
Conti…
If L(r) = L(s) language represented by
r = language represented by s then r = s,
the two regular expressions are
equivalent, i.e. r=s.
Example:
r = (a+b)*{ε, a, b, aa, ab, ba, bb, aaa, aab, ….}
s= (a*b*)*{ε, a, b, aa, ab, ba, bb, aaa, aab, ….}
Both represent the same language,
hence r=s.
REX simplification
Simplification rules:
(+R)* = r*
(+r)r* = r*
R+rs* = rs*
+r+r* = r*
R = R =   (annihilation)
+R = R+ = R  (identity)
There are also many properties of REX
that are proven
Equivalence of FA's and REX
Regular expressions and finite automata are equivalent in their descriptive power.
This fact is expressed in the following theorem:
Theorem
A set is regular if and only if it can be described by a regular expression.
We know REX represent RLs accepted by FA
So, there is a FA accept the RL represented by the RE.
For each REX FA(equivalent to the REX) accepts the RL represented by the REX
We have already shown that DFA's, NFA's, and -NFA's are all equivalent.
Now see,
Conti…
To show FA's equivalent to REX we need to establish that
1. For every DFA A we can find (construct, in this case) a REX R, L(R) = L(A).
2. For every REX R there is an -NFAA, L(A) = L(R).
Conti…
We have a regular expression, call R,
Convert R to -NFA,
Conclude that the language must be regular accepted by some FA
RE to -NFA
For every REX R there is an equivalent -NFAA, L(A) = L(R) .
Automata for ,
, and
a
Conti…
automata for R+S automata for RS
Automata for R*
Conti…
First construct a FA to the basic (primitive) REXs
Then extend this method to any REXs
Example :
AutomatonFinitetoconvert )10(1)10( *

What is grammar
We have learned two ways of characterizing regular languages: REXs and FA
Grammar is another way of characterizing them.
A set of rewrite (production) rules , used to generate strings by successively
rewriting symbols.
Example: a language L is set of strings of symbol ‘a’ except empty string, {a, aa,
aaa, …}
Generate the strings of L by the following procedure.
Star the process from a symbol s, and rewrite s using one of the following rules.
S -> a , and s -> as , meaning s can be rewritten as either a (the first production) or
as (the last production).
Purpose: To specify how to form grammatically correct strings in the language the
grammar represents
Conti…
To generate the string aa for example,
Start with S and apply the second rule to replace S with the right hand side of the
rule, i.e. as (sentential form), to obtain as.
Then apply the first rule to as to rewrite s as a. That gives us aa (sentence).
S => as, to express that as is obtained from s by applying a single production.
Thus is written as
S => as => aa, the process of obtaining aa from s  derivation.
If we are not interested in the intermediate steps, the fact that aa is obtained from s
is written as s =>* aa .
In general if a string  is obtained from a string  by applying productions of a
grammar G, we write ( is derived from ).
Simply write =>* .( it is clear that we are using a grammar)
 
*
G
Formal definition
A grammar G is a mathematical object of four components.
General form for production rules , generates
The left hand side contains at least one symbol v
The right hand side is a string of
G=(V, T, S, P) , where
V is a nonempty, finite set of non-terminal symbols,
T is a finite set of terminal symbols , or alphabet, symbols,
S ∈ V is the start symbol
P is a finite set of rules(productions), the first rule uses the start symbol.
V and T are disjoint.
 :P  
*
)( TV 
General
form
The language of G
The set of all strings that can derived from the grammar.
The language generated by G,
All strings in that can be produced from the start symbol by applying rules of
the grammar are in the language .
All others are not.
}|{)( **
wSTwGL 
*
T
derivation
For a grammar a derivation of a string is a sequence of grammar rule applications
that transforms the start symbol in to the string
Proves the string is belongs to the grammar’s language.
Left most derivation
Replace the left most non-terminal by one of the production rules
Right most derivation
Replace the right most non- terminal by one of the production rules
Both generates the same string in the language.
Parse tree(parsing, derivation, syntax tree)
A tree representation of a derivation.
Start symbol is root node.
Non- terminals are interior nodes.
Terminals are leaf nodes.
Represents the syntactic structure of a string according to some form of the
abstract syntax trees used in computer programming, Syntax tree.
A tree representation of the abstract syntactic structure of source code written in a
programming language.
Built by parsers during the source code translation and compiling process.
Regular grammars
Generates a regular language.( To prove, convert RGFA. Since we have FA for the
language then it is regular.)
– Small number in formal language families.
Are linear.
– At most one non -terminal at the right hand side of the production rule.
Right – linear
– Left hand side of P is single non-terminal.
– Right hand side of P is a terminal followed by a single non-terminal or
– A single terminal followed by a single non-terminal __ strictly right -linear
Left – linear
– Left hand side = right linear
– Right hand side is a single non-terminal followed by a terminal or
– A single non-terminal followed by a single terminal __ strictly left- linear
Are left –linear or right – linear.
If right and left linear grammars are mixed in single grammar the result will not generate
necessarily a regular language. (Assignment? give example).
Chapter four
Context free languages and grammars
chapter plan
context-free language
context-free grammar
parsing and ambiguity
context-free grammars and programming
languages
Conti…
Context free languages (CFL)
– More general than regular languages.
– Languages in programming languages.
– Recognized by automaton __ PDA.
– Generated by type 2 grammars.
Context free grammars (CFG)
– Type 2 grammars
– Generates CFL
– Powerful than regular grammars
– Left hand side of P is a single non-terminal.
– Right hand side of P is anything (no restriction)
For a given CFG G, if w is in L(G) then there is a left most and right most
derivation of w.
 *
VT 
Conti…
Context-free grammars are powerful enough to describe the syntax of most
programming languages; in fact, the syntax of most programming languages is
specified using context-free grammars.
On the other hand, context-free grammars are simple enough to allow the
construction of efficient parsing algorithms which, for a given string, determine
whether and how it can be generated from the grammar.
Parsing and ambiguity
Parsing a string is finding a derivation (derivation tree) for that string.
Is like recognizing string.
A CFG is ambiguous if it has >1 parse tree for some string.
– More than one RMD or/and LMD for some string.
– At least one string in L(G) which has more than one derivation tree.
Ambiguity is bad
– Leaves meaning of some programs ill-defined.
– Creates conflict
Is common in programming language
– Arithmetic expressions
– If-then-else
Context free language and programming languages
Most programming languages are CFL (CFL=L(CFG) ).
A program that we write is a CFL.
For this CFL there is a grammar (CFG) that generates it.
CFG is predetermined with compiler (i.e. parser).
Syntax tree of describes the syntax of the (syntactic structure) programming
language.
Abstract syntax tree are created during compilation process for the program (code).
The structure of the string (program) is described by the syntax (parse) tree.
Then the parser checks the string (program) if it is syntactically correct.
If the written code is generated by the CFG in the compiler the program is
syntactically correct (No syntax error) unless fails( Syntax error).
Simplification of CFGs
And
Normal forms
Chapter five
CFG Simplification
In a CFG, it may happen that all the production rules (Useless productions) and
symbols (Useless symbols) are not needed for the derivation of strings.
Besides, there may be some null productions and unit productions.
Elimination of these productions and symbols is simplification of CFGs.
Simplification essentially comprises of the following steps −
Reduction of CFG
Removal of Unit Productions
Removal of Null Productions
Reduced CFG
CFG we get after removing useless symbols and productions, .
CFGs are reduced in two phases −two step procedures to obtain .
Phase 1 Remove those variables which do not derive any terminal string.
Objective Obtaining an equivalent , from the CFG, G such that each
variable derives some terminal string.
'
G
)()( '
GLGL 
'G
Derivation Procedure −
Step 1 − Include all symbols, W1, that derive some terminal and initialize i=1.
Step 2 − Include all symbols, Wi+1, that derive Wi.
Step 3 − Increment i and repeat Step 2, until Wi+1 = Wi.
Step 4 − Include all production rules that have Wi in it.
'G
Conti…
Phase 2 − Derivation of an equivalent grammar, G”, from the CFG, G’, such that
each symbol appears in a sentential form.
Objective Remove those variables and terminals which are not present in any string
derivable from state symbol S.
Derivation Procedure −
Step 1 − Include the start symbol in Y1 and initialize i = 1.
Step 2 − Include all symbols, Yi+1, that can be derived from Yi and
include all production rules that have been applied.
Step 3 − Increment i and repeat Step 2, until Yi+1 = Yi.
Removal of -production
, is null-production
If  L(G), we can remove the null-production from G.
Let the grammar we had obtained by removing -production from G=(V,T,P,S) is
If L(G) contains ‘’,
else
,where A and B are variables
A
 SPTVG ,,, 22 
  )()( 2 GLGL
)()( 2 GLGL 
Removal of unit-production
YX  (a single variable in both sides)
XX 
can be removed immediately
Unit productions of form
Summery
To sum up, Every CFL with out  can be generated by a grammar with out
Useless symbols
-production, and
Unit productions
The grammar G generates the above language is obtained by
1. Eliminate all  -productions.
2. Eliminate all unit productions.
3. Eliminate all useless symbols.
This sequence guarantees that unwanted variables and productions are removed.
Hence, The grammar is simplified
steps
Normal forms for CFG
You know that, a CFG contains a single non terminal at the  side and any thing can
be at the  side  from the general production rule.
To be applicable, an arbitrary CFG must have some specific form to describe the
language.
– The RHS must have some particular way of describing the language, i.e. must
be restricted.
Chomsky Normal Form –– CNF
Griebach Normal Form –– GNF
Backus–Naur form––BNF
Kuroda Normal Form –– KNF
Conti…
Every context-free grammar that doesn’t generate the empty string can be
transformed in to an equivalent one in normal forms.
Here equivalent means the two grammars generate the same language.
Motivations for CNF ( theoretical and practical implications)
1. Every string of length n is derivable in 2n-1 steps (in CNF) , if you know the length you
can start parsing and, hence this is really use full.
2. Proof of pumping lemma for CFL.
3. Every CFG has an equivalent NPDM, if you know the grammar in CNF then it is easy
to construct a machine that recognizes same language.
4. CYK- parsing algorithm- algorithm for membership. For instance, given a CFG, one
can use the CNF to construct time algorithm (CYK) to decide whether a given
string is in the language represented by that grammar or not.
)( 3
nO
Chomsky Normal Form (CNF)
In formal language theory, a context-free grammar G is said to be in Chomsky
normal form if all of its production rules are of the form .
where A, B, and C are non-terminal symbols, a is a terminal symbol,
S is the start symbol, and
ε denotes the empty string.
Also, neither B nor C may be the start symbol
Additionally we permit the rule where S is the start variable, for technical
reasons. this means that we allow this as one of many possible rules.
ora,A
orBC,A


S
Conti…
If G is a CFG which generates a language that consists of at least one string along
with , then there is an other CFG (in CNF) such that , no
null-productions, unit productions and useless symbols.
2G  )()( 2 GLGL
bA
SAA
aS
ASS




Chomsky
Normal Form
Not Chomsky
Normal Form
From any context-free grammar (which doesn’t produce ) not in Chomsky Normal
Form
we can obtain:
an equivalent grammar in Chomsky Normal Form

In general:
The Procedure
First remove:
1. Nullable variables
2. Unit productions
3. Long productions >2
4. Useless symbols
Conversion to Chomsky Normal Form
Example:
AcB
aabA
ABaS


 Not Chomsky
Normal Form
convert it to Chomsky Normal Form
Observations
Chomsky normal forms are good for parsing and proving
theorems
It is easy to find the Chomsky normal form for any context-free
grammar

Mais conteúdo relacionado

Mais procurados

Theory of automata and formal language
Theory of automata and formal languageTheory of automata and formal language
Theory of automata and formal languageRabia Khalid
 
Theory of Computation Lecture Notes
Theory of Computation Lecture NotesTheory of Computation Lecture Notes
Theory of Computation Lecture NotesFellowBuddy.com
 
Pushdown Automata Theory
Pushdown Automata TheoryPushdown Automata Theory
Pushdown Automata TheorySaifur Rahman
 
Deterministic Finite Automata (DFA)
Deterministic Finite Automata (DFA)Deterministic Finite Automata (DFA)
Deterministic Finite Automata (DFA)Animesh Chaturvedi
 
NFA or Non deterministic finite automata
NFA or Non deterministic finite automataNFA or Non deterministic finite automata
NFA or Non deterministic finite automatadeepinderbedi
 
Context free grammars
Context free grammarsContext free grammars
Context free grammarsRonak Thakkar
 
Regular expressions-Theory of computation
Regular expressions-Theory of computationRegular expressions-Theory of computation
Regular expressions-Theory of computationBipul Roy Bpl
 
Regular expression with DFA
Regular expression with DFARegular expression with DFA
Regular expression with DFAMaulik Togadiya
 
Nondeterministic Finite Automata
Nondeterministic Finite AutomataNondeterministic Finite Automata
Nondeterministic Finite AutomataAdel Al-Ofairi
 
Regular Languages
Regular LanguagesRegular Languages
Regular Languagesparmeet834
 
Regular expressions
Regular expressionsRegular expressions
Regular expressionsShiraz316
 
Context free grammars
Context free grammarsContext free grammars
Context free grammarsShiraz316
 
Lecture 1,2
Lecture 1,2Lecture 1,2
Lecture 1,2shah zeb
 

Mais procurados (20)

Lecture 7
Lecture 7Lecture 7
Lecture 7
 
Theory of automata and formal language
Theory of automata and formal languageTheory of automata and formal language
Theory of automata and formal language
 
Turing machine-TOC
Turing machine-TOCTuring machine-TOC
Turing machine-TOC
 
Flat unit 1
Flat unit 1Flat unit 1
Flat unit 1
 
Theory of Computation Lecture Notes
Theory of Computation Lecture NotesTheory of Computation Lecture Notes
Theory of Computation Lecture Notes
 
Pushdown Automata Theory
Pushdown Automata TheoryPushdown Automata Theory
Pushdown Automata Theory
 
Deterministic Finite Automata (DFA)
Deterministic Finite Automata (DFA)Deterministic Finite Automata (DFA)
Deterministic Finite Automata (DFA)
 
NFA or Non deterministic finite automata
NFA or Non deterministic finite automataNFA or Non deterministic finite automata
NFA or Non deterministic finite automata
 
Finite Automata
Finite AutomataFinite Automata
Finite Automata
 
Context free grammars
Context free grammarsContext free grammars
Context free grammars
 
Regular expressions-Theory of computation
Regular expressions-Theory of computationRegular expressions-Theory of computation
Regular expressions-Theory of computation
 
Regular expression with DFA
Regular expression with DFARegular expression with DFA
Regular expression with DFA
 
TOC 7 | CFG in Chomsky Normal Form
TOC 7 | CFG in Chomsky Normal FormTOC 7 | CFG in Chomsky Normal Form
TOC 7 | CFG in Chomsky Normal Form
 
Nondeterministic Finite Automata
Nondeterministic Finite AutomataNondeterministic Finite Automata
Nondeterministic Finite Automata
 
Regular Languages
Regular LanguagesRegular Languages
Regular Languages
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
 
Lecture 6
Lecture 6Lecture 6
Lecture 6
 
Context free grammars
Context free grammarsContext free grammars
Context free grammars
 
Theory of computation Lec7 pda
Theory of computation Lec7 pdaTheory of computation Lec7 pda
Theory of computation Lec7 pda
 
Lecture 1,2
Lecture 1,2Lecture 1,2
Lecture 1,2
 

Semelhante a Ch-three Regular languages and regular grammar

Types of Language in Theory of Computation
Types of Language in Theory of ComputationTypes of Language in Theory of Computation
Types of Language in Theory of ComputationAnkur Singh
 
Lecture Notes-Are Natural Languages Regular.pdf
Lecture Notes-Are Natural Languages Regular.pdfLecture Notes-Are Natural Languages Regular.pdf
Lecture Notes-Are Natural Languages Regular.pdfDeptii Chaudhari
 
ChomskyPresentation.pdf
ChomskyPresentation.pdfChomskyPresentation.pdf
ChomskyPresentation.pdfNiteshJaypal
 
Speech To Sign Language Interpreter System
Speech To Sign Language Interpreter SystemSpeech To Sign Language Interpreter System
Speech To Sign Language Interpreter Systemkkkseld
 
theory of computation lecture 02
theory of computation lecture 02theory of computation lecture 02
theory of computation lecture 028threspecter
 
INFO-2950-Languages-and-Grammars.ppt
INFO-2950-Languages-and-Grammars.pptINFO-2950-Languages-and-Grammars.ppt
INFO-2950-Languages-and-Grammars.pptLamhotNaibaho3
 
Алексей Чеусов - Расчёсываем своё ЧСВ
Алексей Чеусов - Расчёсываем своё ЧСВАлексей Чеусов - Расчёсываем своё ЧСВ
Алексей Чеусов - Расчёсываем своё ЧСВMinsk Linux User Group
 
Natural Language Processing Topics for Engineering students
Natural Language Processing Topics for Engineering studentsNatural Language Processing Topics for Engineering students
Natural Language Processing Topics for Engineering studentsRosnaPHaroon
 
NLP_KASHK:Finite-State Morphological Parsing
NLP_KASHK:Finite-State Morphological ParsingNLP_KASHK:Finite-State Morphological Parsing
NLP_KASHK:Finite-State Morphological ParsingHemantha Kulathilake
 
01-Introduction&Languages.pdf
01-Introduction&Languages.pdf01-Introduction&Languages.pdf
01-Introduction&Languages.pdfTariqSaeed80
 
Free Ebooks Download ! Edhole
Free Ebooks Download ! EdholeFree Ebooks Download ! Edhole
Free Ebooks Download ! EdholeEdhole.com
 
Mba ebooks ! Edhole
Mba ebooks ! EdholeMba ebooks ! Edhole
Mba ebooks ! EdholeEdhole.com
 
The Theory of Finite Automata.pptx
The Theory of Finite Automata.pptxThe Theory of Finite Automata.pptx
The Theory of Finite Automata.pptxssuser039bf6
 
Regular Expression in Compiler design
Regular Expression in Compiler designRegular Expression in Compiler design
Regular Expression in Compiler designRiazul Islam
 
Formal language
Formal languageFormal language
Formal languageRajendran
 

Semelhante a Ch-three Regular languages and regular grammar (20)

Lexical analyzer
Lexical analyzerLexical analyzer
Lexical analyzer
 
Types of Language in Theory of Computation
Types of Language in Theory of ComputationTypes of Language in Theory of Computation
Types of Language in Theory of Computation
 
Lecture Notes-Are Natural Languages Regular.pdf
Lecture Notes-Are Natural Languages Regular.pdfLecture Notes-Are Natural Languages Regular.pdf
Lecture Notes-Are Natural Languages Regular.pdf
 
ChomskyPresentation.pdf
ChomskyPresentation.pdfChomskyPresentation.pdf
ChomskyPresentation.pdf
 
Speech To Sign Language Interpreter System
Speech To Sign Language Interpreter SystemSpeech To Sign Language Interpreter System
Speech To Sign Language Interpreter System
 
theory of computation lecture 02
theory of computation lecture 02theory of computation lecture 02
theory of computation lecture 02
 
INFO-2950-Languages-and-Grammars.ppt
INFO-2950-Languages-and-Grammars.pptINFO-2950-Languages-and-Grammars.ppt
INFO-2950-Languages-and-Grammars.ppt
 
Алексей Чеусов - Расчёсываем своё ЧСВ
Алексей Чеусов - Расчёсываем своё ЧСВАлексей Чеусов - Расчёсываем своё ЧСВ
Алексей Чеусов - Расчёсываем своё ЧСВ
 
Regular expression
Regular expressionRegular expression
Regular expression
 
Natural Language Processing Topics for Engineering students
Natural Language Processing Topics for Engineering studentsNatural Language Processing Topics for Engineering students
Natural Language Processing Topics for Engineering students
 
NLP_KASHK:Finite-State Morphological Parsing
NLP_KASHK:Finite-State Morphological ParsingNLP_KASHK:Finite-State Morphological Parsing
NLP_KASHK:Finite-State Morphological Parsing
 
Unit ii
Unit iiUnit ii
Unit ii
 
01-Introduction&Languages.pdf
01-Introduction&Languages.pdf01-Introduction&Languages.pdf
01-Introduction&Languages.pdf
 
Unit2 Toc.pptx
Unit2 Toc.pptxUnit2 Toc.pptx
Unit2 Toc.pptx
 
Free Ebooks Download ! Edhole
Free Ebooks Download ! EdholeFree Ebooks Download ! Edhole
Free Ebooks Download ! Edhole
 
Mba ebooks ! Edhole
Mba ebooks ! EdholeMba ebooks ! Edhole
Mba ebooks ! Edhole
 
The Theory of Finite Automata.pptx
The Theory of Finite Automata.pptxThe Theory of Finite Automata.pptx
The Theory of Finite Automata.pptx
 
Knowledge Extraction
Knowledge ExtractionKnowledge Extraction
Knowledge Extraction
 
Regular Expression in Compiler design
Regular Expression in Compiler designRegular Expression in Compiler design
Regular Expression in Compiler design
 
Formal language
Formal languageFormal language
Formal language
 

Último

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Último (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Ch-three Regular languages and regular grammar

  • 1. Ch-three Regular languages and regular grammar Chapter programme Regular Language Regular Expressions Grammar Regular Grammar
  • 2. Introduction(prelude) What is language? May refer either to the (human) capacity for acquiring and using complex systems of communication, or to a specific instance of such a system of complex communication. ...and a formal language? A language is formal if it is provided with a mathematically rigorous representation of its: Alphabet of symbols, and Formation rules specifying which strings of symbols count as well-formed.
  • 3. Describing formal languages Generative approach: A language is the set of strings generated by a grammar. Generation process: Start symbol Expand with rewrite rules. Stop when a word of the language is generated. A grammar generates regular language regular grammar. Recognition approach: A language is the set of strings accepted by an automaton. Recognition process: Start in initial state. Transitions to other states guided by the string symbols. Until read whole string and reach accept/reject state.
  • 4. Linguistics and formal language Two application settings: Formal languages are studied in linguistics and computer science. Linguistics In linguistics, formal languages are used for the scientific study of human language. Linguists privilege a generative approach, as they are interested in defining a (finite) set of rules stating the grammar based on which any reasonable sentence in the language can be constructed. A grammar does not describe the meaning of the sentences or what can be done with them in whatever context __ but only their form.
  • 5. Computer science and formal languages Computer science In computer science, formal languages are used for the precise definition of programming languages and, therefore, in the development of compilers. A compiler is a computer program (or set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language). Computer scientists privilege a recognition approach based on abstract machines (automata) that take in input a sentence and decide whether it belongs to the reference language.
  • 6. Grammars and formal languages Chomsky Noam Chomsky (1928) is an American linguist, philosopher, cognitive scientist, historian, and activist. In Syntactic Structures (1957), Chomsky models knowledge of language using a formal grammar, by claiming that formal grammars can explain the ability of a hearer/speaker to produce and interpret an infinite number of sentences with a limited set of grammatical rules and a finite set of terms. The human brain contains a limited set of rules for organizing language, known as Universal Grammar. The basic rules of grammar are hard-wired into the brain, and manifest themselves without being taught.
  • 7. Conti… Chomsky proposed a hierarchy that partitions formal grammars into classes with increasing expressive power, i.e. each successive class can generate a broader set of formal languages than the one before. Interestingly, modeling some aspects of human language requires a more complex formal grammar (as measured by the Chomsky hierarchy) than modeling others.
  • 8. Chomsky Classification of Grammars Grammar Type Grammar Accepted (generator) Language Accepted Automaton (recognizers ) Type 0 Unrestricted grammar Recursively enumerable language Turing Machine Type 1 Context-sensitive grammar Context-sensitive language Linear-bounded automaton Type 2 Context-free grammar Context-free language Pushdown automaton Type 3 Regular grammar Regular language Finite state automaton
  • 9. Chomsky hierarchy It shows the scope of each type of grammar −
  • 10. Automata and formal language Stephen Kleene introduced finite automata in the 50s: abstract state machines equipped with a finite memory. He demonstrated the equivalence between such a model and the description of sequences of symbols using only three logical primitives (set union, set product, and iteration). Alan Turing (and, independently, Emil Post and John Backus a) put the ideas underlying the notion of pushdown automata: abstract state machines with an unbounded memory that is accessible only through a restricted mode (called a stack). In 1936, Alan Turing described the Turing Machine: an abstract state machine equipped with an infinite memory in the form of a strip of tape. Turing machines simulate the logic of any computer algorithm and recognize the larger class of formal languages.
  • 11. Automata and grammar Already we have seen Formal languages and its definition and basic notions Which class of formal languages is recognized by a given type of automata?__Chomsky hierarchy. Languages described by deterministic finite acceptors (DFAs) are called regular languages. For any nondeterministic finite acceptor (NFA) we can find an equivalent DFA. Thus, NFAs also describe regular languages. Regular language
  • 12. Conti… A language L is said to be regular if there is a DFA (actually any FA ) M such that L = L(M). Sometimes we say that M recognizes L. The set in a regular language is a regular set. Now we can use regular set and regular language interchangeable. In fact most languages are not regular!. The class of regular languages is the smallest class of languages in a hierarchy of classes that we will consider. To explicitly give an example of a language that is not regular though, we will need proof technique the pumping lemma. Regular languages form a simple but important family of formal languages.
  • 13. Conti… They can be specified in several equivalent ways: Using FA, already have seen Using regular expressions, see next Using regular grammars, see next
  • 14. Regular expression(REX) An FA (NFA or DFA) is a “blueprint" for constructing a machine recognizing a regular language. A regular expression ,RE is a “user-friendly," declarative way of describing a regular language. A Regular Expression (RE) is a string of symbols that describes a regular Language. An algebraic-expression (notation) for describing (some) languages Represents regular languages or regular sets A language is regular IFF some REX describes it. Languages are nothing, but are basically set of strings. Each REX represents set of strings __set of strings are regular sets.
  • 15. cont… For every regular set there is a corresponding REX. Given a regular expression E, the language it describes is l(E)={…… set of strings….) R is a regular expression over alphabet exactly if R is one of the following: a, for some a in Σ, ε, ∅ , ( R1 + R2 ), where R1 and R2 are regular expressions, R1 ° R2 ), where R1 and R2 are regular expressions, , where R1 is a regular expression, and  1R * 1R
  • 16. Conti… Operators of REX represents operations on regular languages. Regular language operations are operations to generate regular language (RL) from other RL( union , concatenation, power, intersection, kleene closure and plus… ). Since RLs are set, we use set operations (example union, concatenation, kleene closure). For every RL operation there is corresponding REX operator. Star, * Dot,  Plus, +
  • 17. Operations on RLs Examples: L1={a, b} and L2={c, d}, then union L1 U L2={a, b, c, d}. concatenation L1L2={ac, ad, bc, bd}. kleene closure L1*( the strings of L1 kleene closure)={, a, ab, aba,…}. plus closure L1+ (plus closure of L1= includes all strings of L1 kleene closure except .
  • 18. Conti… Now look at the REXs for what type of regular set(or RL) does represent. REX Corresponding regular set or RL 1.  {} 2.  { } 3. a where a {a} 4. r1 and r2 R1 and R2 respectively 5. r1+r2 R1 U R2 6. r1r2 R1R2 the concatenation 7. r1* R1* kleene closure 8. r1+ R1+ plus closure
  • 19. Conti… Suppose a, b, c, and d are   and are REXs, then the corresponding regular sets are {a},{b}, {c} and {d} respectively. Now find the regular sets for the following REXs REXs Regular set or RLs ( a + b ) {a} U {b} = { a, b } ab (a +b )( c + d ) (a +b ) c ( c + d ) a* (ab)* (a+b)*
  • 20. Order of precedence for operators: (Precedence Rules) 1. Star 2. Dot 3. Plus The star (*) operation has the highest precedence. The concatenation ( . ) operation is second on the preference order. The union (+) operation is the least preferred. Parentheses can be omitted using these rules. Example: 01 +1 is grouped (0(1))+1 a+bb*a can be parenthesized as a+(b(b*))a The operations + and . in a regular expression satisfy the distributive law: For any regular expressions r, s and t, r(s + t) = rs + rt, (r + s)t = rt + st.
  • 21. Conti… If L(r) = L(s) language represented by r = language represented by s then r = s, the two regular expressions are equivalent, i.e. r=s. Example: r = (a+b)*{ε, a, b, aa, ab, ba, bb, aaa, aab, ….} s= (a*b*)*{ε, a, b, aa, ab, ba, bb, aaa, aab, ….} Both represent the same language, hence r=s. REX simplification Simplification rules: (+R)* = r* (+r)r* = r* R+rs* = rs* +r+r* = r* R = R =   (annihilation) +R = R+ = R  (identity) There are also many properties of REX that are proven
  • 22. Equivalence of FA's and REX Regular expressions and finite automata are equivalent in their descriptive power. This fact is expressed in the following theorem: Theorem A set is regular if and only if it can be described by a regular expression. We know REX represent RLs accepted by FA So, there is a FA accept the RL represented by the RE. For each REX FA(equivalent to the REX) accepts the RL represented by the REX We have already shown that DFA's, NFA's, and -NFA's are all equivalent. Now see,
  • 23. Conti… To show FA's equivalent to REX we need to establish that 1. For every DFA A we can find (construct, in this case) a REX R, L(R) = L(A). 2. For every REX R there is an -NFAA, L(A) = L(R).
  • 24. Conti… We have a regular expression, call R, Convert R to -NFA, Conclude that the language must be regular accepted by some FA
  • 25. RE to -NFA For every REX R there is an equivalent -NFAA, L(A) = L(R) . Automata for , , and a
  • 26. Conti… automata for R+S automata for RS Automata for R*
  • 27. Conti… First construct a FA to the basic (primitive) REXs Then extend this method to any REXs Example : AutomatonFinitetoconvert )10(1)10( * 
  • 28. What is grammar We have learned two ways of characterizing regular languages: REXs and FA Grammar is another way of characterizing them. A set of rewrite (production) rules , used to generate strings by successively rewriting symbols. Example: a language L is set of strings of symbol ‘a’ except empty string, {a, aa, aaa, …} Generate the strings of L by the following procedure. Star the process from a symbol s, and rewrite s using one of the following rules. S -> a , and s -> as , meaning s can be rewritten as either a (the first production) or as (the last production). Purpose: To specify how to form grammatically correct strings in the language the grammar represents
  • 29. Conti… To generate the string aa for example, Start with S and apply the second rule to replace S with the right hand side of the rule, i.e. as (sentential form), to obtain as. Then apply the first rule to as to rewrite s as a. That gives us aa (sentence). S => as, to express that as is obtained from s by applying a single production. Thus is written as S => as => aa, the process of obtaining aa from s  derivation. If we are not interested in the intermediate steps, the fact that aa is obtained from s is written as s =>* aa . In general if a string  is obtained from a string  by applying productions of a grammar G, we write ( is derived from ). Simply write =>* .( it is clear that we are using a grammar)   * G
  • 30. Formal definition A grammar G is a mathematical object of four components. General form for production rules , generates The left hand side contains at least one symbol v The right hand side is a string of G=(V, T, S, P) , where V is a nonempty, finite set of non-terminal symbols, T is a finite set of terminal symbols , or alphabet, symbols, S ∈ V is the start symbol P is a finite set of rules(productions), the first rule uses the start symbol. V and T are disjoint.  :P   * )( TV  General form
  • 31. The language of G The set of all strings that can derived from the grammar. The language generated by G, All strings in that can be produced from the start symbol by applying rules of the grammar are in the language . All others are not. }|{)( ** wSTwGL  * T
  • 32. derivation For a grammar a derivation of a string is a sequence of grammar rule applications that transforms the start symbol in to the string Proves the string is belongs to the grammar’s language. Left most derivation Replace the left most non-terminal by one of the production rules Right most derivation Replace the right most non- terminal by one of the production rules Both generates the same string in the language.
  • 33. Parse tree(parsing, derivation, syntax tree) A tree representation of a derivation. Start symbol is root node. Non- terminals are interior nodes. Terminals are leaf nodes. Represents the syntactic structure of a string according to some form of the abstract syntax trees used in computer programming, Syntax tree. A tree representation of the abstract syntactic structure of source code written in a programming language. Built by parsers during the source code translation and compiling process.
  • 34. Regular grammars Generates a regular language.( To prove, convert RGFA. Since we have FA for the language then it is regular.) – Small number in formal language families. Are linear. – At most one non -terminal at the right hand side of the production rule. Right – linear – Left hand side of P is single non-terminal. – Right hand side of P is a terminal followed by a single non-terminal or – A single terminal followed by a single non-terminal __ strictly right -linear Left – linear – Left hand side = right linear – Right hand side is a single non-terminal followed by a terminal or – A single non-terminal followed by a single terminal __ strictly left- linear Are left –linear or right – linear. If right and left linear grammars are mixed in single grammar the result will not generate necessarily a regular language. (Assignment? give example).
  • 35. Chapter four Context free languages and grammars chapter plan context-free language context-free grammar parsing and ambiguity context-free grammars and programming languages
  • 36. Conti… Context free languages (CFL) – More general than regular languages. – Languages in programming languages. – Recognized by automaton __ PDA. – Generated by type 2 grammars. Context free grammars (CFG) – Type 2 grammars – Generates CFL – Powerful than regular grammars – Left hand side of P is a single non-terminal. – Right hand side of P is anything (no restriction) For a given CFG G, if w is in L(G) then there is a left most and right most derivation of w.  * VT 
  • 37. Conti… Context-free grammars are powerful enough to describe the syntax of most programming languages; in fact, the syntax of most programming languages is specified using context-free grammars. On the other hand, context-free grammars are simple enough to allow the construction of efficient parsing algorithms which, for a given string, determine whether and how it can be generated from the grammar.
  • 38. Parsing and ambiguity Parsing a string is finding a derivation (derivation tree) for that string. Is like recognizing string. A CFG is ambiguous if it has >1 parse tree for some string. – More than one RMD or/and LMD for some string. – At least one string in L(G) which has more than one derivation tree. Ambiguity is bad – Leaves meaning of some programs ill-defined. – Creates conflict Is common in programming language – Arithmetic expressions – If-then-else
  • 39. Context free language and programming languages Most programming languages are CFL (CFL=L(CFG) ). A program that we write is a CFL. For this CFL there is a grammar (CFG) that generates it. CFG is predetermined with compiler (i.e. parser). Syntax tree of describes the syntax of the (syntactic structure) programming language. Abstract syntax tree are created during compilation process for the program (code). The structure of the string (program) is described by the syntax (parse) tree. Then the parser checks the string (program) if it is syntactically correct. If the written code is generated by the CFG in the compiler the program is syntactically correct (No syntax error) unless fails( Syntax error).
  • 40. Simplification of CFGs And Normal forms Chapter five
  • 41. CFG Simplification In a CFG, it may happen that all the production rules (Useless productions) and symbols (Useless symbols) are not needed for the derivation of strings. Besides, there may be some null productions and unit productions. Elimination of these productions and symbols is simplification of CFGs. Simplification essentially comprises of the following steps − Reduction of CFG Removal of Unit Productions Removal of Null Productions
  • 42. Reduced CFG CFG we get after removing useless symbols and productions, . CFGs are reduced in two phases −two step procedures to obtain . Phase 1 Remove those variables which do not derive any terminal string. Objective Obtaining an equivalent , from the CFG, G such that each variable derives some terminal string. ' G )()( ' GLGL  'G Derivation Procedure − Step 1 − Include all symbols, W1, that derive some terminal and initialize i=1. Step 2 − Include all symbols, Wi+1, that derive Wi. Step 3 − Increment i and repeat Step 2, until Wi+1 = Wi. Step 4 − Include all production rules that have Wi in it. 'G
  • 43. Conti… Phase 2 − Derivation of an equivalent grammar, G”, from the CFG, G’, such that each symbol appears in a sentential form. Objective Remove those variables and terminals which are not present in any string derivable from state symbol S. Derivation Procedure − Step 1 − Include the start symbol in Y1 and initialize i = 1. Step 2 − Include all symbols, Yi+1, that can be derived from Yi and include all production rules that have been applied. Step 3 − Increment i and repeat Step 2, until Yi+1 = Yi.
  • 44. Removal of -production , is null-production If  L(G), we can remove the null-production from G. Let the grammar we had obtained by removing -production from G=(V,T,P,S) is If L(G) contains ‘’, else ,where A and B are variables A  SPTVG ,,, 22    )()( 2 GLGL )()( 2 GLGL  Removal of unit-production YX  (a single variable in both sides) XX  can be removed immediately Unit productions of form
  • 45. Summery To sum up, Every CFL with out  can be generated by a grammar with out Useless symbols -production, and Unit productions The grammar G generates the above language is obtained by 1. Eliminate all  -productions. 2. Eliminate all unit productions. 3. Eliminate all useless symbols. This sequence guarantees that unwanted variables and productions are removed. Hence, The grammar is simplified steps
  • 46. Normal forms for CFG You know that, a CFG contains a single non terminal at the  side and any thing can be at the  side  from the general production rule. To be applicable, an arbitrary CFG must have some specific form to describe the language. – The RHS must have some particular way of describing the language, i.e. must be restricted. Chomsky Normal Form –– CNF Griebach Normal Form –– GNF Backus–Naur form––BNF Kuroda Normal Form –– KNF
  • 47. Conti… Every context-free grammar that doesn’t generate the empty string can be transformed in to an equivalent one in normal forms. Here equivalent means the two grammars generate the same language. Motivations for CNF ( theoretical and practical implications) 1. Every string of length n is derivable in 2n-1 steps (in CNF) , if you know the length you can start parsing and, hence this is really use full. 2. Proof of pumping lemma for CFL. 3. Every CFG has an equivalent NPDM, if you know the grammar in CNF then it is easy to construct a machine that recognizes same language. 4. CYK- parsing algorithm- algorithm for membership. For instance, given a CFG, one can use the CNF to construct time algorithm (CYK) to decide whether a given string is in the language represented by that grammar or not. )( 3 nO
  • 48. Chomsky Normal Form (CNF) In formal language theory, a context-free grammar G is said to be in Chomsky normal form if all of its production rules are of the form . where A, B, and C are non-terminal symbols, a is a terminal symbol, S is the start symbol, and ε denotes the empty string. Also, neither B nor C may be the start symbol Additionally we permit the rule where S is the start variable, for technical reasons. this means that we allow this as one of many possible rules. ora,A orBC,A   S
  • 49. Conti… If G is a CFG which generates a language that consists of at least one string along with , then there is an other CFG (in CNF) such that , no null-productions, unit productions and useless symbols. 2G  )()( 2 GLGL bA SAA aS ASS     Chomsky Normal Form Not Chomsky Normal Form
  • 50. From any context-free grammar (which doesn’t produce ) not in Chomsky Normal Form we can obtain: an equivalent grammar in Chomsky Normal Form  In general: The Procedure First remove: 1. Nullable variables 2. Unit productions 3. Long productions >2 4. Useless symbols
  • 51. Conversion to Chomsky Normal Form Example: AcB aabA ABaS    Not Chomsky Normal Form convert it to Chomsky Normal Form
  • 52. Observations Chomsky normal forms are good for parsing and proving theorems It is easy to find the Chomsky normal form for any context-free grammar