Unit ii

REGULAR
EXPRESSION
BY DR.T.P.LATCHOUMI

REGULAR EXPRESSION
 The language accepted by finite automata are easily described by simple expression called regular expression
Regular Set
 Regular sets are the sets which are accepted by FA.

DEFINITION
Let ∑ be an alphabet which is used to denote the input set. The regular expression over ∑ can be defined as follows
1) Фis the regular expression which denotes the empty set.
2) ξ is the regular expression and denote the set {ξ} (null string).
3) For each a in ∑, a is a regular expression and denote the set{a}.
4) If ‘R’ and ‘S’ are regular expression denoting the languages L1 and L2 respectively then
 r+s is equivalent to L1υL2 i.e. union
 rs is equivalent to L1L2 i.e. concatenation
 r* equivalent to L1* i.e. kleen closure

EXAMPLES
 Example 1
Write the regular expression for the language accepting all combination of a’s over the set
∑={a}.
 Solution
Regular set = {ξ,a,aa,aaa,…….}
Regular expression (RE)=a*
 Example 2
Design the RE for the language accepting all combination all combination of a’s except the
null q string over ∑={a}
 Solution
Regular set = {a,aa,aaa,………}
Regular expression (RE)=a+

EXAMPLES
 Example 3
Design the RE for the language containing any no of a’s and b’s.
 Solution:
Regular set = {ξ,a,b,aa,ab,bb,ba,aaa…………}
Regular expression (RE)=(a+b) *
 Example 4
Construct RE for the language accepting all the strings which are ending with 00 ∑=
{0,1}.
 Solution
Regular set = {00,000,100,0000,0100,1000,1100……….}
Regular expression (RE)= (0+1) *00

FA WITH REGULAR
EXPRESSION
 Thompson's Construction to find
out a Finite Automaton from a
Regular Expression.
 We will reduce the regular
expression into smallest regular
expressions and converting these to
NFA and finally to DFA.
 Some basic RA expressions are
the following –

THEOREM
Equivalence of Regular
Expression and FA

EXAMPLE 1
Construct an NFA for the
given RE (01+2*)1

ARDEN’S THEOREM
Step1: let q1 be the initial state.

EXAMPLE 3
CONSTRUCT A REGULAR
EXPRESSION
CORRESPONDING TO THE
AUTOMATA GIVEN BELOW −
 Solution
Here the initial state and final state is q1.
The equations for the three states q1, q2, and q3 are as follows −
q1 = q1a + q3a + ε (ε move is because q1 is the initial state0
q2 = q1b + q2b + q3b
q3 = q2a
Now, we will solve these three equations −
q2 = q1b + q2b + q3b
= q1b + q2b + (q2a)b (Substituting value of q3)
= q1b + q2(b + ab)

CONT.….
= q1b (b + ab)* (Applying Arden’s Theorem)
q1 = q1a + q3a + ε
= q1a + q2aa + ε (Substituting value of q3)
= q1a + q1b(b + ab*)aa + ε (Substituting value of q2)
= q1(a + b(b + ab)*aa) + ε
= ε (a+ b(b + ab)*aa)*
= (a + b(b + ab)*aa)*
 Hence, the regular expression is (a + b(b + ab)*aa)*.

APPLICATIONS OF REGULAR EXPRESSION
 Regular expressions are useful in a wide variety of text processing tasks, and more generally string processing,
where the data need not be textual.
 Common applications include data validation, data scraping (especially web scraping), data wrangling, simple
parsing, the production of syntax highlighting systems, and many other tasks.
 While regexps would be useful on Internet search engines, processing them across the entire database could
consume excessive computer resources depending on the complexity and design of the regex.

CONT.….
 It is used in Web scraping.
 Basically regular expressions are used in search tools
 Text file search in Unix (tool: egrep)
 This command searches for a text pattern in the file and list the file names containing that pattern.

ALGEBRAIC LAWS FOR REGULAR EXPRESSION
 Two expressions with variables are equivalent if whatever languages we substitute for the variables the results of
the two expressions are the same language.
Examples in the algebra of arithmetic: 1 + 2 = 2 + 1 or x + y = y + x.

CONT.….
 Commutativity is the property of an operator that says we can switch the order of its operands and get the same
result.
 Associativity is the property of an operator that allow us to regroup the operands when the operator is applied
twice.
For regular expressions, we have: L + M = M + L

CONT.….
 Commutative law for union: we may make the union of two languages in either order
(L + M) + N = L + (M + N)
 Associative law for union: we may take the union of three languages either by taking the union of the first two initially or taking the
union of the last two initially.
Together with the commutative law we can take the union of any collection of languages with any order and grouping, and the result will
be the same.
Intuitively, a string is in L1∪L2 . . .∪Lk iff it is in one or more of the L is.
(L.M).N= L.(M.N)

CONT.….
 Associative law for concatenation: we can concatenate three-languages by concatenating either the first two or
the last two initially.
 Clearly the law L.M= M.L is FALSE
 An identity for an operator is a value that when the operator is applied to the identity and some other value, the
result is the other value.

CONT.….
 An annihilator for an operator is value that when the operator is applied to the annihilator and some other value,
the result is the annihilator.
For regular expressions we have:
∅+L=L+∅=L
ε.L= L. ε=L
∅.L= L.∅=∅

CONT.….
 Distributive law

CONT.….
 The Idempotent Law
An operator is idempotent if the result of applying it to two of the same values as arguments is that value.
Idempotent Law for union:
L+L=L
If we take the union of two identical expressions, we can replace them by one copy of the expression.

TO PROVE
LANGUAGES
ARE NOT
REGULAR

CLOSURE PROPERTIES OF REGULAR LANGUAGES (RL)
 If certain languages are regular and a new language L is formed from them by certain operations (such as union or
concatenation) then L is also regular. These properties are called closure properties of regular languages.
 Such languages represent the class of regular languages which is closed under the certain operations.
 The closure properties express the idea that when one or many languages are regular then certain related languages
are also regular.

THE CLOSURE
PROPERTIES OF
REGULAR LANGUAGES
ARE AS GIVEN BELOW
1. The union of two regular languages is regular.
2. The intersection of two regular languages is regular.
3. The complement of a regular language is regular.
4. The difference of two regular languages is regular.
5. The reversal of a regular language is regular.
6. The Kleene closure operation on a regular language is regular.
7. The concatenation of regular language is regular.
8. A homomorphism of regular languages is regular.
9. The inverse homomorphism of regular language is regular.

THE UNION OF
TWO REGULAR
LANGUAGES IS
REGULAR
 Theorem 1: If L1 and L2 are two languages then L1 ∪ L2 is
regular.
 Proof: If L1 and L2 are regular then they have regular expression
L1 = L (R1) and L2 = L(R2). Then L1 ∪ L2 = L (R1 + R2) thus we
get L1 ∪ L2 as regular language. (Any language given by some
regular expression is regular).

THE COMPLEMENT
OF A REGULAR
LANGUAGE IS
REGULAR
 Theorem 2: The complement of regular language is regular.
 Proof: Consider L1 be regular language which is accepted by a DFA
M = (Q, Σ, į, q0, F).
The complement of regular language L
̅ 1 is which is accepted by M ' =
(Q, Σ, į, q0, Q-F).
That means M is a DFA with final states ∈ F and M ' is a DFA in
which all the non-final states of M become final.
In other words, we can say that strings that are accepted by M are
rejected by M' similarly, the strings rejected by M are accepted by M’.
Thus, L
̅ 1 is accepted by M' is regular.

THE INTERSECTION
OF TWO REGULAR
LANGUAGES IS
REGULAR
 Theorem 3: If L1 and L2 are two languages then L1 ∩ L2 is regular.
 Proof: Consider that languages L1 is regular. That means there exists some DFA M1
that accepts L1. We can write M1 = (Q1, Σ, į1, q1, F1) Similarly being L2 regular
there is another DFA M2 = = (Q2, Σ, į2, q2, F2).
Let L be the language obtained from L1 ∩ L2. We can the simulate M=(Q, Σ, į, q, F ).
Where Q = Q1 ∩ Q2
𝛿= 𝛿1 ∩ 𝛿2 a mapping function derived from both the DFAs.
q ∈ Q which is initial state of machine M.
F = F1 ∩ F2, the set of final states, which is common for M1 and M2 both.
There exists some DFA which accepts L1 ∩ L2 i.e. L. Hence L is a regular language. This
proves that if L1 and L2 are two regular languages then L1 ∩ L2 is regular. In other
words, the regular language is closed under intersection.

THE DIFFERENCE
OF TWO REGULAR
LANGUAGES IS
REGULAR
 Theorem 4: If L1 and L2 are two regular languages then L1 - L2 is regular.
 Proof: The L1 - L2 can also be denoted as L1 ∪ L
̅ 2
Consider L1 be regular language which is accepted by DFA M = (Q, Σ, į, q0, F).
The complement of regular language L1 is L
̅ 2 which is accepted by M ' = (Q, Σ, į, q0, Q-
F). That means M is a DFA with final states set F and M' is a DFA in which all the non
final states of M become final states and all the final states of M become non-final states
Thus, L1 and L
̅ 2 are two regular languages.
That also means: these languages are accepted by regular expressions. If L1 =L(R1) and
L
̅ 2=L(R'2).
Then L1 ∪ L
̅ 2 = L(R1+R'2). This ultimately shows that L1 ∪ L
̅ 2 is regular.
In other words, L1 - L2 is regular. Thus, regular languages are closed under difference.

THE REVERSAL OF
A REGULAR
LANGUAGE IS
REGULAR
 Theorem 5: The reversal of a regular languages is regular
 Proof: Reversal of a string means obtaining a string which is written from backward that is
WR is denoted as reversal of string w. That means L(wR) = (L(w)) R
This proof can be done with basis of induction.
 Basis: If w = ɛ or ϕ then w R is also ɛ or
ϕ i.e. (ɛ)R= ɛ and (ϕ)R = ϕ Hence L (w R) is also regular.
 Induction:
 Case 1: If w = w1 + w2 then w = (w1) R +( w2) R
As the regular language is closed under union, w is also regular.
 Case 2 : If w = w1 w2 then w = (w1)R ( w2)R
As the regular language is closed under concatenation, w is also regular. Thus, the reversal of a
regular languages is regular.

THE KLEENE
CLOSURE
OPERATION ON A
REGULAR
LANGUAGE IS
REGULAR
 Theorem 6: The closure operation on regular language is regular
 Proof: If language L1 is regular then it can be expressed as L1
=L(R1*).
Thus, for a closure operation a language can be expressed as a
language of regular expressions. Hence L1 is said to be a regular
language.

THE
CONCATENATION OF
REGULAR
LANGUAGE IS
REGULAR
 Theorem 7: If L1 and L2 are two languages then L1· L2 is regular.
In other words, regular languages are closed under concatenation.
 Proof: If L1 and L2 are regular then they can be expressed as L1
=L(R1) and L2 =L(R2), Then L1.L2 =L (R1· R2) thus we get a
regular language. Hence it is proved that regular languages are
closed under concatenation.

A HOMOMORPHISM
OF REGULAR
LANGUAGES IS
REGULAR
 Theorem 8: A homomorphism of regular language is regular.
 Proof: The term homomorphism means substitution of string by some other
symbols. For instance, the string "aabb" can be written as 0011 under
homomorphism.
Clearly here, a is replaced by 0 and b is replaced by 1. Let Σ is the set of input
alphabets and Γ be the set of substitution symbols then Σ *→Γ* is homomorphism.
The definition of homomorphism can be extended as
Let, w = a1 a2 … an
h(w) = h(a1) h(a2) ...h(an)
If L is a Language that belongs to the set Σ, then homomorphic image of L can be
defined as h(L) = {h(w): w∈L}

EXAMPLE
 Let Σ = {a,b} and w = abab
 Put h(a) = 00 and h(b) = 11
 Then we can write h(w) = h(a) h(b) h(a) h(b) =00110011
 The homomorphism to language is applied by applying
homomorphism on each string of language.

THE INVERSE
HOMOMORPHISM OF
REGULAR
LANGUAGE IS
REGULAR
 Theorem 9: The inverse homomorphism of regular language is regular.
 Proof: Let h: Σ*→Γ* is homomorphism.
The Σ is the input set and Γ be the substitution symbols used by homomorphic function. Let, L be
the regular language where L ∈ Σ then h(L) be homomorphic language. The inverse
homomorphic language can be represented as h-1(L) such that h-1: Γ* → Σ*
Let, h-1(L) = {w | w ∈ L}
If L is regular, then h(L) is also regular because regular language is closed under homomorphism.
That if there exist a FA M = (Q, Σ, į, q, F) which accepts L then h(L) must also be accepted by FA
M.
For complement of L i.e. language L' the inverse homomorphic language is h-1(L).
Let M' = (Q, Σ, į, q, Q-F) be the FA in which all the final states of M become non-final states and
all the non-final states of M become the final states.
Clearly the language L' can be accepted by M' Hence h-1(L) must also be accepted by FA M'.

EXAMPLE
 Let L= (010) * is a R.L., h (0) = a and h (1) = bb be a
homomorphic function then
 h(L) = (abba)*
 h -1(h(L)) = (010) * = L as the inverse homomorphism h-1(a)=0 and
h-1(bb)=1.

DECISION PROPERTIES OF REGULAR LANGUAGES (RL)
(i) Emptiness
(ii) Non-emptiness
(iii) Finiteness
(iv) Infiniteness
(v) Membership
(vi) Equality

EMPTINESS AND NON-EMPTINESS
 Step-1: select the state that cannot be reached from the initial states & delete them (remove unreachable states).
 Step 2: if the resulting machine contains at least one final states, so then the finite automata accepts the non-empty
language.
 Step 3: if the resulting machine is free from final state, then finite automata accepts empty language.
Accepting state separated from start state, i.e., ɸ

FINITENESS AND INFINITENESS
 Step-1: select the state that cannot be reached from the initial state & delete them (remove unreachable states).
 Step-2: select the state from which we cannot reach the final state & delete them (remove dead states).
 Step-3: if the resulting machine contains loops or cycles then the finite automata accepts infinite language.
 Step-4: if the resulting machine do not contain loops or cycles then the finite automata accepts infinite language.

MEMBERSHIP
 Membership is a property to verify an arbitrary string is accepted by a finite automaton or not i.e. it is a member of
the language or not.
 Let M is a finite automata that accepts some strings over an alphabet, and let ‘w’ be any string defined over the
alphabet, if there exist a transition path in M, which starts at initial state & ends in anyone of the final state, then
string ‘w’ is a member of M, otherwise ‘w’ is not a member of M.
 A string is in a language, i.e., w ∈ L

EQUALITY
 Two finite state automata M1 & M2 is said to be equal if and only if, they accept the same language. Minimise the
finite state automata and the minimal DFA will be unique.
 Two languages are equal, i.e., L1 = L2

Unit ii

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Unit ii

Semelhante a Unit ii (20)

Mais de TPLatchoumi

Mais de TPLatchoumi (7)

Último

Último (20)

Unit ii