2. String Matching
String matching with finite automata
• The string-matching automaton is very
Effective tool which is used in string
matching Algorithms.it examines each
character in the text exactly once and
reports all the valid shifts in O(n) time.
3. The basic idea is to build a automaton in which
•
•
•
Each character in the pattern has a state.
Each match sends the automaton into a new state.
If all the characters in the pattern has been
matched, the automaton enters the accepting state.
Otherwise, the automaton will return to a suitable
state according to the current state and the input
Character.
the matching takes O(n) time since each character
is examined once.
•
•
4. • The construction of the string-matching
automaton is based on the given pattern.
The time of this construction may be
O(m3||).
The finite automaton begins in state q0 and
read the characters of its input string one at
a time. If the automaton is in state q and
reads input character a, it moves from state
q to state (q,a).
•
6. Finite automata:
Afinite automaton M is a 5-tuple (Q,q0,A,,),
where
••
•
•
•
•
Q is a finite set of states.
q0 Q is the start state.
A Q is a distinguish set of accepting states.
is a finite input alphabet
is a function from Q × into Q, called the
transition function of M.
7. The following inputs it accepts:
(Odd number of a’s accepted and any
number of bb’s. )
-“aaa”
-“abb”
-“bababab”
-“babababa”
Rejected: (Even number of a’s not accepted)
-“aabb”
-“aaaa”
•
8. input
State
0
1
a b
1 0
0 1
(a)Transition Table (b) Finite Automata
The automaton can also be represented as a
state-transition diagram as shown in right
hand side of the figure.
•
10. Build DFA from pattern.
Run DFA on text.
3 4
a a
5 6
a
0 1
a a
2
b
b
b
b
b
b
a
a a b a a a
a a a b a a
Search Text
b a a a b
accept state
Example
11. 3 4
a a
5 6
a
0 1
a a
2
b
b
b
b
b
b
a
accept state
a a b a a a
a a a b a a
Search Text
b a a a b
12. 3 4
a a
5 6
a
0 1
a a
2
b
b
b
b
b
b
a
accept state
a a b a a a
a a a b a a
Search Text
b a a a b
13. 3 4
a a
5 6
a
0 1
a a
2
b
b
b
b
b
b
a
accept state
a a b a a a
a a a b a a
Search Text
b a a a b
a a b a a a
14. 3 4
a a
5 6
a
0 1
a a
2
b
b
b
b
b
b
a
accept state
a a b a a a
a a a b a a
Search Text
b a a a b
a a b a a a
15. 3 4
a a
5 6
a
0 1
a a
2
b
b
b
b
b
b
a
accept state
a a b a a a
a a a b a a
Search Text
b a a a b
a a b a a a
16. 3 4
a a
5 6
a
0 1
a a
2
b
b
b
b
b
b
a
accept state
a a b a a a
a a a b a a
Search Text
b a a a b
a a b a a a
17. 3 4
a a
5 6
a
0 1
a a
2
b
b
b
b
b
b
a
accept state
a a b a a a
a a a b a a
Search Text
b a a a b
a a b a a a
a a b a a a
18. 3 4
a a
5 6
a
0 1
a a
2
b
b
b
b
b
b
a
accept state
a a b a a a
a a a b a a
Search Text
b a a a b
a a b a a a
a a b a a a
19. 3 4
a a
5 6
a
0 1
a a
2
b
b
b
b
b
b
a
accept state
a a b a a a
a a a b a a
Search Text
b a a a b
a a b a a a
a a b a a a
20. 3 4
a a
5 6
a
0 1
a a
2
b
b
b
b
b
b
a
accept state
a a b a a a
a a a b a a
Search Text
b a a a b
a a b a a a
a a b a a a
21. 3 4
a a
5 6
a
0 1
a a
2
b
b
b
b
b
b
a
accept state
a a b a a a
a a a b a a
Search Text
b a a a b
a a b a a a
a a b a a a
23. • This algorithm was conceived by Donald Knuth and
Vaughan Pratt and independently by James H.Morris in
1977.
24. Text: abcabb
Pattern: abd
• Prefix: Prefix of a string any number of leading symbols of that
String.
o Ex: λ,a,ab,abc,abca,abca,abcab,abcabb.
• Suffix : Suffix is any number of trailing symbols of that string.
o Ex: λ,b,bb,abb,cabb,bcabb,abcabb.
25. • Propare Prefix and Propare Suffix: Prefix and Suffix other
than string is called Propare Prefix and Propare Suffix.
o Ex: Propare Prefix: λ,a,ab,abc,abca,abca,abcab
Propare Suffix: λ,b,bb,abb,cabb,bcabb.
• Border: Border of the given string is intersection of Propare
prefix and Propare suffix.
o Ex: λ
So, Border is 1.
26. • Shift Distance = Length of Pattern – Border.
o Ex: Length of a Pattern = 3 &
Length of a Border = 1
So, Shift Distance = 2.
27. Step 1: Initialize Input Variables:
m = Length of the Pattern.
u = Prefix-Function of Pattern( p ) .
q = Number of character matched .
Step 2: Define the variable :
q=0 , the beginning of the match .
Step 3: Compare the first character with first character of Text.
If match is not found ,Substitute the value of u[ q ] to q .
If match is found , then increment the value of q by 1.
Step 4: Check whether all the pattern elements are matched with the text elements .
If not , repeat the search process .
If yes , print the number of shifts taken by the pattern.
Step 5: look for the next match .
28. • O(m) - It is to compute the prefix function values.
• O(n) - It is to compare the pattern to the text. Total of
• O(n + m) run time.