4. How ddooeess tthhee OO(mmnn) aapppprrooaacchh
wwoorrkk
BBeellooww iiss aann iilllluussttrraattiioonn ooff hhooww tthhee pprreevviioouussllyy
ddeessccrriibbeedd OO(mmnn) aapppprrooaacchh wwoorrkkss..
SSttrriinngg SS aa bb cc aa bb aa aa bb cc aa bb aa cc
Pattern p aa bb aa aa
5. SStteepp 11::ccoommppaarree pp[[11]] wwiitthh SS[[11]]
SS aa bb cc aa bb aa aa bb cc aa bb aa cc
p aa bb aa aa
Step 2: compare p[2] with S[2]
S aa bb cc aa bb aa aa bb cc aa bb aa cc
p aa bb aa aa
6. SStteepp 33:: ccoommppaarree pp[[33]] wwiitthh SS[[33]]
SS
aa bb cc aa bb aa aa bb cc aa bb aa cc
p aa bb aa aa
Mismatch occurs here..
Since mismatch is detected, shift ‘p’ one position to the left and
perform steps analogous to those from step 1 to step 3. At position
where mismatch is detected, shift ‘p’ one position to the right and
repeat matching procedure.
7. SS aa bb cc aa bb aa aa bb cc aa bb aa cc
p aa bb aa aa
Finally, a match would be found after shifting ‘p’ three times to the right side.
Drawbacks of this approach: if ‘m’ is the length of pattern ‘p’ and ‘n’ the length
of string ‘S’, the matching time is of the order O(mn). This is a certainly a very
slow running algorithm.
What makes this approach so slow is the fact that elements of ‘S’ with which
comparisons had been performed earlier are involved again and again in
comparisons in some future iterations. For example: when mismatch is
detected for the first time in comparison of p[3] with S[3], pattern ‘p’ would be
moved one position to the right and matching procedure would resume from
here. Here the first comparison that would take place would be between p[0]=‘a’
and S[1]=‘b’. It should be noted here that S[1]=‘b’ had been previously involved
in a comparison in step 2. this is a repetitive use of S[1] in another comparison.
It is these repetitive comparisons that lead to the runtime of O(mn).
10. TThhee pprreeffiixx ffuunnccttiioonn,, ΠΠ
FFoolllloowwiinngg ppsseeuuddooccooddee ccoommppuutteess tthhee pprreeffiixx ffuuccnnccttiioonn,, ΠΠ::
CCoommppuuttee--PPrreeffiixx--FFuunnccttiioonn ((pp))
11 mm lleennggtthh[[pp]] ////’’pp’’ ppaatttteerrnn ttoo bbee mmaattcchheedd
22 ΠΠ[[11]] 00
33 kk 00
44 ffoorr qq 22 ttoo mm
55 ddoo wwhhiillee kk >> 00 aanndd pp[[kk++11]] !!== pp[[qq]]
66 ddoo kk ΠΠ[[kk]]
77 IIff pp[[kk++11]] == pp[[qq]]
88 tthheenn kk kk ++11
99 ΠΠ[[qq]] kk
1100 rreettuurrnn ΠΠ
11. EExxaammppllee:: ccoommppuuttee ΠΠ ffoorr tthhee ppaatttteerrnn ‘‘pp’’ bbeellooww::
pp aa bb aa bb aa cc aa
Initially: m = length[p] = 7
Π[1] = 0
k = 0
Step 1: q = 2, k=0
Π[2] = 0
Step 2: q = 3, k = 0,
Π[3] = 1
Step 3: q = 4, k = 1
Π[4] = 2
qq 11 22 33 44 55 66 77
pp aa bb aa bb aa cc aa
ΠΠ 00 00
qq 11 22 33 44 55 66 77
pp aa bb aa bb aa cc aa
ΠΠ 00 00 11
qq 11 22 33 44 55 66 77
pp aa bb aa bb aa cc AA
ΠΠ 00 00 11 22
12. SStteepp 44:: qq == 55,, kk ==22
ΠΠ[[55]] == 33
SStteepp 55:: qq == 66,, kk == 33
ΠΠ[[66]] == 11
SStteepp 66:: qq == 77,, kk == 11
ΠΠ[[77]] == 11
AAfftteerr iitteerraattiinngg 66 ttiimmeess,, tthhee pprreeffiixx
ffuunnccttiioonn ccoommppuuttaattiioonn iiss
ccoommpplleettee::
qq 11 22 33 44 55 66 77
pp aa bb aa bb aa cc aa
ΠΠ 00 00 11 22 33
qq 11 22 33 44 55 66 77
pp aa bb aa bb aa cc aa
ΠΠ 00 00 11 22 33 00
qq 11 22 33 44 55 66 77
pp aa bb aa bb aa cc aa
ΠΠ 00 00 11 22 33 00 11
qq 11 22 33 44 55 66 77
pp aa bb AA bb aa cc aa
ΠΠ 00 00 11 22 33 00 11
14. IIlllluussttrraattiioonn:: ggiivveenn aa SSttrriinngg ‘‘SS’’ aanndd ppaatttteerrnn ‘‘pp’’ aass
ffoolllloowwss::
SS bb a cc bb aa bb aa bb aa bb aa cc aa cc aa
p aa bb aa bb aa cc aa
Let us execute the KMP algorithm to find
whether ‘p’ occurs in ‘S’.
For ‘p’ the prefix function, Π was computed previously and is as follows:
qq 11 22 33 44 55 66 77
pp aa bb aa bb aa cc aa
ΠΠ 00 00 11 22 33 00 11
15. Initially: n = size of S = 15;
m = size of p = 7
Step 1: i = 1, q = 0
comparing p[1] with S[1]
bb aa cc bb aa bb aa bb aa bb aa cc aa aa bb
aa bb aa bb aa cc aa
Step 2: i = 2, q = 0
comparing p[1] with S[2]
bb aa cc bb aa bb aa bb aa bb aa cc aa aa bb
S
p
P[1] does not match with S[1]. ‘p’ will be shifted one position to the right.
S
p aa bb aa bb aa cc aa
P[1] matches S[2]. Since there is a match, p is not shifted.
16. SStteepp 33:: ii == 33,, qq == 11
Comparing p[2] with S[3]
bb aa cc bb aa bb aa bb aa bb aa cc aa aa bb
p aa bb aa bb aa cc aa
bb aa cc bb aa bb aa bb aa bb aa cc aa aa bb
aa bb aa bb aa cc aa
bb aa cc bb aa bb aa bb aa bb aa cc aa aa bb
S
aa bb aa bb aa cc aa
S
p
S
p
p[2] does not match with S[3]
Backtracking on p, comparing p[1] and S[3]
Step 4: i = 4, q =co 0m paring p[1] with S[4] p[1] does not match with S[4]
Step 5: i = 5, q c=o 0m paring p[1] with S[5] p[1] matches with S[5]
17. bb aa cc bb aa bb aa bb aa bb aa cc aa aa bb
aa bb aa bb aa cc aa
SStteepp 77:: ii == 77,, qq == 22
bb aa cc bb aa bb aa bb aa bb aa cc aa aa bb
aa bb aa bb aa cc aa
bb aa cc bb aa bb aa bb aa bb aa cc aa aa bb
aa bb aa bb aa cc aa
SStteepp 66:: ii == 66,, qq == 11
S
p
Comparing p[2] with S[6] p[2] matches with S[6]
S
p
Comparing p[3] with S[7] p[3] matches with S[7]
SStteepp 88:: ii == 88,, qq == 33
Comparing p[4] with S[8] p[4] matches with S[8]
S
p
18. SStteepp 99:: ii == 99,, qq == 44
Comparing p[5] with S[9]
p
SStteepp 1100:: ii == 1100,, qq == 55
p[5] matches with S[9]
aa bb aa bb aa cc aa
Comparing p[6] with S[10]
p
Backtracking on p, comparing p[4] with S[10] because after mismatch q = Π[5] = 3
SStteepp 1111:: ii == 1111,, qq == 44
p[6] doesn’t match with S[10]
aa bb aa bb aa cc aa
Comparing p[5] with S[11]
S
S
S
p
bb aa cc bb aa bb aa bb aa bb aa cc aa aa bb
bb aa cc bb aa bb aa bb aa bb aa cc aa aa bb
p[5] matches with S[11]
bb aa cc bb aa bb aa bb aa bb aa cc aa aa bb
aa bb aa bb aa cc aa
19. bb aa cc bb aa bb aa bb aa bb aa cc aa aa bb
aa bb aa bb aa cc aa
bb aa cc bb aa bb aa bb aa bb aa cc aa aa bb
aa bb aa bb aa cc aa
SStteepp 1122:: ii == 1122,, qq == 55
Comparing p[6] with S[12]
Comparing p[7] with S[13]
S
p
S
p
SStteepp 1133:: ii == 1133,, qq == 66
p[6] matches with S[12]
p[7] matches with S[13]
Pattern ‘p’ has been found to completely occur in string ‘S’. The total number of shifts
that took place for the match to be found are: i – m = 13 – 7 = 6 shifts.