OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
Rabin Karp - String Matching Algorithm
1. Yet another string matching algorithm
PRESENTED BY
P14-6011
P14-6016
Rabin Karp
2. Proposed in 1987
Michael O. Rabin
Richard M. Karp
Improved Naive String Matching
By HASHING
Average Running Time O(m+n)
Introduction
3. Let text string be T of length N
Pattern string be P of length M
Example
T=“hello world”; N=11;
P=“llo”; M=3
Application
Keyword matching in large files
Good for Plagiarism detection
Statement
4. Calculate Hash of Pattern
As well of M characters of text
If hash is not equal
Calculate hash of next M characters
If hash is equal then
Compare both character by character
Note: Only one comparison for sub sequence
Working
6. int RabinKarp(string t, string p) {
int pHash = Hash(p);
int limit = t.size() - p.size() + 1; // n – m + 1
for (int i = 0; i < limit; i++) {
string substr = t.substr(i, p.size());
int tHash = Hash(substr);
if (pHash == tHash && p == substr) return i;
}
return -1;
}
Implementation
7. Hash to two string match then it is called Hit
There is possibility
Hash of “LLO” is 100
Hash of “OLL” is 100
This is called Hash Collision
Minimize Collision by
Scaling with index position
30(3)+30(2)+40(1) = 190 | 40(3)+30(2)+30(1) = 210
Taking mod with prime number
Hash Collision
8. Hash of Pattern
O(m)
Hash Comparison
O(n-m+1) = O(n) ; m < n
Average Running Time
O(m+n)
Worst Case Running Time
m comparison in each iteration
O(mn)
Analysis