prefix & suffix example w = ab is a prefix of x = abefac where y = efac. example w = cdaa is a...
TRANSCRIPT
Prefix & Suffix
Example W = ab is a prefix of X = abefac where Y = efac.
Example W = cdaa is a suffix of X = acbecdaa where Y = acbe
A string W is a prefix of a string X if X = WY for some string Y,denoted W X.
A string W is a suffix of a string X if X = YW for some string Y,denoted W X.
The empty string is a prefix of any string.
is a suffix of any string.
Overlapping Suffix
Lemma Suppose X Z and Y Z.
a) if |X| |Y|, then X Y ; b) if |X| |Y|, then Y X ; c) if |X| = |Y|, then X = Y.
X
Z
Y
a) b) c)
The Knuth-Morris-Pratt Algorithm
Use one auxiliary function (prefix function).
Key idea on improvement:
Achieve running time O(n+m)!
Instead of precomputing the transition function in O(m ||), efficiently compute it “on the fly” as needed.
3
Minimum Shifting
Pattern P
Text T
P
Question: What is the least shift s > s ?
1 q
s+1 s+q
shift s (minimum)
shift s
q matching charss+1
The Prefix Function
How much to shift depends on the pattern not the text.
Prefix function measures length of the longest prefix of P[1..m] that is also a proper suffix of P[1..q].
[q] = max{ k: k < q and P[1..k] is a suffix of P[1..q] }
b a c b a b a b a a b c b a b a b a b a c a
TP[1..q] = ababa
[5] = 3
shift by 2
a b a b a c aP[1..k] = aba
Example
[ ] measures how well the pattern matches against a shift of itself.
i 1 2 3 4 5 6 7 8 9 10P[1..i] a b a b a b a b c a[i] 0 0 1 2 3 4 5 6 0 1
P[1..8] a b a b a b a b c aP[1..6] a b a b a b a b c a [8] = 6P[1..4] a b a b a b a b c a [6] = 4P[1..2] a b a b a b a b c a [4] = 2P[] a b a b a b a b c a [2] = 0
Ex.
a b a b a b a b c a a b a b a b a b c a
a b a b a b a b c a a b a b a b a b c a
Computing the Prefix FunctionCompute-Prefix-Function(P) m length[P] [1] 0 k 0 for q 2 to m // invariant k = [q 1]
do while k > 0 and P[k+1] P[q]do k [k]
if P[k+1] = P[q] then k k+1 [q] k
return
k+1
q
[k]+1
Ex. q = 9 and k = 6p[k+1] = a c = p[q][9] = [[[6]]] = 0
k+1
a b a b a b a b c a k a b a b a b a b c a 6
q
a b a b a b a b c a 4a b a b a b a b c a 2
a b a b a b a b c a 0
Running-time Analysis
1 Compute-Prefix-Function(P)2 m length[P]3 [1] 04 k 05 for q = 2 to m6 do while k > 0 and P[k+1] P[q]7 do k [k] // decrease k by at least 18 if P[k+1] = P[q]9 then k k+1 // m 1 increments, each by 1 10 [q] k11 return
# decrements # increments, thus line 7 is executed at most m 1 times in total.
Total time (m).
KMP Algorithm
KMP-Matcher(T, P) // n = |T| and m = |P| Compute-Prefix-Function(P) // (m) time. q 0 for i 1 to n do while q > 0 and P[q+1] T[i]
do q [q] if P[q+1] = T[i] then q q+1 // n total increments
if q = m then print “Pattern occurs with
shift” i m q [q]
// (n) time
Total time (m+n).
A KMP Example
i 1 2 3 4 5 6 7 8 9 10 11
[i] 0 0 1 2 0 1 2 3 4 3 1
abababbababbaababbababaaababbababaa
abababbababbaababbababaa ababbababaa
abababbababbaababbababaa ababbababaa
abababbababbaababbababaa ababbababaa
abababbababbaababbababaa ababbababaa
shift by q [q] = 4 2
shift by 9 4 = 5shift by 6 1 = 5
shift by 1 0 = 1
P[1..i] a b a b b a b a b a a