prefix & suffix example w = ab is a prefix of x = abefac where y = efac. example w = cdaa is a...

10
Prefix & Suffix ple W = ab is a prefix of X = abefac where Y = ef ample W = cdaa is a suffix of X = acbecdaa where Y ng W is a prefix of a string X if X = WY for some st d W X. ng W is a suffix of a string X if X = YW for some s d W X. The empty string is a prefix of any string. is a suffix of any string.

Upload: felicity-johnston

Post on 18-Dec-2015

240 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Prefix & Suffix Example W = ab is a prefix of X = abefac where Y = efac. Example W = cdaa is a suffix of X = acbecdaa where Y = acbe A string W is a prefix

Prefix & Suffix

Example W = ab is a prefix of X = abefac where Y = efac.

Example W = cdaa is a suffix of X = acbecdaa where Y = acbe

A string W is a prefix of a string X if X = WY for some string Y,denoted W X.

A string W is a suffix of a string X if X = YW for some string Y,denoted W X.

The empty string is a prefix of any string.

is a suffix of any string.

Page 2: Prefix & Suffix Example W = ab is a prefix of X = abefac where Y = efac. Example W = cdaa is a suffix of X = acbecdaa where Y = acbe A string W is a prefix

Overlapping Suffix

Lemma Suppose X Z and Y Z.

a) if |X| |Y|, then X Y ; b) if |X| |Y|, then Y X ; c) if |X| = |Y|, then X = Y.

X

Z

Y

a) b) c)

Page 3: Prefix & Suffix Example W = ab is a prefix of X = abefac where Y = efac. Example W = cdaa is a suffix of X = acbecdaa where Y = acbe A string W is a prefix

The Knuth-Morris-Pratt Algorithm

Use one auxiliary function (prefix function).

Key idea on improvement:

Achieve running time O(n+m)!

Instead of precomputing the transition function in O(m ||), efficiently compute it “on the fly” as needed.

3

Page 4: Prefix & Suffix Example W = ab is a prefix of X = abefac where Y = efac. Example W = cdaa is a suffix of X = acbecdaa where Y = acbe A string W is a prefix

Minimum Shifting

Pattern P

Text T

P

Question: What is the least shift s > s ?

1 q

s+1 s+q

shift s (minimum)

shift s

q matching charss+1

Page 5: Prefix & Suffix Example W = ab is a prefix of X = abefac where Y = efac. Example W = cdaa is a suffix of X = acbecdaa where Y = acbe A string W is a prefix

The Prefix Function

How much to shift depends on the pattern not the text.

Prefix function measures length of the longest prefix of P[1..m] that is also a proper suffix of P[1..q].

[q] = max{ k: k < q and P[1..k] is a suffix of P[1..q] }

b a c b a b a b a a b c b a b a b a b a c a

TP[1..q] = ababa

[5] = 3

shift by 2

a b a b a c aP[1..k] = aba

Page 6: Prefix & Suffix Example W = ab is a prefix of X = abefac where Y = efac. Example W = cdaa is a suffix of X = acbecdaa where Y = acbe A string W is a prefix

Example

[ ] measures how well the pattern matches against a shift of itself.

i 1 2 3 4 5 6 7 8 9 10P[1..i] a b a b a b a b c a[i] 0 0 1 2 3 4 5 6 0 1

P[1..8] a b a b a b a b c aP[1..6] a b a b a b a b c a [8] = 6P[1..4] a b a b a b a b c a [6] = 4P[1..2] a b a b a b a b c a [4] = 2P[] a b a b a b a b c a [2] = 0

Ex.

a b a b a b a b c a a b a b a b a b c a

a b a b a b a b c a a b a b a b a b c a

Page 7: Prefix & Suffix Example W = ab is a prefix of X = abefac where Y = efac. Example W = cdaa is a suffix of X = acbecdaa where Y = acbe A string W is a prefix

Computing the Prefix FunctionCompute-Prefix-Function(P) m length[P] [1] 0 k 0 for q 2 to m // invariant k = [q 1]

do while k > 0 and P[k+1] P[q]do k [k]

if P[k+1] = P[q] then k k+1 [q] k

return

k+1

q

[k]+1

Ex. q = 9 and k = 6p[k+1] = a c = p[q][9] = [[[6]]] = 0

k+1

a b a b a b a b c a k a b a b a b a b c a 6

q

a b a b a b a b c a 4a b a b a b a b c a 2

a b a b a b a b c a 0

Page 8: Prefix & Suffix Example W = ab is a prefix of X = abefac where Y = efac. Example W = cdaa is a suffix of X = acbecdaa where Y = acbe A string W is a prefix

Running-time Analysis

1 Compute-Prefix-Function(P)2 m length[P]3 [1] 04 k 05 for q = 2 to m6 do while k > 0 and P[k+1] P[q]7 do k [k] // decrease k by at least 18 if P[k+1] = P[q]9 then k k+1 // m 1 increments, each by 1 10 [q] k11 return

# decrements # increments, thus line 7 is executed at most m 1 times in total.

Total time (m).

Page 9: Prefix & Suffix Example W = ab is a prefix of X = abefac where Y = efac. Example W = cdaa is a suffix of X = acbecdaa where Y = acbe A string W is a prefix

KMP Algorithm

KMP-Matcher(T, P) // n = |T| and m = |P| Compute-Prefix-Function(P) // (m) time. q 0 for i 1 to n do while q > 0 and P[q+1] T[i]

do q [q] if P[q+1] = T[i] then q q+1 // n total increments

if q = m then print “Pattern occurs with

shift” i m q [q]

// (n) time

Total time (m+n).

Page 10: Prefix & Suffix Example W = ab is a prefix of X = abefac where Y = efac. Example W = cdaa is a suffix of X = acbecdaa where Y = acbe A string W is a prefix

A KMP Example

i 1 2 3 4 5 6 7 8 9 10 11

[i] 0 0 1 2 0 1 2 3 4 3 1

abababbababbaababbababaaababbababaa

abababbababbaababbababaa ababbababaa

abababbababbaababbababaa ababbababaa

abababbababbaababbababaa ababbababaa

abababbababbaababbababaa ababbababaa

shift by q [q] = 4 2

shift by 9 4 = 5shift by 6 1 = 5

shift by 1 0 = 1

P[1..i] a b a b b a b a b a a