1 fast parallel and serial approximate string matching journal of algorithms, vol.10 (1989),...

43
1 Fast Parallel and Seri al Approximate String Matching Journal of Algorithms, Vol.10 (198 9), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C. T. Lee Speaker: L. Y. Huang

Upload: destiny-lynch

Post on 27-Mar-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

1

Fast Parallel and Serial Approximate String Matching

Journal of Algorithms, Vol.10 (1989), pp.157-169.G. Landau and U. Vishkin

Advisor: Prof. R. C. T. Lee

Speaker: L. Y. Huang

Page 2: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

2

Problem

• Give two arrays: P = p1p2…pm – the pattern, and T = t1t2…tn – the text, and an integer k (k 1),≧ find all occurrences of the pattern in the text with edit distances at most equal to k.

Page 3: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

3

• This algorithm improves the Alternative Dynamic Programming Computation.

• First, we introduce the Dynamic Programming Computation.

Page 4: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

4

The Dynamic Programming Algorithm[S80]

• In the dynamic programming approach, we construct a matrix Dn+1,m+1 when Di,j is the minimum edit distance between P(1, j) and any substring in T which ends at Ti.

• Example:

T = gggtcta

P = gttc

k = 221101112t

2

1

1

0

t

1

1

1

0

c atgg

223334c

212223t

110001g

000000

i 1 2 3 4 5 6 7

j

1

2

3

4

g

Page 5: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

5

• We found:– gt gt gt – gttc g t t gt

– g t c gtc– g t t c gtc

Distance =2(1)

Distance =1(2)

21101112t

2

1

1

0

t

1

1

1

0

c atgg

223334c

212223t

110001g

000000

i 1 2 3 4 5 6 7

j

1

2

3

4

g

Page 6: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

6

– g t c t g t c t gtct– g t t c g t t t gtct–

– g t c t g t c t gtct– g t t c g t t gtct

– g t c t a g t c t a gtcta– g t t c g t t a gtcta

Distance =2

Distance =2

Distance =2

(3)

(4)

(5)

21101112t

2

1

1

0

t

1

1

1

0

c atgg

223334c

212223t

110001g

000000

i 1 2 3 4 5 6 7

j

1

2

3

4

g

Page 7: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

7

An alternative Dynamic Programming Computation

• We should heavily use the concept of diagonal.

• Diagonal d is defined as all of the Di,j’s where d = i – j.

Diagonal 2

Diagonal 0

1

0122c

101b

0000

cba

i 1 2 3

j

1

2

Page 8: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

8

• We first have the following:– (a) If Ti= Pj, Di,j = Di-1,j-1;

– (b) otherwise, Di,j = Di-1,j-1+1 (subsitutaion) or Di,j = Di, j-1+1 (deletion) or Di,j = Di-1,j (insertion)

Page 9: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

9

• Consider any diagonal d. Let us find the largest j, if it exists, such that (i,j) is on Diagonal d (i - j = d) and Di,j = 0.

• Let us now label all of these locations.

c

t

0t

000 g

00000000

atctgggi 1 2 3 4 5 6 7

j

1

2

3

4

Diagonal 0Diagonal 1

Diagonal 2

Page 10: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

10

• Having found the above locations (i, j) where Di,j = 0, we can further find the largest j, if it exists, such that (i, j) is on Diagonal d and Di,j = 1.

• To do this, we use the following observation: Each element in Diagonal d can only influence elements in Diagonals d-1, d and d+1.

Page 11: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

11

• Let us consider any (i, j) location on Diagonal d.

• Why can Di,j suddenly become 1?– It can only be influenced as shown below:

• Thus, we conclude that we only need to consider Diagonals d-1, d and d+1.

Di-1, j-1Di, j-1

Di-1, jDi, j

d

d+1

d-1

delete

insert

substitution

Page 12: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

12

• Let us consider the following table.

• Question: what is the value of D4,3?– It can not be 0 because we have already decided that on

Diagonal 1, the largest j on Diagonal 1 is 1. Thus D4,3=1.

j

1

2

3

4

d =1

i 1 2 3 4 5 6 7

0c

?0t

00t

0000 g

00000000

atctggg

Page 13: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

13

• Question: What is the value of D5,4?

– Since T5 =P4, D5,4 =D4,3 =1.

j

1

2

3

4

d =1

i 1 2 3 4 5 6 7

?0c

10t

00t

0000 g

00000000

atctggg

Page 14: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

14

• Based upon the above discussion, we can find all (i,j)s where Di,j =1 after finding all (i’, j’)s when Di’,j’ =0.

• In fact, after finding all Di,js where Di,j = e, we can find all (i’, j’)s where Di’,j’ = e+1. Thus the dynamic programming table does not have to computed.

• In the following, we shall give the Alternative Dynamic Programming Computations Method formally.

Page 15: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

15

• Let Ld,e denote the largest row j such that Di,j is on the Diagonal d (i- j = d) and Di,j =e.

• Based upon this definition, e is the minimum edit distance between any substring of T ending at TLd,e+d and PLd,e+1 ≠TLd,e+d+1

• Let d =3. L3,0 = 0, L3,1=3, L3,2 =4

i 1 2 3 4 5 6 7

21 223334c

21101112t

1

1

0

t

1

1

0

c atggg

212223t

110001g

000000j

1

2

3

4

Page 16: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

16

• Example:

– T = gggtcta

– P = gttc

– k = 2

• Now, L3,1 = 3. It means that we have found a substring A, which is T(3,6)=gtct, ending at TLd,e+d = T3+3 =T6, such that the edit dista

nce between A and P(1,3) = gtt is 1.

• PLd,e+1 ≠TLd,e+d+1 P3+1 ≠T3+3+1

g g g t c t a

0 0 0 0 0 0 0 0

g 1 0 0 0 1 1 1 1

t 2 1 1 1 0 1 1 2

t 3 2 2 2 1 1 1 2

c 4 3 3 3 2 1 2 2

i 1 2 3 4 5 6 7

j

1

2

3

4

Page 17: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

17

• Example:– T = gggtcta – P = gttc – k = 2

• Now, L1,1 = 4 = m. It means that we have found substring A, which is T(2,5)=ggtc, ending at TLd,e+d = T3+3 =T6, such that the edit distance between A and P(1,3) = gtt is 1.

• They are T(2,5) = ggtc and P = gttc.

22123334c

21112223t

21101112t

11110001 g

00000000

atctggg

j

1

2

3

4

i 1 2 3 4 5 6 7

Page 18: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

18

• The alternative dynamic algorithm computation is to compute the Ld,e’s value.

Page 19: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

19

g g g t c t a

0 0 0 0 0 0 0 0

g 0

t 0

t 0

c 0

An alternative Dynamic Programming Computation

• First, we set the initial value.

• Example:– T = gggtcta– P= gttc

Page 20: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

20

g g g t c t a

0 0 0 0 0 0 0 0

g 0 0 0

t 0

t 0

c 0

i 1 2 3 4 5 6 7

j

1

2

3

4

• e =0• From d = 0 to d = n, if P[1…j] is equal T[d+1…i],

then we set the value of Ld,0 = j.• d = 0

• P1 = T1, L0,0 =1

d=0

Page 21: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

21

g g g t c t a

0 0 0 0 0 0 0 0

g 0 0 0

t 0

t 0

c 0

i 1 2 3 4 5 6 7

j

1

2

3

4

• e =0• d = 1

• P1 = T2, L1,0 =1

d=1

Page 22: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

22

g g g t c t a

0 0 0 0 0 0 0 0

g 0 0 0 0

t 0 0

t 0

c 0

i 1 2 3 4 5 6 7

j

1

2

3

4

• e =0

• d =2

• P1=T3, P2 = T4, L2,0 = 2

d=2

Page 23: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

23

• Our approach is based upon Rule 1 proposed by Professor Lee.

• Consider tow substring A1 and A2 as shown below:

A1 P1 S1

A2 P2 S2

If d(A1, A2) ≦k and S1=S2, then d(P1, P2) ≦k.

Page 24: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

24

• Observe the following:

• If d(A1,A2) = k, S1 = S2, x ≠ y, then d(A1+S1+x, A2+S2+y) ≦ k+1

A1

A2

S1

S2

x

y

Page 25: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

25

• For e≠0, we search through d = -e to d =n.

• Let row = max[(Ld,e-1+1),(Ld-1,e-1),(Ld+1,e-1+1)].

(subsitutaion) (deletion) (insertion)

• Find the largest j, if it exists, such that P(row+1, j) = T(row+1+d, i) =T(row +1+i-j, i), set Ld,e =j. If no such j exists, set Ld,e = row.

Page 26: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

26

• Let row = max[(Ld,e-1+1),(Ld-1,e-1),(Ld+1,e-1+1)].

(subsitutaion) (deletion) (insertion)

Ld,e-1

Ld-1,e-1 Ld+1,e-1

Diagonal d

Diagonal d+1

Diagonal d-1

substitution

deletioninsertion

Page 27: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

27

• row = max[(Ld,e-1+1),(Ld-1,e-1),(Ld+1,e-1+1)] = max[1+1, 2, 1+1] = max[2, 2, 2] = 2• P(row+1, j) ≠ T(row+1+d, i) , P3 ≠ T2

• L-1,1 = 2

d = -1

i 1 2 3 4 5 6 7

j

1

2

3

4 0c

0t

00t

0000 g

00000000

atctggg

Page 28: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

28

• row = max[(Ld,e-1+1),(Ld-1,e-1),(Ld+1,e-1+1)] = max[1+1, 1, 1+1]

= max[2, 1, 2] = 2• P(row+1, j) ≠ T(row+1+d, i) , P3 ≠ T3

• L0,1 = 2

i 1 2 3 4 5 6 7

d =0

j

1

2

3

4 0c

0t

010t

0000 g

00000000

atctggg

Page 29: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

29

• row = max[(Ld,e-1+1),(Ld-1,e-1),(Ld+1,e-1+1)]

= max[1+1, 1, 2+1]= max[2, 1, 3] = 3

• P(row+1, j) = T(row+1+d, i) = P4 = T5 = c

• L1,1 = 4 = m

• We find an occurrence of the pattern in the text with edit distance at most 1 that ends at Td+m = T1+4 = T5

j

1

2

3

4

d =1

i 1 2 3 4 5 6 7

0c

0t

0110t

0000 g

00000000

atctggg

Page 30: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

30

10c

110t

0110t

0000 g

00000000

atctgggi 1 2 3 4 5 6 7

j

1

2

3

4

d =3

• row = max[(Ld,e-1+1),(Ld-1,e-1),(Ld+1,e-1+1)] = max[0+1, 2, 0+1]

= max[1, 2, 1] = 2• P(row+1, j) = T(row+1+d, i) , P3 = T6 , P4 ≠T7

• L3,1 = 3

Page 31: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

31

• row = max[(Ld,e-1+1),(Ld-1,e-1),(Ld+1,e-1+1)]

= max[3+1, 3, 2+1] = max[4, 3, 3] = 4

• L3,2 = 4 = m • We find an occurrence of the pattern in the text with e

dit distance at most 2 that ends at td+m = t3+4 = t7.

22120c

1112220t

1101110t

1110000 g

00000000

atctgggj 1 2 3 4 5 6 7

i

1

2

3

4

d =3

Page 32: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

32

An alternative Dynamic Programming Computation

Initialization for all d, 0≦d ≦n, Ld,-1 = -1 for all d, -(k+1) ≦d -1, ≦ Ld,|d|-1 = |d|, Ld,|d|-2 = |d|-2

for all e, -1 ≦e ≦k, Ln+1,e = -1For e = 0 to k do

For d = -e to n dorow = max[(Ld,e-1+1),(Ld-1,e-1),(Ld+1,e-1+1)]row = min(row,m)while row < m and row +d <n and arow+1 = trow+1+d do

row = row + 1Ld,e = rowif Ld,e = m then

print *there is an occurrence ending at td+m*

Page 33: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

33

Different with this algorithm

• In the alternative dynamic algorithm computation, we must search j such that P(row+1,j) = T (row +1+d, i) = T (row +1+i-j, i).

• Essentially, we are looking for S1 and S2 in T and P respectively, as show below:

• This paper will use LCA (lowest common ancestor) to improve this searching part.

A1

A2

S1

S2

x

y

Page 34: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

34

• This algorithm has two steps:– Concatenate the text and the pattern to one string t1,

…,tn,p1,…pm. Compute the “suffix tree” of this string.

– Find all occurrence of the pattern in the text with edit distance at most k.

Algorithm

Page 35: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

35

T = ABCDEA

P = DDBE

S = ABCDEADDBE

Suffix tree of a string with length n can be constructed in O(n).

Weiner, 1973McCreight, 1976Ukkonen, 1995

3

CDEADDBE$

A

B DE

61

924 7 8

105

BCDEADDBE$DDBE$

CDEADDBE$

E$

EADDBE$DBE$

BE$

ADDBE$ $

Page 36: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

36

The lowest common ancestor of two leaf nodes can be found in O(1) by O(n) preprocessing in constructing time.

Harel and Tarjan, 1984

3

CDEADDBE$

A

B DE

61

924 7 8

105

BCDEADDBE$DDBE$

CDEADDBE$

E$

EADDBE$DBE$

BE$

ADDBE$ $

Page 37: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

37

• To find such S, if it exists, we may concatenate T and P to find a new string.

• Obviously, on the suffix tree, suffixes S1 and S2 have a common ancestor S.

A1

A2

S

S

x

y

T

P

S x yS

S1

S2

Page 38: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

38

• If we want to compute L3,1, we will use L2,0, L3,0, L4,0 to decide the row value (row =2).

1

0

a

0a

0a

1110t

101110t

10000 g

00000000

ctctgggi 1 2 3 4 5 6 7 8

j

1

2

3

4

5 d=3

In this paper, we find the length of LCA2,3 is 2.q = 2L3,1 = row +2 =4

tacgggtc g atat

S1

S2

Page 39: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

39

a$

taa$

cgttaa$

a gc t

tacgttaa$

gttaa$

g t

gtctacgttaa$

tctacgttaa$

taa$

ctacgttaa$

ctacgttaa$

a

a$

cgttac$

$

S= gggtctacgttac

textpattern

Page 40: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

40

Time Complexity

• An alternative Dynamic Programming Computation takes O(mn) time.

• The suffix tree has O(n) nodes.

• LCA query responds in O(1) time.

• For each of the n+k+1 diagonals, we evaluate (k+1)Ld,e’s

• This algorithm takes O(nk) time.

Page 41: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

41

• [AHU-74] A. V. AHO, J. W. HOPCROFT, AND J. D. ULLMAN, “The Designand Analysis of Computer Algorithms,” Addison- Wesley, Reading, MA, 1974

• [AILSV-88] A. APOSTOLICO, C. ILIOPOULOS, G.M. LANDAU, B. SCHIEBER, AND U. VISHKIN, Parallel construction of a suffix tree with applications, Algorithmica 3(1988), 347-365.

• [BM-77] R.S. BOYER AND J. S. MOORE, Afast string searching algorithm, Comm. ACM 20(1977), 762-772

• [CS-85] M. T. CHEN AND J. SEIFERAS, Efficient and elegant subword tree construction, in “Combinatiorial Algorithms on Words,” (A. Apostolico and Z. Galil, ED.), NATO ASI Series F: Computer and System Sciences Vol. 12, pp. 97-107, Springer-Verlag, New York/ Berlin, 1985.

• [G-84] Z. GALIL, Optimal parallel algorithms for string matching, in “”Proceedings, 16th ACM Symposium on Theory of Computing, 1984” pp..240-248; Inform. And CONTROL 67(1985), 144-157.

• [GG-86] Z. GALIL AND R. GIANCARLO, Improved string matching with k mismatches, SIGACT News 17, No. 4(1986), 52-54.

• [GG-87] Z. GALIL AND R. GIANCARLO, Parallel string matching with k mismatches, Theoret. Comput. Sci. 51(1987), 341-348.

• [GS-83] Z. GALIL AND J. I. SEFIERAS, Time-space-optimal string matching, J. Comput. System Sci. 26(1983),280-294

• [HT-84] D. HAREL AND R. E. TARJAN, Fast algorithms for finding nearest common ancestors, SIAM J. Comput. 13, No. 2(1984), 338-355.

• [KMP-77] D.E. KNUTH, J. H. MORRIS, AND V. R. PRATT, Fast pattern matching in strings, SIAM J. COMPUT. 6(1977), 323-350.

• [KR-87] R. KARP AND M. O. RABIN, Efficient randomized pattern-matching algortihms, IBM J. Res. Develop. 31, No.2(1987), 249-260

Reference

Page 42: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

42

• [LSV-87] G. M. LANDAU, B. SCHIEBER, AND U. VISHKIN, Parallel construction of a suffix tree, in “Proceedings 14th ICALP,” Lecture Notes in Computer Science Vol. 267, pp. 314-325, Springer-Verlag, New York/Berlin,1987.

• [LV-86a] G. M. Landau and U. Vishkin, Introducing efficient parallelism into approximate string matching, in “Proc. 18th ACM Symposium on Theory of Computing, 1986,” pp. 220-230.

• [LV-86b] G. M. Landau and U. Vishkin, Efficient string with k mismatches, Theoret. Comput. Sci.,43(1986), 239-249.

• [LV-88] G. M. LANDAU AND VISHKIN, Fast string matching with k differences, J. Comput. System Sci. 37(No. 1), 1988,63-78

• [S80] The Theory and Computation of Evolutionary Distances: Pattern Recognition, Sellers, P. H., Journal of Algorithms, Vol. 20, No. 1, 1980, pp. 359~373.

• [SK-83] D. SANKOFF AND J. B. KURSKAL (Eds.),”Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison,” Addison-Wesley, Reading, MA, 1983.

• [SV-88] B. SCHIEBER AND U. VISHIN, Parallel computation of lowest common ancestor in trees, SIAM J. Comput., in press.

• [U-83]E. UKKONEN, On approximate string matching, in press. In “Proceedings Int. Conf. Found. Comput. Theory,” Lecture Notes in Computer Science Vol. 158, pp. 487-495, Springer-Verlag, Berlin/New York, 1983.

• [U-85] E. UKKONEN, Finding approximate pattern in strings, J. Algorithms 6(1985),132-137.

• [V-83] U. VISHKIN, “Synchronous parallel computation-A survey,” TR-71, Department of Computer Science, Courant Institute, NYU, 1983.

• [V-85] U. VISHKIN, Optimal parallel pattern matching in strings, in “Proceedings 12th ICALP,” Lecture Notes in Computer Science Vol. 194, pp. 497-508, Springer-Verlag, New York/Berlin, Inform. and Control 67(1985, 91-113.)

Page 43: 1 Fast Parallel and Serial Approximate String Matching Journal of Algorithms, Vol.10 (1989), pp.157-169. G. Landau and U. Vishkin Advisor: Prof. R. C

43

Thank you