1 2 dimensional parameterized matching carmit hazay moshe lewenstein dekel tsur
TRANSCRIPT
![Page 1: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/1.jpg)
1
2 Dimensional Parameterized Matching
Carmit HazayMoshe Lewenstein
Dekel Tsur
![Page 2: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/2.jpg)
2
CPM 2005
![Page 3: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/3.jpg)
3
CPM 2005
![Page 4: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/4.jpg)
4
CPM 2005
![Page 5: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/5.jpg)
5
CPM 2005
![Page 6: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/6.jpg)
6
CPM 2005
![Page 7: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/7.jpg)
7
CPM 2005
![Page 8: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/8.jpg)
8
CPM 2005
![Page 9: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/9.jpg)
9
CPM 2005
![Page 10: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/10.jpg)
10
CPM 2005
![Page 11: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/11.jpg)
11
CPM 2005
![Page 12: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/12.jpg)
12
CPM 2005
![Page 13: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/13.jpg)
13
CPM 2005
![Page 14: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/14.jpg)
14
Parameterized Matching
Input: two strings s and t, |s|=|t|, over alphabets ∑s and ∑t.
s parameterize matches t: if bijection : ∑s ∑t , such that (s) = t.
∃
(a)=x
(b)=y
Π Π
ΠΠ
a ab b b
x xy y y
Example: s
t
![Page 15: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/15.jpg)
15
Parameterized Matching
Input: Two strings T, P; |T|=n, |P|=m.
Output: All text locations i, such that (P)=Ti …
Ti+m-1.Π
![Page 16: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/16.jpg)
16
2D Parameterized Matching
Input: Text T and pattern P; |T|=n*n, |P|=m*m.
Output: All text locations (i,j), such that (P)=Ti,j …Ti+m-1,j+m-1.
Example-
Π
a b ca a bb b bx y z
x x yy y y
(x)=a
(y)=b
(z)=c
ΠΠΠ
T
P
![Page 17: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/17.jpg)
17
2D Parameterized Matching
pattern
‘A horse is a horse,it ain’t make a differencewhat color it is’ John Wayne
![Page 18: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/18.jpg)
18
Parameterized Matching History
Introduced by Brenda Baker [Baker93].
Others: [AFM94], [Bak95], [Bak97].
Two Dimensions: [AACLP03][This work].
Used in scaled matching [ABL99].
Periodicity of parameterized matching [ApostolicoGiancarlo].
Approximate parameterized matching [AEL], [HLS04].
![Page 19: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/19.jpg)
19
Naïve Algorithm
For every location (i,j) of text Check if P parameterized matches at (i,j):
1. For each a alphabet of P, check if all
a’s of P align with same character 2. For each b alphabet of T, check if all b’s of T align with same character
![Page 20: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/20.jpg)
20
Naïve Algorithm
Time Analysis: If done properly – O(n2m2)
![Page 21: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/21.jpg)
21
Mismatch pairs
Pair of locations such that the characters disagree parameterized.
Example,
a a b a a ax x y x z y
![Page 22: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/22.jpg)
22
1D Encoding
Encode every text location by its predecessor location.
a b a d d a b d b c b d a a b d a a a a b b b T
First a to its left
Encoded T
1 3 6 13 14 15 16 17 18
0 1 3 6 13 14 15 16 17
![Page 23: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/23.jpg)
23
1D Encoding
Two p-matching strings have the same encoded texts.
a b b c b a a c b b c b a
x y y z y x x z y y z y x
0 0 2 0 3 1 6 4 5 9 8 10 7
0 0 2 0 3 1 6 4 5 9 8 10 7
S
Encoded S
T
Encoded T
![Page 24: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/24.jpg)
24
1D Encoding
Hence, in order to check whether two strings p-match, enough to compare their encoded strings.
Reduction to exact matching problem.
a b b c b b a c b b c b a
x y y z y x x z y y z y x
0 0 2 0 3 5 6 4 5 9 8 10 7
0 0 2 0 3 1 6 4 5 9 8 10 7
S
Encoded S
T
Encoded T
![Page 25: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/25.jpg)
25
2D Mismatch Pairs
Same as 1D mismatch pairs, but with 2D strings.
Example:
a b a
b a b
b a b
x y x
y y y
y y y
![Page 26: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/26.jpg)
26
First idea,Encode the linearization of text and pattern.
2D Encoding
As you will see this boxframes the texts that it Contains. That is 2D textAll in this little box.
As you will all see this box frames the text that itcontains. That is 2D textall in this little box .
![Page 27: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/27.jpg)
27
First idea,Encode the linearization of text and pattern.
2D Encoding
As you will see this boxframes the texts that it Contains. That is 2D textAll in this little box.
As you will see this box frames the texts that it Contains. That is 2D text All in this little box.
![Page 28: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/28.jpg)
28
First idea,Encode the linearization of text and pattern.
Overflow problem!!
2D Encoding
bb
b
Different character than b
a
a
![Page 29: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/29.jpg)
29
2D Encoding
Second idea, use strips.
Strip – Substring of T of size n*m.
i-th strip of T, is n*m substring T[1:n,i:i+m-1]. i
![Page 30: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/30.jpg)
30
Second Solution
For Pattern P compute predecessors on its linearization.
For each strip of T, compute predecessors on its linearization.
Do Pattern Matching for each strip.
Time – O(n2m).
Can we do better?
![Page 31: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/31.jpg)
31
A Faster Solution
Set into Duel-and-Sweep setting Needs special care for Duel, Sweep Especially difficult: Pattern
preprocessing
Desired Time: O(n2 + poly(m))
We Achieve: O(n2 + m2.5polylog m)
![Page 32: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/32.jpg)
32
Remember…
Observation:
T p-matches P
Every text location and its predecessor are not a mismatch pair
+ # of distinct characters in P and T equal
↔
![Page 33: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/33.jpg)
33
Algorithm Outline
Duel and sweep paradigm Find candidates - Dueling Divide candidates by strips Update predecessors of every new strip Check new predecessors - Sweep
Assume pattern witness table given.
![Page 34: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/34.jpg)
34
Witness
Witness – Mismatch pair between P and its alignment to location (a,b).
+a
+b
![Page 35: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/35.jpg)
35
Set Candidates
Using duel-Every two text locations that has a witness within their alignment can eliminate each other.
Apply algorithm [ABF94] and return list of candidates.
Time – O(n2).
![Page 36: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/36.jpg)
36
Sweep Technique
Observation, All candidates agree with each other.
Hence, Mismatch pair eliminates all candidates
containing it.
Therefore, For every predecessor, enough to find
one candidate that contains it.
![Page 37: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/37.jpg)
37
Sweep Technique
How to find? Create new 2m*2m array A such that,
A[i,j] = largest row among candidates that starts at column j and overlap with row i.
x
![Page 38: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/38.jpg)
38
Sweep Technique
For every predecessor (i,j), (x,y), use range minima query to find highest candidate contain predecessor.
![Page 39: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/39.jpg)
39
Sweep Technique
In case of a mismatch pair,eliminate all candidates containing it.
How?Use mismatch vector.Every mismatch pair translate into range.For new strips, delete old mistakes and add new.
All candidates within this range are eliminated.
![Page 40: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/40.jpg)
40
Sweep Technique Reminder-
T p-matches P
Every text location and its predecessor are not mismatch pair
+ # of distinct characters in P and T equal
Left to do?
Count distinct characters for every candidates. Use algorithm of Amir and Cole, time O(m2).
![Page 41: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/41.jpg)
41
Overview
Checking all predecessors takes linear time.
Total time O(n2).
![Page 42: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/42.jpg)
42
Pattern Preprocessing
Witness – Mismatch pair between P and its alignment to location (a,b).
+a
+b
![Page 43: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/43.jpg)
43
Pattern Preprocessing
Find witness table for P in time O(m2.5 * polylogm).
For every pattern location (i,j), create list of size O( ) pointers.
Pointer i is predecessor in lines above (i,j).
Reduce to exact matching with don’t cares.
m
)1+mi ,mi(
![Page 44: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/44.jpg)
44
Pattern Preprocessing
End cases, multiple cases.
A1
A3 A4
A2B1
B2 B3
B4
Less than m
![Page 45: 1 2 Dimensional Parameterized Matching Carmit Hazay Moshe Lewenstein Dekel Tsur](https://reader035.vdocuments.net/reader035/viewer/2022062714/56649d1f5503460f949f3d2e/html5/thumbnails/45.jpg)
45
Open Questions
Can the algorithm time complexity be reduced into O(n2+m2)?