internal pattern matching queries in a text and applicationskociumaka/files/soda2015b_slides.pdf ·...
TRANSCRIPT
![Page 1: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/1.jpg)
Internal Pattern Matching Queries in a Text andApplications
Tomasz Kociumaka Jakub RadoszewskiWojciech Rytter Tomasz Waleń
University of Warsaw, Poland
SODA 2015San Diego, California, USA
January 4, 2015
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 1/16
![Page 2: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/2.jpg)
Classic Data Structures for Text Processing
T : a b a a b a b a a b a a b a b a a b
a b a a b a b a a b a a b a b a a bconstruction
data structurequeryquery
P : a b a b apos: 4, 12
T [4..8]?= T [12..16]
YES
Indexing SubwordsGiven a pattern P, find alloccurrences of P in T .
Indexing queries:
a string is part of the queryinput.
Subword EqualityGiven two subwords x , y of T ,decide whether x = y .
Internal queries:
concern subwords of T only(identified by positions).
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 2/16
![Page 3: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/3.jpg)
Classic Data Structures for Text Processing
T : a b a a b a b a a b a a b a b a a b
a b a a b a b a a b a a b a b a a b
construction
data structure
queryquery
P : a b a b apos: 4, 12
T [4..8]?= T [12..16]
YES
Indexing SubwordsGiven a pattern P, find alloccurrences of P in T .
Indexing queries:
a string is part of the queryinput.
Subword EqualityGiven two subwords x , y of T ,decide whether x = y .
Internal queries:
concern subwords of T only(identified by positions).
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 2/16
![Page 4: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/4.jpg)
Classic Data Structures for Text Processing
T : a b a a b a b a a b a a b a b a a b
a b a a b a b a a b a a b a b a a b
construction
data structurequeryquery
P : a b a b apos: 4, 12
T [4..8]?= T [12..16]
YES
Indexing SubwordsGiven a pattern P, find alloccurrences of P in T .
Indexing queries:
a string is part of the queryinput.
Subword EqualityGiven two subwords x , y of T ,decide whether x = y .
Internal queries:
concern subwords of T only(identified by positions).
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 2/16
![Page 5: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/5.jpg)
Classic Data Structures for Text Processing
T : a b a a b a b a a b a a b a b a a b
a b a a b a b a a b a a b a b a a b
construction
data structurequeryquery
P : a b a b a
pos: 4, 12
T [4..8]?= T [12..16]
YES
Indexing SubwordsGiven a pattern P, find alloccurrences of P in T .
Indexing queries:
a string is part of the queryinput.
Subword EqualityGiven two subwords x , y of T ,decide whether x = y .
Internal queries:
concern subwords of T only(identified by positions).
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 2/16
![Page 6: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/6.jpg)
Classic Data Structures for Text Processing
T :
a b a a b a b a a b a a b a b a a b
a b a a b a b a a b a a b a b a a bconstruction
data structurequeryquery
P : a b a b apos: 4, 12
T [4..8]?= T [12..16]
YES
Indexing SubwordsGiven a pattern P, find alloccurrences of P in T .
Indexing queries:
a string is part of the queryinput.
Subword EqualityGiven two subwords x , y of T ,decide whether x = y .
Internal queries:
concern subwords of T only(identified by positions).
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 2/16
![Page 7: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/7.jpg)
Classic Data Structures for Text Processing
T : a b a a b a b a a b a a b a b a a b
a b a a b a b a a b a a b a b a a b
construction
data structurequeryquery
P : a b a b apos: 4, 12
T [4..8]?= T [12..16]
YES
Indexing SubwordsGiven a pattern P, find alloccurrences of P in T .
Indexing queries:
a string is part of the queryinput.
Subword EqualityGiven two subwords x , y of T ,decide whether x = y .
Internal queries:
concern subwords of T only(identified by positions).
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 2/16
![Page 8: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/8.jpg)
Classic Data Structures for Text Processing
T :
a b a a b a b a a b a a b a b a a b
a b a a b a b a a b a a b a b a a bconstruction
data structurequeryquery
P : a b a b apos: 4, 12
T [4..8]?= T [12..16]
YES
Indexing SubwordsGiven a pattern P, find alloccurrences of P in T .
Indexing queries:
a string is part of the queryinput.
Subword EqualityGiven two subwords x , y of T ,decide whether x = y .
Internal queries:
concern subwords of T only(identified by positions).
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 2/16
![Page 9: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/9.jpg)
Classic Data Structures for Text Processing
T : a b a a b a b a a b a a b a b a a b
a b a a b a b a a b a a b a b a a b
construction
data structurequeryquery
P : a b a b apos: 4, 12
T [4..8]?= T [12..16]
YES
Indexing SubwordsGiven a pattern P, find alloccurrences of P in T .
Indexing queries:
a string is part of the queryinput.
Subword EqualityGiven two subwords x , y of T ,decide whether x = y .
Internal queries:
concern subwords of T only(identified by positions).
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 2/16
![Page 10: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/10.jpg)
Classic Data Structures for Text Processing
T : a b a a b a b a a b a a b a b a a b
a b a a b a b a a b a a b a b a a b
construction
data structurequeryquery
P : a b a b apos: 4, 12
T [4..8]?= T [12..16]
YES
Indexing SubwordsGiven a pattern P, find alloccurrences of P in T .
Indexing queries:
a string is part of the queryinput.
Subword EqualityGiven two subwords x , y of T ,decide whether x = y .
Internal queries:←− this talk
concern subwords of T only(identified by positions).
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 2/16
![Page 11: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/11.jpg)
Internal Queries
Internal queries
Solve a problem for inputs being subwords of a given text T .
O(1) query input size allows for O(1) query time,Ω(|P|) required if the “pattern” P is in the input.
Motivation:
All input data must be known in advancePrimitives for algorithms and data structures
linear size and O(1) query time crucial for applicability,efficient construction important for applications in algorithms.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 3/16
![Page 12: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/12.jpg)
Internal Queries
Internal queries
Solve a problem for inputs being subwords of a given text T .
O(1) query input size allows for O(1) query time,Ω(|P|) required if the “pattern” P is in the input.
Motivation:
All input data must be known in advancePrimitives for algorithms and data structures
linear size and O(1) query time crucial for applicability,efficient construction important for applications in algorithms.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 3/16
![Page 13: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/13.jpg)
Internal Queries
Internal queries
Solve a problem for inputs being subwords of a given text T .
O(1) query input size allows for O(1) query time,Ω(|P|) required if the “pattern” P is in the input.
Motivation:
All input data must be known in advance
Primitives for algorithms and data structureslinear size and O(1) query time crucial for applicability,efficient construction important for applications in algorithms.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 3/16
![Page 14: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/14.jpg)
Internal Queries
Internal queries
Solve a problem for inputs being subwords of a given text T .
O(1) query input size allows for O(1) query time,Ω(|P|) required if the “pattern” P is in the input.
Motivation:
All input data must be known in advancePrimitives for algorithms and data structures
linear size and O(1) query time crucial for applicability,efficient construction important for applications in algorithms.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 3/16
![Page 15: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/15.jpg)
Internal Queries: Previous Techniques
13
6
8
1
10
3
12
5
7
9
2
11
4
$ab
a
b
a
b
b
$
b$ babababb$
b$ babababb$
b
$ a
b
a
babb$
bb$
bb
$abababb$
b
$abababb$
Suffix tree
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 4/16
![Page 16: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/16.jpg)
Internal Queries: Previous Techniques
Range successor queries
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 4/16
![Page 17: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/17.jpg)
Internal Queries: Previous Techniques
Suffix tree13
6
8
1
10
3
12
5
7
9
2
11
4
$ab
a
b
a
b
b
$
b$ babababb$
b$ babababb$
b
$ a
b
a
babb$
bb$
bb
$abababb$
b
$abababb$
Range successor queries
Suffix tree +problem-specific tools:
Longest common prefix
Min/Max suffix
+ efficient, often O(1) timequeries with O(n) space,
− limited use.
Suffix tree +orthogonal range queries:
Pattern matching-related
Period queries
+ wide applicability,
− lower bounds for query time,slow construction.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 4/16
![Page 18: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/18.jpg)
Internal Queries: Previous Techniques
Suffix tree13
6
8
1
10
3
12
5
7
9
2
11
4
$ab
a
b
a
b
b
$
b$ babababb$
b$ babababb$
b
$ a
b
a
babb$
bb$
bb
$abababb$
b
$abababb$
Range successor queries
Suffix tree +problem-specific tools:
Longest common prefix
Min/Max suffix
+ efficient, often O(1) timequeries with O(n) space,
− limited use.
Suffix tree +orthogonal range queries:
Pattern matching-related
Period queries
+ wide applicability,
− lower bounds for query time,slow construction.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 4/16
![Page 19: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/19.jpg)
Internal Queries: Previous Techniques
Suffix tree13
6
8
1
10
3
12
5
7
9
2
11
4
$ab
a
b
a
b
b
$
b$ babababb$
b$ babababb$
b
$ a
b
a
babb$
bb$
bb
$abababb$
b
$abababb$
Range successor queries
Suffix tree +problem-specific tools:
Longest common prefix
Min/Max suffix
+ efficient, often O(1) timequeries with O(n) space,
− limited use.
Suffix tree +orthogonal range queries:
Pattern matching-related
Period queries
+ wide applicability,
− lower bounds for query time,slow construction.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 4/16
![Page 20: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/20.jpg)
Internal Pattern Matching Queries
Fact
Let Occ(x , y) be the positions in y where occurrences of x start.Occ(x , y) forms an arithmetic progression if |y | ≤ 2|x |.
x : a b a b a b a b b a b a b a b a b a b a ay :
b b a b a b a b a b a b a ab b a b a b a b a b a b a ab b a b a b a b a b a b a a
Occ(x , y) =
3, 5, 7
T :
Problem (Internal Pattern Matching Queries)
Given subwords x and y of T with |y | ≤ 2|x |, report alloccurrences of x in y (as an arithmetic progression).
Θ(|y ||x |
)space is necessary to encode Occ(x , y),
Occ(x , y) can be computed with Θ(|y ||x |
)IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 5/16
![Page 21: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/21.jpg)
Internal Pattern Matching Queries
Fact
Let Occ(x , y) be the positions in y where occurrences of x start.Occ(x , y) forms an arithmetic progression if |y | ≤ 2|x |.
x : a b a b a b a
b b a b a b a b a b a b a a
y : b b a b a b a b a b a b a a
b b a b a b a b a b a b a ab b a b a b a b a b a b a a
Occ(x , y) = 3,
5, 7
T :
Problem (Internal Pattern Matching Queries)
Given subwords x and y of T with |y | ≤ 2|x |, report alloccurrences of x in y (as an arithmetic progression).
Θ(|y ||x |
)space is necessary to encode Occ(x , y),
Occ(x , y) can be computed with Θ(|y ||x |
)IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 5/16
![Page 22: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/22.jpg)
Internal Pattern Matching Queries
Fact
Let Occ(x , y) be the positions in y where occurrences of x start.Occ(x , y) forms an arithmetic progression if |y | ≤ 2|x |.
x : a b a b a b a
b b a b a b a b a b a b a a
y :
b b a b a b a b a b a b a a
b b a b a b a b a b a b a a
b b a b a b a b a b a b a a
Occ(x , y) = 3, 5,
7
T :
Problem (Internal Pattern Matching Queries)
Given subwords x and y of T with |y | ≤ 2|x |, report alloccurrences of x in y (as an arithmetic progression).
Θ(|y ||x |
)space is necessary to encode Occ(x , y),
Occ(x , y) can be computed with Θ(|y ||x |
)IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 5/16
![Page 23: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/23.jpg)
Internal Pattern Matching Queries
Fact
Let Occ(x , y) be the positions in y where occurrences of x start.Occ(x , y) forms an arithmetic progression if |y | ≤ 2|x |.
x : a b a b a b a
b b a b a b a b a b a b a a
y :
b b a b a b a b a b a b a ab b a b a b a b a b a b a a
b b a b a b a b a b a b a a
Occ(x , y) = 3, 5, 7
T :
Problem (Internal Pattern Matching Queries)
Given subwords x and y of T with |y | ≤ 2|x |, report alloccurrences of x in y (as an arithmetic progression).
Θ(|y ||x |
)space is necessary to encode Occ(x , y),
Occ(x , y) can be computed with Θ(|y ||x |
)IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 5/16
![Page 24: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/24.jpg)
Internal Pattern Matching Queries
Fact
Let Occ(x , y) be the positions in y where occurrences of x start.Occ(x , y) forms an arithmetic progression if |y | ≤ 2|x |.
x : a b a b a b a b b a b a b a b a b a b a ay :
b b a b a b a b a b a b a ab b a b a b a b a b a b a ab b a b a b a b a b a b a a
Occ(x , y) = 3, 5, 7
T :
Problem (Internal Pattern Matching Queries)
Given subwords x and y of T with |y | ≤ 2|x |, report alloccurrences of x in y (as an arithmetic progression).
Θ(|y ||x |
)space is necessary to encode Occ(x , y),
Occ(x , y) can be computed with Θ(|y ||x |
)IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 5/16
![Page 25: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/25.jpg)
Internal Pattern Matching Queries
Fact
Let Occ(x , y) be the positions in y where occurrences of x start.Occ(x , y) forms an arithmetic progression if |y | ≤ 2|x |.
x : a b a b a b a b b a b a b a b a b a b a ay :
b b a b a b a b a b a b a ab b a b a b a b a b a b a ab b a b a b a b a b a b a a
Occ(x , y) = 3, 5, 7
T :
Problem (Internal Pattern Matching Queries)
Given subwords x and y of T with |y | ≤ 2|x |, report alloccurrences of x in y (as an arithmetic progression).
Θ(|y ||x |
)space is necessary to encode Occ(x , y),
Occ(x , y) can be computed with Θ(|y ||x |
)IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 5/16
![Page 26: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/26.jpg)
IPM Queries: Our Results
Problem (Internal Pattern Matching Queries)
Given subwords x and y of T with |y | ≤ 2|x |, report alloccurrences of x in y (as an arithmetic progression).
Theorem
IPM Queries can be answered in O(1) time by a data structureof size O(n), which can be constructed in O(n) time.
Technical assumptions:
Word-RAM model with word-size w = Ω(log n).Construction:
randomized (Las Vegas, expected time),
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 6/16
![Page 27: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/27.jpg)
IPM Queries: Our Results
Problem (Internal Pattern Matching Queries)
Given subwords x and y of T with |y | ≤ 2|x |, report alloccurrences of x in y (as an arithmetic progression).
Theorem
IPM Queries can be answered in O(1) time by a data structureof size O(n), which can be constructed in O(n) time.
Technical assumptions:
Word-RAM model with word-size w = Ω(log n).
Construction:randomized (Las Vegas, expected time),
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 6/16
![Page 28: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/28.jpg)
IPM Queries: Our Results
Problem (Internal Pattern Matching Queries)
Given subwords x and y of T with |y | ≤ 2|x |, report alloccurrences of x in y (as an arithmetic progression).
Theorem
IPM Queries can be answered in O(1) time by a data structureof size O(n), which can be constructed in O(n) time.
Technical assumptions:
Word-RAM model with word-size w = Ω(log n).Construction:
randomized (Las Vegas, expected time),
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 6/16
![Page 29: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/29.jpg)
Applications
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 7/16
![Page 30: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/30.jpg)
Prefix-Suffix and Period Queries
T : b a b a b a b a b b a a
b a b a b a b a b b a ab a b a b a b a b b a ab a b a b a b a b b a a
x :
d
2d
b a b b a b a b a b a
b a b b a b a b a b ab a b b a b a b a b ab a b b a b a b a b a
y :
d
2d
lengths: 4
, 6, 8
Problem (Prefix-Suffix Queries)
Given subwords x and y of T and d ∈ N, report all prefixes of x oflength between d and 2d that are also suffixes of y .
O(1) time using IPM Queries.
x : a b b a b a b b a b a b b a b a b b
Problem (Period Queries)
Given a subword x of T , report all periods of x .
O(log |x |) time using IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 8/16
![Page 31: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/31.jpg)
Prefix-Suffix and Period Queries
T :
b a b a b a b a b b a a
b a b a b a b a b b a a
b a b a b a b a b b a ab a b a b a b a b b a a
x :
d
2d
b a b b a b a b a b a
b a b b a b a b a b a
b a b b a b a b a b ab a b b a b a b a b a
y :
d
2d
lengths: 4
, 6, 8
Problem (Prefix-Suffix Queries)
Given subwords x and y of T and d ∈ N, report all prefixes of x oflength between d and 2d that are also suffixes of y .
O(1) time using IPM Queries.
x : a b b a b a b b a b a b b a b a b b
Problem (Period Queries)
Given a subword x of T , report all periods of x .
O(log |x |) time using IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 8/16
![Page 32: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/32.jpg)
Prefix-Suffix and Period Queries
T :
b a b a b a b a b b a ab a b a b a b a b b a a
b a b a b a b a b b a a
b a b a b a b a b b a a
x :
d
2d
b a b b a b a b a b ab a b b a b a b a b a
b a b b a b a b a b a
b a b b a b a b a b a
y :
d
2d
lengths: 4, 6
, 8
Problem (Prefix-Suffix Queries)
Given subwords x and y of T and d ∈ N, report all prefixes of x oflength between d and 2d that are also suffixes of y .
O(1) time using IPM Queries.
x : a b b a b a b b a b a b b a b a b b
Problem (Period Queries)
Given a subword x of T , report all periods of x .
O(log |x |) time using IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 8/16
![Page 33: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/33.jpg)
Prefix-Suffix and Period Queries
T :
b a b a b a b a b b a ab a b a b a b a b b a ab a b a b a b a b b a a
b a b a b a b a b b a ax :
d
2d
b a b b a b a b a b ab a b b a b a b a b ab a b b a b a b a b a
b a b b a b a b a b ay :
d
2d
lengths: 4, 6, 8
Problem (Prefix-Suffix Queries)
Given subwords x and y of T and d ∈ N, report all prefixes of x oflength between d and 2d that are also suffixes of y .
O(1) time using IPM Queries.
x : a b b a b a b b a b a b b a b a b b
Problem (Period Queries)
Given a subword x of T , report all periods of x .
O(log |x |) time using IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 8/16
![Page 34: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/34.jpg)
Prefix-Suffix and Period Queries
T : b a b a b a b a b b a a
b a b a b a b a b b a ab a b a b a b a b b a ab a b a b a b a b b a a
x :
d
2d
b a b b a b a b a b a
b a b b a b a b a b ab a b b a b a b a b ab a b b a b a b a b a
y :
d
2d
lengths: 4, 6, 8
Problem (Prefix-Suffix Queries)
Given subwords x and y of T and d ∈ N, report all prefixes of x oflength between d and 2d that are also suffixes of y .
O(1) time using IPM Queries.
x : a b b a b a b b a b a b b a b a b b
Problem (Period Queries)
Given a subword x of T , report all periods of x .
O(log |x |) time using IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 8/16
![Page 35: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/35.jpg)
Prefix-Suffix and Period Queries
T : b a b a b a b a b b a a
b a b a b a b a b b a ab a b a b a b a b b a ab a b a b a b a b b a a
x :
d
2d
b a b b a b a b a b a
b a b b a b a b a b ab a b b a b a b a b ab a b b a b a b a b a
y :
d
2d
lengths: 4, 6, 8
Problem (Prefix-Suffix Queries)
Given subwords x and y of T and d ∈ N, report all prefixes of x oflength between d and 2d that are also suffixes of y .
O(1) time using IPM Queries.
x : a b b a b a b b a b a b b a b a b b
Problem (Period Queries)
Given a subword x of T , report all periods of x .
O(log |x |) time using IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 8/16
![Page 36: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/36.jpg)
Prefix-Suffix and Period Queries
T : b a b a b a b a b b a a
b a b a b a b a b b a ab a b a b a b a b b a ab a b a b a b a b b a a
x :
d
2d
b a b b a b a b a b a
b a b b a b a b a b ab a b b a b a b a b ab a b b a b a b a b a
y :
d
2d
lengths: 4, 6, 8
Problem (Prefix-Suffix Queries)
Given subwords x and y of T and d ∈ N, report all prefixes of x oflength between d and 2d that are also suffixes of y .
O(1) time using IPM Queries.
x : a b b a b a b b a b a b b a b a b b
Problem (Period Queries)
Given a subword x of T , report all periods of x .
O(log |x |) time using IPM Queries.Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 8/16
![Page 37: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/37.jpg)
Applications in Substring Compression
T :
y :i ji ′ j ′
x :
Substring Compressionserver
client
T [i , j ]? LZ (y)
y :
T [i ′, j ′]? LZ (x |y)
x :
Problem (Cormode & Muthukrishnan; SODA 2005)
Given subwords x and y , compute LZ (x |y), i.e., the section of theLempel-Ziv (LZ77) factorization LZ(y$x) that corresponds to x .
data structure of O(n) size,
query time improved from O(C log |x |C logε n) to
O(C log log |x |C logε n) where C is the output size,
combines orthogonal range queries with IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 9/16
![Page 38: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/38.jpg)
Applications in Substring Compression
T :
y :i ji ′ j ′
x :
Substring Compressionserver
clientT [i , j ]?
LZ (y)
y :
T [i ′, j ′]? LZ (x |y)
x :
Problem (Cormode & Muthukrishnan; SODA 2005)
Given subwords x and y , compute LZ (x |y), i.e., the section of theLempel-Ziv (LZ77) factorization LZ(y$x) that corresponds to x .
data structure of O(n) size,
query time improved from O(C log |x |C logε n) to
O(C log log |x |C logε n) where C is the output size,
combines orthogonal range queries with IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 9/16
![Page 39: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/39.jpg)
Applications in Substring Compression
T : y :i j
i ′ j ′x :
Substring Compressionserver
clientT [i , j ]?
LZ (y)
y :
T [i ′, j ′]? LZ (x |y)
x :
Problem (Cormode & Muthukrishnan; SODA 2005)
Given subwords x and y , compute LZ (x |y), i.e., the section of theLempel-Ziv (LZ77) factorization LZ(y$x) that corresponds to x .
data structure of O(n) size,
query time improved from O(C log |x |C logε n) to
O(C log log |x |C logε n) where C is the output size,
combines orthogonal range queries with IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 9/16
![Page 40: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/40.jpg)
Applications in Substring Compression
T : y :i j
i ′ j ′x :
Substring Compressionserver
clientT [i , j ]? LZ (y)
y :
T [i ′, j ′]? LZ (x |y)
x :
Problem (Cormode & Muthukrishnan; SODA 2005)
Given subwords x and y , compute LZ (x |y), i.e., the section of theLempel-Ziv (LZ77) factorization LZ(y$x) that corresponds to x .
data structure of O(n) size,
query time improved from O(C log |x |C logε n) to
O(C log log |x |C logε n) where C is the output size,
combines orthogonal range queries with IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 9/16
![Page 41: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/41.jpg)
Applications in Substring Compression
T : y :i j
i ′ j ′x :
Substring Compressionserver
clientT [i , j ]? LZ (y)
y :
T [i ′, j ′]? LZ (x |y)
x :
Problem (Cormode & Muthukrishnan; SODA 2005)
Given subwords x and y , compute LZ (x |y), i.e., the section of theLempel-Ziv (LZ77) factorization LZ(y$x) that corresponds to x .
data structure of O(n) size,
query time improved from O(C log |x |C logε n) to
O(C log log |x |C logε n) where C is the output size,
combines orthogonal range queries with IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 9/16
![Page 42: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/42.jpg)
Applications in Substring Compression
T : y :i j
i ′ j ′x :
Generalized Substring Compressionserver
clientT [i , j ]? LZ (y)
y :
T [i ′, j ′]?
LZ (x |y)
x :
Problem (Cormode & Muthukrishnan; SODA 2005)
Given subwords x and y , compute LZ (x |y), i.e., the section of theLempel-Ziv (LZ77) factorization LZ(y$x) that corresponds to x .
data structure of O(n) size,
query time improved from O(C log |x |C logε n) to
O(C log log |x |C logε n) where C is the output size,
combines orthogonal range queries with IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 9/16
![Page 43: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/43.jpg)
Applications in Substring Compression
T : y :i ji ′ j ′
x :
Generalized Substring Compressionserver
clientT [i , j ]? LZ (y)
y :
T [i ′, j ′]?
LZ (x |y)
x :
Problem (Cormode & Muthukrishnan; SODA 2005)
Given subwords x and y , compute LZ (x |y), i.e., the section of theLempel-Ziv (LZ77) factorization LZ(y$x) that corresponds to x .
data structure of O(n) size,
query time improved from O(C log |x |C logε n) to
O(C log log |x |C logε n) where C is the output size,
combines orthogonal range queries with IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 9/16
![Page 44: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/44.jpg)
Applications in Substring Compression
T : y :i ji ′ j ′
x :
Generalized Substring Compressionserver
clientT [i , j ]? LZ (y)
y :
T [i ′, j ′]? LZ (x |y)
x :
Problem (Cormode & Muthukrishnan; SODA 2005)
Given subwords x and y , compute LZ (x |y), i.e., the section of theLempel-Ziv (LZ77) factorization LZ(y$x) that corresponds to x .
data structure of O(n) size,
query time improved from O(C log |x |C logε n) to
O(C log log |x |C logε n) where C is the output size,
combines orthogonal range queries with IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 9/16
![Page 45: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/45.jpg)
Applications in Substring Compression
T : y :i ji ′ j ′
x :
Generalized Substring Compressionserver
clientT [i , j ]? LZ (y)
y :
T [i ′, j ′]? LZ (x |y)
x :
Problem (Cormode & Muthukrishnan; SODA 2005)
Given subwords x and y , compute LZ (x |y), i.e., the section of theLempel-Ziv (LZ77) factorization LZ(y$x) that corresponds to x .
data structure of O(n) size,
query time improved from O(C log |x |C logε n) to
O(C log log |x |C logε n) where C is the output size,
combines orthogonal range queries with IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 9/16
![Page 46: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/46.jpg)
Applications in Substring Compression
T : y :i ji ′ j ′
x :
Generalized Substring Compressionserver
clientT [i , j ]? LZ (y)
y :
T [i ′, j ′]? LZ (x |y)
x :
Problem (Cormode & Muthukrishnan; SODA 2005)
Given subwords x and y , compute LZ (x |y), i.e., the section of theLempel-Ziv (LZ77) factorization LZ(y$x) that corresponds to x .
data structure of O(n) size,
query time improved from O(C log |x |C logε n) to
O(C log log |x |C logε n) where C is the output size,
combines orthogonal range queries with IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 9/16
![Page 47: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/47.jpg)
Applications in Substring Compression
T : y :i ji ′ j ′
x :
Generalized Substring Compressionserver
clientT [i , j ]? LZ (y)
y :
T [i ′, j ′]? LZ (x |y)
x :
Problem (Cormode & Muthukrishnan; SODA 2005)
Given subwords x and y , compute LZ (x |y), i.e., the section of theLempel-Ziv (LZ77) factorization LZ(y$x) that corresponds to x .
data structure of O(n) size,
query time improved from O(C log |x |C logε n) to
O(C log log |x |C logε n) where C is the output size,
combines orthogonal range queries with IPM Queries.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 9/16
![Page 48: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/48.jpg)
Other Applications
Problem (Cyclic Equivalence Queries)
Decide whether given subwords x and y are cyclic shifts of eachother (x = uv and y = vu for some strings u and v).
O(1) query time using IPM Queries.
Definition (δ-subrepetition)
A δ-subrepetition is a fragment x such that per(x) ≤ |x |1+δ and
x cannot be extended (to the left or right) preserving per(x).
Improved algorithm for finding δ-subrepetitions in a word:first (expected) linear-time algorithm for δ = Θ(1).
More applications through wavelet suffix trees:substring suffix rank & selection,substring compression using Burrows-Wheeler transform.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 10/16
![Page 49: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/49.jpg)
Other Applications
Problem (Cyclic Equivalence Queries)
Decide whether given subwords x and y are cyclic shifts of eachother (x = uv and y = vu for some strings u and v).
O(1) query time using IPM Queries.
Definition (δ-subrepetition)
A δ-subrepetition is a fragment x such that per(x) ≤ |x |1+δ and
x cannot be extended (to the left or right) preserving per(x).
Improved algorithm for finding δ-subrepetitions in a word:first (expected) linear-time algorithm for δ = Θ(1).
More applications through wavelet suffix trees:substring suffix rank & selection,substring compression using Burrows-Wheeler transform.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 10/16
![Page 50: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/50.jpg)
Other Applications
Problem (Cyclic Equivalence Queries)
Decide whether given subwords x and y are cyclic shifts of eachother (x = uv and y = vu for some strings u and v).
O(1) query time using IPM Queries.
Definition (δ-subrepetition)
A δ-subrepetition is a fragment x such that per(x) ≤ |x |1+δ and
x cannot be extended (to the left or right) preserving per(x).
Improved algorithm for finding δ-subrepetitions in a word:first (expected) linear-time algorithm for δ = Θ(1).
More applications through wavelet suffix trees:substring suffix rank & selection,substring compression using Burrows-Wheeler transform.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 10/16
![Page 51: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/51.jpg)
High-level Ideas
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 11/16
![Page 52: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/52.jpg)
IPM Queries: Outline
x :
z :
2k
y :
Main idea:Design (locally consistent) representative assignment
representatives of length 2k with |x|4 < 2k ≤ |x|
2 ,store information about location of representative only.
Query algorithm:
1. Compute the representative z assigned to x .
2. Find occurrences of z within y .
3. Check which occurrences of z extend to occurrences of x .
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 12/16
![Page 53: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/53.jpg)
IPM Queries: Outline
x :
z :
2k
y :
Main idea:Design (locally consistent) representative assignment
representatives of length 2k with |x|4 < 2k ≤ |x|
2 ,store information about location of representative only.
Query algorithm:
1. Compute the representative z assigned to x .
2. Find occurrences of z within y .
3. Check which occurrences of z extend to occurrences of x .
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 12/16
![Page 54: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/54.jpg)
IPM Queries: Outline
x :
z :
2k
y :
Main idea:Design (locally consistent) representative assignment
representatives of length 2k with |x|4 < 2k ≤ |x|
2 ,store information about location of representative only.
Query algorithm:
1. Compute the representative z assigned to x .
2. Find occurrences of z within y .
3. Check which occurrences of z extend to occurrences of x .
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 12/16
![Page 55: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/55.jpg)
IPM Queries: Outline
x : z :
2k
y :
Main idea:Design (locally consistent) representative assignment
representatives of length 2k with |x|4 < 2k ≤ |x|
2 ,store information about location of representative only.
Query algorithm:
1. Compute the representative z assigned to x .
2. Find occurrences of z within y .
3. Check which occurrences of z extend to occurrences of x .
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 12/16
![Page 56: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/56.jpg)
IPM Queries: Outline
x : z :
2k
y :
Main idea:Design (locally consistent) representative assignment
representatives of length 2k with |x|4 < 2k ≤ |x|
2 ,store information about location of representative only.
Query algorithm:
1. Compute the representative z assigned to x .
2. Find occurrences of z within y .
3. Check which occurrences of z extend to occurrences of x .
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 12/16
![Page 57: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/57.jpg)
IPM Queries: Outline
x : z :
2k
y :
Main idea:Design (locally consistent) representative assignment
representatives of length 2k with |x|4 < 2k ≤ |x|
2 ,store information about location of representative only.
Query algorithm:
1. Compute the representative z assigned to x .
2. Find occurrences of z (as representatives) within y .
3. Check which occurrences of z extend to occurrences of x .
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 12/16
![Page 58: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/58.jpg)
IPM Queries: Outline
x : z :
2k
y :
Main idea:Design (locally consistent) representative assignment
representatives of length 2k with |x|4 < 2k ≤ |x|
2 ,store information about location of representative only.
Query algorithm:
1. Compute the representative z assigned to x .
2. Find occurrences of z (as representatives) within y .
3. Check which occurrences of z extend to occurrences of x .Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 12/16
![Page 59: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/59.jpg)
IPM Queries: Outline
x : z :
2k
y :
Main idea:Design (locally consistent) representative assignment
representatives of length 2k with |x|4 < 2k ≤ |x|
2 ,store information about location of representative only.
Query algorithm:
1. Compute the representative z assigned to x .
2. Find occurrences of z (as representatives) within y .
3. Check which occurrences of z extend to occurrences of x .Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 12/16
![Page 60: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/60.jpg)
Periodic Representatives
x :
y :
Representative: any periodic subword of appropriate length.
use combinatorial and algorithmic tools for repetitions.
Query algorithm:
1. Extend z to a run.
2. Find runs of the same period in y .
3. Generate candidates using run endpoints.
4. Check which candidates are indeed occurrences.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 13/16
![Page 61: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/61.jpg)
Periodic Representatives
x :
y :
Representative: any periodic subword of appropriate length.
use combinatorial and algorithmic tools for repetitions.
Query algorithm:
1. Extend z to a run.
2. Find runs of the same period in y .
3. Generate candidates using run endpoints.
4. Check which candidates are indeed occurrences.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 13/16
![Page 62: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/62.jpg)
Periodic Representatives
x :
y :
Representative: any periodic subword of appropriate length.
use combinatorial and algorithmic tools for repetitions.
Query algorithm:
1. Extend z to a run. (Suppose it does not cover x .)
2. Find runs of the same period in y .
3. Generate candidates using run endpoints.
4. Check which candidates are indeed occurrences.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 13/16
![Page 63: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/63.jpg)
Periodic Representatives
x :
y :
Representative: any periodic subword of appropriate length.
use combinatorial and algorithmic tools for repetitions.
Query algorithm:
1. Extend z to a run. (Suppose it does not cover x .)
2. Find runs of the same period in y .
3. Generate candidates using run endpoints.
4. Check which candidates are indeed occurrences.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 13/16
![Page 64: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/64.jpg)
Periodic Representatives
x :
y :
Representative: any periodic subword of appropriate length.
use combinatorial and algorithmic tools for repetitions.
Query algorithm:
1. Extend z to a run. (Suppose it does not cover x .)
2. Find runs of the same period in y .
3. Generate candidates using run endpoints.
4. Check which candidates are indeed occurrences.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 13/16
![Page 65: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/65.jpg)
Periodic Representatives
x :
y :
Representative: any periodic subword of appropriate length.
use combinatorial and algorithmic tools for repetitions.
Query algorithm:
1. Extend z to a run. (Suppose it does not cover x .)
2. Find runs of the same period in y .
3. Generate candidates using run endpoints.
4. Check which candidates are indeed occurrences.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 13/16
![Page 66: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/66.jpg)
Periodic Representatives
x :
y :
Representative: any periodic subword of appropriate length.
use combinatorial and algorithmic tools for repetitions.
Query algorithm:
1. Extend z to a run. (Suppose it covers x .)
2. Find runs of the same period in y .
3. Synchronize runs using Lyndon roots.
4. Output occurrences in bulk.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 13/16
![Page 67: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/67.jpg)
Periodic Representatives
x :
y :
Representative: any periodic subword of appropriate length.
use combinatorial and algorithmic tools for repetitions.
Query algorithm:
1. Extend z to a run. (Suppose it covers x .)
2. Find runs of the same period in y .
3. Synchronize runs using Lyndon roots.
4. Output occurrences in bulk.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 13/16
![Page 68: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/68.jpg)
Periodic Representatives
x :
y :
Representative: any periodic subword of appropriate length.
use combinatorial and algorithmic tools for repetitions.
Query algorithm:
1. Extend z to a run. (Suppose it covers x .)
2. Find runs of the same period in y .
3. Synchronize runs using Lyndon roots.
4. Output occurrences in bulk.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 13/16
![Page 69: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/69.jpg)
Periodic Representatives
x :
y :
Representative: any periodic subword of appropriate length.
use combinatorial and algorithmic tools for repetitions.
Query algorithm:
1. Extend z to a run. (Suppose it covers x .)
2. Find runs of the same period in y .
3. Synchronize runs using Lyndon roots.
4. Output occurrences in bulk.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 13/16
![Page 70: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/70.jpg)
Non-periodic Representatives
x : z :
x : z :x : z :x : z :x : z :
y :
Non-periodic representatives:
occurrences of z in y cannot overlap too much;O(1) occurrences within y .
Representative assignment:minimal subword of appropriate length wrt. a random order
guaranteed local consistence,neighbouring patterns often share the representative,in total O(n) representatives with O(n) occurrences as arepresentative.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 14/16
![Page 71: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/71.jpg)
Non-periodic Representatives
x : z :
x : z :x : z :x : z :x : z :
y :
Non-periodic representatives:occurrences of z in y cannot overlap too much;
O(1) occurrences within y .
Representative assignment:minimal subword of appropriate length wrt. a random order
guaranteed local consistence,neighbouring patterns often share the representative,in total O(n) representatives with O(n) occurrences as arepresentative.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 14/16
![Page 72: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/72.jpg)
Non-periodic Representatives
x : z :
x : z :x : z :x : z :x : z :
y :
Non-periodic representatives:occurrences of z in y cannot overlap too much;
O(1) occurrences within y .
Representative assignment:
minimal subword of appropriate length wrt. a random orderguaranteed local consistence,neighbouring patterns often share the representative,in total O(n) representatives with O(n) occurrences as arepresentative.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 14/16
![Page 73: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/73.jpg)
Non-periodic Representatives
x : z :
x : z :x : z :x : z :x : z :
y :
Non-periodic representatives:occurrences of z in y cannot overlap too much;
O(1) occurrences within y .
Representative assignment:minimal subword of appropriate length wrt. a random order
guaranteed local consistence,
neighbouring patterns often share the representative,in total O(n) representatives with O(n) occurrences as arepresentative.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 14/16
![Page 74: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/74.jpg)
Non-periodic Representatives
x : z :
x : z :x : z :x : z :x : z :
y :
Non-periodic representatives:occurrences of z in y cannot overlap too much;
O(1) occurrences within y .
Representative assignment:minimal subword of appropriate length wrt. a random order
guaranteed local consistence,neighbouring patterns often share the representative,
in total O(n) representatives with O(n) occurrences as arepresentative.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 14/16
![Page 75: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/75.jpg)
Non-periodic Representatives
x : z :
x : z :
x : z :x : z :x : z :
y :
Non-periodic representatives:occurrences of z in y cannot overlap too much;
O(1) occurrences within y .
Representative assignment:minimal subword of appropriate length wrt. a random order
guaranteed local consistence,neighbouring patterns often share the representative,
in total O(n) representatives with O(n) occurrences as arepresentative.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 14/16
![Page 76: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/76.jpg)
Non-periodic Representatives
x : z :x : z :
x : z :
x : z :x : z :
y :
Non-periodic representatives:occurrences of z in y cannot overlap too much;
O(1) occurrences within y .
Representative assignment:minimal subword of appropriate length wrt. a random order
guaranteed local consistence,neighbouring patterns often share the representative,
in total O(n) representatives with O(n) occurrences as arepresentative.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 14/16
![Page 77: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/77.jpg)
Non-periodic Representatives
x : z :x : z :x : z :
x : z :
x : z :
y :
Non-periodic representatives:occurrences of z in y cannot overlap too much;
O(1) occurrences within y .
Representative assignment:minimal subword of appropriate length wrt. a random order
guaranteed local consistence,neighbouring patterns often share the representative,
in total O(n) representatives with O(n) occurrences as arepresentative.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 14/16
![Page 78: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/78.jpg)
Non-periodic Representatives
x : z :x : z :x : z :x : z :
x : z :
y :
Non-periodic representatives:occurrences of z in y cannot overlap too much;
O(1) occurrences within y .
Representative assignment:minimal subword of appropriate length wrt. a random order
guaranteed local consistence,neighbouring patterns often share the representative,
in total O(n) representatives with O(n) occurrences as arepresentative.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 14/16
![Page 79: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/79.jpg)
Non-periodic Representatives
x : z :x : z :x : z :x : z :
x : z :
y :
Non-periodic representatives:occurrences of z in y cannot overlap too much;
O(1) occurrences within y .
Representative assignment:minimal subword of appropriate length wrt. a random order
guaranteed local consistence,neighbouring patterns often share the representative,in total O(n) representatives with O(n) occurrences as arepresentative.
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 14/16
![Page 80: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/80.jpg)
Conclusions & Open Problems
Our results:
a O(n)-size data structure for IPM Queries with O(1)-timequeries and O(n)-expected time construction,several applications of IPM Queries:
further internal queries,faster algorithms.
Open problems:
More applications of IPM Queries.Design a deterministic construction algorithm. . .
or an algorithm running in O(n) time w.h.p.
Can shortest periods be computed in o(log |x |) time?Is there any lower bound for this problem?
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 15/16
![Page 81: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/81.jpg)
Conclusions & Open Problems
Our results:
a O(n)-size data structure for IPM Queries with O(1)-timequeries and O(n)-expected time construction,several applications of IPM Queries:
further internal queries,faster algorithms.
Open problems:
More applications of IPM Queries.
Design a deterministic construction algorithm. . .or an algorithm running in O(n) time w.h.p.
Can shortest periods be computed in o(log |x |) time?Is there any lower bound for this problem?
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 15/16
![Page 82: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/82.jpg)
Conclusions & Open Problems
Our results:
a O(n)-size data structure for IPM Queries with O(1)-timequeries and O(n)-expected time construction,several applications of IPM Queries:
further internal queries,faster algorithms.
Open problems:
More applications of IPM Queries.Design a deterministic construction algorithm. . .
or an algorithm running in O(n) time w.h.p.
Can shortest periods be computed in o(log |x |) time?Is there any lower bound for this problem?
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 15/16
![Page 83: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/83.jpg)
Conclusions & Open Problems
Our results:
a O(n)-size data structure for IPM Queries with O(1)-timequeries and O(n)-expected time construction,several applications of IPM Queries:
further internal queries,faster algorithms.
Open problems:
More applications of IPM Queries.Design a deterministic construction algorithm. . .
or an algorithm running in O(n) time w.h.p.
Can shortest periods be computed in o(log |x |) time?Is there any lower bound for this problem?
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 15/16
![Page 84: Internal Pattern Matching Queries in a Text and Applicationskociumaka/files/soda2015b_slides.pdf · SODA 2015 San Diego, California, USA January 4, 2015 Tomasz Kociumaka, J. Radoszewski,](https://reader034.vdocuments.net/reader034/viewer/2022050400/5f7deee072e6845bc26143a0/html5/thumbnails/84.jpg)
Thank you
Thank you for your attention!
Tomasz Kociumaka, J. Radoszewski, W. Rytter, T. Waleń Internal Pattern Matching Queries in a Text 16/16