a linear time algorithm for seeds computation€¦ · a linear time algorithm for seeds computation...
TRANSCRIPT
![Page 1: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/1.jpg)
A Linear Time Algorithmfor Seeds Computation
Tomasz Kociumaka, Marcin Kubica, JakubRadoszewski, Wojciech Rytter, Tomasz Waleń
University of Warsaw
LSD 2012 London, February 9, 2012
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 1/24
![Page 2: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/2.jpg)
Periodicity and quasiperiodicity
Periodicity:
a a a a a a a a a a a ab b b b
a a a b
One of the key concepts in text algorithms.
A
a a a a a a a a a a a a ab b b b bb
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 2/24
![Page 3: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/3.jpg)
Periodicity and quasiperiodicity
Periodicity:
a a a a a a a a a a a ab b b ba a a b
One of the key concepts in text algorithms.
A
a a a a a a a a a a a a ab b b b bb
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 2/24
![Page 4: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/4.jpg)
Periodicity and quasiperiodicity
Periodicity:
a a a a a a a a a a a ab b b ba a a b
Quasiperiodicity:
a a a a a a a a a a a a ab b b b
bb
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 2/24
![Page 5: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/5.jpg)
Periodicity and quasiperiodicity
Periodicity:
a a a a a a a a a a a ab b b ba a a b
Quasiperiodicity:
a a a a a a a a a a a a ab b b b bb
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 2/24
![Page 6: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/6.jpg)
Covers and seeds
Cover:
a a a a a a a a a a a ab b b b b ba b b
Each letter of the word is covered by an occurrenceof the cover.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 3/24
![Page 7: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/7.jpg)
Covers and seeds
Cover:
a a a a a a a a a a a ab b b b b ba b b
Each letter of the word is covered by an occurrenceof the cover.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 3/24
![Page 8: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/8.jpg)
Covers and seeds
Seed:
a a a a a a a a a a a ab b b b b b
a b b
Each letter of the word is covered by an occurrenceof the seed. The occurrences can be external.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 3/24
![Page 9: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/9.jpg)
Covers and seeds
Seed:
a a a a a a a a a a a ab b b b b b
a b b
Each letter of the word is covered by an occurrenceof the seed. The occurrences can be external.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 3/24
![Page 10: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/10.jpg)
The main problem
Problem (Shortest-Seed)Given a word w of length n over an alphabet Σ,compute the shortest seed of w.
Problem (All-Seeds)Given a word w of length n over an alphabet Σ,compute an O(n)-sized representation of all theseeds of w.
Theorem (Our result)
The All-Seeds Problem for Σ ={
0, 1, . . . , nO(1)}
can be solved in O(n) time.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 4/24
![Page 11: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/11.jpg)
The main problem
Problem (Shortest-Seed)Given a word w of length n over an alphabet Σ,compute the shortest seed of w.
Problem (All-Seeds)Given a word w of length n over an alphabet Σ,compute an O(n)-sized representation of all theseeds of w.
Theorem (Our result)
The All-Seeds Problem for Σ ={
0, 1, . . . , nO(1)}
can be solved in O(n) time.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 4/24
![Page 12: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/12.jpg)
The main problem
Problem (Shortest-Seed)Given a word w of length n over an alphabet Σ,compute the shortest seed of w.
Problem (All-Seeds)Given a word w of length n over an alphabet Σ,compute an O(n)-sized representation of all theseeds of w.
Theorem (Our result)
The All-Seeds Problem for Σ ={
0, 1, . . . , nO(1)}
can be solved in O(n) time.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 4/24
![Page 13: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/13.jpg)
Background
Seeds were introduced in 1993 by Iliopoulos,Moore & Park.In the same paper O(n log n)-time algorithmfor the All-Seeds Problem over fixed-sizealphabet was given.
No o(n log n) algorithm even for theShortest-Seed Problem for binary alphabet upto now, despite considerable effort given forthis problem. (Smyth, 2000).
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 5/24
![Page 14: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/14.jpg)
Background
Seeds were introduced in 1993 by Iliopoulos,Moore & Park.In the same paper O(n log n)-time algorithmfor the All-Seeds Problem over fixed-sizealphabet was given.No o(n log n) algorithm even for theShortest-Seed Problem for binary alphabet upto now, despite considerable effort given forthis problem. (Smyth, 2000).
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 5/24
![Page 15: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/15.jpg)
Background (cont’d)
An O(log n)-time PRAM algorithm forn processors, Ben-Amran et al., 1994.
For covers linear algorithms for similarproblems are known:
shortest covers of each prefix (Breslauer, 1992)all covers (Moore & Smyth, 1994)
Variants of seeds have been studied:approximate seeds (Christodoulakis et al., 2003)λ-seeds (Qing, Hui & Iliopoulos, 2006)
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 6/24
![Page 16: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/16.jpg)
Background (cont’d)
An O(log n)-time PRAM algorithm forn processors, Ben-Amran et al., 1994.For covers linear algorithms for similarproblems are known:
shortest covers of each prefix (Breslauer, 1992)all covers (Moore & Smyth, 1994)
Variants of seeds have been studied:approximate seeds (Christodoulakis et al., 2003)λ-seeds (Qing, Hui & Iliopoulos, 2006)
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 6/24
![Page 17: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/17.jpg)
Background (cont’d)
An O(log n)-time PRAM algorithm forn processors, Ben-Amran et al., 1994.For covers linear algorithms for similarproblems are known:
shortest covers of each prefix (Breslauer, 1992)all covers (Moore & Smyth, 1994)
Variants of seeds have been studied:approximate seeds (Christodoulakis et al., 2003)λ-seeds (Qing, Hui & Iliopoulos, 2006)
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 6/24
![Page 18: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/18.jpg)
Constraints for seeds
Two different types of constraintsBorder constraints, easier
Maxgap constrains, harder
a a a a a a a a a a a ab b b b b
≤ 5
≤ 5
Maxgap is a maximal distance between the starting positionsof two consecutive occurrences of a given substring.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 7/24
![Page 19: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/19.jpg)
Constraints for seeds
Two different types of constraintsBorder constraints, easierMaxgap constrains, harder
a a a a a a a a a a a ab b b b b
≤ 5
≤ 5
Maxgap is a maximal distance between the starting positionsof two consecutive occurrences of a given substring.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 7/24
![Page 20: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/20.jpg)
Quasiseeds
The All-Seeds Problem can be linearly reducedto computing the maxgaps of all substrings(encoded in a suffix tree).No o(n log n) algorithm known.
Definition (Quasiseed)A substring v is a quasiseed of w if there are lessthan |v | letters both before its first occurrence andafter the last one and each letter between those twooccurrences is covered by an occurrence of v .
a a a a a a a a a a a ab b b b b b
all letters covered< 5
< 5
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 8/24
![Page 21: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/21.jpg)
Quasiseeds
The All-Seeds Problem can be linearly reducedto computing the maxgaps of all substrings(encoded in a suffix tree).No o(n log n) algorithm known.
Definition (Quasiseed)A substring v is a quasiseed of w if there are lessthan |v | letters both before its first occurrence andafter the last one and each letter between those twooccurrences is covered by an occurrence of v .
a a a a a a a a a a a ab b b b b b
all letters covered< 5
< 5
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 8/24
![Page 22: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/22.jpg)
Useful properties of quasiseeds
An O(n) representation on the suffix tree.
v
a
a
abaaaaa
non-quasiseeds
quasiseeds
aaaa aaaaa aaaaaa aaaaaaab b b
aaa aaaaab
aaa aaab
A family of ranges represented by the lower end(explicit node) and the depth of the upper end.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 9/24
![Page 23: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/23.jpg)
Useful properties of quasiseeds
An O(n) representation on the suffix tree.
v
a
a
abaaaaa
non-quasiseeds
quasiseeds
aaaa aaaaa aaaaaa aaaaaaab b b
aaa aaaaab
aaa aaab
A family of ranges represented by the lower end(explicit node) and the depth of the upper end.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 9/24
![Page 24: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/24.jpg)
Useful properties of quasiseeds (cont’d)
Lemma (Restricted-Quasiseeds)Given an integer d and a word w of length n, therepresentation of all quasiseeds of length in{d , d + 1, . . . , 2d} can be found in O(n) time.
The All-Seeds Problem can be linearly reducedto computing (the representation of) allquasiseeds.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 10/24
![Page 25: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/25.jpg)
Useful properties of quasiseeds (cont’d)
Lemma (Restricted-Quasiseeds)Given an integer d and a word w of length n, therepresentation of all quasiseeds of length in{d , d + 1, . . . , 2d} can be found in O(n) time.
The All-Seeds Problem can be linearly reducedto computing (the representation of) allquasiseeds.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 10/24
![Page 26: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/26.jpg)
Main problem
Problem (All-Quasiseeds)Given a word w of length n, compute therepresentation of all its quasiseeds.
Our solution is a recursive algorithm of a rathernonstandard structure.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 11/24
![Page 27: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/27.jpg)
Main problem
Problem (All-Quasiseeds)Given a word w of length n, compute therepresentation of all its quasiseeds.
Our solution is a recursive algorithm of a rathernonstandard structure.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 11/24
![Page 28: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/28.jpg)
Structure of the algorithm
Interval staircase (parametrized by an integer m).
w :
≥ 2m≥ 2m
Lemma (Short Quasiseeds)A substring v of length ≤ m is a quasiseed of w ifand only if it is a quasiseed of each substringcorresponding to an interval of the staircase.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 12/24
![Page 29: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/29.jpg)
Structure of the algorithm
Interval staircase (parametrized by an integer m).
w :
≥ 2m
≥ 2m
Lemma (Short Quasiseeds)A substring v of length ≤ m is a quasiseed of w ifand only if it is a quasiseed of each substringcorresponding to an interval of the staircase.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 12/24
![Page 30: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/30.jpg)
Structure of the algorithm
Interval staircase (parametrized by an integer m).
w :
≥ 2m≥ 2m
Lemma (Short Quasiseeds)A substring v of length ≤ m is a quasiseed of w ifand only if it is a quasiseed of each substringcorresponding to an interval of the staircase.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 12/24
![Page 31: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/31.jpg)
Structure of the algorithm
Interval staircase (parametrized by an integer m).
w :
≥ 2m≥ 2m
Lemma (Short Quasiseeds)A substring v of length ≤ m is a quasiseed of w ifand only if it is a quasiseed of each substringcorresponding to an interval of the staircase.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 12/24
![Page 32: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/32.jpg)
Structure of the algorithm (cont’d)
The total length of the intervals in a staircase(size of the staircase) is at least n.
If it were e.g. ≤ 12n, the recursion could yield alinear algorithm.We need to reduce the staircase.
not needed
=w :
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 13/24
![Page 33: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/33.jpg)
Structure of the algorithm (cont’d)
The total length of the intervals in a staircase(size of the staircase) is at least n.If it were e.g. ≤ 12n, the recursion could yield alinear algorithm.
We need to reduce the staircase.
not needed
=w :
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 13/24
![Page 34: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/34.jpg)
Structure of the algorithm (cont’d)
The total length of the intervals in a staircase(size of the staircase) is at least n.If it were e.g. ≤ 12n, the recursion could yield alinear algorithm.We need to reduce the staircase.
not needed
=w :
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 13/24
![Page 35: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/35.jpg)
Structure of the algorithm (cont’d)
The total length of the intervals in a staircase(size of the staircase) is at least n.If it were e.g. ≤ 12n, the recursion could yield alinear algorithm.We need to reduce the staircase.
not needed
=w :
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 13/24
![Page 36: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/36.jpg)
Structure of the algorithm (cont’d)
Outline:1 Find an appropriate reduced staircase2 Find the long quasiseeds (non-recursively)3 Find the short quasiseeds (by recursive calls)4 Merge the results of those calls
Main issueHow to find an appropriate staircase such that
the reduced staircase is small,long quasiseeds can be found in O(n).
Due to the Restricted-Quasiseeds Lemma,m = Θ(n) would suffice for the second condition.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 14/24
![Page 37: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/37.jpg)
Structure of the algorithm (cont’d)
Outline:1 Find an appropriate reduced staircase2 Find the long quasiseeds (non-recursively)3 Find the short quasiseeds (by recursive calls)4 Merge the results of those calls
Main issueHow to find an appropriate staircase such that
the reduced staircase is small,long quasiseeds can be found in O(n).
Due to the Restricted-Quasiseeds Lemma,m = Θ(n) would suffice for the second condition.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 14/24
![Page 38: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/38.jpg)
Structure of the algorithm (cont’d)
Outline:1 Find an appropriate reduced staircase2 Find the long quasiseeds (non-recursively)3 Find the short quasiseeds (by recursive calls)4 Merge the results of those calls
Main issueHow to find an appropriate staircase such that
the reduced staircase is small,long quasiseeds can be found in O(n).
Due to the Restricted-Quasiseeds Lemma,m = Θ(n) would suffice for the second condition.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 14/24
![Page 39: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/39.jpg)
f -factorization
A variant of a well known LZ-factorization
Definition (f -factorization)An f -factorization f1f2 . . . fk of w is constructedgreedily: fi is either just the first occurrence of aletter or the longest prefix of the remaining suffixthat is a substring of f1 . . . fi−1.
a a a a a a a a ab b b b b b c
Theorem (Crochemore, 1983; Crochemore et al.2009; Crochemore, Tischler 2011)The f -factorization over integer alphabet can becomputed in O(n) time.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 15/24
![Page 40: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/40.jpg)
f -factorization
Definition (f -factorization)An f -factorization f1f2 . . . fk of w is constructedgreedily: fi is either just the first occurrence of aletter or the longest prefix of the remaining suffixthat is a substring of f1 . . . fi−1.
a a a a a a a a ab b b b b b c
Theorem (Crochemore, 1983; Crochemore et al.2009; Crochemore, Tischler 2011)The f -factorization over integer alphabet can becomputed in O(n) time.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 15/24
![Page 41: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/41.jpg)
f -staircase
Intuition:Stairs lying within a factor are not needed.
In reduced staircase no stairs should overlap.
w :
> 4m
2m 2m
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 16/24
![Page 42: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/42.jpg)
f -staircase
Intuition:Stairs lying within a factor are not needed.In reduced staircase no stairs should overlap.
w :
> 4m
2m 2m
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 16/24
![Page 43: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/43.jpg)
f -staircase
Intuition:Stairs lying within a factor are not needed.In reduced staircase no stairs should overlap.
w :
> 4m
2m 2m
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 16/24
![Page 44: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/44.jpg)
f -staircase
Intuition:Stairs lying within a factor are not needed.In reduced staircase no stairs should overlap.
w :
> 4m
2m 2m
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 16/24
![Page 45: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/45.jpg)
f -staircase
Intuition:Stairs lying within a factor are not needed.In reduced staircase no stairs should overlap.
w :
> 4m
2m 2m
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 16/24
![Page 46: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/46.jpg)
f -staircase
Intuition:Stairs lying within a factor are not needed.In reduced staircase no stairs should overlap.
w :
> 4m
2m 2m
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 16/24
![Page 47: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/47.jpg)
f -staircase
Intuition:Stairs lying within a factor are not needed.In reduced staircase no stairs should overlap.
w :
> 4m
2m 2m
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 16/24
![Page 48: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/48.jpg)
Quasiseeds, staircase and factorization
LemmaLet F be the f -factorization of w (|w | = n) and vbe a quasiseed of w, |v | < n
16 . Then at most 2n|v | − 1
factors from F lie within [2n16 ,15n16 ].
2n16
n16g ≤ 2n|v | − 1 middle factors
The size of the f -staircase is at most3n16 + 4m(g + 1).
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 17/24
![Page 49: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/49.jpg)
Quasiseeds, staircase and factorization
LemmaLet F be the f -factorization of w (|w | = n) and vbe a quasiseed of w, |v | < n
16 . Then at most 2n|v | − 1
factors from F lie within [2n16 ,15n16 ].
2n16
n16g ≤ 2n|v | − 1 middle factors
The size of the f -staircase is at most3n16 + 4m(g + 1).
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 17/24
![Page 50: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/50.jpg)
Quasiseeds, staircase and factorization
LemmaLet F be the f -factorization of w (|w | = n) and vbe a quasiseed of w, |v | < n
16 . Then at most 2n|v | − 1
factors from F lie within [2n16 ,15n16 ].
2n16
n16g ≤ 2n|v | − 1 middle factors
The size of the f -staircase is at most3n16 + 4m(g + 1).
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 17/24
![Page 51: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/51.jpg)
Key lemmas
The algorithm does not know the quasiseed, butcan find the number of middle factors.
Let g be the number of middle factors of the wordw , |w | = n > 48.
LemmaThere is no quasiseed v of w such that:
2ng+1 < |v | ≤
n16 .
LemmaIf m ≤ n
16(g+1) then the size of the f -staircase is< n2 .
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 18/24
![Page 52: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/52.jpg)
Key lemmas
The algorithm does not know the quasiseed, butcan find the number of middle factors.Let g be the number of middle factors of the wordw , |w | = n > 48.
LemmaThere is no quasiseed v of w such that:
2ng+1 < |v | ≤
n16 .
LemmaIf m ≤ n
16(g+1) then the size of the f -staircase is< n2 .
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 18/24
![Page 53: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/53.jpg)
Key lemmas
The algorithm does not know the quasiseed, butcan find the number of middle factors.Let g be the number of middle factors of the wordw , |w | = n > 48.
LemmaThere is no quasiseed v of w such that:
2ng+1 < |v | ≤
n16 .
LemmaIf m ≤ n
16(g+1) then the size of the f -staircase is< n2 .
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 18/24
![Page 54: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/54.jpg)
Final structure of the algorithm
1 Find an f -factorization and the number ofmiddle factors (g)
2 m :=⌊
n16(g+1)
⌋3 Compute the f -staircase4 Compute the long quasiseeds (belonging to two
ranges of fixed ratio)5 If m > 0 compute the short quasiseeds by
recursive calls and merge the results
Merging needs some more attention.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 19/24
![Page 55: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/55.jpg)
Final structure of the algorithm
1 Find an f -factorization and the number ofmiddle factors (g)
2 m :=⌊
n16(g+1)
⌋3 Compute the f -staircase4 Compute the long quasiseeds (belonging to two
ranges of fixed ratio)5 If m > 0 compute the short quasiseeds by
recursive calls and merge the results
Merging needs some more attention.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 19/24
![Page 56: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/56.jpg)
Representation of quasiseeds
An O(n) representation on the suffix tree.
v
a
a
abaaaaa
non-quasiseeds
quasiseeds
aaaa aaaaa aaaaaa aaaaaaab b b
aaa aaaaab
aaa aaab
A family of ranges represented by the lower end(explicit node) and the depth of the upper end.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 20/24
![Page 57: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/57.jpg)
Merging
We have to find the intersection of the sets ofquasiseeds of some substrings of w .
The quasiseeds of each substring are reportedas a family of ranges on the suffix tree of thissubstring.
We want the result to be on the suffix tree ofthe whole w .No direct relation between (explicit) nodes ofthose suffix trees and the suffix tree of w .But relation between leaves is straightforward.Solution: represent a range by a leaf with apair of depths.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 21/24
![Page 58: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/58.jpg)
Merging
We have to find the intersection of the sets ofquasiseeds of some substrings of w .
The quasiseeds of each substring are reportedas a family of ranges on the suffix tree of thissubstring.We want the result to be on the suffix tree ofthe whole w .
No direct relation between (explicit) nodes ofthose suffix trees and the suffix tree of w .But relation between leaves is straightforward.Solution: represent a range by a leaf with apair of depths.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 21/24
![Page 59: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/59.jpg)
Merging
We have to find the intersection of the sets ofquasiseeds of some substrings of w .
The quasiseeds of each substring are reportedas a family of ranges on the suffix tree of thissubstring.We want the result to be on the suffix tree ofthe whole w .No direct relation between (explicit) nodes ofthose suffix trees and the suffix tree of w .
But relation between leaves is straightforward.Solution: represent a range by a leaf with apair of depths.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 21/24
![Page 60: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/60.jpg)
Merging
We have to find the intersection of the sets ofquasiseeds of some substrings of w .
The quasiseeds of each substring are reportedas a family of ranges on the suffix tree of thissubstring.We want the result to be on the suffix tree ofthe whole w .No direct relation between (explicit) nodes ofthose suffix trees and the suffix tree of w .But relation between leaves is straightforward.
Solution: represent a range by a leaf with apair of depths.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 21/24
![Page 61: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/61.jpg)
Merging
We have to find the intersection of the sets ofquasiseeds of some substrings of w .
The quasiseeds of each substring are reportedas a family of ranges on the suffix tree of thissubstring.We want the result to be on the suffix tree ofthe whole w .No direct relation between (explicit) nodes ofthose suffix trees and the suffix tree of w .But relation between leaves is straightforward.Solution: represent a range by a leaf with apair of depths.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 21/24
![Page 62: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/62.jpg)
Merging cont’d
The (leaf, depth, depth) representation is goodfor transferring only.
We need to go back to the original one. It canbe done with the aid of the following queries
Ancestor QueriesGiven an integer h and a leaf v , find the highestexplicit ancestor of v , whose depth is ≥ h.
Any q such queries can be answered offline inO(n + q) time.Once we’re back in the original representation,finding the intersection is elementary.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 22/24
![Page 63: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/63.jpg)
Merging cont’d
The (leaf, depth, depth) representation is goodfor transferring only.We need to go back to the original one. It canbe done with the aid of the following queries
Ancestor QueriesGiven an integer h and a leaf v , find the highestexplicit ancestor of v , whose depth is ≥ h.
Any q such queries can be answered offline inO(n + q) time.Once we’re back in the original representation,finding the intersection is elementary.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 22/24
![Page 64: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/64.jpg)
Merging cont’d
The (leaf, depth, depth) representation is goodfor transferring only.We need to go back to the original one. It canbe done with the aid of the following queries
Ancestor QueriesGiven an integer h and a leaf v , find the highestexplicit ancestor of v , whose depth is ≥ h.
Any q such queries can be answered offline inO(n + q) time.
Once we’re back in the original representation,finding the intersection is elementary.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 22/24
![Page 65: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/65.jpg)
Merging cont’d
The (leaf, depth, depth) representation is goodfor transferring only.We need to go back to the original one. It canbe done with the aid of the following queries
Ancestor QueriesGiven an integer h and a leaf v , find the highestexplicit ancestor of v , whose depth is ≥ h.
Any q such queries can be answered offline inO(n + q) time.Once we’re back in the original representation,finding the intersection is elementary.
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 22/24
![Page 66: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/66.jpg)
Conclusions
We have presented a linear algorithm for theAll-Quasiseeds Problem (over integeralphabet).This yields a linear algorithm for the All-SeedsProblem (over integer alphabet).
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 23/24
![Page 67: A Linear Time Algorithm for Seeds Computation€¦ · A Linear Time Algorithm for Seeds Computation Tomasz Kociumaka, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń](https://reader034.vdocuments.net/reader034/viewer/2022050111/5f48d3c062c3da5adb64be9e/html5/thumbnails/67.jpg)
Thank you for your attention!
Tomasz Kociumaka A Linear Time Algorithm for Seeds Computation 24/24