lempel-ziv factorization revisited - ulm · lempel-ziv factorization revisited enno ohlebusch simon...
TRANSCRIPT
![Page 1: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/1.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Lempel-Ziv Factorization Revisited
Enno Ohlebusch Simon Gog
Institute of Theoretical Computer ScienceUlm University
Palermo, June 27th 2011
![Page 2: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/2.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Outline
1 Problem Definition and Motivation
2 Existing solutions
3 New fast algorithm
4 Experimental Results
![Page 3: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/3.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Definition
Let S = S[1..n] a string of length n over an alphabet Σ.LZ-factorization of S is a factorization S = ω1ω2 · · ·ωm such thateach ωk , 1 ≤ k ≤ m, is either(a) a letter c ∈ Σ that does not occur in ω1ω2 · · ·ωk−1 or(b) the longest substring of S that occurs at least twice in
ω1ω2 · · ·ωk .Definition also know as LZ77
![Page 4: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/4.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Example
S = acaaacatat
101 2 3 45 67 8 9 0a|c|a|aa|ca|t|a|t
suffix 4 aacatatsuffix 3 aaacatat
Encoding: (a,0), (c,0), (1,1), (3,2), (2,2), (t ,0), (7,2)
PrevOcc LPS
S = aa....a is encoded by (a,0), (1,n − 1)i.e. O(log n) bits
![Page 5: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/5.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Applications
LZ-factorization is used in/forgzip, WinZip, PKZIP, 7zip with sliding windowcomputing all runs (Kolpakov and Kucherov)repeats with fixed gap (Kolpakov and Kucherov)branching repeats (Gusfield and Stoye)finding sequence alignment (Crochemore et al.)local periods (Duval et al.)
Space consumption is a bottleneck for finding tandemrepeats in DNA (Pokrazywa and Polanski, 2010)
![Page 6: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/6.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Outline
1 Problem Definition and Motivation
2 Existing solutions
3 New fast algorithm
4 Experimental Results
![Page 7: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/7.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
A list of existing solutions
The construction of the LZ-factorization can be done in lineartime and space. Solutions using S and
suffix tree (ST) (Kolpakov and Kucherov)suffix array (SA) + longest common prefix array (LCP)(Crochemore et al.)...
SA + range minimum queries (RMQs) on SA (Chen et al.)SA + BWT (Okanohara and Sadakane)(C)SA + RMQ + inverse (C)SA + BWT of Srev + binare tree(Kreft and Navarro) for LZend using backward search...
fast
spac
e-ef
ficie
nt
![Page 8: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/8.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Our contributions
A fast algorithm which uses S, SA and Φ during theconstruction
A simple space-efficient algorithm which uses Srev
BWT of Srev + backward searchCompressed Suffix Array (CSA)constant time next and previous greater value datastructure for SA which takes 2n + o(n) bits
![Page 9: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/9.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Workflow of most fast solutions
First calculate Longest Previous String (LPS) and PreviousOccurrence (PrevOcc) for each suffix i in text order
S[i] a c a a a c a t a ti 1 2 3 4 5 6 7 8 9 10
LPF[i] or LPS[i] 0 0 1 2 3 2 1 0 2 1PrevOcc[i] 0 0 1 3 1 2 5 0 7 8
Second calculate LZ factorization from LPS(a,0)
![Page 10: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/10.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Workflow of most fast solutions
First calculate Longest Previous String (LPS) and PreviousOccurrence (PrevOcc) for each suffix i in text order
S[i] a c a a a c a t a ti 1 2 3 4 5 6 7 8 9 10
LPF[i] or LPS[i] 0 0 1 2 3 2 1 0 2 1PrevOcc[i] 0 0 1 3 1 2 5 0 7 8
Second calculate LZ factorization from LPS(a,0) (c,0)
![Page 11: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/11.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Workflow of most fast solutions
First calculate Longest Previous String (LPS) and PreviousOccurrence (PrevOcc) for each suffix i in text order
S[i] a c a a a c a t a ti 1 2 3 4 5 6 7 8 9 10
LPF[i] or LPS[i] 0 0 1 2 3 2 1 0 2 1PrevOcc[i] 0 0 1 3 1 2 5 0 7 8
Second calculate LZ factorization from LPS(a,0) (c,0) (1,1)
![Page 12: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/12.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Workflow of most fast solutions
First calculate Longest Previous String (LPS) and PreviousOccurrence (PrevOcc) for each suffix i in text order
S[i] a c a a a c a t a ti 1 2 3 4 5 6 7 8 9 10
LPF[i] or LPS[i] 0 0 1 2 3 2 1 0 2 1PrevOcc[i] 0 0 1 3 1 2 5 0 7 8
Second calculate LZ factorization from LPS(a,0) (c,0) (1,1) (3,2)
![Page 13: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/13.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Workflow of most fast solutions
First calculate Longest Previous String (LPS) and PreviousOccurrence (PrevOcc) for each suffix i in text order
S[i] a c a a a c a t a ti 1 2 3 4 5 6 7 8 9 10
LPF[i] or LPS[i] 0 0 1 2 3 2 1 0 2 1PrevOcc[i] 0 0 1 3 1 2 5 0 7 8
Second calculate LZ factorization from LPS(a,0) (c,0) (1,1) (3,2) (1,2)
![Page 14: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/14.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Workflow of most fast solutions
First calculate Longest Previous String (LPS) and PreviousOccurrence (PrevOcc) for each suffix i in text order
S[i] a c a a a c a t a ti 1 2 3 4 5 6 7 8 9 10
LPF[i] or LPS[i] 0 0 1 2 3 2 1 0 2 1PrevOcc[i] 0 0 1 3 1 2 5 0 7 8
Second calculate LZ factorization from LPS(a,0) (c,0) (1,1) (3,2) (1,2) (t,0)
![Page 15: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/15.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Workflow of most fast solutions
First calculate Longest Previous String (LPS) and PreviousOccurrence (PrevOcc) for each suffix i in text order
S[i] a c a a a c a t a ti 1 2 3 4 5 6 7 8 9 10
LPF[i] or LPS[i] 0 0 1 2 3 2 1 0 2 1PrevOcc[i] 0 0 1 3 1 2 5 0 7 8
Second calculate LZ factorization from LPS(a,0) (c,0) (1,1) (3,2) (1,2) (t,0) (7,2)
![Page 16: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/16.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Calculating LPS[SA[i ]]
i SA[i] LCP[i] SSA[i] PSV[i] NSV[i] LPS[SA[i]] PrevOcc[SA[i]]0 0 ε
1 3 0 aaacatat 0 3 1 12 4 2 aacatat 1 3 2 33 1 1 acaaacatat 0 11 0 04 5 3 acatat 3 7 3 15 9 1 at 4 6 2 76 7 2 atat 4 7 1 57 2 0 caaacatat 3 11 0 08 6 2 catat 7 11 2 29 10 0 t 8 10 1 8
10 8 1 tat 8 11 0 011 0 0 ε
![Page 17: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/17.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Calculating LPS[SA[i ]]i SA[i] LCP[i] SSA[i] PSV[i] NSV[i] LPS[SA[i]] PrevOcc[SA[i]]0 0 ε
1 3 0 aaacatat 0 3 1 12 4 2 aacatat 1 3 2 33 1 1 acaaacatat 0 11 0 04 5 3 acatat 3 7 3 15 9 1 at 4 6 2 76 7 2 atat 4 7 1 57 2 0 caaacatat 3 11 0 08 6 2 catat 7 11 2 29 10 0 t 8 10 1 8
10 8 1 tat 8 11 0 011 0 0 εN
SV
[7]
PS
V[7
]
LPS[7] = max{|lcp(SSA[3],SSA[7])|, |lcp(SSA[7],SSA[11])|}= max{0,0} = 0
![Page 18: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/18.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Calculating LPS[SA[i ]]
i SA[i] LCP[i] SSA[i] PSV[i] NSV[i] LPS[SA[i]] PrevOcc[SA[i]]0 0 ε
1 3 0 aaacatat 0 3 1 12 4 2 aacatat 1 3 2 33 1 1 acaaacatat 0 11 0 04 5 3 acatat 3 7 3 15 9 1 at 4 6 2 76 7 2 atat 4 7 1 57 2 0 caaacatat 3 11 0 08 6 2 catat 7 11 2 29 10 0 t 8 10 1 8
10 8 1 tat 8 11 0 011 0 0 ε
LPS[SA[i]] = max{|lcp(SSA[PSV[i]],SSA[i])|, |lcp(SSA[i],SSA[NSV[i]])|}
with |lcp(SSA[x ],SSA[y ])| = RMQLCP(x + 1, y)
![Page 19: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/19.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Peak elimination (Crochemore and Ilie)
SA[i]LPS[SA[i]]
PrevOcc[SA[i]]
graph is onlyconceptional0
3
0
42
1
1
5
3
9
1 7
2
2
0
6
2
10
0 81
0
0
![Page 20: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/20.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Peak elimination (Crochemore and Ilie)
SA[i]LPS[SA[i]]
PrevOcc[SA[i]]
0
3
0
42
1
1
5
3
9
1 7
2
2
0
6
2
10
0 81
0
0
graph is onlyconceptional0
3
0
42
3
1
1
5
3
9
1 7
2
2
0
6
2
10
0 81
0
0
![Page 21: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/21.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Peak elimination (Crochemore and Ilie)
SA[i]LPS[SA[i]]
PrevOcc[SA[i]]
0
3
0
42
3
1
1
5
3
9
1 7
2
2
0
6
2
10
0 81
0
0
graph is onlyconceptional0
31
1
42
3
10
5
3
9
1 7
2
2
0
6
2
10
0 81
0
0
![Page 22: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/22.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Peak elimination (Crochemore and Ilie)
SA[i]LPS[SA[i]]
PrevOcc[SA[i]]
031
1
42
3
10
5
3
9
1 7
2
2
0
6
2
10
0 81
0
0
graph is onlyconceptional0
31
1
42
3
10
5
3
92
7
71
2
0
6
2
10
0 81
0
0
![Page 23: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/23.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Peak elimination (Crochemore and Ilie)
SA[i]LPS[SA[i]]
PrevOcc[SA[i]]
031
1
42
3
10
5
3
92
7
71
2
0
6
2
10
0 81
0
0
graph is onlyconceptional0
31
1
42
3
10
5
3
92
7
715
2
0
6
2
10
0 81
0
0
![Page 24: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/24.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Peak elimination (Crochemore and Ilie)
SA[i]LPS[SA[i]]
PrevOcc[SA[i]]
031
1
42
3
10
5
3
92
7
715
2
0
6
2
10
0 81
0
0
graph is onlyconceptional0
31
1
42
3
10
53
1
92
7
715
20
6
2
10
0 81
0
0
![Page 25: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/25.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Peak elimination (Crochemore and Ilie)
SA[i]LPS[SA[i]]
PrevOcc[SA[i]]
031
1
42
3
10
53
1
92
7
715
20
6
2
10
0 81
0
0
graph is onlyconceptional0
31
1
42
3
10
53
1
92
7
715
20
6
2
101
8
80
0
0
![Page 26: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/26.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Peak elimination (Crochemore and Ilie)
SA[i]LPS[SA[i]]
PrevOcc[SA[i]]
031
1
42
3
10
53
1
92
7
715
20
6
2
101
8
80
0
0
graph is onlyconceptional0
31
1
42
3
10
53
1
92
7
715
20
6
2
101
8
80
⊥
0
0
![Page 27: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/27.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Peak elimination (Crochemore and Ilie)
SA[i]LPS[SA[i]]
PrevOcc[SA[i]]
031
1
42
3
10
53
1
92
7
715
20
6
2
101
8
80
⊥
0
0
graph is onlyconceptional0
31
1
42
3
10
53
1
92
7
715
20
62
2
101
8
80
⊥
0
0
![Page 28: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/28.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Peak elimination (Crochemore and Ilie)
SA[i]LPS[SA[i]]
PrevOcc[SA[i]]
031
1
42
3
10
53
1
92
7
715
20
62
2
101
8
80
⊥
0
0
graph is onlyconceptional0
31
1
42
3
10
53
1
92
7
715
20
⊥
62
2
101
8
80
⊥
00
![Page 29: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/29.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Peak elimination (Crochemore and Ilie)
SA[i]LPS[SA[i]]
PrevOcc[SA[i]]
031
1
42
3
10
53
1
92
7
715
20
⊥
62
2
101
8
80
⊥
00
graph is onlyconceptional0
31
1
42
3
10
⊥
53
1
92
7
715
20
⊥
62
2
101
8
80
⊥
00
![Page 30: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/30.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Summary for peak elimination approach
LPS is a permutation of LCPMost implementations
first calculate LCP (SA order!)calculate LPS and PrevOcc also in SA ordertransform LPS and PrevOcc into text ordercalculate the LZ-factorization from LPS and PrevOcc in textorder
![Page 31: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/31.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Outline
1 Problem Definition and Motivation
2 Existing solutions
3 New fast algorithm
4 Experimental Results
![Page 32: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/32.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Overview: The New fast algorithm
Observation:Kasai et al.’s LCP algorithm (CPM 2001) produces LCParray in SA orderKärkkäinen at al.’s algorithm, called Φ-algorithm, (CPM2009) produces PLCP (=LCP in text order!) much faster
Our solution:Adapt Φ-algorithm to peak eliminationEliminate transformation from SA order to text order
![Page 33: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/33.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Calculating PLCP
Main procedurefor i ← 1 to n do
Φ[SA[i]]← SA[i − 1]`← 0for i ← 1 to n do
j ← Φ[i]while S[i + `] = S[j + `] do`← ` + 1
PLCP[i]← ``← max(`− 1,0)
Procedure sop(i , `, j)if LPS[i] = ⊥ then
LPS[i]← `PrevOcc[i]← j
elseif LPS[i] < ` then
if PrevOcc[i] > j thensop(PrevOcc[i],LPS[i], j)
elsesop(j ,LPS[i],PrevOcc[i])
LPS[i]← `PrevOcc[i]← j
else /* LPS[i] ≥ ` */if PrevOcc[i] > j then
sop(PrevOcc[i], `, j)else
sop(j , `,PrevOcc[i])
![Page 34: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/34.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
The new fast algorithm
Main procedurefor i ← 1 to n do
Φ[SA[i]]← SA[i − 1]`← 0for i ← 1 to n do
j ← Φ[i]while S[i + `] = S[j + `] do`← ` + 1
if i > j thensop(i , `, j)
elsesop(j , `, i)
`← max(`− 1,0)
Procedure sop(i , `, j)if LPS[i] = ⊥ then
LPS[i]← `PrevOcc[i]← j
elseif LPS[i] < ` then
if PrevOcc[i] > j thensop(PrevOcc[i],LPS[i], j)
elsesop(j ,LPS[i],PrevOcc[i])
LPS[i]← `PrevOcc[i]← j
else /* LPS[i] ≥ ` */if PrevOcc[i] > j then
sop(PrevOcc[i], `, j)else
sop(j , `,PrevOcc[i])
![Page 35: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/35.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
The new fast algorithm
Main procedurefor i ← 1 to n do
Φ[SA[i]]← SA[i − 1]`← 0for i ← 1 to n do
j ← Φ[i]while S[i + `] = S[j + `] do`← ` + 1
if i > j thensop(i , `, j)
elsesop(j , `, i)
`← max(`− 1,0)
Procedure sop(i , `, j)if LPS[i] = ⊥ then
LPS[i]← `PrevOcc[i]← j
elseif LPS[i] < ` then
if PrevOcc[i] > j thensop(PrevOcc[i],LPS[i], j)
elsesop(j ,LPS[i],PrevOcc[i])
LPS[i]← `PrevOcc[i]← j
else /* LPS[i] ≥ ` */if PrevOcc[i] > j then
sop(PrevOcc[i], `, j)else
sop(j , `,PrevOcc[i])
![Page 36: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/36.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
graph is onlyconceptional
set(1, Φ[1] =4)
1
4
10 1 2 3 4
1
1
5 6 7 8 9 10 11
![Page 37: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/37.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
set(1, Φ[1] =4)
1
4
10 1 2 3 4
1
1
5 6 7 8 9 10 11
graph is onlyconceptional
set(2, Φ[2] =7)
1
4
1
2
7
0
0 1 2 3 41
1
5 6 70
2
8 9 10 11
![Page 38: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/38.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
set(2, Φ[2] =7)
1
4
1
2
7
0
0 1 2 3 41
1
5 6 70
2
8 9 10 11
graph is onlyconceptional
set(3, Φ[3] =0)
1
4
1
2
7
03
0
0
0 1 2 30
0
41
1
5 6 70
2
8 9 10 11
![Page 39: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/39.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
set(3, Φ[3] =0)
1
4
1
2
7
03
0
0
0 1 2 30
0
41
1
5 6 70
2
8 9 10 11
graph is onlyconceptional
set(4, Φ[4] =3)
1
4
1
2
7
03
0
0
43
2
0 1 2 30
0
42|1
3|1
5 6 70
2
8 9 10 11
![Page 40: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/40.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
set(4, Φ[4] =3)
1
4
1
2
7
03
0
0
43
2
0 1 2 30
0
42|1
3|1
5 6 70
2
8 9 10 11
graph is onlyconceptional
permute
0
3
0 2
7
03
1
1
0 1 2 30|1
0|1
42
3
5 6 70
2
8 9 10 11
![Page 41: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/41.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
permute
0
3
0 2
7
03
1
1
0 1 2 30|1
0|1
42
3
5 6 70
2
8 9 10 11
graph is onlyconceptional
permute
010
2
7
0
0 10
0
2 31
1
42
3
5 6 70
2
8 9 10 11
![Page 42: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/42.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
permute
010
2
7
0
0 10
0
2 31
1
42
3
5 6 70
2
8 9 10 11
graph is onlyconceptional
set(5, Φ[5] =1)
010
2
7
05
1
3
0 10
0
2 31
1
42
3
53
1
6 70
2
8 9 10 11
![Page 43: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/43.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
set(5, Φ[5] =1)
010
2
7
05
1
3
0 10
0
2 31
1
42
3
53
1
6 70
2
8 9 10 11
graph is onlyconceptional
set(6, Φ[6] =2)
010
2
7
05
1
36
2
20 1
0
0
2 31
1
42
3
53
1
62
2
70
2
8 9 10 11
![Page 44: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/44.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
set(6, Φ[6] =2)
010
2
7
05
1
36
2
20 1
0
0
2 31
1
42
3
53
1
62
2
70
2
8 9 10 11
graph is onlyconceptional
set(7, Φ[7] =9)
010
2
7
05
1
36
2
2
7
9
2
0 10
0
2 31
1
42
3
53
1
62
2
70
2
8 92
7
10 11
![Page 45: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/45.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
set(7, Φ[7] =9)
010
2
7
05
1
36
2
2
7
9
2
0 10
0
2 31
1
42
3
53
1
62
2
70
2
8 92
7
10 11
graph is onlyconceptional
set(8, Φ[8] =10)
010
2
7
05
1
36
2
2
7
9
2
8
10
1
0 10
0
2 31
1
42
3
53
1
62
2
70
2
8 92
7
101
8
11
![Page 46: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/46.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
set(8, Φ[8] =10)
010
2
7
05
1
36
2
2
7
9
2
8
10
1
0 10
0
2 31
1
42
3
53
1
62
2
70
2
8 92
7
101
8
11
graph is onlyconceptional
set(9, Φ[9] =5)
010
2
7
05
1
36
2
2
7
9
2
8
10
19
51
0 10
0
2 31
1
42
3
53
1
62
2
70
2
8 91|2
5|7
101
8
11
![Page 47: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/47.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
set(9, Φ[9] =5)
010
2
7
05
1
36
2
2
7
9
2
8
10
19
51
0 10
0
2 31
1
42
3
53
1
62
2
70
2
8 91|2
5|7
101
8
11
graph is onlyconceptional
permute
010
2
7
05
1
36
2
2
8
10
1
5
71
0 10
0
2 31
1
42
3
53
1
62
2
70|1
2|5
8 92
7
101
8
11
![Page 48: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/48.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
permute
010
2
7
05
1
36
2
2
8
10
1
5
71
0 10
0
2 31
1
42
3
53
1
62
2
70|1
2|5
8 92
7
101
8
11
graph is onlyconceptional
permute
010
5
1
36
2
2
8
10
1
5
2
0
0 10
0
2 31
1
42
3
50|3
2|1
62
2
715
8 92
7
101
8
11
![Page 49: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/49.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
permute
010
5
1
36
2
2
8
10
1
5
2
0
0 10
0
2 31
1
42
3
50|3
2|1
62
2
715
8 92
7
101
8
11
graph is onlyconceptional
permute
010
6
2
2
8
10
1
120
0 10
0
20
1
31
1
42
3
53
1
62
2
715
8 92
7
101
8
11
![Page 50: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/50.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
permute
010
6
2
2
8
10
1
120
0 10
0
20
1
31
1
42
3
53
1
62
2
715
8 92
7
101
8
11
graph is onlyconceptional
set(10, Φ[10] =6)
010
6
2
2
8
10
1
120
10
6
00 1
0
0
20
1
31
1
42
3
53
1
62
2
715
8 92
7
100|1
6|8
11
![Page 51: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/51.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
set(10, Φ[10] =6)
010
6
2
2
8
10
1
120
10
6
00 1
0
0
20
1
31
1
42
3
53
1
62
2
715
8 92
7
100|1
6|8
11
graph is onlyconceptional
permute
010
6
2
2
120
6
80
0 10
0
20
1
31
1
42
3
53
1
62
2
715
80
6
92
7
101
8
11
![Page 52: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/52.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
permute
010
6
2
2
120
6
80
0 10
0
20
1
31
1
42
3
53
1
62
2
715
80
6
92
7
101
8
11
graph is onlyconceptional
set(11, Φ[11] =8)
010
6
2
2
120
6
80
0
8
0
0 10
0
20
1
31
1
42
3
53
1
62
2
715
80|0
6|11
92
7
101
8
11
![Page 53: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/53.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
set(11, Φ[11] =8)
010
6
2
2
120
6
80
0
8
0
0 10
0
20
1
31
1
42
3
53
1
62
2
715
80|0
6|11
92
7
101
8
11
graph is onlyconceptional
permute
010
6
2
2
120
6
0
0
0 10
0
20
1
31
1
42
3
53
1
60|2
11|2
715
80
6
92
7
101
8
11
![Page 54: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/54.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
permute
010
6
2
2
120
6
0
0
0 10
0
20
1
31
1
42
3
53
1
60|2
11|2
715
80
6
92
7
101
8
11
graph is onlyconceptional
permute
010 1
20 2
0
0
0 10
0
20|0
1|11
31
1
42
3
53
1
62
2
715
80
6
92
7
101
8
11
![Page 55: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/55.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
permute
010 1
20 2
0
0
0 10
0
20|0
1|11
31
1
42
3
53
1
62
2
715
80
6
92
7
101
8
11
graph is onlyconceptional
permute
010 1
00
0 10|0
0|11
20
1
31
1
42
3
53
1
62
2
715
80
6
92
7
101
8
11
![Page 56: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/56.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
iLPS[i]
PrevOcc[i]
permute
010 1
00
0 10|0
0|11
20
1
31
1
42
3
53
1
62
2
715
80
6
92
7
101
8
11
graph is onlyconceptional
permute
0 00
0 10
0
20
1
31
1
42
3
53
1
62
2
715
80
6
92
7
101
8
11
![Page 57: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/57.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Outline
1 Problem Definition and Motivation
2 Existing solutions
3 New fast algorithm
4 Experimental Results
![Page 58: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/58.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Experimental setup
SA is already calculated and stored on diskTest cases like in Chen et. alOther implementations from
German TischlerSimon J. Puglisi
We use data structures of the succinct data structurelibrary (sdsl)http://www.uni-ulm.de/in/theo/research/sdsl
![Page 59: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/59.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Experimental Results for Fast Algorithms
test case LCP
LPF_
sim
ple
LPF_
next
_pre
v
LPF_
sort
ing
LPF_
onlin
e
LPF_
optim
al
LZ_O
G
chr19.dna4 19.40 46.40 25.80 31.30 26.30 26.30 19.10chr22.dna4 8.90 24.60 14.80 14.60 12.30 12.20 10.00E.coli 0.80 2.00 1.20 1.40 1.20 1.20 0.90bible 0.60 1.60 0.90 1.10 0.90 0.90 0.50howto 8.10 20.60 11.80 13.40 11.60 11.50 6.70fib_s14930352 2.20 6.90 3.40 3.80 3.30 3.30 0.80fib_s9227465 1.40 4.20 2.30 2.40 2.10 2.10 0.50fss10 1.70 5.60 2.80 3.00 2.70 2.70 0.70fss9 0.40 1.10 0.60 0.70 0.60 0.60 0.20p16Mb 3.50 8.70 5.00 5.90 4.90 4.90 3.90p32Mb 8.50 20.60 11.60 13.90 11.60 11.60 9.70rndA21_8Mb 1.60 4.00 2.40 2.80 2.30 2.30 1.90rndA2_8Mb 1.50 3.90 2.20 2.60 2.20 2.20 1.50
![Page 60: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/60.jpg)
Questions?Thank you!
![Page 61: Lempel-Ziv Factorization Revisited - Ulm · Lempel-Ziv Factorization Revisited Enno Ohlebusch Simon Gog Institute of Theoretical Computer Science Ulm University Palermo, June 27th](https://reader034.vdocuments.net/reader034/viewer/2022052613/5f1ac2d34351343f9a008de5/html5/thumbnails/61.jpg)
Problem Definition and Motivation Existing solutions New fast algorithm Experimental Results
Experimental Results for Space-Efficient Algorithms
space in bytes per symbol time in seconds
test case CP
S2
LZ_b
wd1
LZ_b
wd2
LZ_b
wd4
LZ_b
wd8
LZ_b
wd1
6
LZ_b
wd3
2
CP
S2
LZ_b
wd1
LZ_b
wd2
LZ_b
wd4
LZ_b
wd8
LZ_b
wd1
6
LZ_b
wd3
2
chr19.dna4 6.0 4.7 2.7 1.7 1.2 1.0 0.8 161.9 75.2 76.5 78.9 83.3 92.1 110.5chr22.dna4 6.0 4.7 2.7 1.7 1.2 1.0 0.8 78.8 39.0 39.5 40.3 43.0 48.3 58.5E.coli 6.0 4.7 2.7 1.7 1.2 1.0 0.8 7.6 3.9 4.0 4.3 4.5 5.0 6.2bible 6.0 5.1 3.1 2.1 1.6 1.3 1.2 3.9 4.6 4.8 5.2 5.6 6.5 8.5howto 6.0 5.1 3.1 2.1 1.6 1.4 1.2 54.2 60.1 61.7 65.2 72.1 84.9 110.1fib_s14930352 6.0 4.6 2.6 1.6 1.1 0.8 0.7 2.1 14.6 15.0 14.7 14.7 14.6 14.7fib_s9227465 6.0 4.6 2.6 1.6 1.1 0.8 0.7 1.3 8.9 9.1 8.9 8.9 9.1 8.9fss10 6.0 4.6 2.6 1.6 1.1 0.8 0.7 1.8 11.5 11.6 11.6 11.6 11.5 11.6fss9 6.0 4.6 2.6 1.6 1.1 0.9 0.7 0.4 2.0 2.0 2.0 2.0 2.0 2.0p16Mb 6.0 5.0 3.0 2.0 1.5 1.3 1.2 33.7 24.8 26.3 28.5 33.8 43.7 63.9p32Mb 6.0 5.0 3.0 2.0 1.5 1.3 1.1 73.7 52.5 54.5 59.7 69.3 88.1 126.7rndA21_8Mb 6.0 5.1 3.1 2.1 1.6 1.3 1.2 17.6 13.0 13.9 15.6 19.0 26.2 40.6rndA2_8Mb 6.0 4.6 2.6 1.6 1.1 0.8 0.7 11.7 6.8 6.8 6.9 7.2 7.5 8.4