algorithmic aspects of searching in the past
DESCRIPTION
Algorithmic Aspects of Searching in the Past. Lecture 1: Persistent Data Structures Advanced Topics in Algorithms & Data Structures. Christine Kupich Institut für Informatik, Universität Freiburg. Overview. Motivation Example : Natural search trees - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/1.jpg)
1
Algorithmic Aspects of Searching in the Past
Christine KupichInstitut für Informatik, Universität Freiburg
Lecture 1: Persistent Data StructuresAdvanced Topics in Algorithms & Data Structures
![Page 2: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/2.jpg)
2
Overview
• Motivation
• Example: Natural search trees
• Making data structures partially persistent
• Example: Partially persistent red-black trees
• An application: Point location
• An application: Grounded 2-dimensional range searching
• Making data structures fully persistent
![Page 3: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/3.jpg)
3
Motivation
Ephemeral: no mechanism to revert to previous states
A structure is called persistent, if it supports access to multiple versions.
Partially persistent: All versions can be accessed but only the newest version can be
modified.
Fully persistent: All versions can be accessed and modified.
Confluently persistent: Two or more old versions can be combined into one new
version.
Oblivious: The data structure yields no knowledge about the sequence of operations
that have been applied to it other than the final result of the operations.
![Page 4: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/4.jpg)
4
Example: Natural search trees
Only partially oblivious!
• Insertion history can sometimes be reconstructed.
• Deleted keys are not visible.
13
57 3
1
5
7
![Page 5: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/5.jpg)
5
Simple methods for making structures persistent
• Structure-copying method: Make a copy of the data structure each time it is
changed. Yields full persistence at the price of (n) time and space per update to a
structure of size n
• Store a log-file of all updates! In order to access version i, first carry out i updates,
starting with the initial structure, and generate version i. (i) time per access, O(1)
space and time per update
• Hybrid-method: Store the complete sequence of updates and additionally each k-th
version for a suitably chosen k. Result: Any choice of k causes blowup in either
storage space or access time
Are there any better methods?
![Page 6: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/6.jpg)
6
Making data structures persistent
Several constructions to make various data structures persistent have been devised,
but no general approach has been taken until the seminal paper by Driscoll,
Sarnak, Sleator and Tarjan, 1986.
They propose methods to make linked data structures partially as well as fully
persistent.
Let’s first have a look at how to make structures partially persistent
![Page 7: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/7.jpg)
7
Fat node method - partial persistence
Record all changes made to node fields in the nodes
• Each fat node contains same fields as ephemeral node and a version stamp• Add a modification history to every node: each field in a node contains a list of version-value pairs
![Page 8: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/8.jpg)
8
Fat node method - partial persistence
Modifications
• Ephemeral update step i creates new node: create a new fat node with version stamp i and original field values• Ephemeral update step i changes a field: store the field value plus a timestamp
Each node knows what its value was at any previous point in time
Access field f of version iChoose the value with maximum version stamp no greater than i
![Page 9: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/9.jpg)
9
Fat node method - analysis
• Time cost per access gives O(log m) slowdown per node (using binary search on the modification history)• Time and Space cost per update step is O(1) (to store the modification along with the timestamp at the end of the modification history)
![Page 10: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/10.jpg)
10
Fat node method - Example
A partially persistent search tree. Insertions:5,3,13,15,1,9,7,11,10, followed by deletion of item 13.
5
1-10
3 13
2 3
44
151
56
9
7
77
11
8
10
910
1010
10
![Page 11: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/11.jpg)
11
Path-copying method - partial persistence
• Make a copy of the node before changing it to point to the new child. Cascade the change back until root is reached. Restructuring costs O(height_of_tree) per update operation• Every modification creates a new root
• Maintain an array of roots indexed by timestamp.
![Page 12: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/12.jpg)
12
Path-copying method - Example
5
1 7
3
0
version 0:
version 1:Insert (2)
version 2:Insert (4)
![Page 13: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/13.jpg)
13
Path-copying method - Example
5
1 7
3
0
version 0:
![Page 14: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/14.jpg)
14
Path-copying method - partial persistence
5 5
1 1 7
3 3
2
0 1
version 1:Insert (2)
![Page 15: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/15.jpg)
15
Path-copying method - partial persistence
5 5 5
1 1 1 7
3 3 3
2 4
0 1 2
version 1:Insert (2)
version 2:Insert (4)
![Page 16: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/16.jpg)
16
Node-copying method - partial persistence
Extend each node by a time-stamped modification box (initially empty)
Version before the modification
time t
Versionat/ after time t
k
t: rplp rp
Searching in version j
Follow an entry pointer with largest version number i, i <= j Compare keys and follow newest pointer no greater than j
![Page 17: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/17.jpg)
17
Node-copying method - partial persistence
5
1
3
7
version 0
version 1:Insert (2)version 2:Insert (4)
![Page 18: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/18.jpg)
18
Node-copying method - partial persistence
5
1
3
2
7
1 lp
version 0:
version 1:Insert (2)
![Page 19: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/19.jpg)
19
Node-copying method - partial persistence
5
1
3
2
3
4
7
1 lp
version 1:Insert (2)
version 2:Insert (4)
![Page 20: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/20.jpg)
20
Node-copying method - partial persistence
5
1
3
2
3
4
72 rp
1 lp
version 1:Insert (2)version 2:Insert (4)
![Page 21: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/21.jpg)
21
Node-copying method - partial persistence
Modification
If modification box empty, fill it.
Otherwise, make a copy of the node, using only the latest values, i.e. value in modification box plus the value we want to insert, without using modification box
Cascade this change to the node’s parent
If the node is a root, add the new root to a sorted array of roots
Access time gets O(1) slowdown per node, plus additive O(log m) cost for finding the correct root
![Page 22: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/22.jpg)
22
Node-copying method - Example
A partially persistent search tree. Insertions: 5,3,13,15,1,9,7,11,10, followed by deletion of item 13.
5
1-2
22
3
5
13
3-9
15
4
1
513
6
9
7
7
9
11
8
10
9
5
10
1110
![Page 23: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/23.jpg)
23
Node-copying method - partial persistence
The amortized costs (time and space) per modification are O(1).Proof: Using the potential technique
![Page 24: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/24.jpg)
24
Potential technique
Potential techniqueThe potential is a function of the entire data structure
Definition potential function: A measure of a data structure whose change after an operation corresponds to the time cost of the operation • The initial potential has to be equal to zero and non-negative for all versions• The amortized cost of an operation is the actual cost plus the change in potential• Different potential functions lead to different amortized bounds
![Page 25: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/25.jpg)
25
Node-copying method - partial persistence
Definitions
• Live nodes: they form the latest version (reachable from the root of the most recent version), dead otherwise• Full live nodes: live nodes whose modification boxes are full
![Page 26: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/26.jpg)
26
Node-copying method - potential paradigm
The potential function f (T): the number of full live nodes in T(initially zero)
The amortized cost of an operation is the actual cost plus the change in potential
Δ f =?
Each modification involves k number of copies, each with a O(1) space and time cost, and one change to a modification box with O(1) time costChange in potential after update operation i: Δ f =
Space: O(k + Δ f), time: O(k + 1 + Δ f)
Hence, a modification takes O(1) amortized space and O(1) amortized time
![Page 27: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/27.jpg)
27
Red-black trees
Constraints
• All missing nodes are regarded as black• Any red node has a black parent• From any node, all paths to a missing node contain the same number of black nodes
Depth of an n-node red-black tree is at most 2 log nRoot is colored black
![Page 28: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/28.jpg)
28
Red-black trees
Rebalancing transformations - insertion
1.
2.
bubble the violation up the tree
recolor
recolor
![Page 29: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/29.jpg)
29
Red-black trees
Rebalancing transformations - insertion
11 rr4.
lr + recolor parent and gran-parent
3.leaving no inconsistency
An insertion requires O(log n) recolorings plus at most 2 rotations
Case 3.
![Page 30: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/30.jpg)
30
Red-black trees - partial persistence
A red-black tree can be made partially persistent using the node copying method at an amortized space cost of O(1) per insertion or deletion and a worst-case time cost of O(log n) per access, insertion or deletion.
Each node contains: • a key • 2 pointers for the successors• a color bit and • an extra pointer (version stamp, direction)
Colors are not used in access operations. Old colors can be overwritten
![Page 31: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/31.jpg)
31
Red-black trees - partial persistence
An Example: insert E, C, M, O, N
![Page 32: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/32.jpg)
32
Red-black trees - partial persistence
An Example: insert E, C, M, O, N
recolor
E
C
1-2
r,b
2r
E
1
r,bInsert C Insert M E
M
3
b
r
E
C
1-2
r,b
2r
O
E
M
3-4
b
r
4r
E
C
1-2
r,b
2r
O
E
M
3-4
b,r,b
r,b4
r
E
C
1-2
r,b
2r,b
Insert O
![Page 33: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/33.jpg)
33
Red-black trees - partial persistence
Insert N
O
E
M
3-4
b
r,b4
r
E
C
1-2
r,b
2r,b
Nr
O
E
M
3-4
b
r,b4
r
E
C
1-2
r,b
2r,b
5
RR
Or
N
Mr,b
r
![Page 34: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/34.jpg)
34
Or
N
Mr,b
r LR +recolor
Or
N
M
r,b
r
Nr
O
E
M
3-4
b
r,b4
r
E
C
1-2
r,b
2r,b
5
Nr,b
O
E
M
3-5
b
r,b
4r
E
C
1-2
r,b
2r,b
5
Mr
![Page 35: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/35.jpg)
35
Application: Grounded 2-Dimensional Range Searching
Given a set of points, and a query triple (a,b,i)
Report the set of points a<x<b and y<i.
a b
i
x
y
![Page 36: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/36.jpg)
36
Application: Grounded 2-Dimensional Range Searching
a b
i
To answer a query: Report all points in version i whose x-coordinates are in [a,b]. Query time?
Persistent red-black tree: Space ? , preprocessing time ?
Version i contains every point for which y<i. Use x-coordinates as keys.
![Page 37: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/37.jpg)
37
1-Dimensional Range Search
![Page 38: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/38.jpg)
38
Application: Planar point location
Suppose that the Euclidian plane is subdivided into polygons by n line segments that intersect only at their endpoints.
Given such a polygonal subdivision and an on-line sequence of query points in the plane, the planar point location problem, is to determine for each query point the polygon containing it.
Measure an algorithm by three parameters:
1) The preprocessing time.
2) The space required for the data structure.
3) The time per query.
![Page 39: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/39.jpg)
39
Planar point location - example
![Page 40: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/40.jpg)
40
Solving planar point location (Cont.)
Dobkin-Lipton:
Partition the plane into vertical slabs by drawing a vertical line through each endpoint.
Within each slab the lines are totally ordered.
Allocate a search tree per slab containing the lines and with each line associate the polygon above it.
Allocate another search tree on the x-coordinates of the vertical lines
![Page 41: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/41.jpg)
41
Planar point location -- example
![Page 42: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/42.jpg)
42
Solving planar point location (Cont.)
To answer a query:
first find the appropriate slab
then search the slab to find the polygon
Query time is O(log n)
How about the space ?
![Page 43: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/43.jpg)
43
Planar point location -- bad example
Total # lines O(n), and number of lines in each slab is O(n).
![Page 44: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/44.jpg)
44
Planar point location & persistence
So how do we improve the space bound ?
Key observation: The lists of the lines in adjacent slabs are very similar.
Create the search tree for the first slab.
Then obtain the next one by deleting the lines that end at the corresponding vertex and adding the lines that start at that vertex
How many insertions/deletions are there all together ?
2n (One insertion and one deletion per segment)
![Page 45: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/45.jpg)
45
Planar point location & persistence (cont)
Updates should be persistent since we need all search trees at the end.
Partial persistence is enough.
Well, we already have the path copying method, lets use it.What do we get ?
O(n log n) space and O(n log n) preprocessing time.
Using the node-copying method, we can improve the space bound to O(n).
![Page 46: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/46.jpg)
46
Making data structures fully persistent
• With this type of persistence the versions don't form a simple linear path, they form
a version tree (since you can also update in the past). Lack of linear ordering.
• Impose a total ordering on the versions (version list)
• The version list defines a preorder on the version tree (for navigation): for any
version i, the descendants of i in the version tree occur consecutively in the version
list, starting with i.
0
1
2
3
4
5
6 7
version list:
A version tree
![Page 47: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/47.jpg)
47
Making data structures fully persistent
iA iC
iG iA iM
iI
0
1
2
3
4
6
7 8
iK
dE
10
11
iM9
dM
iO
5
12
iE
Search tree versions:
![Page 48: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/48.jpg)
48
Full persistence
It must be possible to:
• perform insertions in the version list and
• given two versions i and j, determine whether i precedes or follows j in the version
list
This list order problem has been addressed by Dietz and Sleator
• order queries are answered in O(1) worst case time with an O(1) amortized time
bound for insertion
![Page 49: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/49.jpg)
49
Fat node method - full persistence
• Each fat node contains same fields as ephemeral node plus space for extra fields
(each with a field name and a version stamp)
• Each field in a node contains a list of version-value pairs
Access
• Versions are compared with respect to their position in the version list, not with
respect to their numeric values
• Access a field in version i: search for the version stamp rightmost in version list,
but not to the right of i
![Page 50: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/50.jpg)
50
Fat node method - Example
iA iC
iG iA iM
iI
0
1
2
3
4
6
7 8
iK
dE
10
11
iM9
dM
iO
5
12
iE
Version list: 1,6,7,10,11,2,8,9,3,4,5,12
E
A C
A
G
K
M M
I
O
1-10,12
1-10, 12
A fully persistent search tree
11 6 2 7 9
3
4
12
5
10
112
8
2
33
2
![Page 51: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/51.jpg)
51
Fat node method - full persistence
Update operation i
• Add i to the version list
• Update step creates new node: create new fat node with original field values (stamp i)
• Update step changes a field f: we have to guarantee that the new value of f will be used only in version i
Time cost per access and update step
• O(log m), provided each set of field values is stored in a search tree, ordered by version stamp
Space cost
• Worst-case space cost per update step is O(1)
![Page 52: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/52.jpg)
52
Applications
Partially persistent balanced search trees
• give a simple solution to the planar point location problem, the grounded 2-dimensional range searching problem, …
• can be used as a substitute for Chazelle‘s hive graph (geometric retrieval)
Fully persistent data structures can be used for
• the binary dispatching problem (OO – languages: find for a invocation the most specific applicable method)
• text editing
Oblivious data structures
• cryptography
![Page 53: Algorithmic Aspects of Searching in the Past](https://reader035.vdocuments.net/reader035/viewer/2022062322/56814881550346895db590b7/html5/thumbnails/53.jpg)
53
References
• J. R. Driscoll, N. Sarnak, D. D. Sleator, and R. E. Tarjan: Making data structures
persistent. Journal of Computer and System Sciences, 38:86-124, 1989. Final
version.
• N. Sarnak, R. E. Tarjan. Planar Point Location Using Persistent Search Trees:
Communications of the ACM,29:669 – 679, July 1986.
• D. Micciancio: Oblivious Data Structures: Applications to Cryptography.1997.