r-trees a dynamic index structure for spatial searching
DESCRIPTION
R-Trees A Dynamic Index Structure for Spatial Searching. Antonin Guttman In Proceedings of the 1984 ACM SIGMOD international conference on Management of data (SIGMOD '84). ACM, New York, NY, USA. Outline. Introduction R-Tree Index Structure Searching and Updating Performance Tests - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/1.jpg)
R-Trees A Dynamic Index
Structure for Spatial Searching
Antonin GuttmanIn Proceedings of the 1984 ACM SIGMOD
international conference on Management of data (SIGMOD '84). ACM, New York, NY, USA
![Page 2: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/2.jpg)
2
Introduction R-Tree Index Structure Searching and Updating Performance Tests Conclusion
Outline
![Page 3: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/3.jpg)
3
Introduction
Background Previous Works
R-Tree Index Structure Searching and Updating Performance Tests Conclusion
Outline
![Page 4: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/4.jpg)
4
Motivation
To deal with spatial data efficiently Traditional database are for one-dimension data
Traditional Index Structure Hash Tables B Trees and ISAM
Background
![Page 5: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/5.jpg)
5
Previous Works
Method DisadvantageCell methods Not good for dynamic structuresQuad trees Do not take paging of secondary
memory into accountK-D treeK-D-B tree Useful only for point dataCorner Stltchmg
Homogeneous primary memoryNot efficient
Grid files
![Page 6: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/6.jpg)
6
Introduction R-Tree Index Structure
R-Tree Index Structure Properties of the R-Tree Example of a R-Tree
Searching and Updating Performance Tests Conclusion
Outline
![Page 7: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/7.jpg)
7
What is a R-tree
Height-balanced tree similar to a B-tree No need for doing periodic reorganization
What is the contents in the nodes (I, tuple-identifier) in leaf node (I, child-pointer) in non-leaf node
It must satisfy following properties
R-Tree Index Structure
![Page 8: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/8.jpg)
8
Let M be the maximum number of entries that
will fit in one node Let m <= M/2 be a parameter specifying the
minimum number of entries in a node
Properties of the R-Tree
![Page 9: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/9.jpg)
9
1. Every leaf node contains between m and M index records
unless it is the root2. For each index record(I, tuple-identifier) in a leaf node, I is
the smallest rectangle that spatially contains the n-dimensional data object represented by the indicated tuple
3. Every non-leaf node has between m and M children unless it is the root
4. For each entry(I, child-pointer) in a non-leaf node, I is the smallest rectangle that spatially contains the rectangles in the child node
5. The root node has at least two children unless it is a leaf6. All leaves appear on the same level
Properties of the R-Tree
![Page 10: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/10.jpg)
10
Example of a R-Tree
![Page 11: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/11.jpg)
11
Introduction R-Tree Index Structure Searching and Updating
Searching Example of Searching Insertion Updates and Other Operations Node Splitting
Performance Tests Conclusion
Outline
![Page 12: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/12.jpg)
12
Problem definition
Give an R-Tree whose root node is T, find all index records whose rectangles overlap a search rectangle S
NotationsEI is the rectangle part of an index entry EEp is the tuple-identifier or child-pointer of an E
Searching
![Page 13: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/13.jpg)
13
Search(T, LIST) {
IF (T is not a leaf) FOR EACH (E in T) IF (E.EI overlaps S) Search(E.Ep);ELSE FOR EACH (E in T) IF (E.EI overlaps S) LIST.ADD(E.Ep);
}
Searching
![Page 14: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/14.jpg)
14
Example of Searching
![Page 15: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/15.jpg)
15
It is similar to insert a record in B-tree that new
record are added to the leaves, nodes that overflow are split, and splits propagate up the tree
Insert(T, E) { L = ChooseLeaf(T, E); INSTALL E; IF (L is full) { LL = SplitNode(L); AdjustTree(L, LL); }}
Insertion
![Page 16: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/16.jpg)
16
N ChooseLeaf(T, E) { SET N = T; IF (N is a non-leaf node) { find the F that F.FI needs least enlargement to include E.EI IN N SET N = F.Fp; ChooseLeaf(N, E); } ELSE return N;}
Insertion - ChooseLeaf()
![Page 17: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/17.jpg)
17
AdjustTree(L, LL) { SET N = L; SET NN = LL; IF (N is root) // check if done return; SET P = N.parent; SET En to be N’s entry in P ADJUST EnI so that it tightly encloses all entry rectangles in N IF (NN != NULL) { CREATE Enn; // Enn.p = NN, EnnI enclosing all rectangles in NN P.add(Enn); IF (P is full) { PP = SplitNode(P); AdjustTree(P, PP); } }}
Insertion - AdjustTree()
These three lines are for adjust covering rectangle in parent entry
![Page 18: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/18.jpg)
18
Remove index record E from an R-tree
Delete(T, E) { L = FindLeaf(T, E); IF (L != NULL) { Remove(E, L); // remove E from L CondenseTree(L); IF (root node has only one child) make the child the new root; }}
Deletion
![Page 19: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/19.jpg)
19
Given an R-tree whose root node is T, find the leaf node containing the index
entry E
T FindLeaf(T, E) { IF (T is not a leaf) { FOR EACH (F in T) { IF (FI overlaps EI) { T = FindLeaf(Fp, E); } } } IF (T is leaf) { FOR EACH (F in T) IF (F MATCH E) return T; }}
Deletion - FindLeaf()
![Page 20: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/20.jpg)
20
CondenseTree(L) {CT1: SET N = L; SET Q = empty; // the set of eliminated nodes.CT2: IF (N is root) { FOR EACH (E in Q) Insert(T, E); } ELSE { SET P = N.parent; SET En to be N’s entry in P;CT3: IF (N has fewer than m entries) { DELETE (En, P) // delete En from P Q.add(N); } ELSE {CT4: adjust EnI to tightly contain all entries in N;CT5: SET N = P; GOTO CT2; } }}
Deletion - CondenseTree()
![Page 21: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/21.jpg)
21
Update
Just perform deletion and re-insertion to do update
Other operations To find all data objects completely contained in
a search area, or all objects that contain a search area
Range deletion
Updates and Other Operations
![Page 22: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/22.jpg)
22
We need to perform node splitting when we
insert an entry into a full node The two covering rectangles after a split
should be minimized because it affect efficiency seriously
The are three different kind of splitting algorithms: exhaustive algorithm, quadratic-cost algorithm and linear-cost algoritym
Node Splitting
![Page 23: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/23.jpg)
23
It is the most straightforward approach To generate all possible groupings and choose
the best It most disadvantage is the high time
complexity, and reasonable value of M is 200(4096/4/(4+1))
Node Splitting- Exhaustive Algorithm
![Page 24: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/24.jpg)
24
It attempts to find a small-area split, but is not
guaranteed to find one with the smallest area possible
The cost is quadratic in M and linear in the number of dimensions
Process1. Pick first entry for each group2. Check if done3. Select entry to assign
Node Splitting - Quadratic-Cost
Algorithm
![Page 25: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/25.jpg)
25
Select two entries to be the first elements of the
groups Process
1. Calculate inefficiency of grouping entries together
2. Choose the most wasteful pair
Quadratic-Cost Algorithm PickSeeds()
![Page 26: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/26.jpg)
26
Select one remaining entry for classification in
a group Process
1. Determine cost of putting each entry in each group
2. Find entry with greatest preference for one group
Quadratic-Cost Algorithm PickNext()
![Page 27: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/27.jpg)
27
It is linear in M and in the number of
dimensions It is identical to Quadratic Split but used a
different version of PickSeed, PickNext Process
1. Find extreme rectangles along all dimensions2. Adjust for shape of the rectangle cluster3. Select the most extreme pair
Node Splitting – Linear-Cost Algorithm
![Page 28: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/28.jpg)
28
Introduction R-Tree Index Structure Searching and Updating Performance Tests
Performance Tests CPU Cost of Inserting Records CPU Cost of Deleting Records Search Performance Pages Touched Search Performance CPU Cost Space Efficiency
Second Series of Tests CPU Cost of Inserts and Deletes vs. Amount of Data Search Performance vs. Amount of Data Pages Touched Search Performance vs. Amount of Data CPU Cost Space Required for R-Tree vs. Amount of Data
Conclusion
Outline
![Page 29: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/29.jpg)
29
Implemented R-trees in C under Unix on a Vax
11/780 computer It purpose is to choose values for M and m, and
to evaluate different node-splitting algorithms Five page sizes were tested, corresponding to
different values of M Values tested for m were
M/2, M/3 and 2 All tests used
two-dimensional data
Performance Tests
Bytes per Page
Max Entries per Page(M)
128 6256 12512 251024 502048 102
![Page 30: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/30.jpg)
30
CPU Cost of
Inserting Records
![Page 31: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/31.jpg)
31
CPU Cost of
Deleting Records
![Page 32: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/32.jpg)
32
Search Performance
Pages Touched
![Page 33: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/33.jpg)
33
Search Performance
CPU Cost
![Page 34: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/34.jpg)
34
Space Efficiency
![Page 35: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/35.jpg)
35
It measured T-tree performance as a function
of the amount of data in the index The same sequence of test operations as
before was run on samples containing 1057, 2238, 3295, and 4559 rectangles
Parameters Linear algorithm with m = 2 Quadratic algorithm with m = M/3 Both with a page size of 1024 bytes(M=50)
Second Series of Tests
![Page 36: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/36.jpg)
36
CPU Cost of Inserts and
Deletes vs. Amount of Data
![Page 37: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/37.jpg)
37
Search Performance vs. Amount of Data Pages
Touched
![Page 38: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/38.jpg)
38
Search Performance vs.Amount of Data CPU
Cost
![Page 39: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/39.jpg)
39
Space Required for R-
Tree vs. Amount of Data
![Page 40: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/40.jpg)
40
Introduction R-Tree Index Structure Searching and Updating Performance Tests Conclusion
Outline
![Page 41: R-Trees A Dynamic Index Structure for Spatial Searching](https://reader035.vdocuments.net/reader035/viewer/2022062520/56816368550346895dd4410f/html5/thumbnails/41.jpg)
41
Author proposed an useful index structure,
named R-tree, for multi-dimensional data Author also gave tree different splitting
algorithm, ran some tests on it, and concluded that linear node-split algorithm is the most efficient approach
R-tree would be easy to add to any relational database system
Conclusion