indexing data structure
DESCRIPTION
TRANSCRIPT
![Page 1: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/1.jpg)
Vivek Kantariya(09bce020)
Guided by :Prof. Vibha Patel
![Page 2: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/2.jpg)
Manage large data Provide faster access Easy search Reduce unwanted memory access Proper memory allocation Increase efficiency
![Page 3: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/3.jpg)
It contains a search key and a pointer. Search key - an attribute or set of
attributes that is used to look up the records in a file.
Pointer - contains the address of where the data is stored in memory.
![Page 4: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/4.jpg)
Five Factors involved when choosing the indexing technique:
1)access type2)access time3)insertion time4)deletion time5)space overhead
![Page 5: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/5.jpg)
1) Access type - is the type of access being used.
2) Access time - time required to locate the data.
3) Insertion time - time required to insert the new data.
4) Deletion time - time required to delete the data.
5) Space overhead - the additional space occupied by the added data structure.
![Page 6: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/6.jpg)
It is for multi- dimension data. Used to describe 2D or 3D objects. Real world usage.
Examples are : R tree , R+ tree , KD tree , A tree , Hilbert tree , etc
![Page 7: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/7.jpg)
Computer Aided Design (CAD) Geographic applications (like maps) Multimedia Applications (like X-rays) Biological Databases
![Page 8: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/8.jpg)
Any Type of Geometry Point
City Line
Trail Polygon
Border A Collection of Geometries
Ski Resort Trails Any Coordinate System
Meters Pixels WGS84 (GPS)
![Page 9: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/9.jpg)
![Page 10: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/10.jpg)
• Proposed by • Antonin Guttman• UC Berkley
• All Spatial Data Enveloped• Minimum Bounding Rectangle (MBR)
• Stored and Indexed According to MBR• Structure Resembles B+-tree• Height Balanced
![Page 11: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/11.jpg)
• For an index record <I, tuple-identifier>• I = (I0, I1, … In)• n = Number of Dimensions in the Geometry• Each I is a set of the form [a,b] describing the range
of the rectangle along the dimension• a or b can be equal to infinity
• Tuple-identifier points to a record• Non-leaf nodes are in the form:
<I, child-pointer>
![Page 12: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/12.jpg)
• M is the maximum number of entries in one node• m specifies the minimum number of entries in a
node , where m ≤ M/2• Properties :1. Every Leaf Node Contains Between m and M index
records unless it is root.2. For each index record, <I, tuple-identifier> in a leaf
node is the smallest rectangle that spatially contains the n-dimensional data object.
![Page 13: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/13.jpg)
3. Every non-leaf node has between m and M children unless it is the root.
4. For each entry <I, child-pointer> in a non-leaf node, I is the smallest rectangle that spatially contains the rectangles in the child nodes.
5. The root node has at least two children unless it is a leaf.
6. All leaves appear on the same level.
![Page 14: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/14.jpg)
![Page 15: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/15.jpg)
1. Search2. Insert3. Delete4. Nearest Neighbor
![Page 16: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/16.jpg)
1. Given R-tree with root T and and all records overlap with Search rectangle S.
2. If T is not leaf, check each entry E to determine whether Ei overlaps with S.
3. For all overlapping entries invoke search on each of them with root as node pointed by Ep.
4. If T is a leaf check each entry E. If it overlaps output it.
![Page 17: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/17.jpg)
1) Start at the root node2) Select the child that needs the least
enlargement in order to fit the new geometry.
3) Repeat until at a leaf node.4) If leaf node has available space then insert.
![Page 18: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/18.jpg)
5) Else split the entry into two nodes.• Update parent nodes• Update the entry that pointed to the node with
a new MBR [ Minimum Bounding Rectangle ] .• Add a new entry for the second new node
6) If there is no space in the parent node, split and repeat.
![Page 19: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/19.jpg)
Make sure nodes are split so they cover the smallest possible area.
Split should minimize average search time.
GOOD SPLIT!
BAD!
![Page 20: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/20.jpg)
1) Remove index node E from R-Tree.2) Find node containing record.3) Remove E.4) If node contains fewer than m records
remove the node and add it to Queue.5) Move up and do the same reducing
covering rectangles.6) Reinsert all records from Queue.
![Page 21: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/21.jpg)
• Split Entries in the tree so that there is no overlap• No more multiple paths to reach a solution• Child pointers duplicated within the tree
R-Tree MBRs R+-Tree MBRs
![Page 22: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/22.jpg)
Do not split nodes on insertTake entries from the overfull node and reinsert
them into the tree Changes MBRs
Saves time and possibly rebalances the tree
![Page 23: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/23.jpg)
1. www.ieeexplore.ieee.org◦ A NEW APPROACH TO CREATING SPATIAL INDEX
WITH R-TREE byZe-Bao Zhang, Jian-Pei Zhang, Jing Yang, Yue Yang
◦ A NEW VARIATION OF R-TREE FOR INDEXING SPACIAL DATA IN GIS byChen Yongkang , Zhou Xintie , Shi Tailai , Feng Xiaoming
2. http://wikipedia.org/wiki/R_tree
![Page 24: Indexing Data Structure](https://reader033.vdocuments.net/reader033/viewer/2022051107/53ff98e88d7f724c088b46b4/html5/thumbnails/24.jpg)