spatial indexing. spatial queries given a collection of geometric objects (points, lines,...
Post on 22-Dec-2015
221 views
TRANSCRIPT
![Page 1: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/1.jpg)
Spatial Indexing
![Page 2: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/2.jpg)
Spatial Queries
Given a collection of geometric objects (points, lines, polygons, ...)
organize them on disk, to answer point queries range queries k-nn queries spatial joins (‘all pairs’ queries)
![Page 3: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/3.jpg)
Spatial Queries
Given a collection of geometric objects (points, lines, polygons, ...)
organize them on disk, to answer point queries range queries k-nn queries spatial joins (‘all pairs’ queries)
![Page 4: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/4.jpg)
Spatial Joins
Spatial joins: find (quickly) all counties intersecting
lakes
![Page 5: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/5.jpg)
R-trees – spatial join We assume that both organized in R-
trees using the MBRs Find the MBRs that intersect Check the original objects
![Page 6: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/6.jpg)
R-tree – Spatial JoinsSPJ1(T1, T2)for each parent P1 of tree T1 for each parent P2 of tree T2 if their MBRs intersect, process them recursively (ie.,
check their children)
![Page 7: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/7.jpg)
R-tree – Spatial Joins We assume that the trees have the
same height The traversal is done in DFS order
![Page 8: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/8.jpg)
R-tree – Spatial Joins Optimization: SPJ2: First compute the
intersection of nodes T1 and T2. Check for intersection only the rectangles in the intersection
Huge improvement on CPU time!
![Page 9: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/9.jpg)
R-tree – Spatial Joins
![Page 10: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/10.jpg)
R-tree – Spatial Joins Is there any way to do better? Yes, using plane sweep! To check for intersection, naïve:
O(n2) But with plane sweep: O(n log n)
![Page 11: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/11.jpg)
R-tree – Spatial Joins Move a vertical line (sweep line)
from left to right. Every time that you find a new object do some processing
Objects are sorted over their x-coordinate
![Page 12: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/12.jpg)
R-tree – Spatial Joins What happens if only one relation
has an index? Build another index on the other
relation, then join Use the first tree to build the
second one: since we want to compute the join we can filter out some rectangle during the construction of the second tree!
![Page 13: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/13.jpg)
Spatial Joins Similar idea if we have z-ordering/
quadtrees Merge the lists of z-ordering, use
the properties of z-values (10* encloses 1001)
![Page 14: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/14.jpg)
R-trees - performance analysis
How many disk (=node) accesses we’ll need for range nn spatial joins
why does it matter?
![Page 15: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/15.jpg)
R-trees - performance analysis
How many disk (=node) accesses we’ll need for range nn spatial joins
why does it matter? A: because we can design split etc
algorithms accordingly; also, do query-optimization
![Page 16: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/16.jpg)
R-trees - performance analysis
A: because we can design split etc algorithms accordingly; also, do query-optimization
motivating question: on, e.g., split, should we try to minimize the area (volume)? the perimeter? the overlap? or a weighted combination? why?
![Page 17: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/17.jpg)
R-trees - performance analysis
How many disk accesses for range queries? query distribution wrt location? “ “ wrt size?
![Page 18: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/18.jpg)
R-trees - performance analysis
How many disk accesses for range queries? query distribution wrt location? uniform;
(biased) “ “ wrt size? uniform
![Page 19: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/19.jpg)
R-trees - performance analysis
easier case: we know the positions of parent MBRs, eg:
![Page 20: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/20.jpg)
R-trees - performance analysis
How many times will P1 be retrieved (unif. queries)?
P1
x1
x2
![Page 21: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/21.jpg)
R-trees - performance analysis
How many times will P1 be retrieved (unif. POINT queries)?
P1
x1
x2
0 10
1
![Page 22: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/22.jpg)
R-trees - performance analysis
How many times will P1 be retrieved (unif. POINT queries)? A: x1*x2
P1
x1
x2
0 10
1
![Page 23: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/23.jpg)
R-trees - performance analysis
How many times will P1 be retrieved (unif. queries of size q1xq2)?
P1
x1
x2
0 10
1
q1
q2
![Page 24: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/24.jpg)
R-trees - performance analysis Minkowski sum
q1
q2
![Page 25: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/25.jpg)
R-trees - performance analysis
How many times will P1 be retrieved (unif. queries of size q1xq2)? A: (x1+q1)*(x2*q2)
P1
x1
x2
0 10
1
q1
q2
![Page 26: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/26.jpg)
R-trees - performance analysis
Thus, given a tree with N nodes (i=1, ... N) we expect
#DiskAccesses(q1,q2) = sum ( xi,1 + q1) * (xi,2 + q2)
= sum ( xi,1 * xi,2 ) +
q2 * sum ( xi,1 ) +
q1* sum ( xi,2 )
q1* q2 * N
![Page 27: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/27.jpg)
R-trees - performance analysis
Thus, given a tree with N nodes (i=1, ... N) we expect
#DiskAccesses(q1,q2) = sum ( xi,1 + q1) * (xi,2 + q2)
= sum ( xi,1 * xi,2 ) +
q2 * sum ( xi,1 ) +
q1* sum ( xi,2 )
q1* q2 * N
‘volume’
surface area
count
![Page 28: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/28.jpg)
R-trees - performance analysis
Observations: for point queries: only volume
matters for horizontal-line queries: (q2=0):
vertical length matters for large queries (q1, q2 >> 0): the
count N matters
![Page 29: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/29.jpg)
R-trees - performance analysis
Observations (cont’ed) overlap: does not seem to matter formula: easily extendible to n
dimensions (for even more details: [Pagel +,
PODS93], [Kamel+, CIKM93])
![Page 30: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/30.jpg)
R-trees - performance analysis
Conclusions: splits should try to minimize area and
perimeter ie., we want few, small, square-like
parent MBRs rule of thumb: shoot for queries with
q1=q2 = 0.1 (or =0.5 or so).
![Page 31: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/31.jpg)
R-trees - performance analysis
How many disk (=node) accesses we’ll need for range nn spatial joins
![Page 32: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/32.jpg)
R-trees - performance analysis
Range queries - how many disk accesses, if we just now that we have
- N points in n-d space?A: ?
![Page 33: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/33.jpg)
R-trees - performance analysis
Range queries - how many disk accesses, if we just now that we have
- N points in n-d space?A: can not tell! need to know
distribution
![Page 34: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/34.jpg)
R-trees - performance analysis
What are obvious and/or realistic distributions?
![Page 35: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/35.jpg)
R-trees - performance analysis
What are obvious and/or realistic distributions?
A: uniformA: Gaussian / mixture of GaussiansA: self-similar / fractal. Fractal
dimension ~ intrinsic dimension
![Page 36: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/36.jpg)
R-trees - performance analysis
Formulas for range queries and k-nn queries: use fractal dimension [Kamel+, PODS94], [Korn+ ICDE2000] [Kriegel+, PODS97]
Formulas for spatial joins of regions: open research question
![Page 37: Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries](https://reader035.vdocuments.net/reader035/viewer/2022081515/56649d7f5503460f94a625d7/html5/thumbnails/37.jpg)
R-trees–performance analysis Assuming Uniform distribution:
where And D is the density of the dataset, f
the fanout [TS96]
}){(1)( 21
1j
h
jj f
NqDqDA
21}
11{
f
DD
j
j