1 csis 7101: csis 7101: spatial data (part 2) efficient processing of spatial joins using r-trees...

28
1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric Lo Sindy Shou Hugh Wang

Upload: alisha-blake

Post on 18-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

1

CSIS 7101:CSIS 7101:Spatial Data (Part 2)

Efficient Processing of Spatial Joins Using R-trees

Rollo ChanChu Chung Man

Mak Wai YipVivian Lee

Eric LoSindy ShouHugh Wang

Page 2: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

2

Efficient Processing of Spatial Join Using R-trees

What is Spatial Data? Consists of points, lines, rectangles,

polygons, surfaces…

Two types of queries in DBS Single scan and Multiple scan queries

How to retrieve spatial objects in GIS efficiently? Spatial Access Method (SAM) – eg. R*-tree

Page 3: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

3

Designed to support single scan query eg. Window query “Find all objects which intersect a given

window”

Attempts to store objects which are close together in the data space on a common page Reduces number of disk accesses

What is Spatial Access Method?

Page 4: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

4

How is window query processed by SAM?

1) Filter step Find all objects whose minimum bounding

rectangles intersects the query rectangle

2) Refinement step Check whether the objects fulfill the query

condition

Page 5: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

5

To combine two sets of spatial objects according to some spatial properties

It is an important type of query for multiple scanning in spatial DBS

What is Spatial Join?

Page 6: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

6

Example of Spatial Join Two relations: forests, cities

(Assume an attributes in each relation represents the borders of forests and cities)

Example query would be: “Find all forests which are in a city”

Page 7: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

7

Problems when performing Spatial Join

It is too expensive in terms of CPU time and I/O time

Traditional index structure is not efficient for spatial join

How to make it more efficient? R*-tree

Page 8: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

8

Why using R*-tree for Spatial Join ?

To optimize CPU-time and I/O time

Less comparison than a simple nested loop

Other algorithms cannot be efficiently applied to spatial join

Page 9: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

9

R*-tree Approach for Spatial Join

Suppose there are two R*-trees R, S

Idea:

To use the property that directory rectangles

form the minimum bounding box of data

rectangles in the corresponding subtrees.

If the rectangles of two directory entries ER

and ES have common intersection then there

is a pair (rectR, rectS)

Page 10: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

10

Minimum Bounding Box

Page 11: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

11

Is there anyway to be more efficient? There are two areas we need to take

into account in order to be more efficient

CPU – Time Tuning

I/O – Time Tuning

Page 12: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

12

CPU – Time Tuning Two ways to improve CPU – time

Restricting the search space

Spatial sorting and plane sweep

Page 13: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

13

Restricting the search space Idea:

Scan through each of two nodes marks all

entries which are required for performing

the join, (i.e. which intersect the intersecting

rectangles of two nodes. )

Then, each marked entry of one node is

tested against all marked entries of the

other node.

Page 14: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

14

Restricting the search space (cont’d)

1

4

3

2

5

6

7

1

2

3

46

5

7

Original: 7 of R * 7 of S

1

21

2

3 Now: 3 of R * 2 of S

= 49 joins

Plus Scanning: 7 of R + 7 of S

=6 joins

= 14 times

Page 15: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

15

Spatial sorting and plane sweep Idea:

Sort the entries in a node of the R*-tree

according to the spatial location of the

corresponding rectangles.

Then move the Sweep-Line perpendicular to

one of the axes from left to right to compute

the intersections.

Page 16: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

16

Example of Sorted Intersection Test

t = r1 : r1 <--> s1

t = s1 : s1 <--> r2

t = r2 : r2 <--> s2, r2 <--> s3

t = s2 : - t = r3: r3 <--> s3

Sweep-Line

r1.xu

s1.xl

s1.xl < r1.xu

Page 17: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

17

I/O Time Tuning To achieve good I/O-performance with a buffer

size as small as possible R*-tree might occupy only small portion of LRU-

buffer

Compute a read schedule of the pages to minimize the number of disk accesses Local optimization policy based on spatial locality

Idea of Read Schedule: If a frequently used page always resides in the buffer, the number of disk access can be improved by a lot

Page 18: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

18

Three such techniques Local plane sweep

Local plane sweep with pinning

Local z-order

Page 19: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

19

Local Plane-Sweep Order Idea:

Based on spatial ordering, the plane-sweep

algorithm creates a sequence of pairs of

intersecting rectangles.

This sequence can be used to determine the

read schedule of the spatial join.

Page 20: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

20

Local Plane-Sweep Order (cont’d)

Read schedule:

s1

r1

r2

s2

r3

r4

1 2

3

4

5

6

<

s1

s2

r2

r1

r4

r3

>, , , , ,

Page 21: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

21

Local Plane-Sweep Order w/ Pinning

Idea:1. Determine a pair of (Er,Es) of entries wrt local

plane sweep order. Compute the degree of the rectangles of both entries Deg(E.rect) = # of intersections between E.rect

and the rectangles which belong to entries of the other tree that are not yet processed

2. Pin the page in the buffer whose corresponding rectangle has maximal degree

3. Perform spatial join on the pinned page with all other pages

Page 22: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

22

Local Plane-Sweep Order w/ Pinning (cont’d)

s1

r1

r2

s2

r3

r4

Er

EsEr.rect = r1Es.rect = s2

Deg(r1) =Deg(s2) =

02

1

2

Page 23: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

23

Local Z-Order Idea:

1. Compute the intersections between each rectangle of the one node and all rectangles of the other node

2. Sort the rectangles according to the spatial location of their centers

3. Decompose the underlying space into cells of equal size and provide an ordering on this set of cells

Page 24: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

24

Local Z-Order (cont’d)

s1

r1

r2

s2

r3

r4

IV II

I

III

IV

Read schedule:<s1,r2,r1,s2,r4,r3>

II

I

III

Page 25: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

25

Number of Disk Access

0

1000

2000

3000

4000

5000

6000

7000

LPS order LPS order w/Pinning

Z-order

0KByte8KByte32KByte128KByte512KByte

5384 5290

2373 2392

Size ofLRU Buffer

>

<

Page 26: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

26

Number of Disk Access (cont’d)

0

1000

2000

3000

4000

5000

6000

Original LPS order w/ Pinning

0KByte

8KByte

32KByte

128KByte

512KByte

Size ofLRU Buffer

Page 27: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

27

Q & A

That’s it for the PresentationAny Questions?

Page 28: 1 CSIS 7101: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric

28

Reference1. Brinkhoff T., Kriegel H.P., Seeger B. (1993).

Institute of Computer Science, University of Munich. Efficient Processing of Spatial Joins Using R-trees. Washington, DC, USA: ACM-SIGMOD.