storage representations for set-oriented selection predicates karthikeyan ramasamy with jeffrey f....

Post on 20-Jan-2018

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Set Valued Attributes Many semantic notions of the real world can be described by sets (e.g) set of courses, set of products, etc. Set valued attributes provide conciseness and ease of expression

TRANSCRIPT

Storage Representations for Set-Oriented Selection Predicates

Karthikeyan Ramasamywith

Jeffrey F. Naughton and David Maier

Object Relational DBMS

• OR-DBMS are gaining acceptance• Market for OR-DBMS is growing • Many vendors are working on a version of

OR-DBMS• Main features of OR-DBMS

– Type extensibility– Collections

Set Valued Attributes

• Many semantic notions of the real world can be described by sets (e.g) set of courses, set of products, etc.

• Set valued attributes provide conciseness and ease of expression

Classification of Representations

Internal External

Nes

ted

Unn

este

d

Yes Yes

YesNo

Nested Internal Representation

• Stored at the end of the tuple• Requires support for handling large tuples• Retrieval cost of a tuple increases• Updates could reorganize the whole tuple• Might do well when the size of the set is

small

Nested Internal Representation

Cardinality

Element 1

Element 2

Element N

.

.

Length

Tuple

A1 A2 A3

Unnested External Representation

• Set-valued attributes are stored separately in an auxiliary relation

• Set instances are unnested and each element stored as a tuple

• Uses key - foreign key for connecting tuple and its set elements

• Requires join to assemble elements

Unnested External Representation

• Example– Moviegoer(name, street, city, state, zip, {movies})

• Base Relation– Moviegoer-Base(name, street, city, state, zip, id)

• Set Relation– Moviegoer-Set(id, movie-name)

Nested External Representation

• Set valued attributes are stored in an auxiliary relation

• Set instances are nested in auxiliary relation• Uses key - foreign key• Number of tuples is the same as base

relation• Resorts to join

Nested External Representation

• Example– Moviegoer(name, street, city, state, zip, {movies})

• Base Relation– Moviegoer-Base(name, street, city, state, zip, id)

• Set Relation– Moviegoer-Set(id, {movies})

Indexed Variants

• Augmentation with Indexes• Nested Representations

– Unnested and unclustered Index• Unnested Representations

– Clustered Index– Unclustered Index

Performance - Settings

• Implementation in Paradise - Set Adt• Intel Pentium 333 MHz - Solaris 2.6• Main memory - 128 MB• Buffer pool size - 32 MB• Used raw disks of size 4 GB• Each experiment was run against cold

database

Performance - Experimental Schema

Moviegoer(name, street, city, state, zipcode, {movies})

– Average tuple size 68 bytes – Number of Base Relation tuples 10000– Number of Set Elements 1000000– Set element size is 20 bytes

Performance - Queries

• Queries ran are– Conjunctive Queries– Disjunctive Queries– Queries not referring to set valued attribute

• Sets are not in the result• Sets in the result

Performance - Parameters Varied

• Cardinality• Selectivity of the predicate• Number of elements in the predicate• Size of each set element

Conjunctive Queries

SELECT m.name, m.street, m.city, m.state, m.zipcodeFROM Moviegoer mWHERE { “movieA50061”, “movieA50062” }

SUBSET OF m.movies

No Set in the Result

SELECT m.name, m.street, m.city, m.state, m.zipcode, m.moviesFROM Moviegoer mWHERE { “movieA50061”, “movieA50062” }

SUBSET OF m.movies

Set in the Result

Disjunctive Queries

SELECT m.name, m.street, m.city, m.state, m.zipcodeFROM Moviegoer mWHERE “movieA50061” IN m.movies OR

“movieA50062” IN m.movies

No Set in the Result

SELECT m.name, m.street, m.city, m.state, m.zipcode, m.moviesFROM Moviegoer mWHERE “movieA50061” IN m.movies OR

“movieA50062” IN m.movies

Set in the Result

No Set in Result - Varying Cardinality

0

10

20

30

40

50

60

70

80

0 20 40 60 80 100 120Cardinality

Res

pons

e Ti

me

(sec

)

Nested Internal Indexed Nested InternalNested External Indexed Nested ExternalUnnested External Indexed Unnested External

Selectivity of 1 % for Six Element Predicate Query

No Set in Result - Varying Selectivity

0

20

40

60

80

100

120

0.01 0.1 1 10 25 50Selectivity (%)

Res

pons

e Ti

me

(sec

)

Nested Internal Indexed Nested InternalNested External Indexed Nested ExternalUnnested External Indexed Unnested External

Six Element Predicate Query with Cardinality of 100

No Set in Result - Varying Number of Elements in Predicate

Selectivity of 1% with cardinality of 100

0

10

20

30

40

50

60

70

80

1 2 4 6Number of Elements in the Predicate

Res

pons

e Ti

me

(sec

)

Nested Internal Indexed Nested InternalNested External Indexed Nested ExternalUnnested External Indexed Unnested External

No Set in Result - Varying Size of Set Element

Selectivity of 1% with cardinality of 100

0

10

20

30

40

50

60

70

80

11 20 30Size of Set Element

Res

pons

e Ti

me

(sec

)

Nested Internal Indexed Nested Internal

Nested External Indexed Nested External

Unnested External Indexed Unnested External

Queries - Not Referring Set Valued Attribute

SELECT m1.name, m1.street, m1.city, m1.state, m1.zipcodeFROM Moviegoer m1, Moviegoer m2WHERE m1.id = m2.id

Join Query

SELECT m.name, m.street, m.city, m.state, m.zipcode,FROM Moviegoer m

Select Query

Select Query

0

2

4

6

8

10

12

10 25 50 100

Cardinality

Res

pons

e Ti

me

(sec

)

Unnested External Nested External Nested Internal

Conclusions and Future Work

• Nested representations perform better for set oriented selection predicates

• Indexes on nested representations are effective than unnested representations

• Evaluation of these representations for nested set joins

• Specialized operators for nested representations

Unnested External Representation

• Ability to handle any cardinality• Easily slides into existing relational engine• Set operations might be inefficient since

elements are scattered• Keys provide overhead when set elements

are small• Cardinality Explosion

No Set in Result - Cost BreakdownNested Internal

0

10

20

30

40

50

60

70

80

10 25 50 100

Cardinality

Res

pons

e Ti

me

(sec

)

I/O Cost Buffer Pool Cost

Predicate Eval Cost Other System Cost

Unnested External

0

10

20

30

40

50

60

70

80

10 25 50 100

Cardinality

Res

pons

e Ti

me

(sec

)

I/O Cost Buffer Pool Cost

Predicate Eval Cost Other System Cost

Selectivity of 1 % for Six Element Predicate Query

Conjunctive Queries - Unnested External

SELECT mb.set-id, mb.name, mb.street, mb.city, mb.state, mb.zipcode

FROM Moviegoer-Base mb, Moviegoer-Set msWHERE mb.set-id = ms.set-id AND

(ms.movie-name = “movieA50061” OR ms.movie-name = “movieA50062”)

GROUP BY mb.set-id, mb.name, mb.street, mb.city, mb.state, mb.zipcode

HAVING count(*) = 2

No Set in the Result

Conjunctive Queries - Unnested External

SELECT mb.set-id, mb.name, mb.street, mb.city, mb.state, mb.zipcode

FROM Moviegoer-Base mb, Moviegoer-Set ms1, Moviegoer-Set ms2

WHERE mb.set-id = ms1.set-id AND mb.set-id = ms2.set-id AND ms1.movie-name = “movieA50061” AND ms2.movie-name = “movieA50062”

No Set in the Result

Conjunctive Queries - Unnested External

INSERT INTO tempSELECT DISTINCT mb.set-id, mb.name, mb.street, mb.city,

mb.state, mb.zipcodeFROM Moviegoer-Base mb, Moviegoer-Set msWHERE mb.set-id = ms.set-id AND

ms.movie-name = “movieA50061” OR ms.movie-name = “movieA50062”

GROUP BY mb.set-id, mb.name, mb.street, mb.city, mb.state, mb.zipcode

HAVING count(*) = 2

SELECT t.name, t.street, t.city, t.state, t.zip, ms.movie-nameFROM temp t, Moviegoer-Set msWHERE t.set-id = ms.set-id

Set in the Result

Disjunctive Queries - Unnested External

SELECT DISTINCT mb.set-id, mb.name, mb.street, mb.city, mb.state, mb.zipcode

FROM Moviegoer-Base mb, Moviegoer-Set msWHERE mb.set-id = ms.set-id AND

(ms.movie-name = “movieA50061” OR ms.movie-name = “movieA50062”)

No Set in the Result

Disjunctive Queries - Unnested External

SELECT DISTINCT mb.set-id, mb.name, mb.street, mb.city, mb.state, mb.zipcode, ms2.movie-name

FROM Moviegoer-Base mb, Moviegoer-Set ms1, Moviegoer-Set ms2

WHERE mb.set-id = ms1.set-id AND ms1.set-id = ms2.set-id (ms.movie-name = “movieA50061” OR ms.movie-name = “movieA50062”)

Set in the Result

top related