1 query evaluation partially using prof. hector garcia-molina’s slides (notes06, notes07)...

Post on 01-Apr-2015

266 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Query Evaluation

Partially using Prof. Hector Garcia-Molina’s slides (Notes06, Notes07)http://www-db.stanford.edu/~ullman/dscb.html

Donghui ZhangNortheastern University

2

Query Evaluation

SQL Query Query Result

SELECT E.NameFROM Emp EWHERE E.SSN<5000AND E.Age>50

Michael JordanDonghui Zhang

• Check the data and meta data;• Produce query result

Server

Michael JordanDonghui Zhang

???

3

Query Evaluation Steps

• Query Compiling: get logical Q.P.• Query Optimization: choose a physical

Q.P.• Query Execution: execute

4

parse

convert

apply laws

estimate result sizes

consider physical plans estimate costs

pick best

execute

{P1,P2,…..}

{(P1,C1),(P2,C2)...}

Pi

answerSQL query

parse tree

logical query plan

“ improved” l.q.p

l.q.p. +sizes

statistics

query compiling

query optimization

query execution

5

Query Compiling Parse

• Background knowledge: Grammar.• Input: SQL query.• Output: a parse tree.

• Start with a simple grammar:– Only SFW (no group by, having, nested query)– Simple AND condition (no OR, UNION, EXISTS, IN, …)– One table (no conditions like E.did=D.did)

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

6

• <SFW> := SELECT <SelList> FROM <Table> WHERE <CondList>

• <SelList> := <Attribute> | <Attribute>, <SelList> • <CondList> := <Condition> | <Condition> AND

<CondList>• <Condition> := <Attribute> <op> <value>• <op>:= > | < | = | >= | <=

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50Query Compiling Parse

Grammar

7

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50Query Compiling Parse

Parse Tree

<SFW>

SELECT <SelList> FROM <Table> WHERE <CondList>

<Attribute> <op> <value>

E.SSN < 5000 <op> <value>

E.Age > 50

<Attribute>

<Condition>

Emp E<Attribute>

E.Name

<Condition>AND<CondList>

8

Query Compiling Convert

• Input: a parse tree.• Output: a logical query plan.

• Algorithm: followed by . E.Name(E.SSN<5000 AND E.Age>50(E) )

• Alternatively, a l.q.p tree.

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

Emp E

E.SSN<5000 AND E.Age>50

E.Name

9

Query Compiling Apply Laws

• Replace with , push [and ] down.

• Only used for multiple tables. So skip.

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

Emp E

E.SSN<5000 AND E.Age>50

E.Name

10

parse

convert

apply laws

estimate result sizes

consider physical plans estimate costs

pick best

execute

{P1,P2,…..}

{(P1,C1),(P2,C2)...}

Pi

answerSQL query

parse tree

logical query plan

“ improved” l.q.p

l.q.p. +sizes

statistics

query compiling

query optimization

query execution

11

Query Optimization Estimate Result Sizes

• The size of each input table is stored as meta data.

• Intermediate result: size not known, but needed to estimate I/O cost of physical plan.

• But for the simple case, can be evaluated on the fly. So no need to estimate the size of . So skip.

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

12

Query Optimization Consider Physical Plans

• Associate each RA operator with an implementation scheme.

• Multiple implementation schemes? Enumerate all.

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

Emp E

E.SSN<5000 AND E.Age>50

E.Name

Plan 1 (always work!)

scan

on-the-fly

13

Query Optimization Consider Physical Plans

• For the other physical plans, need to know what indices exist.

• Primary index: controls the actual storage of a table.– Suppose a primary B+-tree index exists on SSN.

• Secondary index: built on some other attribute. Does not store the actual record. Each leaf entry stores a set of page IDs in the primary index.– Suppose a secondary B+-tree index exists on Age.

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

e.g. entry in Age index:

Age=50, pageIDs={1, 4, 6}

21 3 54 6

SSN index

14

Query Optimization Consider Physical Plans

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

Emp E

E.SSN<5000 AND E.Age>50

E.Name

Plan 2

range search in SSN index

on-the-fly

15

Query Optimization Consider Physical Plans

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

Emp E

E.SSN<5000 AND E.Age>50

E.Name

Plan 3

range search in Age index, follow pointers to SSN index

on-the-fly

16

Query Optimization Estimate Costs

• Estimate #I/Os for each physical plan.• Pick the cheapest one.

• Input: physical plan.• Additional input:

– meta data (e.g. how many levels a B+-tree has)– assumptions (e.g. the root node of every B+-tree is

pinned)– memory buffer size.

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

17

Query Optimization Estimate Costs Meta Data

• All the database tables.• For each table R:

– Schema– T(R): #records in R– For every attribute A:

• V(R, A): #distinct values of A• min(R, A): minimum value of A• max(R, A): maximum value of A

– Primary index: #levels, #leaf nodes.– Secondary index: #levels, #leaf nodes, average

#pageIDs per leaf entry.

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

18

Query Optimization Estimate Costs sample input

• Assume for table E:– Schema = (SSN: int, Name: string, Age: int, Salary: int) – T(E) = 100 tuples. – For attribute SSN:

• V(E, SSN)=100, min(E, SSN)=0000, max(E, SSN)=9999– For attribute Age:

• V(E, Age)=20, min(E, Age)=21, max(E, Age)=60– Primary index on SSN: 3 level B+-tree, 50 leaf nodes.– Secondary index on Age: 2 level B+-tree, 10 leaf nodes,

every leaf entry points to 3.5 pageIDs (on average).

• Assumptions: all B+-tree roots are pinned. Can reach the first leaf page of a B+-tree directly.

• Memory buffer size: 2 pages.

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

19

Query Optimization Estimate Costs

• Cost = 50. (The primary index has 50 leaf nodes. Assume we can reach the first leaf page of a B+-tree directly.)

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

Emp E

E.SSN<5000 AND E.Age>50

E.Name

Plan 1 (always work!)

scan

on-the-fly

20

Query Optimization Estimate Costs

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

Emp E

E.SSN<5000 AND E.Age>50

E.Name

Plan 2

range search in SSN index

on-the-fly

• Cost = 25. SSN<5000 selects half of the employees, so 50/2=25 leaf nodes.

• Note: if condition is E.SSN>5000, needs 1 more I/O.

21

Query Optimization Estimate Costs

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

Emp E

E.SSN<5000 AND E.Age>50

E.Name

Plan 3

range search in Age index, follow pointers to SSN index

on-the-fly

• Cost = 10/4 + 20/4 * 3.5 = 21.

#I/Os in the Age index #I/Os in the SSN index

22

Query Optimization Estimate Costs

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

Emp E

E.SSN<5000 AND E.Age>50

E.Name

Plan 3

range search in Age index, follow pointers to SSN index

on-the-fly

• Cost = 10/4 + 20/4 * 3.5 = 21.

Age index has 10 leaf nodes. Check 1/4 of them, since [51,60] is 1/4 of [21,60].

23

Query Optimization Estimate Costs

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

Emp E

E.SSN<5000 AND E.Age>50

E.Name

Plan 3

range search in Age index, follow pointers to SSN index

on-the-fly

• Cost = 10/4 + 20/4 * 3.5 = 21.

20 distinct ages divided by 4to get #ages in [51,60].

times 3.5 (#pageIDs per page)to get #I/Os in the SSN index.

24

Query Optimization Pick Best

SELECT E.NameFROM Emp EWHERE E.SSN<5000 AND E.Age>50

physical plan I/O cost

Plan 1: scan 50

Plan 2: range search SSN index

25

Plan 3: range search Age index

21

Pick!

25

parse

convert

apply laws

estimate result sizes

consider physical plans estimate costs

pick best

execute

{P1,P2,…..}

{(P1,C1),(P2,C2)...}

Pi

answerSQL query

parse tree

logical query plan

“ improved” l.q.p

l.q.p. +sizes

statistics

query compiling

query optimization

query execution

26

Another case study: two tables.

• Extended grammar:– Only SFW (no group by, having, nested query)– Simple AND condition (no OR, UNION, EXISTS, IN, …)– Allow two tables (allow conditions like E.did=D.did)

• Example query:SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND

D.budget=1000

27

• <SFW> := SELECT <SelList> FROM <TableList> WHERE <CondList>

• <SelList> := <Attribute> | <Attribute>, <SelList> • <TableList> := <Table> | <Table>, <Table>• <CondList> := <Condition> | <Condition> AND

<CondList>• <Condition> := <Attribute> <op> <value> |

<Attribute> = <Attribute>• <op>:= > | < | = | >= | <=

Query Compiling Parse Grammar

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

28

Query Compiling Parse Parse Tree

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

<SFW>

SELECT <SelList> FROM<TableList>WHERE<CondList>

<Attribute>

E.Name

, <SelList>

<Attribute>

D.Dname

29

Query Compiling Parse Parse Tree

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

<SFW>

SELECT <SelList> FROM <CondList><TableList>WHERE

<Table> <Table>

Emp E Dept D

,

30

Query Compiling Parse Parse Tree

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

<SFW>

SELECT <SelList> FROM <CondList>

<Attribute> = <Attribute>

E.Did D.Did <Condition>

<Condition> AND <CondList>

<Condition>AND <CondList>

<TableList>WHERE

31

Query Compiling Convert

• Algorithm: then then .

E.Name. D.Dname(E.Did=D.Did AND E.SSN<5000 AND

D.budget=1000(ED) )

• The l.q.p tree:

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

Emp E

E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

E.Name, D.Dname

Dept D

32

Query Compiling Apply Laws

• Always always: (try to) replace with !

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

Emp E

E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

E.Name, D.Dname

Dept D

33

Query Compiling Apply Laws

• Always always: (try to) replace with !

• Also, push down.

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

Emp E

E.SSN<5000 AND D.budget=1000

E.Name, D.Dname

Dept D

34

Query Compiling Apply Laws

• Always always: (try to) replace with !

• Also, push down.

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

Emp E

E.SSN<5000 AND D.budget=1000

E.Name, D.Dname

Dept D

35

Query Compiling Apply Laws

• Always always: (try to) replace with !

• Also, push down.

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

Emp E

E.SSN<5000

E.Name, D.Dname

Dept D

D.budget=1000

36

Query Compiling Apply Laws Theory Behind

• Let p = predicate with only E attributes q = predicate with only D attributes m = E & D’s common attributes are equal• We have:

pqm (E D) = p(E) q(D)

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

37

parse

convert

apply laws

estimate result sizes

consider physical plans estimate costs

pick best

execute

{P1,P2,…..}

{(P1,C1),(P2,C2)...}

Pi

answerSQL query

parse tree

logical query plan

“ improved” l.q.p

l.q.p. +sizes

statistics

query compiling

query optimization

query execution

38

Query Optimization Consider Physical Plans

• Because join is so important, let’s skip result size estimation for now, and let’s assume selections are not pushed down.

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

Emp E

E.SSN<5000 AND D.budget=1000

E.Name, D.Dname

Dept D

39

Four Join Algorithms

• Iteration join (nested loop join)• Merge join• Hash join• Join with index

40

Example E D over common attribute Did

• E:– T(E)=10,000 – primary index on SSN, 3 levels. – |E|= 1,000 leaf nodes.

• D:– T(D)=5,000– primary index on Did. 3 levels.– |D| = 500 leaf nodes.

• Memory available = 101 blocks

41

Iteration Join

1. for every block in E2. scan through D;3. join records in the E block with records in the D block.

• I/O cost = |E| + |E| * |D| =

1000 + 1000*500 = 501,000.

• Works good for small buffer (e.g. two blocks).

42

• Can we do better?Use our memory(1) Read 100 blocks of E(2) Read all of D (using 1 block) + join(3) Repeat until done

• I/O cost = |E| + |E|/100 * |D| =

1000 + 10*500 = 6,000.

43

• Can we do better?Reverse join order: D E. i.e. For every 100 D blocks, go

through E.

• I/O cost = |D| + |D|/100 * |E| =

500 + 5*1000 = 5,500.

44

• Merge join (conceptually)(1) if R1 and R2 not sorted, sort them(2) i 1; j 1;

While (i T(R1)) (j T(R2)) do if R1{ i }.C = R2{ j }.C then

outputTuples else if R1{ i }.C > R2{ j }.C then j j+1 else if R1{ i }.C < R2{ j }.C then i i+1

45

Procedure Output-TuplesWhile (R1{ i }.C = R2{ j }.C) (i T(R1)) do

[jj j;

while (R1{ i }.C = R2{ jj }.C) (jj T(R2)) do

[output pair R1{ i }, R2{ jj };

jj jj+1 ]

i i+1 ]

46

Example

i R1{i}.C R2{j}.C j1 10 5 12 20 20 23 20 20 34 30 30 45 40 30 5

50 6 52 7

47

Merge Join Cost

• Recall that |E|=1000, |D|=500. And |D| is already sorted on Did.

• External sort E: pass 0, by reading and writing E, produces a file with 10 sorted runs. Another read is enough.

• No need to write! Can pipeline to join operator.

• Cost = 3*1000 + 500 = 3,500.

48

• Hash join (conceptual)– Hash function h, range 0 k– Buckets for R1: G0, G1, ... Gk– Buckets for R2: H0, H1, ... Hk

Algorithm(1) Hash R1 tuples into G buckets(2) Hash R2 tuples into H buckets(3) For i = 0 to k do

match tuples in Gi, Hi buckets

49

Simple example hash: even/odd

R1 R2 Buckets2 5 Even 4 4 R1 R23 12 Odd: 5 38 139 8

1114

2 4 8 4 12 8 14

3 5 9 5 3 13 11

50

Hash Join Cost

• Read + write both E and D for partitioning, then read to join.

• Cost = 3 * (1000 + 500) = 4,500.

51

• Join with index (Conceptually)

For each r E do

Find the corresponding D tuple by probing index.

• Assuming the root is pinned in memory,Cost = |E| + T(E)*2 = 1000 + 10,000*2 = 21,000.

52

Note:

• The costs are different if integrate selection conditions!

• E.g. for the index join, only check half of E. So should be 500+5,000*2=10,500.

• Selection condition which is not used during join should be evaluated to filter the join result. E.g. index join checked D without evaluating the selection condition on D.

53

physical plan with selections being pushed down

• Finally, let’s consider pushing down selections.• Now that the join operator takes intermediate

results (which could be written to disk), we need to estimate their sizes…

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

Emp E

E.SSN<5000

E.Name, D.Dname

Dept D

D.budget=1000

54

parse

convert

apply laws

estimate result sizes

consider physical plans estimate costs

pick best

execute

{P1,P2,…..}

{(P1,C1),(P2,C2)...}

Pi

answerSQL query

parse tree

logical query plan

“ improved” l.q.p

l.q.p. +sizes

statistics

query compiling

query optimization

query execution

55

Estimating result size

• Keep statistics for relation R– T(R) : # tuples in R– S(R) : # of bytes in each R tuple– V(R, A) : # distinct values in R for

attribute A– min(R, A)– max(R, A)

56

Example R A: 20 byte string

B: 4 byte integerC: 8 byte dateD: 5 byte string

A B C D

cat 1 10 a

cat 1 20 b

dog 1 30 a

dog 1 40 c

bat 1 50 d

T(R) = 5 S(R) = 37V(R,A) = 3 V(R,C) = 5V(R,B) = 1 V(R,D) = 4

57

Size estimates for W = R1 x R2

T(W) =

S(W) =

T(R1) T(R2)

S(R1) + S(R2)

58

S(W) = S(R)

T(W) = ?

Size estimate for W = A=a(R)

59

Example R V(R,A)=3

V(R,B)=1V(R,C)=5V(R,D)=4

W = z=val(R) T(W) =

A B C D

cat 1 10 a

cat 1 20 b

dog 1 30 a

dog 1 40 c

bat 1 50 d

T(R)V(R,Z)

60

Assumption:

Values in select expression Z = valare uniformly distributedover possible V(R,Z) values.

61

What about W = z val (R) ?

T(W) = ?

• T(W) = T(R)/2?

62

• Solution: Estimate values in range

Example R ZMin=1 V(R,Z)=10

W= z 16 (R)

Max=20

f = 5 (fraction of range) 20

T(W) = f T(R)

63

Size estimate for W = R1 R2

Let x = attributes of R1 y = attributes of R2

X Y =

Same as R1 x R2

Case 1

64

W = R1 R2 X Y = AR1 A B C R2 A D

Case 2

Assumption:

V(R1,A) V(R2,A) Every A value in R1 is in R2

V(R2,A) V(R1,A) Every A value in R2 is in R1

65

R1 A B C R2 A D

Computing T(W) when V(R1,A) V(R2,A)

Take 1 tuple Match

1 tuple matches with T(R2)

tuples... V(R2,A)

so T(W) = T(R2) T(R1) V(R2, A)

66

• V(R1,A) V(R2,A) T(W) = T(R2) T(R1)

V(R2,A)

• V(R2,A) V(R1,A) T(W) = T(R2) T(R1)

V(R1,A)

[A is common attribute]

67

T(W) = T(R2) T(R1)max{ V(R1,A), V(R2,A) }

In general W = R1 R2

68

S(W) = S(R1) + S(R2) - S(A) size of attribute

A

69

Note: for complex expressions, need

intermediate T,S,V results.

E.g. W = [A=a (R1) ] R2

Treat as relation U

T(U) = T(R1)/V(R1,A) S(U) = S(R1)

Also need V (U, *) !!

70

To estimate Vs

E.g., U = A=a (R1) Say R1 has attribs A,B,C,D

V(U, A) = V(U, B) =V(U, C) = V(U, D) =

71

Example R 1 V(R1,A)=3

V(R1,B)=1V(R1,C)=5V(R1,D)=3

U = A=a (R1)

A B C D

cat 1 10 10

cat 1 20 20

dog 1 30 10

dog 1 40 30

cat 1 50 10

V(U,A) =1 V(U,B) =1 V(U,C) = T(R1)

V(R1,A)V(U,D) ... somewhere in between

72

For an arbitrary attribute D other than A (the attribute being selected)V(R1,D) ranges from 1 to T(R1), andV(U,D) ranges from 1 to T(R1)/V(R1,A).

),1(/)1(

),(

)1(

),1(

ARVRT

DUV

RT

DRVLet’s make

Or, V(U,D) = V(R1,D)/V(R1,A)

73

For Joins U = R1(A,B) R2(A,C)

V(U,A) = min { V(R1, A), V(R2, A) }V(U,B) = V(R1, B)V(U,C) = V(R2, C)

74

Example:

Z = R1(A,B) R2(B,C) R3(C,D)

T(R1) = 1000 V(R1,A)=50 V(R1,B)=100

T(R2) = 2000 V(R2,B)=200 V(R2,C)=300

T(R3) = 3000 V(R3,C)=90 V(R3,D)=500

R1

R2

R3

75

T(U) = 10002000 V(U,A) = 50 200 V(U,B) = 100

V(U,C) = 300

Partial Result: U = R1 R2

76

Z = U R3

T(Z) = 100020003000 V(Z,A) = 50200300 V(Z,B) = 100

V(Z,C) = 90 V(Z,D) = 500

77

• E:– T(E)=10,000 – primary index on SSN, 3 levels. – |E|= 1,000 leaf nodes.– V(E,SSN)=10,000: from 0000 to 9999.

• D:– T(D)=5,000– primary index on Did. 3 levels.– |D| = 500 leaf nodes.– V(D,budget)=20: from 100 to 10,000.

• Memory available = 11 blocks• ?? What’s the best physical plan?

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

Example

Note: |E’| = 500|D’| = 25

78

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

l.q.p

Emp E

E.SSN<5000

E.Name, D.Dname

Dept D

D.budget=1000

79

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

p.q.p #1

Emp E

E.SSN<5000

E.Name, D.Dname

Dept D

D.budget=1000

range search scan

iteration join; D is outer table

Cost = 500 (read D)+ 25 (write D’)+ 25 + ceiling(25/10)*500

= 2050

80

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

p.q.p #2

Emp E

E.SSN<5000

E.Name, D.Dname

Dept D

D.budget=1000

range search scan

sort merge Cost = 5*500 (sort E’; no write)+ 500 (read D)

= 3000

81

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

p.q.p #3

Emp E

E.SSN<5000

E.Name, D.Dname

Dept D

D.budget=1000

range search scan

hash join Cost = 3*500 (for E’)+ 500 (read D)+ 25 (write D’)+ 3*25 (for D’)

= 3000

Note: M should be bigger than sqrt(min{|E’|, |D’|})+1. - Why? - What if not?

82

SELECT E.Name, D.DnameFROM Emp E, Dept DWHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

p.q.p #4

Emp E

E.SSN<5000

E.Name, D.Dname

Dept D

D.budget=1000

range search

index nested loop join

Cost = 500 (scan E’)+ 5000*(3-1) (for D)

= 10,500

83

Some notes

• For BNL, merge, hash joins: always push selection!

• For index join, do not push selection on the inner table (the one whose primary key is involved in the join condition).

• For BNL, make the smaller table be the outer table – join could be free if it fits in memory!

top related