cache conscious indexing for decision-support in main memory pradip dhara

32
Cache Conscious Indexing for Decision- Support in Main Memory Pradip Dhara

Post on 20-Dec-2015

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Cache Conscious Indexing for Decision-Support in Main

MemoryPradip Dhara

Page 2: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Why In-memory databases

• Telecommunications

• CAD tools

• Moore’s law will allow us to store relations in memory

Page 3: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Redesigning DBMS’s

• Optimize memory-cpu performance vs disk-memory performance

• Re-evaluate space/time tradeoff – space isn’t cheap

• Given certain space requirement, need to optimize response time for lookups

Page 4: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Indices in In-Memory DBMS’s

• Little extra space vs. Increased performance

• Index design takes on new dimensions when looking at in-memory databases

• Space overhead can not be ignored – hash tables are unacceptable

Page 5: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Hardware solutions

• Caches

• Growing disparity between CPU performance and memory performance.

• Cache misses can’t be overlapped

Page 6: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Solution

• CSS-trees indices exploit cache behavior to get improved performance

Page 7: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Direct Mapped Cache

Page 8: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Fully Associative Cache

Page 9: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

2-Way Set Associative Cache

Page 10: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Binary Search on Sorted Array

Store the relation in sorted order on a key

Cache performance dependent upon tuple size

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Page 11: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

T-trees

pointer to record

4, * 8, *…

0, * 3, *…

10, * 16, *…

key

Page 12: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Enhanced B+ trees

1, * 3, *2, * 4, * 5, * 7, *6, * 8, * 9, * 11, *10, * 12, *

13, * 15, *14, * 16, * 17, * 19, *18, * 20, *

5 9 13 17

Page 13: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Hash Indices

000

111

010

011

100

101

110

001

0, * 8, * 80, *…

Put however many <key, rid> pairs fit into a cache line

Page 14: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Idea Behind CSS-trees

• Save space by not storing pointers

• Use an array as a tree

• Implicitly store pointers as offsets into the array

Page 15: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Useful Formulas for CSS-trees

Children of a node b are nodes b(m+1) to b(m+1) + (m+1)

N = n * m

n = # of elements

m = # of elements per node

N = # of nodes

# of Internal Nodes =

First leaf node in bottom level =

(EQ 1)

(EQ 2)

(EQ 3)

(EQ 4)

Page 16: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

How it works

Sorted array

CSS-tree array (Directory)

Full CSS-tree

10 8 9 7 6 5 4 3 2 1

10 8 9 7 6 5 4 3 2 1 4 2 8 6

8 6

4 2 10 8 9 7

6 5 4 3 2 1

node 0

node 0

node 1 node 2 node 3

node 4 node 5 node 6

node 1 node 2 node 3 node 4 node 5 node 6

Internalnodes

Leafnodes

node 0 node 1 node 2 node 3 node 4

Values (Lemma 4.1)m (# keys per node) = 2n (# keys) = 10k (logm+1N)= 2N (# of Leaf Nodes) = 5Internal Nodes = 2First leaf node in bottom level

= 4

Page 17: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Building a full CSS-tree

Page 18: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Searching Within a Node

1 2 3 4 5 6 7 8

Page 19: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Level CSS-trees

1 2 3 4 5 6 7 Value of largest

key in subtree

m = 2t

Entries per node = m -1

Page 20: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Level vs. Full CSS-trees• Level CSS-trees will be deeper due to the

difference in branching factor• Level CSS-trees have fewer comparisons per node

• Level CSS-trees have more cache accesses and and node traversals

log2N vs log2N * logm+1m * (1 + 2/(m+1))

logmN vs Logm+1N

Page 21: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Time Analysis

R (size of rid) = 4 bytesK (size of key) = 4 bytesP (size of pointer) = 4 bytesh = 1.2n (# records) = 107

c (cache line) = 32 bytess (node size/c) = 1

D = time to derefence a pointerAb = time to compute child address for binary searchAfcss = time to compute child address for full CSSAlcss = time to compute child address for level CSS

s = mK/c

Page 22: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Space Analysis

R (size of rid) = 4 bytesK (size of key) = 4 bytesP (size of pointer) = 4 bytesh = 1.2n (# records) = 107

c (cache line) = 32 bytess (node size/c) = 1

D = time to derefence a pointerAb = time to compute child address for binary searchAfcss = time to compute child address for full CSSAlcss = time to compute child address for level CSS

s = mK/c

Page 23: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Experiment

• Results are for Ultra Sparc II– < 16K, 32B, 1>– <1M, 64B, 1>

• Keys randomly generated integers between 0 and 1 million

• Performed 5 tests of 100,00 searches for random keys

Page 24: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Figure 5a: Array Size vs. time

Page 25: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Figure 5b: Array Size vs. Time

Page 26: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Figure 6a: Array Size vs. 2nd cache accesses

Page 27: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Figure 6b: Array Size vs. 2nd cache misses

Page 28: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Figure 7: Node Size vs. Time

Page 29: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

CSS Performance on Other Queries

• CSS is very good for individual selection queries

• CSS will probably perform the best in range queries

• Index nested loops join vs. Sort merge join

Page 30: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Doubts About CSS

• Flexibility of CSS-trees across different cache designs

• Any applicability to variable sized records

• Multiple CSS-tree indices on different keys

Page 31: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

Conclusion

• CSS-trees improve searching performance by exploiting cache consciousness.

Page 32: Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

One Last Thought

• Cache designs

• Should we redesign them to let programmers have control?