silt: a memory-efficient, high-performance key-value store

65
SILT: A Memory-Efficient, High- Performance Key-Value Store Hyeontaek Lim, Bin Fan, David G. Andersen, Michael Kaminsky Carnegie Mellon University, Intel Labs 1

Upload: mahdi-atawneh

Post on 16-Apr-2017

262 views

Category:

Education


0 download

TRANSCRIPT

Page 1: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: A Memory-Efficient, High-Performance Key-Value Store Hyeontaek Lim, Bin Fan, David G. Andersen, Michael Kaminsky Carnegie Mellon University, Intel Labs

1

Page 2: SILT: A Memory-Efficient, High-Performance Key-Value Store

Outline1- Introduction.2- Problem statement.3. Contributions.

4- Experiments5- Results and conclusion

This template is free to use under Creative Commons Attribution license. If you use the graphic assets (photos, icons and typographies) provided with this presentation you must keep the Credits slide.2

Page 3: SILT: A Memory-Efficient, High-Performance Key-Value Store

Hello!I am Mahdi Atawneh

You can find me at:@[email protected]

3

Page 4: SILT: A Memory-Efficient, High-Performance Key-Value Store

1.Introduction

4

Page 5: SILT: A Memory-Efficient, High-Performance Key-Value Store

What is key-value store ?

a data storage paradigm designed for storing, retrieving, and managing associative arrays,

5

Introduction

Page 6: SILT: A Memory-Efficient, High-Performance Key-Value Store

Ecommerce (Amazon) Picture stores (facebook) Web object caching.

6

key-value stores used in:

Page 7: SILT: A Memory-Efficient, High-Performance Key-Value Store

Many projects have examined flash memory-based key-value stores ; Faster than disk, cheaper than DRAM

7

DRAM vs Flash

RAM (main memory) a bit more expensive . requires constant power. is much faster.

Flash memory (HD) low-cost . retains data when power is

removed (nonvolatile), but its performance is also

slow.

Page 8: SILT: A Memory-Efficient, High-Performance Key-Value Store

Memory overhead: Index size per entry, Ideally 0 (no memory overhead)

Read amplification: Flash reads per query, Limits query throughput, Ideally 1 (no wasted flash reads).

Write amplification: Flash writes per entry, Limits insert throughput, Also reduces flash life expectancy

8

Three Metrics to Minimize

Page 9: SILT: A Memory-Efficient, High-Performance Key-Value Store

2.MotivationProblem statement

9

Page 10: SILT: A Memory-Efficient, High-Performance Key-Value Store

MotivationAs key-value stores scale in both size and importance,

index memory efficiency is increasingly becoming one of the most important factors for the system’s scalability and overall cost effectiveness.

10

Page 11: SILT: A Memory-Efficient, High-Performance Key-Value Store

Challenge

Memory efficiency

High performance

11

This talk will introduce SILT, which uses drastically less memory than previous systems while retaining high performance.

Page 12: SILT: A Memory-Efficient, High-Performance Key-Value Store

Related WorkMany studies tried to reduce in-memory index overhead,but:

• there solutions either require more memory.

• or keeping parts of there index on disk (low performance "called read amplification")

12

Page 13: SILT: A Memory-Efficient, High-Performance Key-Value Store

3.Contributions

SILT (Small Index Large Table)

13

Page 14: SILT: A Memory-Efficient, High-Performance Key-Value Store

Contributions1. The design and implementation of three basic key-value

stores (LogStore, HashStore, and SortedStore) .2. Synthesis of these basic stores to build SILT.3. An analytic model that enables an explicit and careful

balance between memory, storage, and computation .

14

Page 15: SILT: A Memory-Efficient, High-Performance Key-Value Store

Contributions1. The design and implementation of three basic key-value

stores (LogStore, HashStore, and SortedStore) .2. Synthesis of these basic stores to build SILT.3. An analytic model that enables an explicit and careful

balance between memory, storage, and computation .

15

Page 16: SILT: A Memory-Efficient, High-Performance Key-Value Store

Basic store designLogStore, HashStore, and SortedStore . ( overall overview).

16

Page 17: SILT: A Memory-Efficient, High-Performance Key-Value Store

Basic store designLogStore, HashStore, and SortedStore . ( overall overview).

17

Page 18: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 1. LogStore1. Write friendly key-value store.2. Use a tag (15 bits) for an index rather than an

entire hash index (160bits) .3. A customized version of Cuckoo hashing is used.4. In-memory hash table to map contents in flash.5. Only one instance.

18

Page 19: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 1. LogStoreHow it works? .

19

Page 20: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 1. LogStore

K2h1(k2)

Tag Offset

DRAM (memory)** store short tag (15b)

FLASH (hard disk)** store the full key (160)

h2(k2)

1 2 3 4

Cuckoo hashing

20

Page 21: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 1. LogStore

K2h1(k2)

Tag Offset

h2(k2)

1

K2

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

h2(k2)

1 2 3 4

21

Page 22: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 1. LogStore

K1h1(k1)

Tag Offset

h2(k2)

1

h1(k1)

2

K2 K1

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

h2(k1)

1 2 3 4

22

Page 23: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 1. LogStore

K4h1(k2)

Tag Offset

h2(k2)

1

h1(k4)

3

h1(k1)

2

K2 K1 K4

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

h2(k2)

1 2 3 4

23

Page 24: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 1. LogStore

K3h1(k3)

Tag Offset

h2(k2)

1

h1(k4)

3

h1(k1)

2

K2 K1 K4 K3

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

h2(k3)

1 2 3 4

24

Page 25: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 1. LogStore

K3h1(k3)

Tag Offset

h1(k4)

3

h2(k2)

1

h1(k1)

2

K2 K1 K4 K3

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

h2(k3)

1 2 3 4

25

Page 26: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 1. LogStore

K3h1(k3)

Tag Offset

h2(k3)

4

h1(k4)

3

h2(k2)

1

h1(k1)

2

K2 K1 K4 K3

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

h2(k3)

1 2 3 4

26

Page 27: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 1. LogStore

K5? h1(k5)

Tag Offset

h2(k3)

4

h1(k4)

3

h2(k2)

1

h1(k1)

2

K2 K1 K4 K3

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

h2(k5)

1 2 3 4

27

Page 28: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 1. LogStore

K5? h1(k5)

Tag Offset

h2(k3)

4

h1(k4)

3

h2(k2)

1

h1(k1)

2

K2 K1 K4 K3

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

h2(k5)

1 2 3 4

28

Page 29: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 1. LogStore

K5? h1(k5)

Tag Offset

h2(k3)

4

h1(k4)

3

h2(k2)

1

h1(k1)

2

K2 K1 K4 K3

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

h2(k5)

1 2 3 4LogStore is full?1. SILT freezes the LogStore 2. initializes a new one without expensive rehashing.3. Convert the old LogStore to a HashStore ( in

background).

29

Page 30: SILT: A Memory-Efficient, High-Performance Key-Value Store

Basic store designLogStore, HashStore, and SortedStore . ( overall overview).

30

Page 31: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 2. HashStore1. Convert logStore into a more memory-efficient data

structure.2. Sort the LogStore based on ‘HashOrder’3. Saves lots of in-memory by eliminating the index and

reordering the on-flash pairs from insertion order to hash-order

31

Page 32: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 2. HashStore

32

Page 33: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 2. HashStore

Tag Offset

h2(k3)

4

h1(k4)

3

h2(k2)

1

h1(k1)

2

K2 K1 K4 K3

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

1 2 3 4

Convert logStore into a more memory-efficient data structure..

33

Page 34: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 2. HashStore

Tag Offset

h2(k3)

4

h1(k4)

3

h2(k2)

1

h1(k1)

2

K2 K1 K4 K3

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

1 2 3 4

Convert logStore into a more memory-efficient data structure..1. Remove the Offset

column

34

Page 35: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 2. HashStore

Tag

h2(k3)

h1(k4)

h2(k2)

h1(k1)

K2 K1 K4 K3

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

1 2 3 4

Convert logStore into a more memory-efficient data structure..

35

Page 36: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 2. HashStore

Tag

h2(k3)

h1(k4)

h2(k2)

h1(k1)

K2 K1 K4 K3

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

1 2 3 4

Convert logStore into a more memory-efficient data structure..

2. Sort according to

the hash

36

Page 37: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 2. HashStore

Tag

h2(k3)

h1(k4)

h2(k2)

h1(k1)

K3 K4 K2 K1

DRAM (memory)** store short tag

FLASH (hard disk)** store the full key (160)

1 2 3 4

Convert logStore into a more memory-efficient data structure..

37

Page 38: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 2. HashStore

K3 K4 K2 K1

Hashstore store many logstores .

K3 K4 K2 K1

K3 K4 K2 K1

K3 K4 K2 K1

38

Page 39: SILT: A Memory-Efficient, High-Performance Key-Value Store

Basic store designLogStore, HashStore, and SortedStore . ( overall overview).

39

Page 40: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 3. SortedStore Multiple HashStore can be aggregated into one

SortedStore. Focuses on minimize the bit presentation (by Using

Sorted Data on Flash). From the sorted results, indices are re-made ( trie data

structure , uses 0.4 bytes of index memory per key on average).

keeps read amplification low (exactly 1) by directly pointing to the correct location on flash.

40

Page 41: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 3. SortedStoreIndexing Sorted Data with a Trie:leaf = key internal node = common prefix of the keys represented by its descendants

How it works?41

Page 42: SILT: A Memory-Efficient, High-Performance Key-Value Store

SILT: 3. SortedStoreIndexing Sorted Data with a Trie:

Example

42

Page 43: SILT: A Memory-Efficient, High-Performance Key-Value Store

0 0 0 1 1 1 1 1

0 0 1 0 0 0 1 1

0 1 1 0 1 1 0 1

1 0 1 1 0 1 1 0

0 1 0 0 0 1 0 1

0 1 2 3 4 5 6 7

43

Page 44: SILT: A Memory-Efficient, High-Performance Key-Value Store

0 0 0 1 1 1 1 1

0 0 1 0 0 0 1 1

0 1 1 0 1 1 0 1

1 0 1 1 0 1 1 0

0 1 0 0 0 1 0 1

0 1 2 3 4 5 6 7

44

Page 45: SILT: A Memory-Efficient, High-Performance Key-Value Store

0 1

0 0 0 1 1 1 1 1

0 0 1 0 0 0 1 1

0 1 1 0 1 1 0 1

1 0 1 1 0 1 1 0

0 1 0 0 0 1 0 1

0 1 2 3 4 5 6 7

45

Page 46: SILT: A Memory-Efficient, High-Performance Key-Value Store

0

0 1

1

0 0 0 1 1 1 1 1

0 0 1 0 0 0 1 1

0 1 1 0 1 1 0 1

1 0 1 1 0 1 1 0

0 1 0 0 0 1 0 1

0 1 2 3 4 5 6 7

46

Page 47: SILT: A Memory-Efficient, High-Performance Key-Value Store

0

0 1

1

0 0 0 1 1 1 1 1

0 0 1 0 0 0 1 1

0 1 1 0 1 1 0 1

1 0 1 1 0 1 1 0

0 1 0 0 0 1 0 1

0 1 2 3 4 5 6 7

47

Page 48: SILT: A Memory-Efficient, High-Performance Key-Value Store

0

0

0

1

1

1

0 0 0 1 1 1 1 1

0 0 1 0 0 0 1 1

0 1 1 0 1 1 0 1

1 0 1 1 0 1 1 0

0 1 0 0 0 1 0 1

0 1 2 3 4 5 6 7

48

Page 49: SILT: A Memory-Efficient, High-Performance Key-Value Store

0

0

0

1

1

1

0 0 0 1 1 1 1 1

0 0 1 0 0 0 1 1

0 1 1 0 1 1 0 1

1 0 1 1 0 1 1 0

0 1 0 0 0 1 0 1

0 1 2 3 4 5 6 7

0

49

Page 50: SILT: A Memory-Efficient, High-Performance Key-Value Store

0

0

0

1

1

1

0 0 0 1 1 1 1 1

0 0 1 0 0 0 1 1

0 1 1 0 1 1 0 1

1 0 1 1 0 1 1 0

0 1 0 0 0 1 0 1

0 1 2 3 4 5 6 7

0

Ignored

50

Page 51: SILT: A Memory-Efficient, High-Performance Key-Value Store

0

0

0

1

1

1

0 0 0 1 1 1 1 1

0 0 1 0 0 0 1 1

0 1 1 0 1 1 0 1

1 0 1 1 0 1 1 0

0 1 0 0 0 1 0 1

0 1 2 3 4 5 6 7

0 1

51

Page 52: SILT: A Memory-Efficient, High-Performance Key-Value Store

0

0

0

1

1

1

0 0 0 1 1 1 1 1

0 0 1 0 0 0 1 1

0 1 1 0 1 1 0 1

1 0 1 1 0 1 1 0

0 1 0 0 0 1 0 1

0 1 2 3 4 5 6 7

0 1

2

52

Page 53: SILT: A Memory-Efficient, High-Performance Key-Value Store

10

2

763

54

0

0

0

0

0

0

0

1

1

1

1

1 1

1

0 0 0 1 1 1 1 1

0 0 1 0 0 0 1 1

0 1 1 0 1 1 0 1

1 0 1 1 0 1 1 0

0 1 0 0 0 1 0 1

0 1 2 3 4 5 6 7

53

Page 54: SILT: A Memory-Efficient, High-Performance Key-Value Store

SortedStore uses a compact recursive representation to eliminate pointers

54

Page 55: SILT: A Memory-Efficient, High-Performance Key-Value Store

SLIT Lookup Process Queries look up stores in sequence (from new to old) Note : Inserts only go to Log

55

Page 56: SILT: A Memory-Efficient, High-Performance Key-Value Store

SLIT Lookup Process Queries look up stores in sequence (from new to old) Note : Inserts only go to Log

56

Page 57: SILT: A Memory-Efficient, High-Performance Key-Value Store

Contributions1. The design and implementation of three basic key-value

stores (LogStore, HashStore, and SortedStore) .2. Synthesis of these basic stores to build SILT.3. An analytic model that enables an explicit and careful

balance between memory, storage, and computation .

57

Page 58: SILT: A Memory-Efficient, High-Performance Key-Value Store

Contributions1. The design and implementation of three basic key-value

stores (LogStore, HashStore, and SortedStore) .2. Synthesis of these basic stores to build SILT.3. An analytic model that enables an explicit and careful

balance between memory, storage, and computation .

58

Page 59: SILT: A Memory-Efficient, High-Performance Key-Value Store

Analytic model

59

Memory overhead (MA)=

Read amplification (RA)=

Write amplification (WA)=data written to flash

data written by application

data read from flash

data read by application

total memory consumed

number of items

Page 60: SILT: A Memory-Efficient, High-Performance Key-Value Store

4.Evaluation & Experiments

60

Page 61: SILT: A Memory-Efficient, High-Performance Key-Value Store

Experiment Setup

61

CPU 2.80 GHz (4 cores)

Flash driveSATA 256 GB

(48 K random 1024-byte reads/sec)

Workload size 20-byte key, 1000-byte value, ≥ 50 M keys

Query pattern Uniformly distributed

Page 62: SILT: A Memory-Efficient, High-Performance Key-Value Store

Experiment 1LogStore Alone: Too Much Memory Workload: 90% GET (50-100 M keys) + 10% PUT (50 M keys)

62

Page 63: SILT: A Memory-Efficient, High-Performance Key-Value Store

Experiment 2LogStore + SortedStore: Still Much MemoryWorkload: 90% GET (50-100 M keys) + 10% PUT (50 M keys)

63

Page 64: SILT: A Memory-Efficient, High-Performance Key-Value Store

Experiment 3Full SILT: Very Memory EfficientWorkload: 90% GET (50-100 M keys) + 10% PUT (50 M keys)

64

Page 65: SILT: A Memory-Efficient, High-Performance Key-Value Store

“Thanks