new hardware platforms the bw-tree: a b-tree...

29
The Bw-Tree: A B-tree for New Hardware Platforms Author: J. Levandoski et al.

Upload: others

Post on 21-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

The Bw-Tree: A B-tree for New Hardware Platforms

Author: J. Levandoski et al.

Page 2: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

The Bw-Tree: A B-tree for New Hardware Platforms

Author: J. Levandoski et al.

Buzz word

DRAM + Flash storage

Page 3: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Hardware Trends● Multi-core + large main memories

○ Latch contention■ Worker threads set latches for accessing data

○ Cache invalidation■ Worker threads access data from different NUMA nodes

Page 4: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Hardware Trends● Multi-core + large main memories

○ Latch contention■ Worker threads set latches for accessing data

○ Cache invalidation■ Worker threads access data from different NUMA nodes

Delta updates○ No updates in place○ Reduces cache invalidation○ Enable latch-free tree operation

Page 5: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Hardware Trends● Flash storage

○ Good at random reads and sequential reads/writes○ Bad at random writes

■ Erase cycle

Page 6: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Hardware Trends● Flash storage

○ Good at random reads and sequential reads/writes○ Bad at random writes

■ Erase cycle

Log-structured storage design

Page 7: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Architecture● CRUD API● Bw-tree search logic● In-memory pages

Bw-tree Layer

Cache Layer

Flash Layer

● Logical page abstraction● Paging between flash and RAM

● Sequential writes to log-structured storage

● Flash garbage collection

Page 8: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Architecture● CRUD API● Bw-tree search logic● In-memory pages

Bw-tree Layer

Cache Layer

Flash Layer

● Logical page abstraction● Paging between flash and RAM

● Sequential writes to log-structured storage

● Flash garbage collection

Atomic record store, not an ACID transactional database

Page 9: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Architecture● CRUD API● Bw-tree search logic● In-memory pages

Bw-tree Layer

Cache Layer

Flash Layer

● Logical page abstraction● Paging between flash and RAM

● Sequential writes to log-structured storage

● Flash garbage collection

Atomic record store, not an ACID transactional database

Page 10: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Logical Pages and Mapping Table

● Logical pages are identified by PIDs stored as Mapping Table keys.● Physical addresses can be either in main memory or in flash storage.

Page 11: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Delta Updates

● Tree operations are atomic.● Update operations are “logged” as a lineage of delta records.● Delta records are incorporated to the base page asynchronously.● Updates are “installed” to Mapping Table through compare-and-swap.● Important enabler for latch-freedom and cache-efficiency.

Page 12: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Delta Updates

● Tree operations are atomic.● Update operations are “logged” as a lineage of delta records.● Delta records are incorporated to the base page asynchronously.● Updates are “installed” to Mapping Table through compare-and-swap.● Important enabler for latch-freedom and cache-efficiency.

Q: What is the performance of reading data from page P?

Page 13: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Other details

● SMO: structure modification operations○ split, merge, consolidate○ has multiple phases -> how to make SMO atomic?

● In-memory page garbage collection○ epoch-based.

Page 14: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Architecture● CRUD API● Bw-tree search logic● In-memory pages

Bw-tree Layer

Cache Layer

Flash Layer

● Logical page abstraction● Paging between flash and RAM

● Sequential writes to log-structured storage

● Flash garbage collection

Page 15: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Flash Layer

Page 16: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Flushing Pages

PID Physical Address

P

Page P

Insert 40

Insert 50

Delete 33

Modify 40 to 60

Q: Why flushing pages?Q: When to flush pages?Q: How many pages to flush?Q: What if you crash during a flush?

Log-structured Store

Page 17: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Flushing Pages

PID Physical Address

P

Page P

Flush Write Buffer

Log-structured Store

Page 18: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Flushing Pages

PID Physical Address

P

Page P

Flush Write Buffer

Page P

Log-structured Store

Page 19: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Flushing Pages

PID Physical Address

P

Page P

Flush Write Buffer

Page P Page T

Log-structured Store

Page 20: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Flushing Pages

PID Physical Address

P

Page P

Flush Write Buffer

Page P Page T

Flush

Log-structured Store

Page 21: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Flushing Pages

PID Physical Address

P

Page P

Flush Write BufferFlush

Insert 50

Delete 33

Page P Page T Log-structured Store

Page 22: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Flushing Pages

PID Physical Address

P

Page P

Flush Write BufferFlush

Insert 50

Delete 33

Page P Page T Log-structured Store

Page 23: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Flushing Pages

PID Physical Address

P

Page P

Flush Write Buffer

Page P

Flush

Insert 50

Delete 33

Page T

Insert 50

Delete 33

Log-structured Store

Page 24: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Flushing Pages

PID Physical Address

P

Page P

Flush Write Buffer

Page P

Flush

Insert 50

Delete 33

Page T

Insert 50

Delete 33

Page E

Log-structured Store

Page 25: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Flushing Pages

PID Physical Address

P

Page P

Flush Write Buffer

Page P

Flush

Insert 50

Delete 33

Page T

Flush

Page EInsert 50 Delete 33 Log-structured Store

Page 26: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Other details

● Log-structured Store garbage collection○ Cleans orphaned data unreachable from mapping

table○ Relocates entire pages in sequential blocks (to

reduce fragmentation)● Access method recovery

○ Occasionally checkpoint mapping table○ Redo-scan starts from last checkpoint

Page 27: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Experiment

● Against○ BerkeleyDB (without transaction)○ latch-free skip-list

Page 28: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Experiment

Over BerkerleyDB:- 18x speedup in read-intensive workload- 5-8x speedup in update-intensive workload

Over Skip-list:- 4.4x speedup in read-only workload.- 3.7x speedup in update-intensive workload.

Page 29: New Hardware Platforms The Bw-Tree: A B-tree forcs.brown.edu/courses/cs227/archives/2015/slides/week2/2-sam-BW… · Bw-tree search logic In-memory pages Bw-tree Layer Cache Layer

Thank you!

Slides adapted from http://www.hpts.ws/papers/2013/bw-tree-hpts2013.pdf