monetdb intro ppt

62
M tDB/X100 MonetDB/X100 a (very) fast column-store M i Zk ki P t B Marcin Zukowski, Peter Boncz Sándor Héman, Niels Nes Sándor Héman, Niels Nes CWI, Amsterdam, The Netherlands

Upload: nikescar

Post on 04-Mar-2015

407 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Monetdb Intro PPT

M tDB/X100MonetDB/X100a (very) fast column-store( y)

M i Z k ki P t BMarcin Zukowski, Peter BonczSándor Héman, Niels NesSándor Héman, Niels Nes

CWI, Amsterdam, The Netherlands

Page 2: Monetdb Intro PPT

Disclaimer… a different one ☺

This talk is about data-intensive applications pp

Data warehousing, analytical processing

Scientific data information retrievalScientific data, information retrieval

Transaction processing is a different story…

A lot of content… wake up!

MonetDB/X100 overview 2

Page 3: Monetdb Intro PPT

Outline

Traditional database performancep

Improvements in MonetDB

MonetDB/X100MonetDB/X100

Query execution

Storage

MonetDB/X100 overview 3

Page 4: Monetdb Intro PPT

Database performancep

TPC-H 1GB, Query 1, Q y

Selects 98% of fact table (6M rows), computes t i d t llnet prices and aggregates all

Performance:

C program: ?

MySQL: 26.2sMySQL: 26.2s

DBMS “X”: 28.1s

MonetDB/X100 overview 4

Page 5: Monetdb Intro PPT

Database performancep

TPC-H 1GB, Query 1, Q y

Selects 98% of fact table (6M rows), computes t i d t llnet prices and aggregates all

Performance:

C program: 0.2s

MySQL: 26.2sMySQL: 26.2s

DBMS “X”: 28.1s

MonetDB/X100 overview 5

Page 6: Monetdb Intro PPT

Database performance analyzedatabase pe o a ce a a y ed

Why so slow?y

Inefficient data storage format

Inefficient query processing modelInefficient query processing model

MonetDB/X100 overview 6

Page 7: Monetdb Intro PPT

N-ary storage model (NSM)y g ( )

Fixed-width attributes in a record

Joe 27 Black101

Edward 21 Scissorhand103

MonetDB/X100 overview 7

Page 8: Monetdb Intro PPT

Real-life NSM implementationp

“Slotted” pages, example:p g , p

27 Joe Black 101

21 Edward Scissorhand103 21 Edward Scissorhand 03

MonetDB/X100 overview 8

Page 9: Monetdb Intro PPT

NSM problemsp

Poor bandwidth use

Always read all the attributes

Terrible on diskTerrible on disk

Bad in memory

Complex attribute access

Variable-length fields

Null fields

MonetDB/X100 overview 9

Page 10: Monetdb Intro PPT

Column stores to the rescue!

Store attributes separatelyp y

R d l ib d bRead only attributes used by a query

MonetDB/X100 overview 10

Page 11: Monetdb Intro PPT

“Traditional” column stores

Data pathp

Read columns from disk

Convert into NSMConvert into NSM

Use NSM-based processing

Examples: Sybase IQ, Vertica

Not enough!g

Only I/O problem addressed

MonetDB/X100 overview 11

Page 12: Monetdb Intro PPT

How databases run a queryq y

Query

SELECT name, salary*.19 AS taxsalary .19 AS tax

FROMemployee

WHERE age > 25

MonetDB/X100 overview 12

Page 13: Monetdb Intro PPT

Database operatorspTuple-at-a-time iterator interface:- open()- next(): tuple

close()- close()

next() is called:- for each operator- for each operator- for each tuple

Complex code repeatedComplex code repeated over and over

MonetDB/X100 overview 13

Page 14: Monetdb Intro PPT

Primitive functions

Provide data-specific computational functionality

Called once for every operation on e er t pleoperation on every tuple.

Even worse with complex tuple representationtuple representation

Perform one operation (e.g. addition) in one call(e.g. addition) in one call

MonetDB/X100 overview 14

Page 15: Monetdb Intro PPT

DBMS performance - IPTp

Lots of repeated, unnecessary codep , y

Operator logic

Function callsFunction calls

Attribute access

Most instructions NOT processing any actual data!

High instructions-per-tuple (IPT) factor

MonetDB/X100 overview 15

Page 16: Monetdb Intro PPT

Modern CPUs

New CPU features over last 20 yearsy

Instruction and data cache

Deep pipeline with out of order executionDeep pipeline with out-of-order execution

Superscalar features – multiple instructions at once

SIMD instructions (SSE)

Great for e.g. multimedia processing…

… but bad for database code!

MonetDB/X100 overview 16

Page 17: Monetdb Intro PPT

DBMS performance - CPIp

CPU-unfriendly codey

Complicated code

Poor use of CPU cache (both data and instructions)Poor use of CPU cache (both data and instructions)

Processing one value at a time

Compilers can’t help much

High cycles-per-instruction (CPI) factor

MonetDB/X100 overview 17

Page 18: Monetdb Intro PPT

DBMS performancep

Performance factors:

High instructions-per-tuple

High cycles per instructionHigh cycles-per-instruction

Very high cycles-per-tuple (CPT)

Others can do better

Scientific computing, …

How can we?

MonetDB/X100 overview 18

Page 19: Monetdb Intro PPT

MonetDB

MonetDB – 1993-now, developed at CWI , p

Peter’s PhD ☺

Column storeColumn store

Improves computational efficiency

Predecessor of MonetDB/X100

MonetDB/X100 overview 19

Page 20: Monetdb Intro PPT

MonetDB: a column store

“save disk I/O when scan-intensive queries / qneed a few columns”

MonetDB/X100 overview 20

Page 21: Monetdb Intro PPT

MonetDB: a column store

“save disk I/O when scan-intensive queries / qneed a few columns”

“ id i i t t t i“avoid an expression interpreter to improve computational efficiency”

MonetDB/X100 overview 21

Page 22: Monetdb Intro PPT

MonetDB in actionSELECT id, name, (age-30)*50 as bonusFROM peopleWHERE age > 30WHERE age > 30

MonetDB/X100 overview 22

Page 23: Monetdb Intro PPT

MonetDB in actionSELECT id, name, (age-30)*50 as bonusFROM peopleWHERE age > 30WHERE age > 30

MonetDB/X100 overview 23

Page 24: Monetdb Intro PPT

MonetDB in actionSELECT id, name, (age-30)*50 as bonusFROM peopleWHERE age > 30

CPU Efficiency depends on “nice” code- out-of-order execution- few dependencies (control,data)WHERE age > 30

int select gt float( oid* res,

few dependencies (control,data)- compiler support

Compilers love simple loops over arraysl i li i

Simple hard

_g _ ( ,float* column, float val, int n)

{for(int j=0 i=0; i<n; i++)

- loop-pipelining- automatic SIMD

Simple, hard-coded semantics

in operators

for(int j=0,i=0; i<n; i++)if (column[i] >val) res[j++] = i;

return j;}

MonetDB/X100 overview 24

Page 25: Monetdb Intro PPT

MonetDB: a column store

“save disk I/O when scan-intensive queries / qneed a few columns”

“ id i i t t t i“avoid an expression interpreter to improve computational efficiency”

Simple algebra – Monet Interpreter Language (MIL)

Hard-coded operator semantics – no function calls

Array-like processing

MonetDB/X100 overview 25

Page 26: Monetdb Intro PPT

MonetDB problempSELECT id, name, (age-30)*50 as bonusFROM peopleWHERE age > 30WHERE age > 30

MATERIALIZED intermediate

results

MonetDB/X100 overview 26

Page 27: Monetdb Intro PPT

Materialization problemp

Extra main-memory bandwidth y

Performance is sub-optimal…

but still faster than anything else (5 years ago ☺)… but still faster than anything else (5 years ago ☺)

Reduces scalability

Can’t afford writing to disk

Only effective for limited data sizes and not all query types

MonetDB/X100 overview 27

Page 28: Monetdb Intro PPT

MonetDB: a Faustian Pact

You want efficiencyy

Simple hard-coded operators

I take scalabilitI take scalability

Result materializationSupports SQL and XQuery

C program: 0.2s

MonetDB: 3 7s

Open-source download:monetdb.cwi.nl

MonetDB: 3.7s

MySQL: 26.2s

DBMS “X”: 28.1s

MonetDB/X100 overview 28

Page 29: Monetdb Intro PPT

MonetDB/X100/

My PhD thesisy

Motivation:

l ’ fi M DB l bililet’s fix MonetDB scalability…

… and improve the performance on the way ☺

Core ideas:

New execution modele e e u o ode

High performance column storage

MonetDB/X100 overview 29

Page 30: Monetdb Intro PPT

Typical Relational DBMS Engineyp g

Query

SELECT name, salary*.19 AS taxsalary .19 AS tax

FROMemployee

WHERE age > 25

MonetDB/X100 overview 30

Page 31: Monetdb Intro PPT

MonetDB/X100: “Vectors”/

MonetDB/X100 overview 31

Page 32: Monetdb Intro PPT

MonetDB/X100: “Vectors”o et / 00 ecto s

Vector contains data of multiple tuples (~100-1000)

All operations work on entire vectorson entire vectors

Effect: much less operator.next() and primitive calls.

MonetDB/X100 overview 32

Page 33: Monetdb Intro PPT

VectorsColumn slices as

unary arrays

NOT:NOT:Vertical is a better table storage layout than horizontal

(though we still think it often is)(though we still think it often is)

RATIONALE:- Simple array operations are p y p

well-supported by compilers- SIMD friendly layout- Assumed cache-resident

MonetDB/X100 overview 33

Page 34: Monetdb Intro PPT

Vectorized Primitivesint select_lt_int_col_int_val (

int *res,t es,int *col, int val, int n)

{for(int j i 0; i<n; i++)

Most primitives take just 0.5 (!) tofor(int j=i=0; i<n; i++)

if (col[i] < val) res[j++] = i;return j;

}

take just 0.5 (!) to 10 cycles per tuple

}10-100+ times faster thant l t tituple-at-a-time

MonetDB/X100 overview 34

Page 35: Monetdb Intro PPT

MonetDB/X100/

Both efficiency…y

Vectorized primitives

and scalabilit… and scalability

Pipelined query evaluation

C program: 0.2s

MonetDB/X100: 0.6s

MonetDB: 3.7s

MySQL: 26.2s

DBMS “X”: 28 1s

MonetDB/X100 overview 35

DBMS “X”: 28.1s

Page 36: Monetdb Intro PPT

Memory Hierarchyy yX100 query engine

CPUhcache

ColumnBM (buffer manager)

RAM

MonetDB/X100 overview 36

(raid)Disk(s)

Page 37: Monetdb Intro PPT

Optimal Vector size?pX100 query engine

All vectors together should fit the CPU cache

Depends on the query

CPUhcache

ColumnBM (buffer manager)

RAM

MonetDB/X100 overview 37

(raid)Disk(s)

Page 38: Monetdb Intro PPT

Varying the Vector sizey g

Less and less operator.next() and

primitive function calls (“interpretation overhead”)( interpretation overhead )

MonetDB/X100 overview 38

Page 39: Monetdb Intro PPT

Varying the Vector sizey g

Vectors start to exceed theCPU cache, causing

additional memory trafficadditional memory traffic

MonetDB/X100 overview 39

Page 40: Monetdb Intro PPT

MonetDB/MIL materializes columnso et / ate a es co u sMonetDB/MILX100 query engine

CPUhcache

ColumnBM (buffer manager)

RAM

MonetDB/X100 overview 40

(raid)Disk(s)

Page 41: Monetdb Intro PPT

Why is X100 so fast?yReduced interpretation overhead

100x less Function Calls100x less Function Calls

Good CPU cache useHigh locality in the primitivesg y pCache-conscious data placement

No Tuple NavigationPrimitives only see arrays

Vectorization allows algorithmic optimizationCPU and compiler-friendly function bodies

Multiple work units, loop-pipelining, SIMD…

MonetDB/X100 overview 41

Page 42: Monetdb Intro PPT

Feeding the Beastg

X100 uses < 100 cycles per tuple for TPC-H Q1y p p Q

Q1 has ~30 bytes of used columns per tuple

3GHz CPU core3GHz CPU core

eats 900MB/s

No problem for RAM

But disk-based data?

MonetDB/X100 overview 42

Page 43: Monetdb Intro PPT

Using Disk in the 21th centuryUsing Disk in the 21th century

Poor random disk access needs toPoor random disk access needs to be compensated with more and

more disk heads.(tens, hundreds… thousands!)

You’re better off with scanning!

MonetDB/X100 overview 43

Page 44: Monetdb Intro PPT

Using Disk in Data Warehousingg gGoals:1 S b d di k * l * (f ll ti l )1. Scan-based disk access *only* (full or partial scans)2. Minimize bandwidth3. Benefit, not suffer from concurrency

Database strategies:replicate tables in multiple orders (goal 1)p p (g )clustering join-tables in foreign-key order (goal 1)keep dimension tables in RAM (goal 1&2)scan-optimized indices (goal 1&2)scan-optimized indices (goal 1&2)use a column-store (goal 2)increase disk bandwidth with lightweight compression (goal 2)

di di k ( l 2&3)

MonetDB/X100 overview 44

coordinate concurrent disk access (goals 2&3)

Page 45: Monetdb Intro PPT

Feeding the Beast (1)g ( )

Two ideas pursued:p

Lightweight compression to enhance disk bandwidth

Maximizing diskMaximizing disk

scan sharing in

concurrent queries.

MonetDB/X100 overview 45

Page 46: Monetdb Intro PPT

Compression to improve I/O bandwidthCompression to improve I/O bandwidth

0.9GB/s query consumption/ q y p

1/3 CPU for decompression 1.8GB/s needed

new lightweight compression schemesMonetDB/X100 overview 46

new lightweight compression schemes

Page 47: Monetdb Intro PPT

Key Ingredientsy g

Compress relations on a per-column basisEasy to exploit redundancy

Keep data compressed in main-memoryKeep data compressed in main memoryMore data can be buffered

Decompress vector at a timeDecompress vector at a time Minimize main-memory overhead

Use light-weight, CPU-efficient algorithmsExploit processing power of modern CPUs

MonetDB/X100 overview 47

Page 48: Monetdb Intro PPT

Results

MonetDB/X100 overview 48

Page 49: Monetdb Intro PPT

TPC-H 100 GBDecent improvement

with fast disksTPC-H

query

MonetDB/X100 on 1 CPU DB2 – 8 CPUs

Compression ratio

4 disks 12 disks 142 disks

Speedup Time (s) Speedup Time (s) Time (s)Speedup Time (s) Speedup Time (s) Time (s)

01 4.33 4.41 69.6 1.29 50.9 111.9

03 3.04 3.10 11.3 1.48 6.0 15.1

04 8.15 7.58 2.4 2.67 1.8 12.5

05 3.81 3.55 15.3 1.06 16.2 84.0

06 4.39 4.50 10.7 2.35 4.6 17.1

07 1.71 1.66 72.0 0.84 40.8 86.5

MonetDB/X100 overview 49

Linear speedup with slow disks

Competes with DB2 using ~10x less resources

Page 50: Monetdb Intro PPT

Feeding the Beast (2)g ( )

Two ideas pursued:p

Lightweight compression to enhance disk bandwidth

Maximizing diskMaximizing disk

scan sharing in

concurrent queries.

MonetDB/X100 overview 50

Page 51: Monetdb Intro PPT

Concurrent scans

Multiple queries p qscanning the same table

Different start timesDifferent start times

Different scan ranges

Compete for disk accessCompete for disk access and buffer space

C SFCFS request scheduling: poor latency

MonetDB/X100 overview 51

Page 52: Monetdb Intro PPT

“Normal” scans in real life

MonetDB/X100 overview 52

Page 53: Monetdb Intro PPT

Shared scans

Observation: queries qoften do not need data in a sequential orderq

Idea: make queries “share” the scanningshare the scanning process

Two existing types:Two existing types:Attach

Elevator

MonetDB/X100 overview 53

Elevator

Page 54: Monetdb Intro PPT

“Attach” in real life

MonetDB/X100 overview 54

Page 55: Monetdb Intro PPT

“Elevator” in real life

MonetDB/X100 overview 55

Page 56: Monetdb Intro PPT

Existing shared scansg

Benefits

Less I/O operations

Better data reuseBetter data reuse

Problems

Sharing decisions static (when a query starts)

Misses opportunities in a dynamic environment

Not sensitive to different query types

MonetDB/X100 overview 56

Page 57: Monetdb Intro PPT

“Relevance” scans

Core ideas

Dynamically adapt to the current situation

Allow fully arbitrary data orderAllow fully arbitrary data order

Goals:

Maximize data sharing

Optimize latency and throughput

Work for different types of queries

MonetDB/X100 overview 57

Page 58: Monetdb Intro PPT

“Relevance” in real life

MonetDB/X100 overview 58

Page 59: Monetdb Intro PPT

Results

MonetDB/X100 overview 59

Page 60: Monetdb Intro PPT

ConclusionPresented MonetDB/X100

A d t b k l d l d t CWIA new database kernel developed at CWI

Uses block-oriented iterator model (vectorization)

works amazingly wellworks amazingly well

So fast must reduce hunger for hard disksColumn storage specialized in sequential accessg p q

+ Lightweight compression schemes (give ~~ factor 3)

+ Cooperative Bandwidth Sharing (gives ~~ factor 2)

Good performance resultsFastest raw 100GB TPC-H performance around (** not fair)

MonetDB/X100 overview 60

Beats IR systems on Terabyte TREC

Page 61: Monetdb Intro PPT

LiteratureMonetDB

P.A. Boncz. MIL Primitives for Querying a Fragmented World. VLDB Journal, 1999.

P.A. Boncz. Monet: A Next-Generation DBMS Kernel For Query-Intensive Applications. Ph.d. thesis, 2002.

MonetDB/X100P.A. Boncz, M.Zukowski, N.Nes. MonetDB/X100: Hyper-pipelining Query Execution, CIDR 2005.

M Zukowski S Heman N Nes P A Boncz Super-Scalar RAM-CPU Cache Compression ICDE 2006M. Zukowski, S. Heman, N. Nes, P.A. Boncz. Super Scalar RAM CPU Cache Compression. ICDE 2006.

M. Zukowski, S. Heman, N.Nes, P.A. Boncz. Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS, VLDB 2007.

OtherS. Padmanabhan, T. Malkemus, R. Agarwal, A. Jhingran. Block oriented processing of relational database operations in modern computer architectures. ICDE 2001.

J. Goldstein, R. Ramakrishnan, and U. Shaft. Compressing relations and indexes. ICDE 1998.

M. Stonebraker, et al. C-Store: A Column Oriented DBMS. VLDB 2005.

D.J. Abadi, S.R. Madden, M.C. Ferreira. Integrating Compression and Execution in Column-Oriented DatabaseD.J. Abadi, S.R. Madden, M.C. Ferreira. Integrating Compression and Execution in Column Oriented Database Systems. SIGMOD 2006.

S. Harizopoulos, V. Liang, D. Abadi, S. Madden. Performance Tradeoffs in Read-Optimized Databases. VLDB 2006.

D.J. Abadi, D.S. Myers, D.J. DeWitt, S.R. Madden. Materialization Strategies in a Column-Oriented DBMS. ICDE 2007.

C A L B Bh tt h j T M lk S P d bh K W I i b ff l lit f lti l

MonetDB/X100 overview 61

C.A. Lang, B. Bhattacharjee, T. Malkemus, S. Padmanabhan, K. Wong. Increasing buffer-locality for multiple relational table scans through grouping and throttling. ICDE 2007.

Page 62: Monetdb Intro PPT

The End

Thank you!

Questions?

MonetDB/X100 overview 62