voltdb - stonebraker live! - new york city 2013

73
Stonebraker Live! Navigating the Database Universe VoltDB presents

Upload: voltdbevents

Post on 10-May-2015

1.815 views

Category:

Technology


1 download

DESCRIPTION

VoltDB’s Dr. Michael Stonebraker of MIT, UC Berkeley, and Ingres and Postgres fame, presents the founding principles of solving modern data-velocity problems: “Data is growing faster than hard drives,” “Move the computation to the data, never move the data to the computation,” “Bet on main memory because there's no other way to go fast,” and “Run transactions to completion, and you eliminate locking and multithreading,” are central to his beliefs. Not-so-coincidentally, they’re also central concepts of VoltDB.

TRANSCRIPT

Page 1: VoltDB - Stonebraker Live! - New York City 2013

Stonebraker Live!Navigating the Database Universe

VoltDB presents

Page 2: VoltDB - Stonebraker Live! - New York City 2013

BRUCE READING

President and CEO

Page 3: VoltDB - Stonebraker Live! - New York City 2013

• Traditional RDBMS is all wrong– Presented by Dr. Michael Stonebraker, Co-founder

• Making sense of the database universe

– Presented by Bruce Reading, President and CEO

• Hello VoltDB 3.0

– Presented by Ryan Betts, Field CTO

Agenda

Page 4: VoltDB - Stonebraker Live! - New York City 2013

TRADITIONAL RDBMS WISDOM IS ALL WRONG

Dr. Michael Stonebraker

Page 5: VoltDB - Stonebraker Live! - New York City 2013

Traditional RDBMS Wisdom

• Data is in disk block formatting (heavily encoded)

• With a main memory buffer pool of blocks

• Query plans– Optimize CPU, I/O

– Fundamental operation is read a row

• Indexing via B-trees– Clustered or unclustered

Page 6: VoltDB - Stonebraker Live! - New York City 2013

Traditional RDBMS Wisdom

• Dynamic row-level locking

• Aries-style write-ahead log

• Replication (asynchronous or synchronous)

– Update the primary first

– Then move the log to other sites

– And roll forward at the secondary (s)

Page 7: VoltDB - Stonebraker Live! - New York City 2013

Traditional RDBMS Wisdom

• Describes MySQL, DB2, Postgres, SQLServer, Oracle…

• Focus of most college-level DBMS courses

– Including M.I.T.

• Focus of most DBMS textbooks

Page 8: VoltDB - Stonebraker Live! - New York City 2013

Traditional RDBMS Wisdom

• Is completely wrong• (More charitably) is obsolete

Page 9: VoltDB - Stonebraker Live! - New York City 2013

The DBMS Marketplace

• About 1/3 “data warehouses”

– Lots of big reads

– Bulk-loaded from OLTP systems

• About 1/3 “OLTP”

– Lots of small updates

– And a few reads

• About 1/3 “everything else”

– Hadoop, NoSQL, graph DBMS, Array DBMS…

Page 10: VoltDB - Stonebraker Live! - New York City 2013

The DBMS Marketplace

• Data warehouses

– Market already moving strongly in the direction of column stores

– Which have nothing to do with the traditional wisdom

– Because column stores are 50 – 100 X row stores

Page 11: VoltDB - Stonebraker Live! - New York City 2013

The Participants

• Native column store vendors

– HP/Vertica, SAP/Hana, Red Shift (Amazon/Paraccl), SAP/Sybase/IQ

• Native row store vendors

– Microsoft, Oracle, DB2, Netezza

• In transition

– Teradata, Asterdata, Greenplum

• If you are running a row store, then be prepared to switch!

Page 12: VoltDB - Stonebraker Live! - New York City 2013

The DBMS Marketplace

• OLTP

– NewSQL systems are wildly faster than the traditional wisdom

• Everything else

– Not an RDBMS market

Page 13: VoltDB - Stonebraker Live! - New York City 2013

OLTP Databases – 3 Big Decisions

• Main memory vs. disk orientation• Replication strategy• Concurrency control strategy

Page 14: VoltDB - Stonebraker Live! - New York City 2013

Reality Check on OLTP Databases

• TP database size grows at the rate transactions increase• 1 Tbyte of main memory buyable for around $30K (or less)

– (say) 64 Gbytes per server in 16 servers

• 10+ Tbytes possible• If your data doesn’t fit in main memory now, then wait a

couple of years and it will…

Page 15: VoltDB - Stonebraker Live! - New York City 2013

Reality Check – Main Memory Performance

• TPC-C CPU cycles

• On the Shore DBMS prototype

• “Elephants” should be similar

Page 16: VoltDB - Stonebraker Live! - New York City 2013

To Go Fast

• Must focus on overhead– B-trees affects a small fraction of the path length

• Must get rid of all four pie slices– Anything less gives you a marginal win– TimesTen as an example

16

Page 17: VoltDB - Stonebraker Live! - New York City 2013

Buffer Pool Overhead

• Get rid of the buffer pool

• i.e., run a main-memory DBMS

– Like VoltDB

Page 18: VoltDB - Stonebraker Live! - New York City 2013

Single Threading

• Hosed unless you do this

– Unless you get rid of queuing (somehow)

– Or eliminate shared data structures (somehow)

• VoltDB statically divides shared memory among the cores

– And cores are single threaded

Page 19: VoltDB - Stonebraker Live! - New York City 2013

Concurrency Control

• MVCC popular (NuoDB, Hekaton)

• Time stamp order popular (VoltDB)

• I don’t know anybody who is doing normal dynamic locking

– It’s too slow!!!!

Page 20: VoltDB - Stonebraker Live! - New York City 2013

Reality Check – High Availability (HA)

• Requirement in today’s OLTP systems

• Nobody will take down time

• Must be solved through replication

Page 21: VoltDB - Stonebraker Live! - New York City 2013

How to Implement HA

• I am only interested in ACID outcomes!!!!

• Eventual consistency actually means “creates garbage”

– Consider 2 customers at 2 sites, each buying the last “widget”

• Even Jeff Dean (Google) has come around to this point of view

Page 22: VoltDB - Stonebraker Live! - New York City 2013

How to Implement HA

• Active-Passive

– Effectively requires you to write a log

– One of the four pie slices

• Active-Active (VoltDB solution)

– Send only the transaction, not the effect of the transaction

– Allows read-queries to be sent to any replica

Page 23: VoltDB - Stonebraker Live! - New York City 2013

Reality Check – Power Failures

• What to do if you don’t have UPS…

• Cannot lose data on a power failure!!!!

• Two options

– Bring back the log (and the pie slice)

– Command log plus asynchronous checkpoints

Page 24: VoltDB - Stonebraker Live! - New York City 2013

Some Data From Nirmesh Malvaiya

• Implemented Aries in VoltDB

• Compared against the VoltDB command logging

• Command logging about 3X faster in total throughput

Page 25: VoltDB - Stonebraker Live! - New York City 2013

The Nail in the Coffin

• Time stamp order compatible with active-active

– As are any deterministic schemes

• Locking and MVCC are not

– Need a 2 phase commit between the replicas

– Slow, slow, slow

Page 26: VoltDB - Stonebraker Live! - New York City 2013

Net-Net on OLTP

• Main memory DBMS

• Deterministic concurrency control

• HA via active-active

• Has nothing to do with the traditional wisdom

• Even if your data is too big for main memory

– The traditional wisdom is still wrong

– Stay tuned for a paper on this topic

Page 27: VoltDB - Stonebraker Live! - New York City 2013

Summary

• What we teach our DBMS students is all wrong

• Implementations from the “elephants” are all obsolete– One-size-does-not-fit-all

– Several million lines of code per vendor are obsolete

• I expect a lot of turmoil in the market off into the future

Page 28: VoltDB - Stonebraker Live! - New York City 2013

MAKING SENSE OF THE DATABASE UNIVERSE

Bruce Reading

Page 29: VoltDB - Stonebraker Live! - New York City 2013

The fact is…

There’s only more and more to come.

And it’s not slowing down…

Record amounts of data are being created everyday…

Page 30: VoltDB - Stonebraker Live! - New York City 2013

And if that data is most valuable at the moment it’s created, how do you

put it to use NOW?

How do you automate decisioning against it NOW?

Page 31: VoltDB - Stonebraker Live! - New York City 2013

NOW

Page 32: VoltDB - Stonebraker Live! - New York City 2013

Imagine…

Page 33: VoltDB - Stonebraker Live! - New York City 2013

Nice story. So what?

Page 34: VoltDB - Stonebraker Live! - New York City 2013

Large, busy bank

Rogue trader

5 “Mistypednumber”

-$Small sum lost9 “Mistyped

number”

& “Mistypednumber

-$Small sum lost

-$

Small sum lost

Oblivious

-$-$

-$

-$

-$

-$

-$

-$

-$-$

-$

-$

-$

-$-$

-$

-$

-$-$

-$

-$

-$ -$

-$

-$

-$

-$

-$

-$-$

-$

-$

-$

-$

-$

-$

-$

-$

-$

-$

-$

-$ -$-$

-$

-$

-$-$

-$

-$

-$

-$-$

-$-$

-$

-$

-$

-$-$

-$

-$

-$

-$ -$

-$

-$-$

-$-$-$

-$-$-$

-$

-$-$

-$-$

-$

-$

-$

-$

-$

-$

-$

-$

-$

-$ -$

Page 35: VoltDB - Stonebraker Live! - New York City 2013

-$2BNLarge sum lost

Third largest loss inbanking history

Page 36: VoltDB - Stonebraker Live! - New York City 2013

UBS couldn't flag it among all the data... until it was too late.

Page 37: VoltDB - Stonebraker Live! - New York City 2013

This is our world now.

Page 38: VoltDB - Stonebraker Live! - New York City 2013

Same old, same old won’t cut it.

Page 39: VoltDB - Stonebraker Live! - New York City 2013

What’s a developer to do?

Page 40: VoltDB - Stonebraker Live! - New York City 2013

Data Value Chain

Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics

Milliseconds Hundredths of seconds Second(s) Minutes Hours

• Place trade• Serve ad• Enrich stream• Examine packet• Approve trans.

• Calculate risk• Leaderboard• Aggregate• Count

• Retrieve click stream

• Show orders

• Backtest algo• BI• Daily reports

• Algo discovery• Log analysis• Fraud pattern match

Age of Data

Page 41: VoltDB - Stonebraker Live! - New York City 2013

Data Value Chain

Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics

Milliseconds Hundredths of seconds Second(s) Minutes Hours

• Place trade• Serve ad• Enrich stream• Examine packet• Approve trans.

• Calculate risk• Leaderboard• Aggregate• Count

• Retrieve click stream

• Show orders

• Backtest algo• BI• Daily reports

• Algo discovery• Log analysis• Fraud pattern match

Value of Individual Data Item

Data V

alue

AggregateData Value

Age of Data

Page 42: VoltDB - Stonebraker Live! - New York City 2013

Traditional RDBMSSimple SlowSmall

FastComplexLarge

Ap

pli

cati

on

Co

mp

lexi

ty

Value of Individual Data Item Aggregate Data Value

Data V

alue

The Database Universe

Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics

Transactional Analytic

Page 43: VoltDB - Stonebraker Live! - New York City 2013

Traditional RDBMS

Simple SlowSmall

FastComplexLarge

Ap

pli

cati

on

Co

mp

lexi

ty

Value of Individual Data Item Aggregate Data Value

Data V

alue

Data Warehouse

Hadoop, etc.NoSQL

The Database Universe

Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics

Transactional Analytic

NewSQL

Velocity

Page 44: VoltDB - Stonebraker Live! - New York City 2013

The fastest, most scalable database on the market todayVoltDBIngest massive quantities of data and

perform automated decisioning in real time3 MILLION transactions

per second Dramatically lowering your cost per

transactionVoltDB enables

NOW.A huge impact on the bottom lineNOW  

Page 45: VoltDB - Stonebraker Live! - New York City 2013

PREVENT

ACHIEVE

Anything is possible…

Page 46: VoltDB - Stonebraker Live! - New York City 2013

Electrical smart grids

Page 47: VoltDB - Stonebraker Live! - New York City 2013

Micro-personalization

Page 48: VoltDB - Stonebraker Live! - New York City 2013

Real-time display targeting

Page 49: VoltDB - Stonebraker Live! - New York City 2013

Dynamic airline ticket purchasing

Page 50: VoltDB - Stonebraker Live! - New York City 2013

State-of-the-art social networking

Page 51: VoltDB - Stonebraker Live! - New York City 2013

Session management

Page 52: VoltDB - Stonebraker Live! - New York City 2013

Network monitoring

Page 53: VoltDB - Stonebraker Live! - New York City 2013

We enable NOW.

www.VoltDB.com

Page 54: VoltDB - Stonebraker Live! - New York City 2013

HELLO 3.0!

Ryan Betts

Page 55: VoltDB - Stonebraker Live! - New York City 2013

Introducing VoltDB 3.0

VoltDB 3.0

VoltDB: a modern OLTP database built for a high velocity world.

– Horizontal scalability

– Hundreds of thousands of transactions per second

– Relational SQL

Page 56: VoltDB - Stonebraker Live! - New York City 2013

Latency and Throughput, 50-50 Read/Write Workload

Latency and Throughput, 50-50 Read/Write Workload

0 20000 40000 60000 80000 100000 120000 140000 160000 180000 2000000

2

4

6

8

10

12

14

16

3.02.8.4.1

TPS

La

ten

cy

(m

s)

VoltDB 3.0 vs. v2.8.4.1Key/Value 50/50 read/write workload

3 Node, K=1 Cluster

Page 57: VoltDB - Stonebraker Live! - New York City 2013

Read/Write Workload Latency/Throughput

Read/Write Workload Latency/Throughput

0 50000 100000 150000 200000 250000 300000 3500000

1

2

3

4

5

6

7

8

9

10% read/90% write

50% read/50% write

90% read/10% write

TPS

Avg

. L

aten

cy (

ms)

VoltDB 3.0Key/Value various read/write workload

3 Node, K=1 Cluster

Page 58: VoltDB - Stonebraker Live! - New York City 2013

Faster: Ad Hoc SQL Performance

• Conversational SQL

• Thousands to 10,000+ ad hoc SQL transactions/second

• Single or multiple (batch) SQL statement transactionFaster: Ad Hoc SQL Performance

Page 59: VoltDB - Stonebraker Live! - New York City 2013

Easier Development: New SQL Support

• SQL LIKE and NOT LIKE

• UNION

• Column Functions

• Counting function (leaderboard ranking queries)

• Ability to define index using column functions

Easier Development: New SQL Support

Page 60: VoltDB - Stonebraker Live! - New York City 2013

• JSON values stored in a varchar column

• Field() column function

• Indexing on JSON elements

CREATE INDEX session_site_moderator

ON user_session_table (field(json_data, 'site'),

field(json_data, 'moderator'), username);

• New JSON sample in kit

Easier Development: JSON Support

Easier Development: JSON Support

Page 61: VoltDB - Stonebraker Live! - New York City 2013

Easier Development: Online Operations

Easier Development: Online Operations

• Ability to re-join a failed node to cluster with no impact to existing operations

• Online schema update

• No service window

Page 62: VoltDB - Stonebraker Live! - New York City 2013

Easier Development: Streamlined Development

• Elimination of project.xml

• VoltDB-specific configuration now defined in DDL

• Defaulting of deployment.xml

• New Volt Compiler CLI:

voltdb compile

Easier Development: Streamlined Development

Page 63: VoltDB - Stonebraker Live! - New York City 2013

Expanded Reach: Cloud-Friendly

• Reduce impact of variable node performance and latency

• Elimination of strict NTP configuration

• Scales to large # of nodesExpanded Reach: Cloud-Friendly

Page 64: VoltDB - Stonebraker Live! - New York City 2013

Integration: High-Performance Export

• Parallelized export

• New connectors: JDBC, Netezza, VerticaIntegration: High-Performance Export

Page 65: VoltDB - Stonebraker Live! - New York City 2013

Integration: Client Library Updates

• New PHP Client

• Node.js client v1.0

• Go Client

• Coming soon: updated Erlang client

Integration: Client Library Updates

http://golang.org

Page 66: VoltDB - Stonebraker Live! - New York City 2013

Other Notable New Features

• Explain command

• CSV loader utility

• CSV snapshots

• New Administration CLI: voltadmin– voltadmin save

– voltadmin restore

– voltadmin pause

– voltadmin resume

– voltadmin shutdown

Other Notable New Features

Page 67: VoltDB - Stonebraker Live! - New York City 2013

More Samples Available for Download

More Samples Available for Download

http://voltdb.com/community/volt-labs.php

Page 68: VoltDB - Stonebraker Live! - New York City 2013

Volt University

• Portfolio of instructional content, classes, tools, and other resources to help them built applications quickly

• Curriculum and supporting material range from beginner to advanced

• Three types of instruction:

– Volt University Online

– Volt University Classroom

– Volt Vanguard Certification

Volt University

Page 69: VoltDB - Stonebraker Live! - New York City 2013

Summary: VoltDB v3.0

• Run faster: transactions at high velocity scale.

• Create faster: write and scale your ACID application.

• Learn faster: Volt Labs & VoltDB University

VoltDB v3.0

Page 70: VoltDB - Stonebraker Live! - New York City 2013

DOWNLOAD 3.0at

www.voltdb.com

Imagine the Possibilities

Page 71: VoltDB - Stonebraker Live! - New York City 2013

More Information?

E-mail [email protected]

Visit our forumshttp://community.voltdb.com/forum

Read the VoltDB “Getting Started Guide”http://community.voltdb.com/docs/GettingStarted/index

Follow @VoltDB on Twitter

More Information?

Page 72: VoltDB - Stonebraker Live! - New York City 2013

QUESTIONS?

Page 73: VoltDB - Stonebraker Live! - New York City 2013

THANK YOU