![Page 1: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/1.jpg)
Stonebraker Live!Navigating the Database Universe
VoltDB presents
![Page 2: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/2.jpg)
SCOTT JARR
Co-founder and Chief Strategy Officer
![Page 3: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/3.jpg)
• The (proper) design of DBMSs– Presented by Dr. Michael Stonebraker, Co-founder
• The database universe
– Presented by Scott Jarr, Co-founder and Chief Strategy Officer
• Introducing VoltDB 3.0
– Presented by Mark Hydar, VP of Market Technology and Strategy
Agenda
![Page 4: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/4.jpg)
• “Big Data” is a rare, transformative market
• Velocity is becoming the cornerstone
• Specialized databases (working together) are the answer
• Products must provide tangible customer value... Fast
We Believe…
![Page 5: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/5.jpg)
THE (PROPER) DESIGNOF THE DBMS
Dr. Michael Stonebraker
![Page 6: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/6.jpg)
Lessons from 40 Years of Database Design
1. Get the user interaction right
– Bet on a small number of easy-to-understand constructs
– Plus standards
2. Get the implementation right
– Bet on a small number of easy-to-understand constructs
3. One size does not fit all
– At least not if you want fast, big or complex
Those who don’t learn from history are destined to repeat it.
“”-Winston Churchill
![Page 7: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/7.jpg)
#1: Get the User Interaction Right
Winner: RDBMS• Simple data model
(tables)• Simple access
language (SQL)• ACID (transactions)• Standards (SQL)
Loser: CODASYL• Complicated data model
(records; participate in “sets”; set has one owner and, perhaps, many members, etc.)
• Messy access language (sea of “cursors”; some -- but not all -- move on every command, navigation programming)
Loser: OODBs• Complex data model
(hierarchical records, pointers, sets, arrays, etc.)
• Complex access language (navigation, through this sea)
• No standards
Historical Lesson: RDBMS vs. CODASYL vs. OODB
![Page 8: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/8.jpg)
Interaction Take Away − Simple is Good
• ACID was easy for people to understand
• SQL provided a standard, high-level language and made people productive (transportable skills)
![Page 9: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/9.jpg)
#2: Get the Implementation Right
• Leverage a few simple ideas: Early relational implementations– System R storage system dropped links– Views (protection, schema modification, performance)– Cost-based optimizer
• Leverage a few simple ideas: Postgres– User-defined data types and functions (adopted by most everybody)– Rules/triggers– No-overwrite storage
• Leverage a few simple ideas: Vertica– Store data by column– Compressed up the ging gong– Parallel load without compromising ACID
Histo
rical Win
ners
![Page 10: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/10.jpg)
#3: One Size Does NOT Fit All
• OSFA is an old technology with
hundreds of bags hanging off it
• It breaks 100% of the time when under
load
• Load = size or speed or complexity
• Load is increasing at a startling rate
• Purpose-built will exceed by 10x to 100x
• History has not been completely written
yet…but let’s look at VoltDB as an
example
…specialized systems can each be a factor of 50 faster than the single ‘one size fits all’ system…A factor of 50 is nothing to sneeze at.
“
”-My Top 10 Assertions About Data Warehouses, 2010
![Page 11: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/11.jpg)
Example: VoltDB
• Get the interface right– SQL– ACID
• Implementation: Leverage a few simple ideas– Main memory– Stored procedures– Deterministic scheduling
• Specialization– OLTP focus allowed for above implementation choices
![Page 12: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/12.jpg)
Proving the Theory
• Challenge: OLTP performance
– TPC-C CPU cycles
– On the Shore DBMS prototype
– Elephants should be similar
Recovery 24%Latching 24%
Buffer Pool 24%Locking 24%
Useful Work4%
![Page 13: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/13.jpg)
Single Threaded
• Gets rid of the latching problem
• What about Multicore?
– Divide the memory on an N-core node so it looks like N single-core nodes
– Which are single threaded…
![Page 14: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/14.jpg)
Implementation Construct #1: Main Memory
• Main memory format for data
– Disk format gets you buffer pool overhead
• What happens if data doesn’t fit?
– Return to disk-buffer pool architecture (slow)
– Anti-caching
• Main memory format for data
• When memory fills up, then bundle together elderly tuples and write them out
• Run a transaction in “sleuth mode”; find the required records and move to main memory (and pin)
• Run Xact normally
![Page 15: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/15.jpg)
Implementation Construct #2: Stored Procedures
• Round trip to the DBMS is expensive
– Do it once per transaction
– Not once per command
– Or even once per cursor move
• Ad-hoc queries supported
– Turn them into dynamic stored procedures
![Page 16: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/16.jpg)
Implementation Construct #3: Deterministic Scheduling
• Transactions are ordered and run to completion
– No locking
• Active-active replication (HA)
– Run transaction at all replicas – in the same pre-determined order
• What about a cluster-wide power failure?– Asyn checkpointing
– With a command log
– Wildly faster than data logging
![Page 17: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/17.jpg)
Result of Design Principles: VoltDB Example
• Good interface decisions – made developers more productive
– SQL & ACID
• Leveraging a few simple implementation ideas – made VoltDB wicked fast
– Main memory
– Stored procedures
– Deterministic scheduling
![Page 18: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/18.jpg)
Proving the Theory
• Answer: OLTP performance
– 3 million transactions per second
– 7x Cassandra
– 15 million SQL statements per second
– 100,000+ transactions per commodity server
…we are heading toward a world with at least 5 (and probably more) specialized engines and the death of the ‘one size fits all’ legacy systems.
“
”-The End of an Architectural Era (It’s Time for a Complete
Rewrite), 2007
![Page 19: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/19.jpg)
THE DATABASE UNIVERSE
Scott Jarr
![Page 20: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/20.jpg)
Technology Meets the Market
Believe
– “Big Data” is a rare, transformative market
– Velocity is becoming the cornerstone
– Specialized databases (working together) are the answer
– Products must provide tangible customer value… Fast
Observations
– Noisy, crowded and new – kinda like Christmas shopping at the mall
– Everyone wants to understand where the pieces fit
– Analysts build maps on technology NOT use cases
What we need is…
![Page 21: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/21.jpg)
Data Value Chain
Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics
Milliseconds Hundredths of seconds Second(s) Minutes Hours
• Place trade• Serve ad• Enrich stream• Examine packet• Approve trans.
• Calculate risk• Leaderboard• Aggregate• Count
• Retrieve click stream
• Show orders
• Backtest algo• BI• Daily reports
• Algo discovery• Log analysis• Fraud pattern match
Age of Data
![Page 22: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/22.jpg)
Data Value Chain
Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics
Milliseconds Hundredths of seconds Second(s) Minutes Hours
• Place trade• Serve ad• Enrich stream• Examine packet• Approve trans.
• Calculate risk• Leaderboard• Aggregate• Count
• Retrieve click stream
• Show orders
• Backtest algo• BI• Daily reports
• Algo discovery• Log analysis• Fraud pattern match
Value of Individual Data Item
Data V
alue
AggregateData Value
Age of Data
![Page 23: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/23.jpg)
Traditional RDBMSSimple SlowSmall
FastComplexLarge
Ap
pli
cati
on
Co
mp
lexi
ty
Value of Individual Data Item Aggregate Data Value
Data V
alue
The Database Universe
Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics
Transactional Analytic
![Page 24: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/24.jpg)
Traditional RDBMS
Simple SlowSmall
FastComplexLarge
Ap
pli
cati
on
Co
mp
lexi
ty
Value of Individual Data Item Aggregate Data Value
Data V
alue
Data Warehouse
Hadoop, etc.NoSQL
The Database Universe
Interactive Real-time Analytics Record Lookup Historical Analytics Exploratory Analytics
Transactional Analytic
NewSQL
Velocity
![Page 25: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/25.jpg)
Closed-loop Big Data
Interactive & Real-time Analytics
Historical Reports & Analytics
Exploratory Analytics
loginssensors impressionsorders
authorizations clickstrades
![Page 26: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/26.jpg)
Closed-loop Big Data
• Make the most informed decision every time there is an interaction
• Real-time decisions are informed by operational analytics and past knowledge
Knowledge
Interactive & Real-time Analytics
Historical Reports & Analytics
Exploratory Analytics
loginssensors impressionsorders
authorizations clickstrades
![Page 27: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/27.jpg)
The Velocity Use Case
What’s it look like?
– High throughput, relentless data feeds
– Fast decisions on high-value data
– Real-time, operational analytics present immediate visibility
What’s the big deal?
– Batch visibility converts to real time = immediate business impact
– Decisions made at time of event = higher impact decisions with immediate returns
– Ability to ingest and manage massive amounts of data = business differentiation and
disruption
![Page 28: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/28.jpg)
HELLO 3.0!
Mark Hydar
![Page 29: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/29.jpg)
Introducing VoltDB 3.0
Introducing VoltDB 3.0
• Available now!
– Both commercial and open source offerings
– www.voltdb.com/downloads
• Key improvements
– Even faster
– Easier to build high-velocity applications
– Expanded reach across developers and applications
– Extensible to integrate with existing data infrastructure
![Page 30: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/30.jpg)
Latency and Throughput, 50-50 Read/Write Workload
Latency and Throughput, 50-50 Read/Write Workload
0 20000 40000 60000 80000 100000 120000 140000 160000 180000 2000000
2
4
6
8
10
12
14
16
3.02.8.4.1
TPS
La
ten
cy
(m
s)
VoltDB 3.0 vs. v2.8.4.1Key/Value 50/50 read/write workload
3 Node, K=1 Cluster
![Page 31: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/31.jpg)
Read/Write Workload Latency/Throughput
Read/Write Workload Latency/Throughput
0 50000 100000 150000 200000 250000 300000 3500000
1
2
3
4
5
6
7
8
9
10% read/90% write
50% read/50% write
90% read/10% write
TPS
Avg
. L
aten
cy (
ms)
VoltDB 3.0Key/Value various read/write workload
3 Node, K=1 Cluster
![Page 32: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/32.jpg)
Faster: Ad Hoc SQL Performance
• Conversational SQL
• Thousands to 10,000+ ad hoc SQL transactions/second
• Single or multiple (batch) SQL statement transactionFaster: Ad Hoc SQL Performance
![Page 33: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/33.jpg)
Easier Development: New SQL Support
• SQL LIKE and NOT LIKE
• UNION
• Column Functions
• Counting function (leaderboard ranking queries)
• Ability to define index using column functions
Easier Development: New SQL Support
![Page 34: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/34.jpg)
• JSON values stored in a varchar column
• Field() column function
• Indexing on JSON elements
CREATE INDEX session_site_moderator
ON user_session_table (field(json_data, 'site'),
field(json_data, 'moderator'), username);
• New JSON sample in kit
Easier Development: JSON Support
Easier Development: JSON Support
![Page 35: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/35.jpg)
Easier Development: Online Operations
Easier Development: Online Operations
• Ability to re-join a failed node to cluster with no impact to existing operations
• Online schema update
• No service window
![Page 36: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/36.jpg)
Easier Development: Streamlined Development
• Elimination of project.xml
• VoltDB-specific configuration now defined in DDL
• Defaulting of deployment.xml
• New Volt Compiler CLI:
voltdb compile
Easier Development: Streamlined Development
![Page 37: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/37.jpg)
Expanded Reach: Cloud-Friendly
• Reduce impact of variable node performance and latency
• Elimination of strict NTP configuration
• Scales to large # of nodesExpanded Reach: Cloud-Friendly
![Page 38: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/38.jpg)
Integration: High-Performance Export
• Parallelized export
• New connectors: JDBC, Netezza, VerticaIntegration: High-Performance Export
![Page 39: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/39.jpg)
Integration: Client Library Updates
• New PHP Client
• Node.js client v1.0
• Go Client
• Coming soon: updated Erlang client
Integration: Client Library Updates
http://golang.org
![Page 40: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/40.jpg)
Other Notable New Features
• Explain command
• CSV loader utility
• CSV snapshots
• New Administration CLI: voltadmin– voltadmin save
– voltadmin restore
– voltadmin pause
– voltadmin resume
– voltadmin shutdown
Other Notable New Features
![Page 41: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/41.jpg)
More Samples Available for Download
More Samples Available for Download
http://voltdb.com/community/volt-labs.php
![Page 42: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/42.jpg)
Volt University
• Portfolio of instructional content, classes, tools, and other resources to help them built applications quickly
• Curriculum and supporting material range from beginner to advanced
• Three types of instruction:
– Volt University Online
– Volt University Classroom
– Volt Vanguard Certification
Volt University
![Page 43: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/43.jpg)
Summary: VoltDB v3.0 Features
• Even faster
• Easier to build high-velocity applications
• Expanded reach across developers and applications
• Extensible to integrate with existing data infrastructure
• Volt Labs
• Volt University
VoltDB v3.0
![Page 44: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/44.jpg)
DOWNLOAD 3.0at
www.voltdb.com
Imagine the Possibilities
![Page 45: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/45.jpg)
More Information?
E-mail [email protected]
Visit our forumshttp://community.voltdb.com/forum
Read the VoltDB “Getting Started Guide”http://community.voltdb.com/docs/GettingStarted/index
Follow @VoltDB on Twitter
More Information?
![Page 46: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/46.jpg)
QUESTIONS?
![Page 47: Big Data Cloud Meetup - Jan 29 2013 - Mike Stonebraker & Scott Jarr of VoltDB](https://reader035.vdocuments.net/reader035/viewer/2022062712/55d50fc5bb61eb632e8b45aa/html5/thumbnails/47.jpg)
THANK YOU