indexing big data 30,000 foot view of databases big data...
TRANSCRIPT
![Page 1: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/1.jpg)
30,000 Foot View of Databases
oy vey
??????
???
organize data on disks query your data
365
42
ingest data
Big data problemIndexing Big DataMichael A. Bender
I’ll focus on streaming/indexing. I’m interested in learning how it can help with graph analysis.
![Page 2: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/2.jpg)
30,000 Foot View of Databases
oy vey
??????
???
organize data on disks query your data
365
42
ingest data
Big data problemIndexing Big DataMichael A. Bender
Goal: Index big data so that it can be queried quickly.I’ll focus on streaming/indexing. I’m interested in learning how it can help with graph analysis.
![Page 3: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/3.jpg)
Setting of Talk-- “Type II streaming”
Type II streaming: Collect and store streams of data to answer queries.Data is stored on disks.
• Dealing with disk latency has similarities to dealing with network latency.
We’ll discuss indexing.• We want to store data so that it can be queried efficiently.• To discuss: relationship with graphs.
![Page 4: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/4.jpg)
Setting of Talk-- “Type II streaming”
Type II streaming: Collect and store streams of data to answer queries.Data is stored on disks.
• Dealing with disk latency has similarities to dealing with network latency.
We’ll discuss indexing.• We want to store data so that it can be queried efficiently.• To discuss: relationship with graphs.
Sorry it didn’t render.
![Page 5: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/5.jpg)
For on-disk data, one traditionally sees funny tradeoffs in the speeds of data ingestion, query speed, and freshness of data.
oy vey
??????
???
organize data on disks query your data
365
42
ingest data
![Page 6: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/6.jpg)
Don’t Thrash: How to Cache Your Hash in Flashdata indexing query processor
queries + answers
???42
data ingestion
Funny tradeoff in ingestion, querying, freshness• “Select queries were slow until I added an index onto the timestamp field...
Adding the index really helped our reporting, BUT now the inserts are taking forever.”‣ Comment on mysqlperformanceblog.com
• “I'm trying to create indexes on a table with 308 million rows. It took ~20 minutes to load the table but 10 days to build indexes on it.”‣ MySQL bug #9544
• “They indexed their tables, and indexed them well, And lo, did the queries run quick! But that wasn’t the last of their troubles, to tell– Their insertions, like treacle, ran thick.”‣ Not from Alice in Wonderland by Lewis Carroll
![Page 7: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/7.jpg)
Don’t Thrash: How to Cache Your Hash in Flashdata indexing query processor
queries + answers
???42
data ingestion
Funny tradeoff in ingestion, querying, freshness• “Select queries were slow until I added an index onto the timestamp field...
Adding the index really helped our reporting, BUT now the inserts are taking forever.”‣ Comment on mysqlperformanceblog.com
• “I'm trying to create indexes on a table with 308 million rows. It took ~20 minutes to load the table but 10 days to build indexes on it.”‣ MySQL bug #9544
• “They indexed their tables, and indexed them well, And lo, did the queries run quick! But that wasn’t the last of their troubles, to tell– Their insertions, like treacle, ran thick.”‣ Not from Alice in Wonderland by Lewis Carroll
![Page 8: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/8.jpg)
Don’t Thrash: How to Cache Your Hash in Flashdata indexing query processor
queries + answers
???42
data ingestion
Funny tradeoff in ingestion, querying, freshness• “Select queries were slow until I added an index onto the timestamp field...
Adding the index really helped our reporting, BUT now the inserts are taking forever.”‣ Comment on mysqlperformanceblog.com
• “I'm trying to create indexes on a table with 308 million rows. It took ~20 minutes to load the table but 10 days to build indexes on it.”‣ MySQL bug #9544
• “They indexed their tables, and indexed them well, And lo, did the queries run quick! But that wasn’t the last of their troubles, to tell– Their insertions, like treacle, ran thick.”‣ Not from Alice in Wonderland by Lewis Carroll
![Page 9: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/9.jpg)
Tradeoffs come from different ways to organize data on disk
Like a librarian?
![Page 10: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/10.jpg)
Tradeoffs come from different ways to organize data on disk
Like a librarian?
Fast to find stuff. Slow to add stuff.
“Indexing”
![Page 11: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/11.jpg)
Tradeoffs come from different ways to organize data on disk
Like a teenager?
Like a librarian?
Fast to find stuff. Slow to add stuff.
“Indexing”
![Page 12: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/12.jpg)
Tradeoffs come from different ways to organize data on disk
Like a teenager?
Like a librarian?
Fast to find stuff. Slow to add stuff.
“Indexing”
Fast to add stuff. Slow to find stuff.
“Logging”
![Page 13: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/13.jpg)
This talk: we don’t need tradeoffs
Write-optimized data structures: •Faster indexing
(10x-100x)•Faster queries •Fresh data These structures efficiently scale to very big data sizes.
Fractal-tree® index
LSM tree
Bɛ-tree
8
![Page 14: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/14.jpg)
Our algorithmic work appears in two commercial products
Tokutek’s high-performance MySQL and MongoDB.
File System
MySQL Database-- SQL processing,-- query optimization
Application
libFT
Disk/SSD
TokuDB
File System
Standard MongoDB -- drivers, -- query language, and -- data model
Application
libFT
TokuMX
![Page 15: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/15.jpg)
Our algorithmic work appears in two commercial products
Tokutek’s high-performance MySQL and MongoDB.
File System
MySQL Database-- SQL processing,-- query optimization
Application
libFT
Disk/SSD
TokuDB
File System
Standard MongoDB -- drivers, -- query language, and -- data model
Application
libFT
TokuMXThe Fractal Tree
engine implements the persistent structures
for storing data on disk.
![Page 16: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/16.jpg)
Write-Optimized Data Structures
10
![Page 17: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/17.jpg)
Write-Optimized Data Structures
Would you like them with an
algorithmic performance
model?
10
![Page 18: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/18.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
How computation works: • Data is transferred in blocks between RAM and disk. • The number of block transfers dominates the running time.
Goal: Minimize # of block transfers• Performance bounds are parameterized by
block size B, memory size M, data size N.
An algorithmic performance model
DiskRAM
B
B
M
[Aggarwal+Vitter ’88]11
![Page 19: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/19.jpg)
Memory and disk access times
Disks: ~6 milliseconds per access.RAM: ~60 nanoseconds per access
![Page 20: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/20.jpg)
Memory and disk access times
Disks: ~6 milliseconds per access.RAM: ~60 nanoseconds per access
Analogy: • disk = elevation of Sandia peak• RAM = height of a watermellon
![Page 21: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/21.jpg)
Memory and disk access times
Disks: ~6 milliseconds per access.RAM: ~60 nanoseconds per access
Analogy: • disk = walking speed of the giant tortoise (0.3mph)• RAM = escape velocity from earth (25,000 mph)
Analogy: • disk = elevation of Sandia peak• RAM = height of a watermellon
![Page 22: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/22.jpg)
The traditional data structure for disks is the B-tree
O(logBN)
![Page 23: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/23.jpg)
The traditional data structure for disks is the B-tree
Adding a new datum to an N-element B-tree uses O(logBN) block transfers in the worst case.(Even paying one block transfer is too expensive.)
O(logBN)
![Page 24: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/24.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
Write-optimized data structures perform better
• If B=1024, then insert speedup is B/logB≈100.• Hardware trends mean bigger B, bigger speedup.• Less than 1 I/O per insert.
B-tree Some write-optimized structures
Insert/delete O(logBN)=O( ) O( )logNlogB
logNB
Data structures: [O'Neil,Cheng, Gawlick, O'Neil 96], [Buchsbaum, Goldwasser, Venkatasubramanian, Westbrook 00], [Argel 03], [Graefe 03], [Brodal, Fagerberg 03], [Bender, Farach,Fineman,Fogel, Kuszmaul, Nelson’07], [Brodal, Demaine, Fineman, Iacono, Langerman, Munro 10], [Spillane, Shetty, Zadok, Archak, Dixit 11]. Systems: BigTable, Cassandra, H-Base, LevelDB, TokuDB.
![Page 25: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/25.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
Optimal Search-Insert Tradeoff [Brodal, Fagerberg 03]
insert point queryOptimal tradeoff
(function of ɛ=0...1)
B-tree(ɛ=1)
O
✓logB Np
B
◆
O (logB N)
O (logB N)
ɛ=1/2
O
✓logN
B
◆
O (logN)ɛ=0
O�log1+B" N
�O
✓log1+B" N
B1�"
◆
O (logB N)
10x-
100x
fast
er in
sert
s
![Page 26: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/26.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
Illustration of Optimal Tradeoff [Brodal, Fagerberg 03]
Inserts
Poin
t Q
ueri
es
FastSlow
Slow
Fast
Logging
B-tree
Logging
Optimal Curve
![Page 27: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/27.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
Illustration of Optimal Tradeoff [Brodal, Fagerberg 03]
Inserts
Poin
t Q
ueri
es
FastSlow
Slow
Fast
Logging
B-tree
Logging
Optimal Curve
Insertions improve by 10x-100x with
almost no loss of point-query performance
Target of opportunity
![Page 28: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/28.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
Performance of write-optimized data structures
MongoDB MySQL
Write performance on large data
16x faster>100x faster
![Page 29: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/29.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
Performance of write-optimized data structures
MongoDB MySQL
Write performance on large data
16x faster>100x faster
Later: why fast indexing leads to faster queries.
![Page 30: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/30.jpg)
How to Build Write-Optimized Structures
![Page 31: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/31.jpg)
How to Build Write-Optimized Structures
write-optimized
![Page 32: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/32.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
A simple write-optimized structureO(log N) queries and O((log N)/B) inserts:• A balanced binary tree with buffers of size B
Inserts + deletes:• Send insert/delete messages down from the root and store
them in buffers. • When a buffer fills up, flush.
20
![Page 33: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/33.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
A simple write-optimized structureO(log N) queries and O((log N)/B) inserts:• A balanced binary tree with buffers of size B
Inserts + deletes:• Send insert/delete messages down from the root and store
them in buffers. • When a buffer fills up, flush.
20
![Page 34: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/34.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
A simple write-optimized structureO(log N) queries and O((log N)/B) inserts:• A balanced binary tree with buffers of size B
Inserts + deletes:• Send insert/delete messages down from the root and store
them in buffers. • When a buffer fills up, flush.
20
![Page 35: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/35.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
A simple write-optimized structureO(log N) queries and O((log N)/B) inserts:• A balanced binary tree with buffers of size B
Inserts + deletes:• Send insert/delete messages down from the root and store
them in buffers. • When a buffer fills up, flush.
20
![Page 36: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/36.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
A simple write-optimized structureO(log N) queries and O((log N)/B) inserts:• A balanced binary tree with buffers of size B
Inserts + deletes:• Send insert/delete messages down from the root and store
them in buffers. • When a buffer fills up, flush.
20
![Page 37: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/37.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
A simple write-optimized structureO(log N) queries and O((log N)/B) inserts:• A balanced binary tree with buffers of size B
Inserts + deletes:• Send insert/delete messages down from the root and store
them in buffers. • When a buffer fills up, flush.
21
![Page 38: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/38.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
Analysis of writesAn insert/delete costs amortized O((log N)/B) per insert or delete• A buffer flush costs O(1) & sends B elements down one
level• It costs O(1/B) to send element down one level of the tree.• There are O(log N) levels in a tree.
22
![Page 39: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/39.jpg)
Difficulty of Key Accesses
![Page 40: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/40.jpg)
Difficulty of Key Accesses
![Page 41: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/41.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
Analysis of point queries
To search: • examine each buffer along a single root-to-leaf path. • This costs O(log N).
24
![Page 42: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/42.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
Obtaining optimal point queries + very fast inserts
Point queries cost O(log√B N)= O(logB N) • This is the tree height.
Inserts cost O((logBN)/√B) • Each flush cost O(1) I/Os and flushes √B elements.
√B
B
...
fanout: √B
25
![Page 43: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/43.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
Powerful and ExplainableWrite-optimized data structures are very powerful. They are also not hard to teach in a standard algorithms course.
26
![Page 44: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/44.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
What the world looks likeInsert/point query asymmetry• Inserts can be fast: >50K high-entropy writes/sec/disk. • Point queries are necessarily slow: <200 high-entropy reads/
sec/disk.
We are used to reads and writes having about the same cost, but writing is easier than reading.
Reading is hard.Writing is easier.
27
![Page 45: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/45.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
The right read optimization is write optimization
The right index makes queries run fast. • Write-optimized structures maintain indexes efficiently.
data indexing query processor
queries
???42
answers
data ingestion
28
![Page 46: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/46.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
The right read optimization is write optimization
The right index makes queries run fast. • Write-optimized structures maintain indexes efficiently.
Fast writing is a currency we use to accelerate queries. Better indexing means faster queries.
data indexing query processor
queries
???42
answers
data ingestion
28
![Page 47: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/47.jpg)
Don’t Thrash: How to Cache Your Hash in Flash
The right read optimization is write optimization
MongoDB
TokuMX
Adding more indexes leads to faster queries.
If we can insert faster, we can afford to maintain more indexes (i.e., organize the data in more ways.)
query I/O load on a production server
Index maintenance has been notoriously slow.
indexing rate
![Page 48: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/48.jpg)
![Page 49: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/49.jpg)
Summery Slide
![Page 50: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/50.jpg)
Summery Slide
The right read optimization is write optimization.
![Page 51: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/51.jpg)
Summery Slide
We don’t need to trade off ingestion speed, query speed, and data freshness.
The right read optimization is write optimization.
![Page 52: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/52.jpg)
Summery Slide
How do write-optimized data structures help us with graph queries?
We don’t need to trade off ingestion speed, query speed, and data freshness.
The right read optimization is write optimization.
![Page 53: Indexing Big Data 30,000 Foot View of Databases Big data ...wsga.sandia.gov/docs/Bender.streaming.pdf · Tokutek’s high-performance MySQL and MongoDB. File System MySQL Database--](https://reader033.vdocuments.net/reader033/viewer/2022050500/5f92d0b952f02f4db969db5b/html5/thumbnails/53.jpg)
Summery Slide
How do write-optimized data structures help us with graph queries?
We don’t need to trade off ingestion speed, query speed, and data freshness.
The right read optimization is write optimization.
Can we use write-optimization on large Sandia machines?
There are similar challenges with network and I/O latency.