6.814/6.830 lecture 8
DESCRIPTION
6.814/6.830 Lecture 8. Memory Management. Column Representation Reduces Scan Time. Idea: Store each column in a separate file. Column Representation. Reads Just 3 Columns. Assuming each column is same size, reduces bytes read from disk by factor of 3/5. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/1.jpg)
6.814/6.830 Lecture 8
Column StoresMemory Management
![Page 2: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/2.jpg)
Recap : Column Representation Reduces Scan Time
• Idea: Store each column as a collection of sorted, compressed files– Only scan the columns that are needed
• Simple per-column compression (RLE/Delta/Dictionary/LZ)– Direct operation on compressed data
• Buffer writes in write-optimized store, periodically compress, sort, and write out as new column fragments
![Page 3: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/3.jpg)
3
Write Performance
Tuple MoverAsynchronous Data
Movement
Queries read from both WOS and ROS
Batched
Amortizes seeks
Amortizes recompression
Enables continuous load
Trickle load: Very Fast Inserts
![Page 4: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/4.jpg)
When to Rewrite ROS Objects?
• Store multiple ROS objects, instead of just one• Each of which must be scanned to answer a
query
• Tuple mover writes new objects• Avoids rewriting whole ROS on merge
• Periodically merge ROS objects to limit number of distinct objects that must be scanned (like Big Table)
Tuple Mover
WOS ROS
Older objects
![Page 5: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/5.jpg)
5
When Are Columns Right?
• Warehousing (OLAP)• Read-mostly; batch update• Queries: Scan and aggregate a few
columns
• Vs. Transaction Processing (OLTP)• Write-intensive, mostly single record
ops.
• Column-stores: OLAP optimized
• In practice >10x performance on comparable HW, for many real world analytic applications• True even if w/ Flash or main memory!
Different architectures for different workloads
![Page 6: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/6.jpg)
C-Store Performance
• How much do these optimizations matter?
• Wanted to compare against best you could do with a commercial system
![Page 7: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/7.jpg)
7
Emulating a Column Store
• Two approaches:
1. Vertical partitioning: for n column table, store n two-column tables, with ith table containing a tuple-id, and attribute i
• Sort on tuple-id
• Merge joins for query results
2. Index-only plans
• Create a secondary index on each column
• Never follow pointers to base table
![Page 8: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/8.jpg)
Two Emulation Approaches
![Page 9: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/9.jpg)
9
Bottom Line
C-Store, Compression
C-Store, No Compression
C-Store, Early Materialize
Rows
Rows, Vert. Part.
Rows, All Indexes
4
15
41
26
80
221
Time (s)
SSBM (Star Schema Benchmark -- O’Neil et al ICDE 08) Data warehousing benchmark based on TPC-H Scale 100 (60 M row table), 17 columns Average across 12 queries Row store is a commercial DB, tuned by professional DBA vs
C-Store
Commercial System Does Not Benefit From Vertical Partitioning
![Page 10: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/10.jpg)
10
Problems with Vertical Partitioning
① Tuple headers Total table is 4GB Each column table is ~1.0 GB Factor of 4 overhead from tuple headers and tuple-ids
② Merge joins Answering queries requires joins Row-store doesn’t know that column-tables are
sorted Sort hurts performance
Would need to fix these, plus add direct operation on compressed data, to approach C-Store performance
![Page 11: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/11.jpg)
Problems with Index-Only Plans
Consider the query:
SELECT store_name, SUM(revenue) FROM Facts, Stores WHERE fact.store_id = stores.store_id AND stores.country = “Canada” GROUP BY store_name
• Two WHERE clauses result in a list of tuple IDs that pass all predicates
• Need to go pick up values from store_name and revenue columns
• But indexes map from valuetuple ID!
• Column stores can efficiently go from tuple IDvalue in each column
![Page 12: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/12.jpg)
12
Recommendations for Row-Store Designers
• Might be possible to get C-Store like performance
① Need to store tuple headers elsewhere (not require that they be read from disk w/ tuples)
② Need to provide efficient merge join implementation that understands sorted columns
③ Need to support direct operation on compressed data• Requires “late materialization” design
![Page 13: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/13.jpg)
13
Summary
• C-Store is a “next gen” column-oriented databases
• Key New Ideas:• Late materialization• Compression & direct operation• Fast load via “write optimized store”
• Row-stores do a poor job of emulation• Need better support for compression, late
materialization• Need support for narrow tuples, efficient merge
joins
C-Store: http://db.csail.mit.edu/cstore13
![Page 14: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/14.jpg)
Study Break
pgadmin3 demo
14
![Page 15: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/15.jpg)
Database Internals OutlineFront End
Admission ControlConnection Management
(sql)Parser
(parse tree)Rewriter
(parse tree) Planner & Optimizer
(query plan) Executor
Query System
Storage System
Access MethodsLock ManagerBuffer ManagerLog Manager
This time
Last time
![Page 16: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/16.jpg)
Buffer Manager
• Cache of recently accessed pages
• Mediates all access to page data / index methods
• Today: DBMIN Algorithm for Page Management
16
![Page 17: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/17.jpg)
LRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
17
![Page 18: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/18.jpg)
LRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
18
![Page 19: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/19.jpg)
LRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
19
![Page 20: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/20.jpg)
LRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 4 2 3 4 m
20
![Page 21: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/21.jpg)
LRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 4 2 3 4 m
1 4 2 1 3 1 m
21
![Page 22: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/22.jpg)
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
22
![Page 23: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/23.jpg)
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
23
![Page 24: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/24.jpg)
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
24
![Page 25: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/25.jpg)
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
25
![Page 26: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/26.jpg)
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
26
![Page 27: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/27.jpg)
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
27
![Page 28: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/28.jpg)
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
28
![Page 29: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/29.jpg)
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
1 1 3 4 3 m
29
![Page 30: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/30.jpg)
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
1 1 3 4 3 m
1 3 4 4 h
30
![Page 31: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/31.jpg)
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
1 1 3 4 3 m
1 3 4 4 h
1 3 4 1 h
31
![Page 32: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/32.jpg)
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
1 1 3 4 3 m
1 3 4 4 h
1 3 4 1 h
1 2 3 4 2 m
32
![Page 33: 6.814/6.830 Lecture 8](https://reader036.vdocuments.net/reader036/viewer/2022062422/5681378f550346895d9f2ee4/html5/thumbnails/33.jpg)
MRU w/ 3 pages
A B C Read Hit/miss?
1 1 m
1 2 2 m
1 2 3 3 m
1 2 1 4 4 m
1 2 4 1 h
1 2 4 2 h
1 1 3 4 3 m
1 3 4 4 h
1 3 4 1 h
1 2 3 4 2 m
2 3 4 3 h
33