google mesa
DESCRIPTION
My brief read on the Google Mesa Paper.TRANSCRIPT
![Page 2: Google mesa](https://reader036.vdocuments.net/reader036/viewer/2022082808/556262e0d8b42a14048b4cfb/html5/thumbnails/2.jpg)
What is Mesa?
● Geo-Replicated, Near Real-Time, Scalable Data Warehousing for Google’s Internet Advertising Business.
● Ok so what is it really?o Its an Atomic, Consistent, Available, Near Real Time,
Scalable Store
![Page 3: Google mesa](https://reader036.vdocuments.net/reader036/viewer/2022082808/556262e0d8b42a14048b4cfb/html5/thumbnails/3.jpg)
Salient features
● DW for Ad serving at Google● Metadata on BigTable● Data on Colossus● Trillions of Queries/day, Millions/second● Support Multiple indexes● Runs on tens of thousands machines across
geos
![Page 4: Google mesa](https://reader036.vdocuments.net/reader036/viewer/2022082808/556262e0d8b42a14048b4cfb/html5/thumbnails/4.jpg)
Data Model● Table are specified by Table Schemas● Table Schema by, Key and Value Space
o K, V are setso Each is represented as column tupleso Specifies an aggregation function
● Each Col stored separately● For consistency updates are multi-versioned
and batched for throughput● Data is amenable to aggregation
![Page 5: Google mesa](https://reader036.vdocuments.net/reader036/viewer/2022082808/556262e0d8b42a14048b4cfb/html5/thumbnails/5.jpg)
Data Model
● Pre-aggregates data into Deltas (no repeated row keys/delta) and applies a version
● Compaction is multi-level● A Controller handles updates/ maintenance,
works with BigTable
![Page 6: Google mesa](https://reader036.vdocuments.net/reader036/viewer/2022082808/556262e0d8b42a14048b4cfb/html5/thumbnails/6.jpg)
Controller
● 4 sub-systemso Updateso Compactiono Checksumo Schema change
● Does not do any work, only schedules it
![Page 7: Google mesa](https://reader036.vdocuments.net/reader036/viewer/2022082808/556262e0d8b42a14048b4cfb/html5/thumbnails/7.jpg)
Storage and Indexes
- AO, log structured, read-only- Rows organized as compressed row-blocks- Indexes have starting entry of the row-block- Naive lookup
- Binary Search on index to find row-blocks
- Binary Search on the row-blocks
![Page 8: Google mesa](https://reader036.vdocuments.net/reader036/viewer/2022082808/556262e0d8b42a14048b4cfb/html5/thumbnails/8.jpg)
Query sub system
● Limited Query engine with Filtering/Predicate● Used by higher level systems
Dremel/MySQL● Has multiple stateless Query Servers● Works on both the BigTable and Colossus● Provides nice sharding and LB mechanism● Groups similar queries to a subset of
Servers
![Page 9: Google mesa](https://reader036.vdocuments.net/reader036/viewer/2022082808/556262e0d8b42a14048b4cfb/html5/thumbnails/9.jpg)
Multi Datacenter Deployment● Tables are multi-versioned
o (Serve old data while new is in-progress)● Committer is stateless and sends updates to
multiple Datacenterso Built on top of versionsDB. - Globally replicated and
consistent store build on top of distributed Paxos.● Data goes async across Mesa instances● Only Metadata is sync-repl using Paxos-
versionsDB
![Page 10: Google mesa](https://reader036.vdocuments.net/reader036/viewer/2022082808/556262e0d8b42a14048b4cfb/html5/thumbnails/10.jpg)
Optimizations● Delta pruning - similar to Filter pushdown● Resume-Key, Key per data block
o Data is returned a block at a time, so if a QueryServer dies, another one can pick it up.
● Parallelizing workloads: Uses MR to shardo While writing delta, Mesa sample row-keys which is
used to figure out the right number of Mappers/Reducers.
o The workers are the same 4 workers scheduled by the Controller
![Page 11: Google mesa](https://reader036.vdocuments.net/reader036/viewer/2022082808/556262e0d8b42a14048b4cfb/html5/thumbnails/11.jpg)
Optimizations
● Schema changes - two techniqueso Create, Copy, Replay and delete - Expensiveo Link and add default values - This is used in Mesa
● New Instances of Mesa use P2P mechanisms to come up and online.
![Page 12: Google mesa](https://reader036.vdocuments.net/reader036/viewer/2022082808/556262e0d8b42a14048b4cfb/html5/thumbnails/12.jpg)
Handling Data Corruption
● Mesa runs on ~50K boxes● Online - During updates.
o Fact: Each Mesa instance is logically same but physically may differ in deltas
o Check chksums of indexes/datao Row-order, key-range, aggregate values should be
same, across instances● Offline
o Run global chksums of all indexes
![Page 13: Google mesa](https://reader036.vdocuments.net/reader036/viewer/2022082808/556262e0d8b42a14048b4cfb/html5/thumbnails/13.jpg)
Reference
http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/42851.pdf