piccolo – paper discussion big data reading group

16
Piccolo – Paper Discussion Big Data Reading Group 9/20/2010

Upload: jaser

Post on 22-Feb-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Piccolo – Paper Discussion Big Data Reading Group. 9/20/ 2010. Motivation / Goals. Rising demand for distributing computation PageRank , K-Means, N-Body simulation Data-centric frameworks simplify programming Existing models (e.g. MapReduce ) are insufficient - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Piccolo – Paper Discussion Big Data Reading Group

Piccolo – Paper Discussion

Big Data Reading Group

9/20/2010

Page 2: Piccolo – Paper Discussion Big Data Reading Group

Motivation / Goals

• Rising demand for distributing computation• PageRank, K-Means, N-Body simulation

• Data-centric frameworks simplify programming• Existing models (e.g. MapReduce) are insufficient• Designed for large scale data analysis as opposed to in-memory

computation

• Make in-memory computations fast• Enable asynchronous computation

9/20/2010 Piccolo – Paper Discussion2

Page 3: Piccolo – Paper Discussion Big Data Reading Group

Overview

• Global in-memory key-value tables for sharing state• Concurrently running instances of kernel applications modifying

global state• Locality optimized (user specified policies)• Reduced synchronization (accumulation, global barriers)• Checkpoint-based recovery

9/20/2010 Piccolo – Paper Discussion3

Page 4: Piccolo – Paper Discussion Big Data Reading Group

System Design

9/20/2010 Piccolo – Paper Discussion4

Page 5: Piccolo – Paper Discussion Big Data Reading Group

Table interface

9/20/2010 Piccolo – Paper Discussion5

Page 6: Piccolo – Paper Discussion Big Data Reading Group

Optimization

• Ensure Locality• Group kernel with partition• Group partitions• Guarantee: one partition completely on single machine

• Reduce Synchronization• Accumulation to avoid write/write conflicts• No pairwise kernel synchronization• Global barriers sufficient

9/20/2010 Piccolo – Paper Discussion6

Page 7: Piccolo – Paper Discussion Big Data Reading Group

Load balancing

• Assigning partitions• Round robin• Optimized for data location

• Work stealing• Biggest task first (master estimates based on number of keys in partition)• Master decides

• Restrictions• Cannot kill running task (modifies shared state, restore is very expensive)• Partitions need to be moved

9/20/2010 Piccolo – Paper Discussion7

Page 8: Piccolo – Paper Discussion Big Data Reading Group

Table migration

• Migrate table from wa to wb• Message M1 from master to all workers• All workers flush to wa

• All workers send all new requests to wb

• wb buffers all requests• wa sends paused state to wb

• All workers ackknowledge phase 1 => master sends M2 to wa and wb

• wa flushes to wb and leaves “paused”• wb first works buffered requests then resumes normal operation

9/20/2010 Piccolo – Paper Discussion8

Page 9: Piccolo – Paper Discussion Big Data Reading Group

Fault tolerance

• User assisted checkpoint / restore• Chandy Lamport• Asynchronic -> periodic• Synchronic -> barrier

• Problem: When to start barrier checkpoint• Replay log might get very long• Checkpoint might not use enough free CPU time before barrier

• Solution: When first worker finished all his jobs

• No checkpoint during table migration and vice versa

9/20/2010 Piccolo – Paper Discussion9

Page 10: Piccolo – Paper Discussion Big Data Reading Group

Applications

• PageRank, k-means, n-body, matrix multiplication• Parallel, iterative computations• Local reads + local/remote writes or local/remote reads + local writes• Can be implemented as multiple MapReduce jobs

• Distributed web crawler• Idempotent operation• Cannot be realized in MapReduce

9/20/2010 Piccolo – Paper Discussion10

Page 11: Piccolo – Paper Discussion Big Data Reading Group

Scaling

9/20/2010 Piccolo – Paper Discussion11

Fixed input size

Scaled input size

Page 12: Piccolo – Paper Discussion Big Data Reading Group

Comparison with Hadoop / MPI

9/20/2010 Piccolo – Paper Discussion12

• PageRank, k-means (Hadoop)• Piccolo 4x and 11x faster• For PageRank:• 50% in sort• Join data streams• 15% (de)serialization• Read/write HDFS

• Matrix multiplication (MPI)• Piccolo 10% faster• MPI waits for slowest node

many times

Page 13: Piccolo – Paper Discussion Big Data Reading Group

Work stealing / slow worker / checkpoints

9/20/2010 Piccolo – Paper Discussion13

• Work stealing / slow worker• PageRank has skewed

partitions• One slow worker (50% CPU)

• Checkpoints• Naïve - start after all workers

finished• Optimized – start after first

worker finished

Page 14: Piccolo – Paper Discussion Big Data Reading Group

Checkpoint limits / scalability

9/20/2010 Piccolo – Paper Discussion14

• Hypothetical data center• Typical machine uptime of 1 year• Worst-case scenario• Optimistic?

• Looked different on some older slides

Page 15: Piccolo – Paper Discussion Big Data Reading Group

Distributed Crawler

9/20/2010 Piccolo – Paper Discussion15

• 32 Machines saturate 100Mbps• There are single servers doing

this• Piccolo would scale higher

Page 16: Piccolo – Paper Discussion Big Data Reading Group

Summary

• Piccolo provides an easy to use distributed shared memory model• It applies many restrictions• Simple interface• Reduced synchronization• Relaxed consistency• Accumulation• Locality

• But it performs well• Iterative computations• Saves going to disk compared to MapReduce

• A specialized tool for data intensive in-memory computing

9/20/2010 Piccolo – Paper Discussion16