cloud computingmhhammou/15319-s12/lectures/... · 2012-05-14 · cloud computing cs 15-319 dryad...

46
Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Upload: others

Post on 11-Jul-2020

3 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Cloud ComputingCS 15-319

Dryad and GraphLabLecture 11, Feb 22, 2012

Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Page 2: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Today…

Last session Pregel

Today’s session Dryad and GraphLab

Announcement: Project Phases I-A and I-B are due today

2© Carnegie Mellon University in Qatar

Page 3: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Objectives

Discussion on Programming Models

Why parallelism?

Parallel computer architectures

Traditional models of parallel programming

Examples of parallel processing

Message Passing Interface (MPI)

Pregel, Dryad and GraphLab

Last 3 Sessions

MapReduce

Pregel, Dryad and GraphLab

© Carnegie Mellon University in Qatar 3

Page 4: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

4

Dryad

© Carnegie Mellon University in Qatar

Page 5: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Dryad In this part, the following concepts of Dryad will

be described:

Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad

5© Carnegie Mellon University in Qatar

Page 6: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Dryad In this part, the following concepts of Dryad will

be described:

Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad

6© Carnegie Mellon University in Qatar

Page 7: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Dryad Dryad is a general purpose, high-performance, distributed

computation engine

Dryad is designed for: High-throughput Data-parallel computation Use in a private datacenter

Computation is expressed as a directed-acyclic-graph (DAG) Vertices represent programs Edges represent data channels between vertices

7© Carnegie Mellon University in Qatar

Page 8: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Unix Pipes vs. Dryad DAG

8© Carnegie Mellon University in Qatar

Page 9: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Dryad Job Structure

9

grep

sed

sortawk

perlgrep

grepsed

sort

sortawk

Inputfiles

Vertices (processes)

Outputfiles

ChannelsStage

grep1000 | sed500 | sort1000 | awk500 | perl50

© Carnegie Mellon University in Qatar

Page 10: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Dryad In this part, the following concepts of Dryad will

be described:

Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad

10© Carnegie Mellon University in Qatar

Page 11: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Dryad System Organization

11

There are 3 roles for machines in Dryad Job Manager (JM) Name Server (NS) Daemon (D)

© Carnegie Mellon University in Qatar

Page 12: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Program Execution (1) The Job Manager (JM):

Creates the job communication graph (job schedule) Contacts the NS to determine the

number of Ds and the topology Assigns Vs to each D (using a

simple task scheduler-not described) for execution

Coordinates data flow through the data plane

Data is distributed using a distributed storage system that shares with the Google File System some properties (e.g., data are split into chunks and replicated across machines)

Dryad also supports the use of NTFS for accessing files locally

12© Carnegie Mellon University in Qatar

Page 13: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

JM code

vertex code

Program Execution (2)1. Build

2. Send .exe

3. Start JM

5. Generate graph

7. Serializevertices

8. MonitorVertex execution

4. Querycluster resources

Cluster services6. Initialize vertices

© Carnegie Mellon University in Qatar 13

Page 14: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Data Channels in Dryad Data items can be shuffled between vertices through

data channels

Data channels can be: Shared Memory FIFOs (intra-machine)

TCP Streams (inter-machine)

SMB/NTFS Local Files (temporary)

Distributed File System (persistent)

The performance and fault tolerance of these mechanisms vary

Data channels are abstracted for maximum flexibility

14

X

M

Items

© Carnegie Mellon University in Qatar

Page 15: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Dryad In this part, the following concepts of Dryad will

be described:

Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad

15© Carnegie Mellon University in Qatar

Page 16: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Dryad Graph Description Language

16

A An

AS = A^n

(Cloning)

A An

AS >= BS

(Pointwise Composition)

B Bn

A An

AS >> BS

(Bipartite Composition)

B Bn

B

C D

(B>=C) || (B>=D)

(Merge)

Here are some operators in the Dryad graph description language:

© Carnegie Mellon University in Qatar

Page 17: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Example Program in Dryad (1) Skyserver SQL Query (Q18):

Find all the objects in the database that have neighboring objects within 30 arc seconds such that at least one of the neighbors has a color similar to the primary object’s color

There are two tables involved photoObjAll and it has 354,254,163

records Neighbors and it has 2,803,165,372

records

For the equivalent Dryad computation, they extracted the columns of interest into two binary files, “ugriz.bin” and “neighbors.bin”

D D

MM 4n

SS 4n

YY

H

n

n

X Xn

U UN N

L L

© Carnegie Mellon University in Qatar 17

Page 18: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

D D

MM 4n

SS 4n

YY

H

n

n

X Xn

U UN N

L L

Example Program in Dryad (2)

Took SQL plan Manually coded in Dryad Manually partitioned data

u: objid, colorn: objid, neighborobjid[partition by objid]

selectu.color,n.neighborobjid

from u join nwhere

u.objid = n.objid

(u.color,n.neighborobjid)[re-partition by n.neighborobjid][order by n.neighborobjid]

[distinct][merge outputs]

selectu.objid

from u join <temp>where

u.objid = <temp>.neighborobjid and|u.color - <temp>.color| < d

© Carnegie Mellon University in Qatar 18

Page 19: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Example Program in Dryad (3)

19

D D

MM 4n

SS 4n

YY

H

n

n

X Xn

U UN N

L L

GraphBuilder XSet = moduleX^N;GraphBuilder DSet = moduleD^N;GraphBuilder MSet = moduleM^(N*4);GraphBuilder SSet = moduleS^(N*4);GraphBuilder YSet = moduleY^N;GraphBuilder HSet = moduleH^1;

GraphBuilder XInputs = (ugriz1 >= XSet) || (neighbor >= XSet);

GraphBuilder YInputs = ugriz2 >= YSet;GraphBuilder XToY = XSet >= DSet >> MSet >= SSet;

for (i = 0; i < N*4; ++i){XToY = XToY || (SSet.GetVertex(i) >= YSet.GetVertex(i/4));}

GraphBuilder YToH = YSet >= HSet;GraphBuilder HOutputs = HSet >= output;

GraphBuilder final = XInputs || YInputs || XToY || YToH || HOutputs;

Here is the corresponding Dryad code:

© Carnegie Mellon University in Qatar

Page 20: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Dryad In this part, the following concepts of Dryad will

be described:

Dryad Model Dryad Organization Dryad Description Language and An Example Program Fault Tolerance in Dryad

20© Carnegie Mellon University in Qatar

Page 21: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Fault Tolerance in Dryad (1) Dryad is designed to handle two types of failures:

Vertex failures Channel failures

Vertex failures are handled by the JM and the failed vertex is re-executed on another machine

Channel failures cause the preceding vertex to be re-executed

21© Carnegie Mellon University in Qatar

Page 22: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

X[0] X[1] X[3] X[2] X’[2]

Completed vertices Slow vertex

Duplicatevertex

Fault Tolerance in Dryad (2)

Duplication Policy = f(running times, data volumes)

© Carnegie Mellon University in Qatar 22

Page 23: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

23

GraphLab

© Carnegie Mellon University in Qatar

Page 24: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

GraphLab In this part, the following concepts of GraphLab will

be described:

Motivation for GraphLab GraphLab Data Model and Update Mechanisms Scheduling in GraphLab Consistency Models in GraphLab PageRank in GraphLab

24© Carnegie Mellon University in Qatar

Page 25: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

GraphLab In this part, the following concepts of GraphLab will

be described:

Motivation for GraphLab GraphLab Data Model and Update Mechanisms Scheduling in GraphLab Consistency Models in GraphLab PageRank in GraphLab

25© Carnegie Mellon University in Qatar

Page 26: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Motivation for GraphLab Shortcomings of MapReduce Interdependent data computation difficult to perform Overheads of running jobs iteratively – disk access and

startup overhead Communication pattern is not user definable/flexible

Shortcomings of Pregel BSP model requires synchronous computation One slow machine can slow down the entire computation considerably

Shortcomings of Dryad Very flexible but steep learning curve for the programming model

26© Carnegie Mellon University in Qatar

Page 27: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

GraphLab GraphLab is a framework for parallel machine learning

27

Data Graph

Shared Data Table

Scheduling

Update Functions and Scopes

© Carnegie Mellon University in Qatar

Page 28: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

GraphLab In this part, the following concepts of GraphLab will

be described:

Motivation for GraphLab GraphLab Data Model and Update Mechanisms Scheduling in GraphLab Consistency Models in GraphLab PageRank in GraphLab

28© Carnegie Mellon University in Qatar

Page 29: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Data Graph A graph in GraphLab is associated with data at every vertex and edge

29

Data Graph

Arbitrary blocks of data can be assigned to vertices and edges

© Carnegie Mellon University in Qatar

Page 30: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Update Functions The data graph is modified using update functions

The update function can modify a vertex v and its neighborhood, defined as thescope of v (Sv)

30

v

Sv

© Carnegie Mellon University in Qatar

Page 31: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Shared Data Table

Certain algorithms require global information that is shared among all vertices (Algorithm Parameters, Statistics, etc.) GraphLab exposes a Shared Data Table (SDT)

SDT is an associative map between keys and arbitrary blocks of data T[Key] → Value

The shared data table is updated using the sync mechanism

31

Shared Data Table

© Carnegie Mellon University in Qatar

Page 32: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Sync Mechanism Similar to Reduce in MapReduce

User can define fold, merge and apply functions that are triggered during the global sync mechanism

Fold function allows the user to sequentially aggregate information across all vertices

Merge optionally allows user to perform a parallel tree reduction on the aggregated data collected during the fold operation

Apply function allows the user to finalize the resulting value from the fold/merge operations (such as normalization etc.)

32

syncShared Data Table

© Carnegie Mellon University in Qatar

Page 33: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

GraphLab In this part, the following concepts of GraphLab will

be described:

Motivation for GraphLab GraphLab Data Model and Update Mechanisms Scheduling in GraphLab Consistency Models in GraphLab PageRank in GraphLab

33© Carnegie Mellon University in Qatar

Page 34: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Scheduling in GraphLab (1)

CPU 1

CPU 2

The scheduler determines the order that vertices are updated

ee ff gg

kkjjiihh

ddccbbaa bb

iihh

aa

ii

bb ee ff

jj

cc

Sche

dule

r

The process repeats until the scheduler is empty

© Carnegie Mellon University in Qatar 34

Page 35: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Scheduling in GraphLab (2)

An update schedule defines the order in which update functions are applied to vertices A parallel data-structure called the scheduler represents an abstract list of tasks

to be executed in Graphlab

Base (Vertex) schedulers in GraphLab Synchronous scheduler Round-robin scheduler

Job Schedulers in GraphLab FIFO scheduler Priority scheduler

Custom schedulers can be defined by the set scheduler Termination Assessment

If the scheduler has no remaining tasks Or, a termination function can be defined to check for convergence in the data

35© Carnegie Mellon University in Qatar

Page 36: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

GraphLab In this part, the following concepts of GraphLab will

be described:

Motivation for GraphLab GraphLab Data Model and Update Mechanisms Scheduling in GraphLab Consistency Models in GraphLab PageRank in GraphLab

36© Carnegie Mellon University in Qatar

Page 37: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Need for Consistency Models How much can computation overlap?

© Carnegie Mellon University in Qatar 37

Page 38: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Consistency Models in GraphLab

GraphLab guarantees sequential consistency Guaranteed to give the same result as a sequential execution of the

computational steps

User-defined consistency models Full Consistency Vertex Consistency Edge Consistency

38© Carnegie Mellon University in Qatar

Page 39: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

GraphLab In this part, the following concepts of GraphLab will

be described:

Motivation for GraphLab GraphLab Data Model and Update Mechanisms Scheduling in GraphLab Consistency Models in GraphLab PageRank in GraphLab

39© Carnegie Mellon University in Qatar

Page 40: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

PageRank (1) PageRank is a link analysis algorithm

The rank value indicates an importance of a particular web page

A hyperlink to a page counts as a vote of support

A page that is linked to by many pages with high PageRank receives a high rank itself

A PageRank of 0.5 means there is a 50% chance that a person clicking on a random link will be directed to the document with the 0.5 PageRank

© Carnegie Mellon University in Qatar 40

Page 41: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

PageRank (2) Iterate:

Where: α is the random reset probability L[j] is the number of links on page j

1 32

4 65

© Carnegie Mellon University in Qatar 41

Page 42: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

pagerank(i, scope){// Get Neighborhood data(R[i], Wij, R[j]) scope;

// Update the vertex data

// Reschedule Neighbors if neededif R[i] changes then reschedule_neighbors_of(i);

}

;][)1(][][

×−+←iNj

ji jRWiR αα

PageRank Example in GraphLab PageRank algorithm is defined as a per-vertex operation working on the scope

of the vertex

Dynamic computation

© Carnegie Mellon University in Qatar 42

Page 43: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

43

How MapReduce, Pregel, Dryad and GraphLab

Compare Against Each Other?

© Carnegie Mellon University in Qatar

Page 44: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Comparison of the Programming Models

MapReduce Pregel Dryad GraphLab

ProgrammingModel

Fixed Functions –Map and Reduce

Supersteps over a data graph with messages passed

DAG with program vertices and data edges

Data graph with shared data table and update functions

44

MapReduce Pregel Dryad GraphLab

ProgrammingModel

Fixed Functions –Map and Reduce

Supersteps over a data graph with messages passed

DAG with program vertices and data edges

Data graph with shared data table and update functions

Parallelism Concurrent execution of tasks within map and reduce phases

Concurrent executionof user functions over vertices within a superstep

Concurrent execution of vertices during a stage

Concurrent execution of non-overlapping scopes, defined by consistency model

MapReduce Pregel Dryad GraphLab

ProgrammingModel

Fixed Functions –Map and Reduce

Supersteps over a data graph with messages passed

DAG with program vertices and data edges

Data graph with shared data table and update functions

Parallelism Concurrent execution of tasks within map and reduce phases

Concurrent executionof user functions over vertices within a superstep

Concurrent execution of vertices during a stage

Concurrent execution of non-overlapping scopes, defined by consistency model

Data Handling Distributed file system

Distributed file system

Flexible data channels: Memory, Files, DFS etc.

Undefined – Graphs can be in memory or on disk

MapReduce Pregel Dryad GraphLab

ProgrammingModel

Fixed Functions –Map and Reduce

Supersteps over a data graph with messages passed

DAG with program vertices and data edges

Data graph with shared data table and update functions

Parallelism Concurrent execution of tasks within map and reduce phases

Concurrent executionof user functions over vertices within a superstep

Concurrent execution of vertices during a stage

Concurrent execution of non-overlapping scopes, defined by consistency model

Data Handling Distributed file system

Distributed file system

Flexible data channels: Memory, Files, DFS etc.

Undefined – Graphs can be in memory or on disk

Task Scheduling Fixed Phases –HDFS Locality based map task assignment

Partitioned Graph and Inputs assigned by assignment functions

Job and Stage Managers assign vertices to avaiableddaemons

Pluggable schedulers to schedule update functions

MapReduce Pregel Dryad GraphLab

ProgrammingModel

Fixed Functions –Map and Reduce

Supersteps over a data graph with messages passed

DAG with program vertices and data edges

Data graph with shared data table and update functions

Parallelism Concurrent execution of tasks within map and reduce phases

Concurrent executionof user functions over vertices within a superstep

Concurrent execution of vertices during a stage

Concurrent execution of non-overlapping scopes, defined by consistency model

Data Handling Distributed file system

Distributed file system

Flexible data channels: Memory, Files, DFS etc.

Undefined – Graphs can be in memory or on disk

Task Scheduling Fixed Phases –HDFS Locality based map task assignment

Partitioned Graph and Inputs assigned by assignment functions

Job and Stage Managers assign vertices to avaiableddaemons

Pluggable schedulers to schedule update functions

Fault Tolerance DFS replication + Task reassignment / Speculative execution of Tasks

Checkpointing and superstep re-execution

Vertex and Edge failure recovery

Synchronous and asychronoussnapshots

MapReduce Pregel Dryad GraphLab

ProgrammingModel

Fixed Functions –Map and Reduce

Supersteps over a data graph with messages passed

DAG with program vertices and data edges

Data graph with shared data table and update functions

Parallelism Concurrent execution of tasks within map and reduce phases

Concurrent executionof user functions over vertices within a superstep

Concurrent execution of vertices during a stage

Concurrent execution of non-overlapping scopes, defined by consistency model

Data Handling Distributed file system

Distributed file system

Flexible data channels: Memory, Files, DFS etc.

Undefined – Graphs can be in memory or on disk

Task Scheduling Fixed Phases –HDFS Locality based map task assignment

Partitioned Graph and Inputs assigned by assignment functions

Job and Stage Managers assign vertices to availabledaemons

Pluggable schedulers to schedule update functions

Fault Tolerance DFS replication + Task reassignment / Speculative execution of Tasks

Checkpointing and superstep re-execution

Vertex and Edge failure recovery

Synchronous and asychronoussnapshots

Developed by Google Google Microsoft Carnegie Mellon

© Carnegie Mellon University in Qatar

Page 45: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

References This presentation has elements borrowed from various papers and

presentations:

Papers: Pregel: http://kowshik.github.com/JPregel/pregel_paper.pdf Dryad: http://research.microsoft.com/pubs/63785/eurosys07.pdf GraphLab: http://www.select.cs.cmu.edu/publications/paperdir/uai2010-low-gonzalez-kyrola-

bickson-guestrin-hellerstein.pdf

Presentations: Dryad Presentation at Berkeley by M. Budiu:

http://budiu.info/work/dryad-talk-berkeley09.pptx GraphLab1 Presentation: http://graphlab.org/uai2010_graphlab.pptx GraphLab2 Presentation: http://graphlab.org/presentations/nips-biglearn-2011.pptx

45© Carnegie Mellon University in Qatar

Page 46: Cloud Computingmhhammou/15319-s12/lectures/... · 2012-05-14 · Cloud Computing CS 15-319 Dryad and GraphLab Lecture 11, Feb 22, 2012 Majd F. Sakr, Suhail Rehman and Mohammad Hammoud

Next Class

Distributed File Systems

© Carnegie Mellon University in Qatar 46