graph processing - graphlab

74
GraphLab: A New Framework For Parallel Machine Learning Amir H. Payberah [email protected] Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 1 / 42

Upload: amir-payberah

Post on 18-Jul-2015

137 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Graph processing - Graphlab

GraphLab: A New Framework For Parallel MachineLearning

Amir H. [email protected]

Amirkabir University of Technology(Tehran Polytechnic)

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 1 / 42

Page 2: Graph processing - Graphlab

Reminder

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 2 / 42

Page 3: Graph processing - Graphlab

Data-Parallel Model for Large-Scale Graph Processing

I The platforms that have worked well for developing parallel applica-tions are not necessarily effective for large-scale graph problems.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 3 / 42

Page 4: Graph processing - Graphlab

Graph-Parallel Processing

I Restricts the types of computation.

I New techniques to partition and distribute graphs.

I Exploit graph structure.

I Executes graph algorithms orders-of-magnitude faster than moregeneral data-parallel systems.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 4 / 42

Page 5: Graph processing - Graphlab

Data-Parallel vs. Graph-Parallel Computation

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 5 / 42

Page 6: Graph processing - Graphlab

Pregel

I Vertex-centric

I Bulk Synchronous Parallel (BSP)

I Runs in sequence of iterations (supersteps)

I A vertex in superstep S can:• reads messages sent to it in superstep S-1.• sends messages to other vertices: receiving at superstep S+1.• modifies its state.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 6 / 42

Page 7: Graph processing - Graphlab

Pregel

I Vertex-centric

I Bulk Synchronous Parallel (BSP)

I Runs in sequence of iterations (supersteps)

I A vertex in superstep S can:• reads messages sent to it in superstep S-1.• sends messages to other vertices: receiving at superstep S+1.• modifies its state.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 6 / 42

Page 8: Graph processing - Graphlab

Pregel

I Vertex-centric

I Bulk Synchronous Parallel (BSP)

I Runs in sequence of iterations (supersteps)

I A vertex in superstep S can:• reads messages sent to it in superstep S-1.• sends messages to other vertices: receiving at superstep S+1.• modifies its state.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 6 / 42

Page 9: Graph processing - Graphlab

Pregel

I Vertex-centric

I Bulk Synchronous Parallel (BSP)

I Runs in sequence of iterations (supersteps)

I A vertex in superstep S can:• reads messages sent to it in superstep S-1.• sends messages to other vertices: receiving at superstep S+1.• modifies its state.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 6 / 42

Page 10: Graph processing - Graphlab

Pregel Limitations

I Inefficient if different regions of the graph converge at differentspeed.

I Can suffer if one task is more expensive than the others.

I Runtime of each phase is determined by the slowest machine.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 7 / 42

Page 11: Graph processing - Graphlab

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 8 / 42

Page 12: Graph processing - Graphlab

Data Model

I A directed graph that stores the program state, called data graph.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 9 / 42

Page 13: Graph processing - Graphlab

Vertex Scope

I The scope of vertex v is the data stored in vertex v, in all adjacentvertices and adjacent edges.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 10 / 42

Page 14: Graph processing - Graphlab

Programming Model (1/3)

I Rather than adopting a message passing as in Pregel, GraphLaballows the user defined function of a vertex to read and modify anyof the data in its scope.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 11 / 42

Page 15: Graph processing - Graphlab

Programming Model (2/3)

I Update function: user-defined function similar to Compute in Pregel.

I Can read and modify the data within the scope of a vertex.

I Schedules the future execution of other update functions.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 12 / 42

Page 16: Graph processing - Graphlab

Programming Model (3/3)

I Sync function: similar to aggregate in Pregel.

I Maintains global aggregates.

I Performs periodically in the background.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 13 / 42

Page 17: Graph processing - Graphlab

Execution Model

I Each task in the set of tasks T , is a tuple (f, v) consisting of anupdate function f and a vertex v.

I After executing an update function (f, g, · · ·) the modified scopedata in Sv is written back to the data graph.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 14 / 42

Page 18: Graph processing - Graphlab

Execution Model

I Each task in the set of tasks T , is a tuple (f, v) consisting of anupdate function f and a vertex v.

I After executing an update function (f, g, · · ·) the modified scopedata in Sv is written back to the data graph.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 14 / 42

Page 19: Graph processing - Graphlab

Execution Model

I Each task in the set of tasks T , is a tuple (f, v) consisting of anupdate function f and a vertex v.

I After executing an update function (f, g, · · ·) the modified scopedata in Sv is written back to the data graph.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 14 / 42

Page 20: Graph processing - Graphlab

Example: PageRank

GraphLab_PageRank(i)

// compute sum over neighbors

total = 0

foreach(j in in_neighbors(i)):

total = total + R[j] * wji

// update the PageRank

R[i] = 0.15 + total

// trigger neighbors to run again

foreach(j in out_neighbors(i)):

signal vertex-program on j

R[i] = 0.15 +∑

j∈Nbrs(i)

wjiR[j]

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 15 / 42

Page 21: Graph processing - Graphlab

Data Consistency (1/3)

I Overlapped scopes: race-condition in simultaneous execution of twoupdate functions.

I Full consistency: during the execution f(v), no other function readsor modifies data within the v scope.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 16 / 42

Page 22: Graph processing - Graphlab

Data Consistency (1/3)

I Overlapped scopes: race-condition in simultaneous execution of twoupdate functions.

I Full consistency: during the execution f(v), no other function readsor modifies data within the v scope.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 16 / 42

Page 23: Graph processing - Graphlab

Data Consistency (2/3)

I Edge consistency: during the execution f(v), no other functionreads or modifies any of the data on v or any of the edges adja-cent to v.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 17 / 42

Page 24: Graph processing - Graphlab

Data Consistency (3/3)

I Vertex consistency: during the execution f(v), no other functionwill be applied to v.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 18 / 42

Page 25: Graph processing - Graphlab

Sequential Consistency (1/2)

I Proving the correctness of a parallel algorithm: sequential consistency

I Sequential consistency: if for every parallel execution, there exists asequential execution of update functions that produces an equivalentresult.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 19 / 42

Page 26: Graph processing - Graphlab

Sequential Consistency (1/2)

I Proving the correctness of a parallel algorithm: sequential consistency

I Sequential consistency: if for every parallel execution, there exists asequential execution of update functions that produces an equivalentresult.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 19 / 42

Page 27: Graph processing - Graphlab

Sequential Consistency (2/2)

I A simple method to achieve serializability is to ensure that the scopesof concurrently executing update functions do not overlap.

• The full consistency model is used.

• The edge consistency model is used and update functions do not modifydata in adjacent vertices.

• The vertex consistency model is used and update functions only accesslocal vertex data.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 20 / 42

Page 28: Graph processing - Graphlab

Sequential Consistency (2/2)

I A simple method to achieve serializability is to ensure that the scopesof concurrently executing update functions do not overlap.

• The full consistency model is used.

• The edge consistency model is used and update functions do not modifydata in adjacent vertices.

• The vertex consistency model is used and update functions only accesslocal vertex data.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 20 / 42

Page 29: Graph processing - Graphlab

Sequential Consistency (2/2)

I A simple method to achieve serializability is to ensure that the scopesof concurrently executing update functions do not overlap.

• The full consistency model is used.

• The edge consistency model is used and update functions do not modifydata in adjacent vertices.

• The vertex consistency model is used and update functions only accesslocal vertex data.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 20 / 42

Page 30: Graph processing - Graphlab

Sequential Consistency (2/2)

I A simple method to achieve serializability is to ensure that the scopesof concurrently executing update functions do not overlap.

• The full consistency model is used.

• The edge consistency model is used and update functions do not modifydata in adjacent vertices.

• The vertex consistency model is used and update functions only accesslocal vertex data.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 20 / 42

Page 31: Graph processing - Graphlab

Consistency vs. Parallelism

Consistency vs. Parallelism

[Low, Y., GraphLab: A Distributed Abstraction for Large Scale Machine Learning (Doctoral dissertation, University of

California), 2013.]

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 21 / 42

Page 32: Graph processing - Graphlab

GraphLab Implementation

I Shared memory implementation

I Distributed implementation

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 22 / 42

Page 33: Graph processing - Graphlab

GraphLab Implementation

I Shared memory implementation

I Distributed implementation

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 22 / 42

Page 34: Graph processing - Graphlab

Tasks Schedulers (1/2)

I In what order should the tasks (vertex-update function pairs) becalled?

• A collection of base schedules, e.g., round-robin, and synchronous.• Set scheduler: enables users to compose custom update schedules.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 23 / 42

Page 35: Graph processing - Graphlab

Tasks Schedulers (1/2)

I In what order should the tasks (vertex-update function pairs) becalled?

• A collection of base schedules, e.g., round-robin, and synchronous.• Set scheduler: enables users to compose custom update schedules.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 23 / 42

Page 36: Graph processing - Graphlab

Tasks Schedulers (2/2)

I How to add new task in the queue?

• FIFO: only permits task creation but do not permit task reordering.• Prioritized: permits task reordering at the cost of increased overhead.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 24 / 42

Page 37: Graph processing - Graphlab

Tasks Schedulers (2/2)

I How to add new task in the queue?• FIFO: only permits task creation but do not permit task reordering.• Prioritized: permits task reordering at the cost of increased overhead.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 24 / 42

Page 38: Graph processing - Graphlab

Consistency

I Implemented in C++ using PThreads for parallelism.

I Consistency: read-write lock

I Vertex consistency• Central vertex (write-lock)

I Edge consistency• Central vertex (write-lock)• Adjacent vertices (read-locks)

I Full consistency• Central vertex (write-locks)• Adjacent vertices (write-locks)

I Deadlocks are avoided by acquiring locks sequentially following acanonical order.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 25 / 42

Page 39: Graph processing - Graphlab

Consistency

I Implemented in C++ using PThreads for parallelism.

I Consistency: read-write lock

I Vertex consistency• Central vertex (write-lock)

I Edge consistency• Central vertex (write-lock)• Adjacent vertices (read-locks)

I Full consistency• Central vertex (write-locks)• Adjacent vertices (write-locks)

I Deadlocks are avoided by acquiring locks sequentially following acanonical order.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 25 / 42

Page 40: Graph processing - Graphlab

GraphLab Implementation

I Shared memory implementation

I Distributed implementation

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 26 / 42

Page 41: Graph processing - Graphlab

Distributed Implementation

I Graph partitioning• How to efficiently load, partition and distribute the data graph across

machines?

I Consistency• How to achieve consistency in the distributed setting?

I Fault tolerance

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 27 / 42

Page 42: Graph processing - Graphlab

Graph Partitioning

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 28 / 42

Page 43: Graph processing - Graphlab

Graph Partitioning - Phase 1 (1/2)

I Two-phase partitioning.

I Partitioning the data graph into k parts, called atom.• k � number of machines

I meta-graph: the graph of atoms (one vertex for each atom).

I Atom weight: the amount of data it stores.

I Edge weight: the number of edges crossing the atoms.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 29 / 42

Page 44: Graph processing - Graphlab

Graph Partitioning - Phase 1 (1/2)

I Two-phase partitioning.

I Partitioning the data graph into k parts, called atom.• k � number of machines

I meta-graph: the graph of atoms (one vertex for each atom).

I Atom weight: the amount of data it stores.

I Edge weight: the number of edges crossing the atoms.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 29 / 42

Page 45: Graph processing - Graphlab

Graph Partitioning - Phase 1 (1/2)

I Two-phase partitioning.

I Partitioning the data graph into k parts, called atom.• k � number of machines

I meta-graph: the graph of atoms (one vertex for each atom).

I Atom weight: the amount of data it stores.

I Edge weight: the number of edges crossing the atoms.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 29 / 42

Page 46: Graph processing - Graphlab

Graph Partitioning - Phase 1 (1/2)

I Two-phase partitioning.

I Partitioning the data graph into k parts, called atom.• k � number of machines

I meta-graph: the graph of atoms (one vertex for each atom).

I Atom weight: the amount of data it stores.

I Edge weight: the number of edges crossing the atoms.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 29 / 42

Page 47: Graph processing - Graphlab

Graph Partitioning - Phase 1 (2/2)

I Each atom is stored as a separate file on a distributed storage system,e.g., HDFS.

I Each atom file is a simple binary that stores interior and the ghostsof the partition information.

I Ghost: set of vertices and edges adjacent to the partition boundary.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 30 / 42

Page 48: Graph processing - Graphlab

Graph Partitioning - Phase 1 (2/2)

I Each atom is stored as a separate file on a distributed storage system,e.g., HDFS.

I Each atom file is a simple binary that stores interior and the ghostsof the partition information.

I Ghost: set of vertices and edges adjacent to the partition boundary.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 30 / 42

Page 49: Graph processing - Graphlab

Graph Partitioning - Phase 1 (2/2)

I Each atom is stored as a separate file on a distributed storage system,e.g., HDFS.

I Each atom file is a simple binary that stores interior and the ghostsof the partition information.

I Ghost: set of vertices and edges adjacent to the partition boundary.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 30 / 42

Page 50: Graph processing - Graphlab

Graph Partitioning - Phase 2

I Meta-graph is very small.

I A fast balanced partition of the meta-graph over the physical ma-chines.

I Assigning graph atoms to machines.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 31 / 42

Page 51: Graph processing - Graphlab

Consistency

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 32 / 42

Page 52: Graph processing - Graphlab

Consistency

I To achieve a serializable parallel execution of a set of dependenttasks.

• Chromatic engine• Distributed locking engine

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 33 / 42

Page 53: Graph processing - Graphlab

Consistency - Chromatic Engine

I Construct a vertex coloring: assigns a color to each vertex such thatno adjacent vertices share the same color.

I Edge consistency: executing, synchronously, all update tasks asso-ciated with vertices of the same color before proceeding to the nextcolor.

I Full consistency: no vertex shares the same color as any of its dis-tance two neighbors.

I Vertex consistency: assigning all vertices the same color.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 34 / 42

Page 54: Graph processing - Graphlab

Consistency - Chromatic Engine

I Construct a vertex coloring: assigns a color to each vertex such thatno adjacent vertices share the same color.

I Edge consistency: executing, synchronously, all update tasks asso-ciated with vertices of the same color before proceeding to the nextcolor.

I Full consistency: no vertex shares the same color as any of its dis-tance two neighbors.

I Vertex consistency: assigning all vertices the same color.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 34 / 42

Page 55: Graph processing - Graphlab

Consistency - Chromatic Engine

I Construct a vertex coloring: assigns a color to each vertex such thatno adjacent vertices share the same color.

I Edge consistency: executing, synchronously, all update tasks asso-ciated with vertices of the same color before proceeding to the nextcolor.

I Full consistency: no vertex shares the same color as any of its dis-tance two neighbors.

I Vertex consistency: assigning all vertices the same color.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 34 / 42

Page 56: Graph processing - Graphlab

Consistency - Chromatic Engine

I Construct a vertex coloring: assigns a color to each vertex such thatno adjacent vertices share the same color.

I Edge consistency: executing, synchronously, all update tasks asso-ciated with vertices of the same color before proceeding to the nextcolor.

I Full consistency: no vertex shares the same color as any of its dis-tance two neighbors.

I Vertex consistency: assigning all vertices the same color.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 34 / 42

Page 57: Graph processing - Graphlab

Consistency - Distributed Locking Engine

I Associating a readers-writer lock with each vertex.

I Vertex consistency• Central vertex (write-lock)

I Edge consistency• Central vertex (write-lock), Adjacent vertices (read-locks)

I Full consistency• Central vertex (write-locks), Adjacent vertices (write-locks)

I Deadlocks are avoided by acquiring locks sequentially following acanonical order.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42

Page 58: Graph processing - Graphlab

Consistency - Distributed Locking Engine

I Associating a readers-writer lock with each vertex.

I Vertex consistency• Central vertex (write-lock)

I Edge consistency• Central vertex (write-lock), Adjacent vertices (read-locks)

I Full consistency• Central vertex (write-locks), Adjacent vertices (write-locks)

I Deadlocks are avoided by acquiring locks sequentially following acanonical order.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42

Page 59: Graph processing - Graphlab

Consistency - Distributed Locking Engine

I Associating a readers-writer lock with each vertex.

I Vertex consistency• Central vertex (write-lock)

I Edge consistency• Central vertex (write-lock), Adjacent vertices (read-locks)

I Full consistency• Central vertex (write-locks), Adjacent vertices (write-locks)

I Deadlocks are avoided by acquiring locks sequentially following acanonical order.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42

Page 60: Graph processing - Graphlab

Consistency - Distributed Locking Engine

I Associating a readers-writer lock with each vertex.

I Vertex consistency• Central vertex (write-lock)

I Edge consistency• Central vertex (write-lock), Adjacent vertices (read-locks)

I Full consistency• Central vertex (write-locks), Adjacent vertices (write-locks)

I Deadlocks are avoided by acquiring locks sequentially following acanonical order.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42

Page 61: Graph processing - Graphlab

Consistency - Distributed Locking Engine

I Associating a readers-writer lock with each vertex.

I Vertex consistency• Central vertex (write-lock)

I Edge consistency• Central vertex (write-lock), Adjacent vertices (read-locks)

I Full consistency• Central vertex (write-locks), Adjacent vertices (write-locks)

I Deadlocks are avoided by acquiring locks sequentially following acanonical order.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42

Page 62: Graph processing - Graphlab

Fault Tolerance

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 36 / 42

Page 63: Graph processing - Graphlab

Fault Tolerance - Synchronous

I The systems periodically signals all computation activity to halt.

I Then synchronizes all caches (ghosts) and saves to disk all datawhich has been modified since the last snapshot.

I Simple, but eliminates the systems advantage of asynchronous com-putation.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 37 / 42

Page 64: Graph processing - Graphlab

Fault Tolerance - Synchronous

I The systems periodically signals all computation activity to halt.

I Then synchronizes all caches (ghosts) and saves to disk all datawhich has been modified since the last snapshot.

I Simple, but eliminates the systems advantage of asynchronous com-putation.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 37 / 42

Page 65: Graph processing - Graphlab

Fault Tolerance - Synchronous

I The systems periodically signals all computation activity to halt.

I Then synchronizes all caches (ghosts) and saves to disk all datawhich has been modified since the last snapshot.

I Simple, but eliminates the systems advantage of asynchronous com-putation.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 37 / 42

Page 66: Graph processing - Graphlab

Fault Tolerance - Asynchronous

I Based on the Chandy-Lamport algorithm.

I The snapshot function is implemented as an update function invertices.

I The snapshot update takes priority over all other update functions.

I Edge Consistency is used on all update functions.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42

Page 67: Graph processing - Graphlab

Fault Tolerance - Asynchronous

I Based on the Chandy-Lamport algorithm.

I The snapshot function is implemented as an update function invertices.

I The snapshot update takes priority over all other update functions.

I Edge Consistency is used on all update functions.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42

Page 68: Graph processing - Graphlab

Fault Tolerance - Asynchronous

I Based on the Chandy-Lamport algorithm.

I The snapshot function is implemented as an update function invertices.

I The snapshot update takes priority over all other update functions.

I Edge Consistency is used on all update functions.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42

Page 69: Graph processing - Graphlab

Fault Tolerance - Asynchronous

I Based on the Chandy-Lamport algorithm.

I The snapshot function is implemented as an update function invertices.

I The snapshot update takes priority over all other update functions.

I Edge Consistency is used on all update functions.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42

Page 70: Graph processing - Graphlab

Fault Tolerance - Asynchronous

I Based on the Chandy-Lamport algorithm.

I The snapshot function is implemented as an update function invertices.

I The snapshot update takes priority over all other update functions.

I Edge Consistency is used on all update functions.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42

Page 71: Graph processing - Graphlab

Summary

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 39 / 42

Page 72: Graph processing - Graphlab

GraphLab Summary

I Asynchronous model

I Vertex-centric

I Communication: distributed shared memory

I Three consistency levels: full, edge-level, and vertex-level

I Partitioning: two-phase partitioning

I Consistency: chromatic engine (graph coloring), distributed lockengine (reader-writer lock)

I Fault tolerance: synchronous, asynchronous (chandy-lamport)Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 40 / 42

Page 73: Graph processing - Graphlab

GraphLab Limitations

I Poor performance on Natural graphs.

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 41 / 42

Page 74: Graph processing - Graphlab

Questions?

Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 42 / 42