xml publishing

50
1 QSX: Querying Social Graphs Parallel models for querying graphs beyond MapReduce Vertex-centric models Pregel (BSP) GraphLab GRAPE

Upload: leiko

Post on 04-Feb-2016

82 views

Category:

Documents


0 download

DESCRIPTION

XML Publishing. XML publishing: overview SilkRoute (AT&T) XPERANTO (IBM Research) Clio (IBM Research and U. Toronto) Schema-directed XML Publishing: ATG (Bell Labs and U. Edinburgh) PRATA: implementation of ATG Extensions of ATG. Magical Publishing Box!. XML!. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: XML Publishing

1

QSX: Querying Social Graphs

Parallel models for querying graphs beyond

MapReduce

Vertex-centric models

– Pregel (BSP)

– GraphLab

GRAPE

Page 2: XML Publishing

2

Inefficiency of MapReduce

<k1, v1>

mapper mappermapper

<k1, v1> <k1, v1> <k1, v1>

reducer reducer

<k2, v2> <k2, v2> <k2, v2>

<k3, v3> <k3, v3>

Blocking: Reduce does not start until all Map tasks are completed

Other reasons?

Intermediate results shipping: all to all

Write to disk and read from disk in each step, although the data does not change in loops

Page 3: XML Publishing

3

The need for parallel models beyond MapReduce

Can we do better for graph algorithms?

MapReduce:

• Inefficiency: blocking, intermediate result shipping (all to all); write to disk and read from disk in each step, even for invariant data in a loop

• Does not support iterative graph computations:

• External driver

• No mechanism to support global data structures that can be accessed and updated by all mappers and reducers

• Support for incremental computation?

• Have to re-cast algorithms in MapReduce, hard to reuse existing (incremental) algorithms

• General model, not limited to graphs

Page 4: XML Publishing

Vertex-centric models

44

Page 5: XML Publishing

5

Bulk Synchronous Parallel Model (BSP)

Processing: a series of supersteps

Vertex: computation is defined to run on each vertex

Superstep S: all vertices compute in parallel; each vertex v may

– receive messages sent to v from superstep S – 1;

– perform some computation: modify its states and the states of its outgoing edges

– Send messages to other vertices ( to be received in the next superstep)

Message passing

Vertex-centric, message passing

Leslie G. Valiant: A Bridging Model for Parallel Computation. Commun. ACM 33 (8): 103-111 (1990) analogous to MapReduce

rounds

Page 6: XML Publishing

6

Pregel: think like a vertex

Vertex: modify its state/edge state/edge sets (topology)

Supersteps: within each, all vertices compute in parallel

Termination:

– Each vertex votes to halt

– When all vertices are inactive and no messages in transit

Synchronization: supersteps

Asynchronous: all vertices within each superstep

Input: a directed graph G

– Each vertex v: a node id, and a value

– Edges: contain values (associated with vertices)

Page 7: XML Publishing

Example: maximum value

3 6 2 1 Superstep 0

6 6 2 6

6 6 6 6

6 6 6 6

Superstep 1

Superstep 2

Superstep 3

Shaded vertices: voted to halt

message passing

7

Page 8: XML Publishing

8

Vertex API

Template (VertexValue, EdgeValue, MessageValue)

Class Vertex {

void Compute (MessageIterator: msgs)

const vertex_id;

const superstep();

const VertexValue& GetValue();

VertexValue* MutableValue();

OutEdgeIterator GetOutEdgeIterator();

void SendMessageTo (dest_vertex, MessageValue& message);

void VoteToHalt();

}

User definedUser defined

Iteration controlIteration control

Vertex value: mutableVertex value: mutable

Outgoing edgesOutgoing edges

Message passing: messages can be sent to any vertex whose id is knownMessage passing: messages can be sent to any vertex whose id is known

All messages receivedAll messages received

Think like a vertex: local computation8

Page 9: XML Publishing

9

PageRank

The likelihood that page v is visited by a random walk:

(1/|V|) + (1 - ) _(u L(v)) P(u)/C(u)

Recursive computation: for each page v in G,• compute P(v) by using P(u) for all u L(v)

until• converge: no changes to any P(v)• after a fixed number of iterations

random jumprandom jump following a link from other pagesfollowing a link from other pages

A BSP algorithm?9

Page 10: XML Publishing

10

PageRank in Pregel

PageRankVertex {

Compute (MessageIterator: msgs) {

if (superstep() >= 1)

then sum := 0;

for all messages in msgs do

*MutableValue() := /NumVertices() + (1- ) sum;

}

if (superstep() < 30)

then n := GetOutEdgeIterator().size();

sendMessageToAllNeighbors(GetValue() / n);

else VoteToHalt();

}

(1/|V|) + (1 - ) _(u L(v)) P(u) (1/|V|) + (1 - ) _(u L(v)) P(u)

Assume 30 iterationsAssume 30 iterations

Pass revised rank to its neighborsPass revised rank to its neighbors

iterationsiterations

(1/|V|) + (1 - ) _(u L(v)) P(u)/C(u)

(1/|V|) + (1 - ) _(u L(v)) P(u)/C(u)

VertexValue: the current rank 10

Page 11: XML Publishing

11

Dijkstra’s algorithm for distance queries

Distance: single-source shortest-path problem• Input: A directed weighted graph G, and a node s in G• Output: The lengths of shortest paths from s to all nodes in G

Dijkstra (G, s, w): 1. for all nodes v in V do

a. d[v] ;

2. d[s] 0; Que V;

3. while Que is nonempty do

a. u ExtractMin(Que);

b. for all nodes v in adj(u ) do

a) if d[v] > d[u] + w(u, v) then d[v] d[u] + w(u, v);

Use a priority queue Que; w(u, v): weight of edge (u, v); d(u): the distance from s to u

Use a priority queue Que; w(u, v): weight of edge (u, v); d(u): the distance from s to u

Extract one with the minimum d(u)Extract one with the minimum d(u)

An algorithm in Pregel?11

Page 12: XML Publishing

12

Distance queries in Pregel

ShortesPathVertex {

Compute (MessageIterator: msgs) {

if isSource(vertex_id( ))

then minDist := 0 else minDist := ;

for all messages m in msgs do

minDist := min(minDist, m.Value());

if midDist < GetValue()

then *MutableValue() := minDist;

for all nodes v linked to from the current node u do

SendMessageTo(v, minDist + w(u, v));

VoteToHalt();

}

Think like a vertex

MutableValue: the current distanceMutableValue: the current distance

Pass revised distance to its neighborsPass revised distance to its neighbors

Messages: distances to u Messages: distances to u

Refer to the current node as uRefer to the current node as u

aggregation aggregation

12

Page 13: XML Publishing

13

Combiner and Aggregation

Each vertex can provide a value to an aggregator in any superstep S.

System aggregates these values (“reduce”)

The aggregated values are made available to all vertices in superstep S + 1.

optimization

Combine several messages intended for a vertex

– Provided that the messages can be aggregated (“reduced”) by using some associative and commutative function

– Reduce the number of messages

Global data structuresGlobal data structures

Page 14: XML Publishing

14

Topology mutation

Handling conflicts:

– Partial order on operations: edge removal < vertex removal < vertex addition < edge addition

System: random action or user specified

Extra power, yet increased complication

Function compute can add or remove vertices

Possible conflicts:

– Vertex 1 adds an edge to vertex 100

– Vertex 2 deletes vertex 100

– Vertex 1 creates a vertex 10 with value 10

– Vertex 2 also creates a vertex 10 with value 12

Page 15: XML Publishing

15

Pregel implementation

Cross edges: minimize edges across partitions

– Sparsest Cut Problem

Master, worker

Vertices are assigned to machines: hash(vertex.id) mod N

Partitions can be user-specified, to co-locate all Web pages form the same site, for instance

Master: coordinate a set of workers (partitions, assignments)

Worker: processes one or more partitions, local computation

– Know the partition function and partitions assigned to it

– All vertices in a partition are initially active

– Worker notifies master of the number of active vertices at the end of a superstep

Giraph, http://www.apache.org/dyn/closer.cgi/giraphGiraph, http://www.apache.org/dyn/closer.cgi/giraph

Page 16: XML Publishing

16

Fault tolerance

recovery

Checkpoints: master instructs workers to save state to HDFS– Vertex values– Edge values– Incoming messages

Master saves aggregated values to disk

Worker failure:

– detected by regular “ping” messages issued by the master (mark it failed after specified interval)

– Recovered by creating a new worker, with the state stored from previous checkpoint

Page 17: XML Publishing

17

The vertex centric model of GraphLab

Vertex: computation is defined to run on each vertex

All vertices compute in parallel

– Each vertex reads and writes to data on adjacent nodes or edges

Consistency: serialization

– Full consistency: no overlap for concurrent updates

– Edge consistency: exclusive read-write to its vertex and adjacent edges; read only to adjacent vertices

– Vertex consistency: all updates in parallel (sync operations)

Asynchronous: all vertices

No supersteps

asynchronous

Machine learning, data mining

https://dato.com/products/create/open_source.htmlhttps://dato.com/products/create/open_source.html

Page 18: XML Publishing

18

Vertex-centric models vs. MapReduce

Vertex centric: think like a vertex; MapReduce: think like a graph

Can we do better?

Vertex centric: maximize parallelism – asynchronous, minimize data shipment via message passing; support iterations

MapReduce: inefficiency caused by blocking; distributing intermediate results (all to all), unnecessary write/read; does not provide a mechanism to support iteration

Vertex centric: limited to graphs; MapReduce: general

Lack of global control: ordering for processing vertices in recursive computation, incremental computation, etc

New programming models, have to re-cast algorithms in MapReduce, hard to reuse existing (incremental) algorithms

Page 19: XML Publishing

GRAPE: A parallel model based on partial evaluation

19

Page 20: XML Publishing

Querying distributed graphs

Given a big graph G, and n processors S1, …, SnG is partitioned into fragments (G1, …, Gn) G is distributed to n processors: Gi is stored at Si

Dividing a big G into small fragments of manageable size

Each processor Si processes its local fragment Gi in parallel

Parallel query answeringInput: G = (G1, …, Gn), distributed to (S1, …, Sn), and a query Q Output: Q(G), the answer to Q in G

Q( )

GGQ( )

G1G1Q( )GnGnQ( )G2G2

How does it work?

20

Page 21: XML Publishing

GRAPE (GRAPh Engine)

21

Divide and conquer

partition G into fragments (G1, …, Gn), distributed to various sites

manageable sizes

upon receiving a query Q,

• evaluate Q( Gi ) in parallel

• collect partial answers at a coordinator site, and assemble them to find the answer Q( G ) in the entire G

evaluate Q on smaller

Gi

data-partitioned parallelism

Each machine (site) Si processes the same query Q, uses only data stored in its local fragment Gi

21

Page 22: XML Publishing

Partial evaluation

22The connection between partial evaluation and parallel processing

compute f( x ) f( s, d )

conduct the part of computation that depends only on s

generate a partial answer

the part of known input

Partial evaluation in distributed query processing

• evaluate Q( Gi ) in parallel

• collect partial matches at a coordinator site, and assemble them to find the answer Q( G ) in the entire G

yet unavailable input

a residual function

Gj as the yet unavailable input

as residual functions

at each site, Gi as the known input

22

Page 23: XML Publishing

Coordinator

23

Coordinator: receive/post queries, control termination, and assemble answers

Upon receiving a query Q

• post Q to all workers

• Initialize a status flag for each worker, mutable by the worker

Terminate the computation when all flags are true

• Assemble partial answers from workers, and produce the final answer Q(G)

Termination, partial answer assembling

Each machine (site) Si is either a coordinator a worker: conduct local computation and produce partial answers

23

Page 24: XML Publishing

Workers

24

Worker: conduct local computation and produce partial answers

upon receiving a query Q,

• evaluate Q( Gi ) in parallel

• send messages to request data for “border nodes”

use local data Gi only

Local computation, partial evaluation, recursion, partial answers 24

Incremental computation: upon receiving new messages M

• evaluate Q( Gi + M) in parallel

set its flag true if no more changes to partial results, and send the partial answer to the coordinator

This step repeats until the partial answer at site Si is ready

With edges to other fragments

Incremental computation

Page 25: XML Publishing

25Costly when G is big

Regular path

• Input: A node-labelled directed graph G, a pair of nodes s and t in G, and a regular expression R

• Question: Does there exist a path p from s to t such that the labels of adjacent nodes on p form a string in R?

Reachability and regular path queries

Reachability• Input: A directed graph G, and a pair of nodes s and t in G• Question: Does there exist a path from s to t in G?

O(|V| + |E|) time

O(|G| |R|) timeParallel

algorithms?

25

Page 26: XML Publishing

Reachability queries

26

Worker: conduct local computation and produce partial answers

upon receiving a query Q,

• evaluate Q( Gi ) in parallel

• send messages to request data for “border nodes”

Local computation: computing the value of Xv in Gi

With edges to other fragments

Boolean formulas as partial answers

• For each node v in Gi, a Boolean variable Xv, indicating whether v reaches destination t

• The truth value of Xv can be expressed as a Boolean formula over Xvb Border nodes in

Gi

26

Page 27: XML Publishing

Boolean variables

Partial evaluation by introducing Boolean variables

Locally evaluate each

qr(v,t) in Gi in parallel:

for each in-node v’ in Fi,

decides whether v’ reaches

t; introduce a Boolean

variable to each v’

Partial answer to qr(v,t): a

set of Boolean formula,

disjunction of variables of v’

to which v can reach

qr(v,t)

v

tv’

t

qr(v,v’)

Xv’ = qr(v’,t)

= Xv1’ or … or Xvn’

27

Page 28: XML Publishing

Distributed reachability: assembling

Assembling partial answers

Collect the Boolean

equations at coordinator

solve a system of linear

Boolean equation by using

a dependency graph

qr(s,t) is true if and only if

Xs = true in the equation

system

Xv = Xv’’ or Xv’

Xv’’ = false

Xt = 1

Xv’ = Xt

Xs = Xv

O(|Vf|)

Coordinator: Assemble partial answers from workers, and produce the final answer Q(G)

Only Vf , the set of

border nodes in all

fragments

28

Page 29: XML Publishing

QQQQ

Q1. Dispatch Q to

fragments (at Sc)

2. Partial

evaluation:

generating

Boolean

equations (at Gi)

3. Assembling:

solving equation

system (at Sc)

Example

Sc Jack,"MK"

Emmy,"HR"

Mat,"HR"

G2

Fred, "HR"

Walt, "HR"

Bill,"DB"

G1

Pat,"SE"Tom,"AI"

Ross,"HR"

G3

Ann

Mark

No messages between different fragments 29

Page 30: XML Publishing

Reachability queries in GRAPE

30

upon receiving a query Q,

• evaluate Q( Gi ) in parallel

• collect partial answers at a coordinator site, and assemble them to find the answer Q( G ) in the entire G

Think like a graph

Complexity analysisParallel computation: in O(|Vf||Gm|) time

One round: no incremental computation is neededData shipment: O(|Vf|2), to send partial answers to the coordinator; no

message passing between different fragments

30

Gm: the largest fragment

speedup? |Gm| = |G|/n

Complication: minimizing Vf? An NP-complete problem.

Approximation algorithm

. Rahimian, A. H. Payberah, S. Girdzijauskas, M. Jelasity, and S. Haridi. Ja-be-ja: A distributed algorithm for balanced graph partitioning. Technical report, Swedish Institute of Computer Science, 2013

Page 31: XML Publishing

Regular path queries in GRAPE

31Incorporating the state of NFA for R 31

Boolean formulas as partial answers

• Treat R as an NFA (with states)• For each node v in Gi, a Boolean variable X(v, w), indicating

whether v matches state w of R and reach destination t• X(v, f): the final state f of NFA, for destination node t

Regular path queries

• Input: A node-labelled directed graph G, a pair of nodes s and t in G, and a regular expression R

• Question: Does there exist a path p from s to t such that the labels of adjacent nodes on p form a string in R?

Adding a regular expression R

Page 32: XML Publishing

Boolean variables

Partial answers as Boolean formulas

For each node v in Gi, assign

v. rvec: a vector of O(|Vq|)

Boolean formulas, each entry

v.rvec[w] denotes if v matches

state w

introduce a Boolean variable

X(v’,w) for each border node v’

of Gi and a state w in Vq,

denoting if v’ matches w

Partial answer to qrr(s,t): a set

of Boolean formula from each

in-nodes of Fi

v1

tv’

t

vq

wq…

v2

f11 f12…

f1k

f1v’ f2v’ … fkv’

R

X(v’,w)

|Vq|: the number of states in R

32

Page 33: XML Publishing

Ann

HRDB

Mark

patternpattern fragment graphfragment graph

Fragment F"virtual nodes" of F1

cross edges

Regular path queries

33

Page 34: XML Publishing

Regular reachability queries

34

Ann

HRDB

Mark

F1: Y(Ann,Mark) = X(Pat, DB) X(Mat, HR)X(Fred, HR) = X(Emmy, HR)

Fred, HR

Walt, HR

Bill,DB

Ann

Mat, HR

Pat, SE

Emmy, HR

X(Pat, DB)

X(Emmy, HR)

F2: X(Emmy, HR) = X(Ross, HR)X(Mat, HR) = X(Fred, HR)

X(Ross, HR)

F3: X(Pat, DB) = false, X(Ross, HR) = true

Ross, HRMark, Mark

X(Mat, HR)

truefalse

Y(Ann,Mark) = true

Boolean variables for “virtual nodes” reachable from Ann

F1

F2

F3

Boolean equations at each site, in parallel

The same query is partially evaluated at each site in parallel

Assemble partial answers: solve the system of Boolean equations

Only the query and the Boolean equations need to be shipped

Each site is visited once

34

Page 35: XML Publishing

Regular path queries in GRAPE

35

upon receiving a query Q,

• evaluate Q( Gi ) in parallel

• collect partial answers at a coordinator site, and assemble them to find the answer Q( G ) in the entire G

Think like a graph: process an entire fragment

Complexity analysisParallel computation: in O((|Vf|2+|Gm|)|R|2 ) time

One round: no incremental computation is neededData shipment: O(|R|2 |Vf|2), to send partial answers to the coordinator; no

message passing between different fragments

Gm: the largest fragment

Speedup: |Gm| = |G|/n, and R is small in practice

35

Page 36: XML Publishing

36

Graph pattern matching by graph simulation

Input: A directed graph G, and a graph pattern Q Output: the maximum simulation relation R

A parallel algorithm in GRAPE?

Maximum simulation relation: always exists and is unique• If a match relation exists, then there exists a maximum one• Otherwise, it is the empty set – still maximum

Complexity: O((| V | + | VQ |) (| E | + | EQ| )

The output is a unique relation, possibly of size |Q||V|

Page 37: XML Publishing

37

Algorithm for computing graph simulation

Similarity(P)

•for all nodes u in Q do

• sim(u) the set of candidate matches w in G;

•while there exist (u, v) in Q and w in sim(u) (in G) that violate the simulation condition

• sim(u) sim(u) {w};

•output sim(u) for all u in Q

Input: pattern Q and graph G

Output: for each u in Q, sim(u): the matches w in G

successor(w) sim(v) =

Plus optimization techniques

successor(w) sim(v) = • There exists an edge from u to v in Q, but the candidate w of u

has no corresponding edge to a node w’ that matches v

refinement

with the same label; moreover, if u has an outgoing edge, so does w

Page 38: XML Publishing

38

Complication

A cycle with two nodes matches a cycle of unbounded length

Fixpoint computation: revise the match relation until no further changes

Graph simulation does not have data locality

pattern graph

In a parallel setting, data shipment is a must

Page 39: XML Publishing

Coordinator

Coordinator: Upon receiving a query Q

post Q to all workers

Initialize a status flag for each worker, mutable by the worker

Again, Boolean formulas39

Given a big graph G, and n processors S1, …, SnG is partitioned into fragments (G1, …, Gn) G is distributed to n processors: Gi is stored at Si

Boolean formulas as partial answers

• For each node v in Gi and each pattern node u in Q, a Boolean variable X(u, v), indicating whether v matches u

• The truth value of X(u, v) can be expressed as a Boolean formula over X(u’, v’), for border nodes v’ in Vf

Page 40: XML Publishing

Worker: initial evaluation

40

Worker: conduct local computation and produce partial answers

upon receiving a query Q,

• evaluate Q( Gi ) in parallel

• send messages to request data for “border nodes”

use local data Gi only

Partial evaluation: using an existing algorithm

Local evaluation

• Invoke an existing algorithm to compute Q(Gi) • Minor revision: incorporating Boolean variables

Messages:

• For each node to which there is an edge from another fragment Gj, send the truth value of its Boolean variable to Gj

With edges from other fragments

Page 41: XML Publishing

Worker: incremental evaluation

41Incremental computation, recursion, termination

set its flag true and send partial answer Q(Gi) to the coordinator

Repeat until the truth values of all Boolean variables in Gi are determined

• evaluate Q( Gi + M) in parallel

Messages from other fragments

Recursive computation

Termination:

Coordinator:

Terminate the computation when all flags are true

The union of partial answers from all the workers is the final answer Q(G)

Use an existing incremental algorithm

Partial answer assembling

41

Page 42: XML Publishing

Graph simulation in GRAPE

Input: G = (G1, …, Gn), a pattern query Q Output: the unique maximum match of Q in G

42parallel query processing with performance guarantees

Performance guarantees

• Response time: O((|VQ| + |Vm|) (|EQ| + |Em|) |VQ| |Vf|)

• the total amount of data shipped is in O( |Vf| |VQ| )

Speed up

where• Q = (VQ, EQ)

• Gm = (Vm, Em): the largest fragment in G

• Vf: the set of nodes with edges across different fragments

in contrast graph simulation: O((| V | + | VQ |) (| E | + | EQ| )

small|G|/n

with 20 machines, 55 times faster than first collecting data and then using a centralized algorithm

42

Page 43: XML Publishing

43

GRAPE vs. other parallel models

Implement a GRAPE platform?

Reduce unnecessary computation and data shipment

• Message passing only between fragments, vs all-to-all (MapReduce) and messages between vertices

• Incremental computation: on the entire fragment;

Flexibility: MapReduce and vertex-centric models as special cases

• MapReduce: a single Map (partitioning), multiple Reduce steps by capitalizing on incremental computation

• Vertex-centric: local computation can be implemented this way

Think like a graph, via minor revisions of existing algorithms;

• no need to to re-cast algorithms in MapReduce or BSP

• Iterative computations: inherited from existing ones

Page 44: XML Publishing

Summing up

4444

Page 45: XML Publishing

45

Summary and review

What is the MapReduce framework? Pros? Pitfalls?

Develop algorithms in MapReduce

What are vertex-centric models for querying graphs? Why do we need them?

What is GRAPE? Why does it need incremental computation? How to terminate computation in GRAPE?

Develop algorithms in vertex-centric models and in GRAPE.

Compare the four parallel models: MapReduce, PBS, vertex-centric, and GRAPE

Page 46: XML Publishing

46

Project (1)

Recall PageRank (Lecture 2)

46

Implement two algorithms for PageRank, in• BSP• GRAPE

Develop optimization strategies Experimentally evaluate your algorithms, especially its scalability

with the size of G Write a survey on parallel algorithms for PageRank, as part of the

related work.

A development project

Page 47: XML Publishing

47

Project (2)

Recall strongly connected components (Lecture 2)

47

Implement two algorithms for strongly computing connected components, in

• BSP• GRAPE

Develop optimization strategies Experimentally evaluate your algorithms, especially its scalability

with the size of G Write a survey on parallel algorithms for computing strongly

connected components, as part of the related work.

A development project

Page 48: XML Publishing

48

Project (3)

Recall bounded simulation (Lecture 3)

48

Implement a parallel algorithm for graph pattern matching via bounded simulation, in GRAPE

Develop optimization strategies Experimentally evaluate your algorithm, especially its scalability

with the size of G Write a survey on parallel algorithms for graph pattern matching,

as part of the related work.

A research and development project

Page 49: XML Publishing

49

Project (4)

Recall graph partitioning: given a directed graph G and a natural number n, we want to partition G into n fragments of roughly even size such that the total number of border nodes in Vf is minimized

Read existing work on graph partitioning Develop an approximation algorithm for graph partitioning Implement your algorithm in any parallel programming model of

your choice Develop optimization strategies Experimentally evaluate your algorithm, especially its scalability

with the size of G and the size of |Vf| Write a survey on graph partitioning algorithms, as part of the

related work.

A research and development project49

Page 50: XML Publishing

50

• Pregel: a system for large-scale graph processing

http://kowshik.github.io/JPregel/pregel_paper.pdf

• Da Yan, J. Cheng, K. Xing, Y. Lu, W. Ng, Y. Bu: Pregel Algorithms for Graph Connectivity Problems with Performance Guarantees

http://www.vldb.org/pvldb/vol7/p1821-yan.pdf

• Distributed GraphLab: A Framework for Machine Learning in the Cloud, http://vldb.org/pvldb/vol5/p716_yuchenglow_vldb2012.pdf

• PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs, http://select.cs.cmu.edu/publications/paperdir/osdi2012-gonzalez-low-gu-bickson-guestrin.pdf

• W. Fan, X. Wang, and Y. WU. Distributed Graph Simulation: Impossibility and Possibility. VLDB 2014. (parallel scalability)

• W. Fan, X. Wang, and Y. Wu. Performance Guarantees for Distributed Reachability Queries, VLDB, 2012. (parallel model)

Papers for you to review