session 3: big graph analysis - cs.colostate.educs535/slides/week11-a-2.pdf•part 3: implementation...

CS535 Big Data 4/7/2020 Week 11-A Sangmi Lee Pallickara

http://www.cs.colostate.edu/~cs535 Spring 2020 Colorado State University, page 1

CS535 BIG DATA

PART B. GEAR SESSIONSSESSION 3: BIG GRAPH ANALYSIS

Sangmi Lee PallickaraComputer Science, Colorado State Universityhttp://www.cs.colostate.edu/~cs535

CS535 Big Data | Computer Science | Colorado State University

FAQs• Online GEAR presentation will be available on 4/6• You will have 3 days of discussion period on Piazza

• 4/6 ~ 4/8




Topics of Todays Class• GraphX: Graph Processing in a Distributed Dataflow Framework

• Part 1: Introduction and Graph parallelism • Part 2: Distributed Graph Representation• Part 3: Implementation of Distributed Graph Processing


GEAR Session 3. Big Graph AnalysisLecture 2. Distributed Large Graph Analysis-II

GraphX: Graph Processing in a Distributed Dataflow Framework




This material is built based on• Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J. and Stoica, I., 2014.

Graphx: Graph processing in a distributed dataflow framework. In 11th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 14) (pp. 599-613).

• KARYPIS, G., AND KUMAR, V. Multilevel k-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput.

• 48, 1 (1998), 96–129. • GraphX Programming Guide https://spark.apache.org/docs/latest/graphx-programming-

guide.html


Introduction• GraphX is a library built on top of the Apache Spark for graphs and graph-parallel

computation

• Introduces a Graph abstraction• Directed multigraph with properties attached to each vertex and edge

• Provides a set of graph operators• E.g. subgraph, JoinVertices, and aggregateMessages

• Provides an optimized variant of the Pregel API• Implements graph algorithms and builders

• PageRank• Connected Components• Triangle Counting


https://spark.apache.org/docs/latest/graphx-programming-guide.html



Computational Challenges• Graph processing systems outperform general-purpose distributed dataflow

frameworks with own specialized optimization schemes• E.g. Pregel, PowerGraph, BLAS, Kineograph

• Graphs are often only a part of the large analytics process• Combines graphs with unstructured and tabular data• Analytics pipelines are forced to compose multiple systems • Extra data movement and duplication• Fault tolerance

• Design of graph processing systems on top of general purpose distributed dataflow systems is needed




Distributed Dataflow Model and Optimization Schemes for Graph Processing




Dataflow Models - Traditional Network Programming

• Message-passing between nodes (e.g. MPI)

• Very difficult to do at scale• How to split the problem across nodes?• Network communication & data locality

• How to deal with failures? (inevitable at scale)• Stragglers?

• Node not failed but slow• Writing programs for each machine

• Rarely used in commodity datacenters!


Dataflow Models – Modern distributed dataflow models• Restrict the programming interface

• System can do more automatically

• Express jobs as graphs of high-level operators• System picks how to split each operator into tasks and where to run each task• Run parts multiple times for fault recovery• Examples: MapReduce, Spark, Dryad, Storm, Pig, Hive…

• Examples of dataflow operators• join, map, groupby, … most of the operators introduced in the Apache Spark discussion




Why did these graph processing systems evolve separately from distributed dataflow frameworks?• Early emphasis on single stage computation and on-disk processing

• Limited capability to handle iterative graph algorithms• Repeatedly and randomly access subsets of the graph• E.g. MapReduce

• Early distributed dataflow frameworks did not support fine-grained control over the data partitioning• Recent frameworks (e.g. Spark and Naiad) support in-memory representation and fine-grained control

over data partitioning


Optimization used in GraphX• Encoding graph as a collections

• Vertex-cut partitioning• Executing graph algorithms as the common dataflow operators• Join optimizations

• E.g. CSR indexing, join elimination and join-site specification• Materialized view maintenance

• Vertex mirroring and delta updates• Applying above techniques and provides a new set of the Spark dataflow operators for

graph processing• Reducing memory overhead and improve system performance

• Immutability GraphX reuses indices across graph and collection views over multiple iterations






The Property Graphs as Collections and Executing Graph Algorithms


Property Graph• User-defined properties with each vertex and edge• Meta-data• e.g. user profiles and time stamps• Program state• E.g. the PageRank of vertices or inferred affinities• Applicable for natural phenomena such as social networks and web graphs

• Often highly skewed• Power-law degree distributions• Orders of magnitude more edges than vertices




Transforming a Property Graph to a Pair of Collections

• Vertex collection

• Vertex properties (with a unique key: Vertex Identifier)

• Vertex Identifiers are 64-bit integer

• Derived externally (e.g. using userID) or applying a hash function to the vertex property (e.g. URL)

• Edge collection

• Edge properties (with source and destination vertex identifiers)

• Having a pair of collection enables the system to compute graph algorithms with

existing dataflow operations

• Join: adding additional vertex properties

• Creating new collections: creating a new graph

• E.g. maintaining a graph for PageRanks and another graph for membership information while sharing the same

edge collection


The Graph-Parallel Abstraction (Discussed in W10-A)• Iterative local

transformations • E.g. PageRank algorithm

• Vertex program• Launches the vertex program

for each vertex and interacts with adjacent vertex programs through messages (e.g. pregel), or shared state (e.g. PowerGraph)

• Example with the PageRank algorithm

def PageRank(v: Id, msgs: List[Double]) { // Compute the message sumvar msgSum = 0for (m <- msgs) { msgSum += m } // Update the PageRank PR(v) = 0.15 + 0.85 * msgSum// Broadcast messages with new PR for (j <- OutNbrs(v)) {msg = PR(v) / NumLinks(v) send_msg(to=j, msg)

} // Check for termination if (converged(PR(v))) voteToHalt(v)

}

PageRank in Pregel




The Graph-Parallel Abstraction (Discussed in W10-A)• Advantage

• Well-suited for iterative graph algorithms for the static neighborhood structure of the graph

• Disadvantage• It cannot express computation where disconnected vertices interact • It cannot process graph data that changes the graph structure in the course of the computation


The GAS Decomposition

• Gonzalez et al.1 observed that most vertex programs interact with neighboring vertices by collecting messages in the form of a generalized commutative associative sum and then broadcasting new messages in an inherently parallel loop

1 GONZALEZ, J. E., LOW, Y., GU, H., BICKSON, D., AND GUESTRIN, C. “Powergraph: Distributed graph-parallel computation on natural graphs,” OSDI’12, USENIX Association, pp. 17–30.




Types of graph computation [1/3]• Gather: Your computation gathers information from neighboring vertices• e.g. authority value of the HITS algorithm• e.g. current PageRank value


Types of graph computation [2/3]• Apply: The vertex applies an update the vertex property• e.g. update the authority value with the sum of new authority values after normalizing

the value• e.g. Add passed PageRank values and normalize it and update the current PageRank

value




Types of graph computation [3/3]• Scatter: a vertex should send out information to neighboring vertices.


The GAS Decomposition• The GAS decomposition

• Splits vertex programs into three data-parallel stages • Gather• Apply• Scatter

def PageRank(v: Id, msgs: List[Double]) { // Compute the message sumvar msgSum = 0for (m <- msgs) { msgSum += m } // Update the PageRank PR(v) = 0.15 + 0.85 * msgSum// Broadcast messages with new PR for (j <- OutNbrs(v)) {msg = PR(v) / NumLinks(v) send_msg(to=j, msg)

} // Check for termination if (converged(PR(v))) voteToHalt(v)

}


Gather

Apply

Scatter



The GAS Decomposition• pull-based model of message computation

• The system asks the vertex program for value of the message between adjacent vertices • Rather than the user sending messages directly from the vertex program

• Therefore, vertex-cut is suitable for this style of computation

• Limited communication pattern• Supports only between adjacent vertices


Graph Computation as Dataflow Ops.• The graph-parallel computation can be expressed as a sequence of join stages, group-

by stages and map operations• Join stage

• Vertex and edge properties are joined to form the triplets view• Consists of each edge and its corresponding source and destination vertex properties

• Group-by stage• The triplets are grouped by source or destination vertex to construct the neighborhood of each vertex

to construct the neighborhood of each vertex and compute aggregates• Gathers messages destined to the same vertex

• Map operation• Applies the message final results for the given vertex to update the vertex property

• Join operation• To distribute the values to the vertices




Discussions• Assume that you implement the

PageRank algorithm using three stages in GraphX. What stage will be applied for the line5?

a. Join stageb. GroupBy stagec. map operationsd. All of the above

0: def PageRank(v: Id, msgs: List[Double]) { 1: // Compute the message sum2: var msgSum = 03: for (m <- msgs) { msgSum += m } 4: // Update the PageRank 5: PR(v) = 0.15 + 0.85 * msgSum6: // Broadcast messages with new PR 7: for (j <- OutNbrs(v)) {8: msg = PR(v) / NumLinks(v) 9: send_msg(to=j, msg) 10: } 11: // Check for termination 12: if (converged(PR(v))) voteToHalt(v) 13: }


Discussions• Assume that you implement the

PageRank algorithm using three stages in GraphX. What stage will be applied for the line3?

a. Join stageb. GroupBy stagec. map operationsd. All of the above

0: def PageRank(v: Id, msgs: List[Double]) { 1: // Compute the message sum2: var msgSum = 03: for (m <- msgs) { msgSum += m } 4: // Update the PageRank 5: PR(v) = 0.15 + 0.85 * msgSum6: // Broadcast messages with new PR 7: for (j <- OutNbrs(v)) {8: msg = PR(v) / NumLinks(v) 9: send_msg(to=j, msg) 10: } 11: // Check for termination 12: if (converged(PR(v))) voteToHalt(v) 13: }




The GAS Decomposition with GraphX• Gather

• GroupBy stage • Apply

• Map operation• Scatter

• Join stage


Triplets view• Each edge and its corresponding source and destination vertex properties

A

B

A

Vertices Edges

B A B

Triplets

CREATE VIEW triplets ASSELECT s.Id, d.Id, s.P, e.P, d.PFROM edges AS eJOIN vertices AS s JOIN vertices AS d ON e.srcId = s.Id AND e.dstId = d.Id

Constructing Triplets in SQL




GraphX Graph Operators• Transform vertex and edge collections

• Graph Constructor• Logically binds a pair of vertex and edge property collections into a property graph• Verities integrity constrains – every vertex occurs only once and that edges do not link to missing

vertices• def Graph(v: Collection[(Id, V)], e: Collection[(Id, Id, E)])

• Collection views• Vertex and edges operators expose the graph’s vertex and edge property collections• Triplets operator returns the triplets view of the graph• def vertices: Collection[(Id, V)] • def edges: Collection[(Id, Id, E)] • def triplets: Collection[Triplet]


GraphX Graph Operators• Graph-parallel computation

• MapReduce Triplets operator encodes the two-stage process of graph-parallel computation• Composes the map and group-by dataflow operators on the triplets view• User-defined map function is applied to each triplet• Generates values and aggregates them at the destination vertex using user-defined binary

aggregation function • def mrTriplets(f: (Triplet) => M, sum: (M, M) => M): Collection[(Id, M)] • In SQLSELECT t.dstID, reduce(mapF(t)) AS msgSumFROM triplets AS t GROUP BY t.dstId




GraphX Graph Operators

• Convenience functions • def mapV(f: (Id, V) => V): Graph[V, E]• def mapE(f: (Id, Id, E) => E): Graph[V, E] • def leftJoinV(v: Collection[(Id, V)], f: (Id, V, V) => V): Graph[V, E]• def leftJoinE(e: Collection[(Id, Id, E)], f: (Id, Id, E, E) => E): Graph[V, E] • def subgraph(vPred: (Id, V) => Boolean, ePred: (Triplet) => Boolean) : Graph[V, E] • def reverse: Graph[V, E] }


Example use of mrTripletsA B

ED

C

F

42 23

30

7519

16

AmapF( )=1BSource property 42

Target property 23Messageto vertex B

V id PropertyA 0B 2C ?D ?E ?F ?

Resultingvertices

Compute the number of older followersfor each user in a social network

val graph: Graph[User, Double]def mapUDF(t: Triplet[User, Double]) = ??? What will be your computation here?

def reduceUDF(a: Int, b: Int): Int = a + b val seniors: Collection[(Id, Int)] = graph.mrTriplets(mapUDF, reduceUDF)




Example use of mrTripletsA B

ED

C

F

42 23

30

7519

16

AmapF( )=1BSource property 42

Target property 23Messageto vertex B

V id PropertyA 0B 2C 1D 1E 0F 3

Resultingvertices

Compute the number of older followersfor each user in a social network

val graph: Graph[User, Double]def mapUDF(t: Triplet[User, Double]) = if (t.src.age > t.dst.age) 1 else 0

def reduceUDF(a: Int, b: Int): Int = a + b val seniors: Collection[(Id, Int)] = graph.mrTriplets(mapUDF, reduceUDF)


Implementation of the Pregel abstraction using GraphX• Initializes the vertex properties

with an additional field to track active vertices

• While they are active, messages are computed using the mrTriplets operator

• Edge-parallel map operation• Message computation

• Commutative associated aggregation

def Pregel(g: Graph[V, E], vprog: (Id, V, M) => V, sendMsg: (Triplet) => M, gather: (M, M) => M): Collection[V] = {

// Set all vertices as activeg = g.mapV((id, v) => (v, halt=false))// Loop until convergencewhile (g.vertices.exists(v => !v.halt)) { // Compute the messagesval msgs: Collection[(Id, M)] =// Restrict to edges with active source g.subgraph(ePred=(s,d,sP,eP,dP)=>!sP.halt) // Compute messages .mrTriplets(sendMsg, gather)

// Receive messages and run vertex program g = g.leftJoinV(msgs).mapV(vprog) } return g.vertices

}






Distributed Representation of a Graph


Distributed Graph Representation• GraphX represents graphs internally as a pair of vertex and edge collections built on

the Spark RDD abstraction • Indexing and graph-specific partitioning as a layer on top of RDDs

1

2

3

4

56

1 2

Edge partition A

1 3

4 1

Edge partition B

4 5

1 5

Edge partition C

1 6

5 6

Edges

1

2

Vertex partition A

Vertices

3

1

1

1

4

5

Vertex partition B

6

1

1

0

Partition A

Routing Table

Partition B

A 1,2,3

B 1

C 1

A

B 4,5

C 5.6

Partition A

Partition B

Partition C

Bitm

askB

itmask




Vertices and Edges• Vertex collection is hash-partitioned by the vertex ids• Vertices are stored in a local hash index within each

partition• Bitmask stores the visibility of each vertex

• Soft deletions to promote index reuse• If vertex 5 and adjacent edges are restricted from the graph,

they are removed from the corresponding collection by updating the bitmasks

• Your computation can reuse this index

• Edges are divided into three edge partitions by applying a partition function • E.g. 2D partitioning

• Vertices are partitioned by vertex id

1 2Edge partition A

1 3

4 1Edge partition B

4 5

1 5Edge partition C

1 65 6

Edges

1

2

Vertex partition A

Vertices

3

1

1

1

4

5

Vertex partition B

6

1

1

0

Bitmask

Bitmask


Routing table• Encoding the edge partitions for each vertex• Join site information is stored in the routing table

Partition A

Routing Table

Partition B

A 1,2,3

B 1

C 1

A

B 4,5

C 5.6




Graph Partitioning: EdgePartition2D• Inspired by the multilevel k-way partitioning1

• 2D graph partitioning• Upper bound of 2 " − 1 on the vertex replication factor

• ,where n is the number of partitions

1KARYPIS, G., AND KUMAR, V. Multilevel k-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput.48, 1 (1998), 96–129.


Graph Partitioning: EdgePartition2D• Consider a graph G = (V, E)

• ,where V is the set of vertices and E is the set of edges• Every vertex in V has a vertex identifier and a vertex property• Every edge in E has source and destination vertex identifiers and edge property

• Goal• Create n partitions of G such that:• The partitions should incur minimum communication• The workload should be balanced




Step 1: Creating a partition table

If n is a perfect squarerows(# of rows) = !cols (# of columns) = ! If n is not a perfect square

rows = the floor value of (n + cols -1)cols = the ceiling of the decimal value of !

For example, if n = 27, cols = 6 and rows = 5The last column would have 3 rows

!

!


Step 2: Assigning vertices and edges

!

!

Vertex assignmentUsing elementary modular hash v%nVertices are equally distributed among the partitions

Edge assignment

The source vertex (src) is mapped on the columnscol = ((src x mixingPrime)% !, if n is a perfect squarecol = ((src x mixingPrime)% ( #

$%&'), otherwise,where mixingPrime is a large prime number to improve the balance of edge distributions

The destination vertices (des) is mapped on the rowsrow = ((des x mixingPrime)% !, if n is a perfect squarerow = ((des x mixingPrime)% ( #

$%&'), if n is not a perfect square and col < cols - 1

row = ((des x mixingPrime)% )*+,-.)/.0+. otherwise




Step 3: Storing edge properties

!

!

Storing Edge Properties(col x ! + row) if n is a perfect square(col x rows + rows) otherwise

Edge assignment

The source vertex (src) is mapped on the columnscol = ((src x mixingPrime)% !, if n is a perfect squarecol = ((src x mixingPrime)% ( #

$%&'), otherwise,where mixingPrime is a large prime number to improve the balance of edge distributions

The destination vertices (des) is mapped on the rowsrow = ((des x mixingPrime)% !, if n is a perfect squarerow = ((des x mixingPrime)% ( #

$%&'), if n is not a perfect square and col < cols - 1

row = ((des x mixingPrime)% )*+,-.)/.0+. otherwise


Discussions• Let’s locate a set of edges using EdgePartition2D• {(s, d1) , (s, d2) , (s, d3) , (s, d4) , (s, d5) } (sharing the same source vertex)• Where will they be located?a. a single cellb. a single rowc. a single columnd. randomly dispersed

!

!




Discussions• Let’s locate a set of edges using EdgePartition2D• {(s, d1) , (s, d2) , (s, d3) , (s, d4) , (s, d5) } (sharing the same source vertex)• Where will they be located?a. a single cellb. a single rowc. a single columnd. randomly dispersed

!

!


Understanding the effect of EdgePartition2D• Let’s locate an edge (vsrc, vdes) • All the edges where vsrc is the source vertex

• Would be placed in the same column, col• Example:• If vsrc = 9 and mixingPrime = 3 for the 25 (=n) partitions• (9 x 3)%5 = 2

• The actual cell will be determined by the destination vertex• If vdes is 2 and mixingPrime = 3 • (2 x 3)%5 = 1

• Therefore, the edge (vsrc, vdes) is stored in the partition 11 (the partition defined as the 2nd row and the 3rd column)

!

!

0 1 2 3. 4




Understanding the effect of EdgePartition2D• A vertex with the vertex id of v can be in any of

the cell in the column of (v x mixingPrime)% !• If it was a source vertex

• Similarly, a vertex with the vertex id of v can be in any of the cell in the raw of (v x mixingPrime)% !• If it was a destination vertex

• Can a vertex v be in any other cells except aforementioned set of cells?

• No!

!

!

0 1 2 3. 4


Understanding the effect of EdgePartition2D• Therefore, any edge containing v has to be

placed in any of ! + ! -1 = 2 ! - 1 partitions

• The upper bound on the vertex replication factor is 2 " - 1 • This is directly related to the communication cost to

synchronize the status of the vertex properties

!

!

0 1 2 3. 4

Naman Shah, Matthew Malensek, Harshil Shah, Shrideep Pallickara, and Sangmi Lee Pallickara, “Scalable Network Analytics for Characterization of Outbreak Influence in Voluminous Epidemiology Datasets,” Concurrency and Computation: Practice & Experience. John- Wiley. 2018Naman Shah, Harshil Shah, Matthew Malensek, Sangmi Lee Pallickara, and Shrideep Pallickara. “Network Analysis for Identifying and Characterizing Disease Outbreak Influence from Voluminous Epidemiology Data,”Proceedings of the IEEE International Conference on Big Data (IEEE BigData). Washington D.C., USA. 2016


session 3: big graph analysis - cs.colostate.educs535/slides/week11-a-2.pdf•part 3: implementation...

Documents