a framework for processing large graphs in shared...

24
A Framework for Processing Large Graphs in Shared Memory Julian Shun UC Berkeley (Miller Institute, AMPLab) Based on joint work with Guy Blelloch and Laxman Dhulipala, Kimon Fountoulakis, Michael Mahoney, and Farbod Roosta-Khorasani

Upload: others

Post on 19-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

A Framework for Processing Large Graphs in Shared Memory

Julian Shun UC Berkeley (Miller Institute, AMPLab) Based on joint work with Guy Blelloch and Laxman Dhulipala, Kimon Fountoulakis, Michael Mahoney, and Farbod Roosta-Khorasani

Page 2: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Graphs are everywhere!

• Can contain up to billions of vertices and edges • Need simple, efficient, and scalable way to analyze them

2

Page 3: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Ligra Graph Processing Framework 3

EdgeMap VertexMap

Breadth-first search Betweenness centrality Connected components Triangle counting K-core decomposition Maximal independent set …

Single-source shortest paths Eccentricity estimation (Personalized) PageRank Local graph clustering Biconnected components Collaborative filtering …

Simplicity, Performance, Scalability

Page 4: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Graph Processing Systems

• Existing: Pregel/Giraph/GPS, GraphLab, PRISM, Pegasus, Knowledge Discovery Toolbox, GraphChi, GraphX, and many others…

• Our system: Ligra - Lightweight graph processing system for shared memory

4

Takes advantage of “frontier-based” nature of many algorithms (active set is dynamic and often small)

Page 5: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Breadth-first Search (BFS) • Compute a BFS tree rooted at source r containing all vertices reachable from r

5

• Can process each level in parallel • Race conditions, load balancing

Applications Betweenness centrality Eccentricity estimation Maximum flow Web crawlers Network broadcasting Cycle detection …

… …

.

Page 6: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Steps for Graph Traversal • Operate on a subset of vertices • Map computation over subset of edges in parallel • Return new subset of vertices • Map computation over subset of vertices in parallel

6

}

Graph

VertexSubset

EdgeMap

VertexMap

Think with flat data-parallel operators

We built the Ligra abstraction for these kinds of computations

Optimizations: •  hybrid traversal •  graph compression

Many graph traversal algorithms do this!

Shared memory parallelism

Page 7: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Breadth-first Search in Ligra parents = {-1, …, -1}; //-1 indicates “unexplored” procedure UPDATE(s, d):

return compare_and_swap(parents[d], -1, s); procedure COND(v):

return parents[v] == -1; //checks if “unexplored” procedure BFS(G, r):

parents[r] = r; frontier = {r}; //VertexSubset while (size(frontier) > 0): frontier = EDGEMAP(G, frontier, UPDATE, COND);

7

frontier

T T T T F

Page 8: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Actual BFS code in Ligra 8

Page 9: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Sparse or Dense EdgeMap? 9

11

10

9

12

13

15

14

1

4

3

2

5

8

7

6

Frontier •  Dense method better when

frontier is large and many vertices have been visited

•  Sparse (traditional) method better for small frontiers

•  Switch between the two methods based on frontier size [Beamer et al. SC ’12]

11

10

9

12

Limited to BFS?

Page 10: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

EdgeMap 10

Loop through outgoing edges of frontier vertices in parallel

procedure EDGEMAP(G, frontier, Update, Cond): if (size(frontier) + sum of out-degrees > threshold) then: return EDGEMAP_DENSE(G, frontier, Update, Cond); else: return EDGEMAP_SPARSE(G, frontier, Update, Cond);

Loop through incoming edges of “unexplored” vertices (in parallel), breaking early if possible

•  More general than just BFS! •  Generalized to many other problems •  Users need not worry about this

Page 11: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Frontier-based approach enables sparse/dense traversal

11

0

2

4

6

8

10

12

14

BFS Betweenness Centrality

Connected Components

Eccentricity Estimation

40-c

ore

runn

ing

time

(sec

onds

)

Twitter graph (41M vertices, 1.5B edges)

Dense

Sparse

Sparse/Dense

30.7 20.7

(using default threshold of |E|/20)

Page 12: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

PageRank 12

Page 13: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

VertexMap 13

0 4 68VertexSubset

4

7

52

1

0

6

8

3

68VertexSubset

bool f(v){ data[v] = data[v] + 1; return (data[v] == 1); }

4

0

6

8

VertexMap

T F F T

Page 14: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

PageRank in Ligra p_curr = {1/|V|, …, 1/|V|}; p_next = {0, …, 0}; diff = {}; error =∞; procedure UPDATE(s, d):

atomic_increment(p_next[d], p_curr[s] / degree(s)); return 1;

procedure COMPUTE(i):

p_next[i] = α · p_next[i] + (1- α) · (1/|V|); diff[i] = abs(p_next[i] – p_curr[i]); p_curr[i] = 0; return 1;

procedure PageRank(G, α, ε):

frontier = {0, …, |V|-1}; while (error > ε): frontier = EDGEMAP(G, frontier, UPDATE, CONDtrue); frontier = VERTEXMAP(frontier, COMPUTE); error = sum of diff entries; swap(p_curr, p_next) return p_curr;

14

Page 15: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

PageRank • Sparse version?

• PageRank-Delta: Only update vertices whose PageRank value has changed by more than some Δ-fraction (discussed in GraphLab and McSherry WWW ‘05)

15

Page 16: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

PageRank-Delta in Ligra

PR[i] = {1/|V|, …, 1/|V|}; nghSum = {0, …, 0}; Change = {}; //store changes in PageRank values procedure UPDATE(s, d): //passed to EdgeMap

atomic_increment(nghSum[d], Change[s] / degree(s)); return 1;

procedure COMPUTE(i): //passed to VertexMap

Change[i] = α · nghSum[i]; PR[i] = PR[i] + Change[i]; return (abs(Change[i]) > Δ); //check if absolute value of change is big enough

16

Page 17: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Performance of Ligra

17

Page 18: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Ligra BFS Performance 18

0

0.05

0.1

0.15

0.2

0.25

0.3

BFS

Run

ning

tim

e (s

econ

ds)

Twitter graph (41M vertices, 1.5B edges)

Ligra (40-core machine)

Hand-written Cilk/OpenMP (40-core machine)

0 20 40 60 80

100 120 140 160 180

BFS

Line

s of

cod

e • Comparing against direction-optimizing code by Beamer et al.

Page 19: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Ligra PageRank Performance 19

0

2

4

6

8

10

12

14

Page Rank (1 iteration)

Run

ning

tim

e (s

econ

ds)

Twitter graph (41M vertices, 1.5B edges)

GraphLab (64 x 8-cores)

GraphLab (40-core machine)

Ligra (40-core machine)

Hand-written Cilk/OpenMP (40-core machine) 0

10 20 30 40 50 60 70 80

Page Rank

Line

s of

cod

e

• Easy to implement “sparse” version of PageRank in Ligra

Page 20: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Ligra Connected Components Performance 20

0

2

4

6

8

10

12

14

Connected Components

Run

ning

tim

e (s

econ

ds)

Twitter graph (41M vertices, 1.5B edges)

GraphLab (16 x 8-cores)

GraphLab (40-core machine)

Ligra (40-core machine)

Hand-written Cilk/OpenMP (40-core machine) 0

50 100 150 200 250 300 350 400

Connected Components

Line

s of

cod

e

120 244

• Performance close to hand-written code •  Faster than existing high-level frameworks at the time • Shared-memory graph processing can be very efficient

•  Requires graph to fit on a single machine

Page 21: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Large Graphs 21

6.6 billion edges

128 billion edges

~1 trillion edges [VLDB 2015]

•  Most can fit on commodity shared memory machine

Amazon EC2

Page 22: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Ongoing Work: Local Graph Clustering 22

•  Significant savings if output cluster is much smaller than entire graph

•  We parallelized local algorithms using Ligra •  Algorithms are frontier-based, processing small active sets on

each iteration •  Ligra is crucial to getting local running time

Graph clustering (Community detection)

Local algorithms only touch small part of graph

Page 23: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

Ligra Summary 23

EdgeMap VertexMap

Breadth-first search Betweenness centrality Connected components Triangle counting K-core decomposition Maximal independent set …

Single-source shortest paths Eccentricity estimation (Personalized) PageRank Local graph clustering Biconnected components Collaborative filtering …

Simplicity, Performance, Scalability

VertexSubset

Optimizations: Hybrid traversal and graph compression

Page 24: A Framework for Processing Large Graphs in Shared Memorymmds-data.org/presentations/2016/s-shun.pdfSteps for Graph Traversal • Operate on a subset of vertices • Map computation

24

Thank you!

Code: https://github.com/jshun/ligra/