betweenness centrality: algorithms and implementations dimitrios prountzos keshav pingali the...

Betweenness Centrality: Algorithms and

ImplementationsDimitrios Prountzos

Keshav Pingali

The University of Texas at Austin

2

Focus of this Talk

• A novel formulation of Betweenness Centrality– Based on Operator Formulation (Pingali et al. 2011)– Can express existing parallel solutions – Basis for new class of asynchronous parallel solutions

• Systematic derivation of parallel implementations from operator formulation– Ideas applicable to other irregular algorithms

3

• Basic ingredient in Betweenness Centrality

• Problem Formulation– Compute shortest distance

from source node S to every other node

• Many algorithms– Bellman-Ford (1957)– Dijkstra (1959)– Chaotic relaxation (Miranker 1969)– Delta-stepping (Meyer et al. 1998)

• Common structure– Each node has label d

with known shortest distance from S

• Key operation– relax-edge(u,v)

Warm-up: Single-Source Shortest-Path

2 5

1 7

A B

C

D E

F

G

S

34

22

1

9

12

2 A

C

3

If d(A) + WAC < d(C) d(C) = d(A) + WAC

4

Operator Formulation Concepts

u

vdu

dv

w du + w < dv

du+wdu

wu

v

Operator: Conditional rewrite rule on graph

Parallel Graph Algorithm

Operators Schedule

Order activity processing

Identify new activities

What should be done How it should

be done

: activity

“TAO of parallelism”PLDI 2011

5

• Identifies important nodes in a network

• Brandes’ Algorithm (2001)– Forward Pass:

• Compute shortest-path DAG for a given source S• Compute shortest-path count σ(u) for each node u

– Backward Pass: • Traverse DAG and compute BC(u)

• Parallel Implementations– Bader et al. (2006)– Madduri et al. (2009)– Edmonds et al. (2010)– …

Betweenness Centrality

# shortest paths from s to t

# shortest paths from s to t through v

B

A

D

C

E

0,1

1,11,1

1,1

2,2

dC,σC

6

B

BC Operator Formulation

uvdu ,σu

dv ,σv

duv ,w,σuv

dv > du + w

du+w , 0du ,σu

duv ,w,σuv

Pv= , S∅ v=∅

uv

Shortest Path (SP)

A D

C

E0,1

1,1∞,0

1,1

2,02,1

∞,0 C

D1,1

predecessor

successor

uvdu , σu

du+w ,σv

duv ,w,σuv

duv ≠ du

du+w,σv+σu

du , σu

duv ,w, σu

Pv= u⩲

uv

First Update (FU)

Su v⩲

uvdu ,σu

du+w ,σv

du ,w,σuv

σu ≠ σuv

du+w, σv+σu- σuv

du ,σu

duv ,w,σuuv

Update Sigma (US)

uv

du , σu

dv ,σv

du ,w,σuv

du ≠ ∞ ⋀ du+w >dv

dv ,σv

du ,σu, Su−=v

∞ , w, σuv

u

v

Correct Node (CN)

dA,σA

7

B


uvdu ,σu

dv ,σv

duv ,w,σuv

dv > du + w

du+w , 0du ,σu

duv ,w,σuv

Pv= , S∅ v=∅

uv

Shortest Path (SP)

A D

C

E0,1

1,11,1

1,1

uvdu , σu

du+w ,σv

duv ,w,σuv

duv ≠ du

du+w,σv+σu

du , σu

duv ,w, σu

Pv= u⩲

uv

First Update (FU)

Su v⩲

uvdu ,σu

du+w ,σv

du ,w,σuv

σu ≠ σuv

du+w, σv+σu- σuv

du ,σu

duv ,w,σuuv

Update Sigma (US)

uv

du , σu

dv ,σv

du ,w,σuv

du ≠ ∞ ⋀ du+w >dv

dv ,σv

du ,σu, Su−=v

∞ , w, σuv

u

v

Correct Node (CN)

2,1

8

Operator Scheduling for Parallel Algorithm Derivation

op {SP,FU,US,CN}∈while op(u,v) enabled∃

apply op(u,v)


Operators Schedule


Identify new activitiesop {SP,FU,US,CN}∈

Wl = { (u,v) : op(u,v) enabled }while ¬Wl.empty { apply operator(s) Wl = …∪}

Static Ordering

Dynamic Ordering

9

Dynamic Operator Scheduling

Worklist… …

B

A E

C

G

…

D F

J

H

I

………

T1 Tk

…

Our operators are general enough to enable this scheme

Variation of Delta-Stepping(Meyer et al. 1998)

10

Static Operator Scheduling

• Operator Grouping⊕ Exploit locality, reduce worklist pressure⊖ Load balancing

• Operator Merging – E.g. SP(u,v); FU(u,v) Combine into SP FU⨀⊕ Optimize computation + locking

• Context-based Operator Inlining⊕ Reduce worklist pressure⊖ Load balancing

Bind scheduling decisions at compile-time by committing to particular code structure

B1

Bn

A

…

11

Algorithm Encodings

• Async1 : (SP|FU|US|CN)* – SP FU⨀

• Async2 : (SP|FU|US|CN)*

– Group (u,v*)– SP FU⨀– SP FU ⨀ inline CN

• Leveled : level-by-level (SP|FU)* – Group (u,v*)– SP FU⨀

12

Experimental Evaluation

13

Experiments on Unweighted Graphs

24 core Intel Xeon @ 2 GHz

Scale-Free RMAT Graph33 M nodes 268 M edges

Random Graph67 M nodes 268 M edges

1 4 8 12 16 20 240

200

400

600

800

1000

1200

Leveled1 Leveled2 Async2 Leveled2-Serial

Threads

Tim

e (s

ec)

1 4 8 12 16 20 240

200

400

600

800

1000

1200

1400

1600

1800

Leveled1 Leveled2 Async2 Leveled2-Serial

Threads

Tim

e (s

ec)

Leveled1 (Bader et al. 2006)Leveled2 (Madduri et al. 2009)

14

Experiments on Weighted Graphs

USA Road Network24 M nodes 58 M edges

USA Central Road Network14 M nodes 34 M edges

24 core Intel Xeon @ 2 GHz

Sun T5440 UltraSPARC T2+ @ 1.4 GHz

1 4 8 12 16 20 240

100

200

300

400

500

600

Async1 Boost-Serial

Threads

Tim

e (s

ec)

1 16 32 48 64 96 1280

200

400

600

800

1000

1200

Async1 Boost-Serial

Threads

Tim

e (s

ec)

9.5x

38x

15

Conclusion

• New BC formulation – Expresses existing solutions– Basis for new asynchronous solutions

• Systematic derivation of parallel implementations– Dynamic + Static schedule transformations

• Enables automatic synthesis of parallel programs– Elixir [OOPSLA2012]

Thank You


Operators Schedule

16

Backup

17

• Identifies important nodes in a network

• Brandes’ Algorithm (2001)

Updateδ(v),BC(v)

Betweenness Centrality

A B

C

E F

H

S

Compute shortest path

DAG

(0,1)

(1,1)

(1,1)

(2,2)

(3,2)(3,2)

(4,4)

L(u),σ(u)

18

B


uvLu ,σu

Lv ,σv

Luv ,w,σuv

Lv > Lu + w

Lu+w , 0Lu ,σu

Luv ,w,σuv

Pv= , S∅ v=∅

uv

Shortest Path (SP)

A D

C

E0,1

1,1∞,0

1,1

2,02,1

∞,0 C

D1,1

predecessor

successor

uvLu , σu

Lu+w ,σv

Luv ,w,σuv

Luv ≠ Lu

Lu+w,σv+σu

Lu , σu

Luv ,w, σu

Pv= u⩲

uv

First Update (FU)

Su v⩲

uvLu ,σu

Lu+w ,σv

Lu ,w,σuv

σu ≠ σuv

Lu+w, σv+σu- σuv

Lu ,σu

Luv ,w,σuuv

Update Sigma (US)

uv

Lu , σu

Lv ,σv

Lu ,w,σuv

Lu ≠ ∞ ⋀ Lu+w >Lv

Lv ,σv

Lu ,σu, Su−=v

∞ , w, σuv

u

v

Correct Node (CN)

L(u),σ(u)

19


B

A D

C

E0,1

1,12,1

1,1

3,12,1

2,22,2

3,2

D

1,0

1,1∞,1

uvLu ,σu

Lv ,σv

Luv ,w,σuv

Lv > Lu + w

Lu+w , 0Lu ,σu

Luv ,w,σuv

Pv= , S∅ v=∅

uv

Shortest Path (SP)

uvLu , σu

Lu+w ,σv

Luv ,w,σuv

Luv ≠ Lu

Lu+w,σv+σu

Lu , σu

Luv ,w, σu

Pv= u⩲

uv

First Update (FU)

Su v⩲

uvLu ,σu

Lu+w ,σv

Lu ,w,σuv

σu ≠ σuv

Lu+w, σv+σu- σuv

Lu ,σu

Luv ,w,σuuv

Update Sigma (US)

uv

Lu , σu

Lv ,σv

Lu ,w,σuv

Lu ≠ ∞ ⋀ Lu+w >Lv

Lv ,σv

Lu ,σu, Su−=v

∞ , w, σuv

u

v

Correct Node (CN)

20

Deriving Algorithm VariantsWorklist Wl = { src }foreach u Wl ∈ { forall v outNbrs(u) ∈ { lock(u,v) if grd[SP FU,u,v]⨀ { … Wl = { v }∪ if vHasPreds { forall w inNbrs(v) {∈ lock(v,w) if grd[CN,w,v] {…} unlock(v,w) } } } else-if … }}

Worklist Wl = { (src,w) : (src,w) G(V,E) }∈foreach (u,v) Wl ∈ { lock(u,v) if grd[SP FU,u,v]⨀ { apply SP FU ; unlock(u,v)⨀ Wl = { (v,w) : w outNbrs(v) }∪ ∈ if vHasPreds Wl = { (w,u) : w inNbrs(v) }∪ ∈ } else-if grd[CN,u,v] { apply CN ; unlock(u,v) } else-if grd[FU,u,v] { … } else-if grd[US,u,v] { … }}

Async1 Async2

21

Insights Behind Elixir

What should be done

How it should be done

Unordered/Ordered algorithms

Operator Delta

: activity


Operators Schedule


Identify new activities

Static Schedule

Dynamic Schedule

“TAO of parallelism”PLDI 2011

betweenness centrality: algorithms and implementations dimitrios prountzos keshav pingali the...

Documents