big graphs for big data: parallel matching and clustering on …bisse101/slides/asia14.pdf ·...

77
Outline Matching Introduction Greedy Parallelisable BSP algorithm GPU algorithm Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel matching and clustering on billion-vertex graphs Rob H. Bisseling Mathematical Institute, Utrecht University Collaborators: Bas Fagginger Auer, Fredrik Manne, Albert-Jan Yzelman Asia-trip A-Eskwadraat, July 2014

Upload: others

Post on 13-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

1

Big graphs for big data: parallel matching andclustering on billion-vertex graphs

Rob H. Bisseling

Mathematical Institute, Utrecht University

Collaborators: Bas Fagginger Auer, Fredrik Manne, Albert-Jan Yzelman

Asia-trip A-Eskwadraat, July 2014

Page 2: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

2

Graph MatchingIntroductionGreedy algorithmParallelisable 1/2-approximation algorithmBSP algorithmGPU algorithmResults

ClusteringIntroductionSequential algorithmGPU algorithmResults

Conclusion

Page 3: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

3

Matching can win you a Nobel prize

Source: Slate magazine October 15, 2012

Page 4: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

4

Motivation of graph matching

I Graph matching is a pairing of neighbouring vertices.I It has applications in

• medicine: finding suitable donors for organs• social networks: finding partners• scientific computing: finding pivot elements in matrix

computations• graph coarsening: making the graph smaller by merging

similar vertices

Page 5: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

5

Motivation of greedy/approximation graphmatching

I Optimal solution is possible in polynomial time.

I Time for weighted matching in graph G = (V ,E ) isO(mn + n2 log n) with n = |V | the number of vertices,and m = |E | the number of edges (Gabow 1990).

I The aim is a billion vertices, n = 109, with 100 edges pervertex, i.e. m = 1011.

I Thus, a time of O(1020) = 100, 000 Petaflop units is fartoo long. Fastest supercomputer today, the ChineseTianhe-2 (Milky-Way 2), performs 33.8 Petaflop/s.

I We need linear-time greedy or approximation algorithms.

Page 6: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

6

Formal definition of graph matching

I A graph is a pair G = (V ,E ) with vertices V and edges E .

I All edges e ∈ E are of the form e = (v ,w) for verticesv ,w ∈ V .

I A matching is a collection M ⊆ E of disjoint edges.

I Here, the graph is undirected, so (v ,w) = (w , v).

Page 7: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

7

Maximal matching

I A matching is maximal if we cannot enlarge it further byadding another edge to it.

Page 8: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

8

Maximum matching

I A matching is maximum if it possesses the largest possiblenumber of edges, compared to all other matchings.

Page 9: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

9

Edge-weighted matching

I If the edges are provided with weights ω : E → R>0,finding a matching M which maximises

ω(M) =∑e∈M

ω(e),

is called edge-weighted matching.

I Greedy matching provides us with maximal matchings,but not necessarily with maximum possible weight.

Page 10: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

10

Sequential greedy matching

I In random order, vertices v ∈ V select and matchneighbours one-by-one.

I Here, we can pick• the first available neighbour w of v ,

greedy random matching• the neighbour w with maximum ω(v ,w),

greedy weighted matching

I Or: we sort all the edges by weight, and successively matchthe vertices v and w of the heaviest available edge (v ,w).This is commonly called greedy matching.

Page 11: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 12: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 13: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 14: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 15: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 16: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 17: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 18: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 19: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 20: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 21: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 22: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 23: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 24: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 25: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 26: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 27: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

11

Sequential greedy random matching

9

8

65

73

1

4

2

Page 28: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

12

Greedy matching is a 1/2-approximation algorithm

I Weight ω(M) ≥ ωoptimal/2

I Cardinality |M| ≥ |Mcard−max|/2, because M is maximal.

I Time complexity is O(m log m), because all edges must besorted.

Page 29: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

13

Parallel greedy matching: trouble

9

8

65

73

1

4

2

Suppose we match vertices simultaneously.

Page 30: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

13

Parallel greedy matching: trouble

9

8

65

73

1

4

2

Two vertices each find an unmatched neighbour. . .

Page 31: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

13

Parallel greedy matching: trouble

9

8

65

73

1

4

2

. . . but generate an invalid matching.

Page 32: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

14

Parallelisable dominant-edge algorithm

while E 6= ∅ dopick a dominant edge (v ,w) ∈ EM := M ∪ (v ,w)E := E \ (x , y) ∈ E : x = v ∨ x = wV := V \ v ,w

return M

I An edge (v ,w) ∈ E is dominant if

ω(v ,w) = maxω(x , y) : (x , y) ∈ E ∧ (x = v ∨ x = w)

9

7

32

6 wv

5

68

Page 33: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

15

Sequential approximation algorithm: initialisation

function SeqMatching(V ,E )for all v ∈ V do

pref (v) = nullD := ∅M := ∅

Find dominant edges for all v ∈ V do

Adjv := w ∈ V : (v ,w) ∈ Epref (v) := argmaxω(v ,w) : w ∈ Adjvif pref (pref (v)) = v then

D := D ∪ v , pref (v)M := M ∪ (v , pref (v))

Page 34: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

16

Mutual preferences

9

7

32

6 wv

5

68

Page 35: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

17

Non-mutual preferences

9

12

7

3

6 wv

5

68

Page 36: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

18

Sequential approximation algorithm: main loop

while D 6= ∅ dopick v ∈ DD := D \ vfor all x ∈ Adjv \ pref (v) : (x , pref (x)) /∈ M do

Adjx := Adjx \ v

Set new preference pref (x) := argmaxω(x ,w) : w ∈ Adjxif pref (pref (x)) = x then

D := D ∪ x , pref (x)M := M ∪ (x , pref (x))

return M

Page 37: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

19

Properties of the dominant-edge algorithm

I Dominant-edge algorithm is a 1/2-approximation:

ω(M) ≥ ωoptimal/2

I Dominance is a local property: easy to parallelise.

I Algorithm keeps going until set of dominant vertices D isempty and matching M is maximal.

I Assumption without loss of generality: weights are unique.Otherwise, use vertex numbering to break ties.

Page 38: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

20

Time complexity

I Linear time complexity O(|E |) if edges of each vertex aresorted by weight.

I Sorting costs are∑v

deg(v) log deg(v) ≤∑v

deg(v) log ∆ = 2|E | log ∆,

where ∆ is the maximum vertex degree.

I This algorithm is based on a dominant-edge algorithm byPreis (1999), called LAM, which is linear-time O(|E |),does not need sorting, and also is a 1/2-approximation, butis hard to parallelise.

Page 39: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

21

Parallel computer: abstract model

MP P P PP

M M M M

Communicationnetwork

Bulk synchronous parallel (BSP) computer.Proposed by Leslie Valiant, 1989.

Page 40: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

22

Parallel algorithm: supersteps

P(0) P(1) P(2) P(3) P(4)

sync

sync

sync

sync

sync

comm

comm

comm

comp

comp

Page 41: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

23

Composition with Red, Yellow, Blue and Black

Piet Mondriaan 1921

Page 42: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

24

Mondriaan data distribution for matrix prime60

I Non-Cartesian block distribution of 60× 60 matrixprime60 with 462 nonzeros, for p = 4

I aij 6= 0⇐⇒ i |j or j |i (1 ≤ i , j ≤ 60)

Page 43: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

25

Parallel algorithm (Manne & Bisseling, 2007)

I Processor P(s) has vertex set Vs , with

p−1⋃s=0

Vs = V

and Vs ∩ Vt = ∅ if s 6= t.

I This is a p-way partitioning of the vertex set.

Page 44: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

26

Halo vertices

I The adjacency set Adjv of a vertex v may contain verticesw from another processor.

I We define the set of halo vertices

Hs =⋃

v∈Vs

Adjv \ Vs

I The weights ω(v ,w) are stored with the edges, for allv ∈ Vs and w ∈ Vs ∪ Hs .

I Es = (v ,w) ∈ E : v ∈ Vsis the subset of all the edges connected to Vs .

Page 45: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

27

Parallel algorithm for P(s): initialisation

function ParMatching(Vs ,Hs ,Es , distribution φ)for all v ∈ Vs do

pref (v) = nullDs := ∅Ms := ∅

Find dominant edges for all v ∈ Vs do

Adjv := w ∈ Vs ∪ Hs : (v ,w) ∈ EsSetNewPreference(v ,Adjv , pref ,Vs ,Ds ,Ms , φ)

Sync

Page 46: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

28

Setting a vertex preference

function SetNewPreference(v ,Adj ,V ,D,M, φ)pref (v) := argmaxω(v ,w) : w ∈ Adjif pref (v) ∈ V then

if pref (pref (v)) = v thenD := D ∪ v , pref (v)M := M ∪ (v , pref (v))

elseput proposal(v , pref (v)) in P(φ(pref (v)))

Page 47: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

29

How to propose

Source: www.theguardian.com

proposal(v ,w): v proposes to w

Page 48: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

30

Parallel algorithm for P(s): main loop

process received messages

while Ds 6= ∅ dopick v ∈ Ds

Ds := Ds \ vfor all x ∈ Adjv \ pref (v) : (x , pref (x)) /∈ Ms do

if x ∈ Vs thenAdjx := Adjx \ vSetNewPreference(x ,Adjx , pref ,Vs ,Ds ,Ms , φ)

else x ∈ Hsput unavailable(v , x) in P(φ(x))

Sync

Page 49: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

31

Parallel algorithm for P(s): process receivedmessages

for all messages m received doif m = proposal(x , y) then

if pref (y) = x thenDs := Ds ∪ yMs := Ms ∪ (x , y)put accepted(x , y) in P(φ(x))

if m = accepted(x , y) thenDs := Ds ∪ xMs := Ms ∪ (x , y)

if m = unavailable(v , x) thenif (x , pref (x)) /∈ Ms then

Adjx := Adjx \ vSetNewPreference(x ,Adjx , pref ,Vs ,Ds ,Ms , φ)

Page 50: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

32

Termination

I The algorithm alternates supersteps of computationrunning the main loop and communication handling thereceived messages.

I The whole algorithm terminates when no messages havebeen received by processor P(s) and the local set Ds isempty, for all s.

I This can be checked at every synchronisation point.

Page 51: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

33

Load balance

I Processors can have different amounts of work, even ifthey have the same number of vertices or edges.

I Use can be made of a global clock based on ticks, the unitof time needed to ‘handle’ a vertex x (in O(1)).

I Here, ‘handling’ could mean setting a new preference.

I After every k ticks, everybody synchronises.

Page 52: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

34

Synchronisation frequency

I Guidance for the choice of k is provided by the BSPparameter l , the cost of a global synchronisation.

I Choosing k ≥ l guarantees that at most 50% of the totaltime is spent in synchronisation.

I Choosing k sufficiently small will cause all processors to bebusy during most supersteps.

I Good choice: k = 2l?

Page 53: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

35

MulticoreBSP enables shared-memory BSP

Albert-Jan Yzelman 2014, www.multicorebsp.org

Page 54: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

36

GPU matching

9

8

65

73

1

4

2

I A different approach, tightly coupled to the GPUarchitecture.

I To prevent matching conflicts, we create two groups ofvertices:

• Blue vertices propose.• Red vertices respond.

I Proposals that were responded to, are matched.

Page 55: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

37

GPU matching

ColourProposeRespondMatch

9

8

65

73

1

4

2

1 2 3 4 5 6 7 8 9

π - - - - - - - - -σ - - - - - - - - -

Page 56: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

37

GPU matching

ColourProposeRespondMatch

9

8

65

73

1

4

2

1 2 3 4 5 6 7 8 9

π b r r b b r b b rσ - - - - - - - - -

Page 57: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

37

GPU matching

ColourProposeRespondMatch

9

8

65

73

1

4

2

1 2 3 4 5 6 7 8 9

π b r r b b r b b rσ 3 - - 3 6 - 3 2 -

Page 58: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

37

GPU matching

ColourProposeRespondMatch

9

8

65

73

1

4

2

1 2 3 4 5 6 7 8 9

π b r r b b r b b rσ 3 8 7 3 6 5 3 2 -

Page 59: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

37

GPU matching

ColourProposeRespondMatch

9

8

65

73

1

4

2

1 2 3 4 5 6 7 8 9

π b 2 3 b 5 5 3 2 rσ 3 8 7 3 6 5 3 2 -

Page 60: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

37

GPU matching

ColourProposeRespondMatch

9

8

65

73

1

4

2

1 2 3 4 5 6 7 8 9

π r 2 3 r 5 5 3 2 bσ 3 8 7 3 6 5 3 2 -

Page 61: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

37

GPU matching

ColourProposeRespondMatch

9

8

65

73

1

4

2

1 2 3 4 5 6 7 8 9

π r 2 3 r 5 5 3 2 bσ - - - - - - - - d

Page 62: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

37

GPU matching

ColourProposeRespondMatch

9

8

65

73

1

4

2

1 2 3 4 5 6 7 8 9

π r 2 3 r 5 5 3 2 bσ - - - - - - - - d

Page 63: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

37

GPU matching

ColourProposeRespondMatch

9

8

65

73

1

4

2

1 2 3 4 5 6 7 8 9

π r 2 3 r 5 5 3 2 dσ - - - - - - - - d

Page 64: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

37

GPU matching

ColourProposeRespondMatch

9

8

65

73

1

4

2

1 2 3 4 5 6 7 8 9

π b 2 3 r 5 5 3 2 dσ - - - - - - - - d

Page 65: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

37

GPU matching

ColourProposeRespondMatch

9

8

65

73

1

4

2

1 2 3 4 5 6 7 8 9

π b 2 3 r 5 5 3 2 dσ 4 - - - - - - - -

Page 66: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

37

GPU matching

ColourProposeRespondMatch

9

8

65

73

1

4

2

1 2 3 4 5 6 7 8 9

π b 2 3 r 5 5 3 2 dσ 4 - - 1 - - - - -

Page 67: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

37

GPU matching

ColourProposeRespondMatch

9

8

65

73

1

4

2

1 2 3 4 5 6 7 8 9

π 1 2 3 1 5 5 3 2 dσ 4 - - 1 - - - - -

Page 68: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

38

Quality of the matching

0

20

40

60

80

100

0 5 10 15 20Mat

ched

ver

tices

/tota

l nr.

of v

ertic

es (

%)

Number of iterations

Saturation of matching size

ecology2 (1,997,996)ecology1 (1,998,000)

G3_circuit (3,037,674)thermal2 (3,676,134)

kkt_power (6,482,320)af_shell9 (8,542,010)

ldoor (22,785,136)af_shell10 (25,582,130)

audikw1 (38,354,076)nlpkkt120 (46,651,696)

cage15 (47,022,346)

Fraction of matched vertices as a function of the number ofiterations. Number of edges between 2 and 47 million.

Page 69: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

39

Random colouring of the vertices

I At each iteration, we colour the vertices v ∈ V differently.

I For a fixed p ∈ [0, 1]

colour(v) =

blue with probability p,red with probability 1− p.

I How to choose p? Maximise the number of matchedvertices.

I For large random graphs, the expected fraction of matchedvertices can be approximated by

2 (1− p)(1− e−

p1−p

).

This is independent of the edge density.

Page 70: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

40

Clustering of road network of the Netherlands

(a) G0 (b) G11 (c) G21

(d) G26 (e) G33 (f) Best clustering(G21)

Graph with 2,216,688 vertices and 2,441,238 edgesyields 506 clusters with modularity 0.995.

Page 71: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

41

Formal definition of a clustering

I A clustering of an undirected graph G = (V ,E ) is acollection C of disjoint subsets of V satisfying

V =⋃C∈C

C .

I Elements C ∈ C are called clusters.

I The number of clusters is not fixed beforehand.

Page 72: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

42

Quality measure for clustering: modularity

I The quality measure modularity was introduced byNewman and Girvan in 2004 for finding communities.

I Let G = (V ,E , ω) be a weighted undirected graph withoutself-edges. We define

ζ(v) =∑

(u,v)∈E

ω(u, v), Ω =∑e∈E

ω(e).

I Then, the modularity of a clustering C of G is defined by

mod(C) =

∑C∈C

∑(u,v)∈E

u,v∈C

ω(u, v)

Ω−

∑C∈C

( ∑v∈C

ζ(v)

)2

4Ω2.

I −12 ≤ mod(C) ≤ 1.

Page 73: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

43

Merging clusters: change in modularity

I The set of all cut edges between clusters C and C ′ is

cut(C ,C ′) = u, v ∈ E | u ∈ C , v ∈ C ′

I If we merge clusters C and C ′ from C into one clusterC ∪ C ′, then we get a new clustering C′ with

mod(C′) = mod(C)+ 1

4 Ω2

(4 Ω ω(cut(C ,C ′))−2 ζ(C ) ζ(C ′)

),

ζ(C ∪ C ′) = ζ(C ) + ζ(C ′).

Page 74: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

44

Agglomerative greedy clustering heuristic

max← −∞G 0 = (V 0,E 0, ω0, ζ0)i ← 0C0 ← v | v ∈ V while |V i | > 1 do

if mod(G , C i ) ≥ max thenmax← mod(G , C i )Cbest ← C i

µ← weighted match clusters(G i )(πi ,G i+1)← coarsen(G i , µ)C i+1 ← v ∈ V | (πi · · · π0)(v) = u | u ∈ V i+1i ← i + 1

return Cbest

Page 75: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

45

Results: clustering time for DIMACS graphs

10-5

10-4

10-3

10-2

10-1

100

101

102

103

101

102

103

104

105

106

107

108

109

Clu

ste

ring tim

e (

s)

Number of graph edges |E|

Clustering time

3*10-7

|E|CUDA

TBB

I DIMACS categories: clustering/, coauthor/,streets/, random/, delaunay/, matrix/, walshaw/,dyn-frames/, and redistrict/.

I CUDA implementation with the Thrust template libraryand Intel TBB implementation.

I Web link graph uk-2002 with 0.26 billion verticesclustered in 30 s using Intel TBB.

Page 76: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

46

DIMACS road networks and coauthor graphs

G |V | |E | mod t mod tCU CU TBB TBB

luxembourg 114,599 119,666 0.99 0.13 0.99 0.14belgium 1,441,295 1,549,970 0.99 0.44 0.99 1.11netherlands 2,216,688 2,441,238 0.99 0.62 0.99 1.72italy 6,686,493 7,013,978 1.00 1.54 1.00 5.26great-britain 7,733,822 8,156,517 1.00 1.79 1.00 6.00germany 11,548,845 12,369,181 1.00 2.82 1.00 9.57asia 11,950,757 12,711,603 1.00 2.69 1.00 9.33europe 50,912,018 54,054,660 - -.- 1.00 45.21coAuthorsCite 227,320 814,134 0.84 0.42 0.85 0.23coAuthorsDBLP 299,067 977,676 0.75 0.59 0.76 0.28citationCite 268,495 1,156,647 0.64 0.89 0.68 0.32coPapersDBLP 540,486 15,245,729 0.64 6.43 0.67 2.28coPapersCite 434,102 16,036,720 0.75 6.49 0.77 2.27

mod = modularity, t = time in s, CU = CUDA

Page 77: Big graphs for big data: parallel matching and clustering on …bisse101/Slides/asia14.pdf · Clustering Introduction Sequential Results Conclusion 1 Big graphs for big data: parallel

Outline

Matching

Introduction

Greedy

Parallelisable

BSP algorithm

GPU algorithm

Clustering

Introduction

Sequential

Results

Conclusion

47

Conclusions

I BSP is extremely suitable for parallel graph computations:• no worries about communication because we buffer

messages until the next synchronisation;• no send-receive pairs, but one-sided put or get operations;• BSP cost model gives synchronisation frequency;• correctness proof of algorithm becomes simpler;• no deadlock possible.

I Matching can be the basis for clustering, as demonstratedfor GPUs and multicore CPUs.

I We clustered Asia’s road network with 12M vertices and12.7M edges in 2.7 seconds on a GPU.