graph algorithms for very large graphs and their applications to...

67
Graph algorithms for very large graphs and their applications to bioinformatics and social network analysis Pasquale Foggia University of Salerno, Italy [email protected] 1 MIVIA Lab Intelligent Machines for Video, Image and Audio Analysis

Upload: others

Post on 22-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

Graph algorithms forvery large graphs

and their applications tobioinformatics and

social network analysisPasquale Foggia

University of Salerno, Italy

[email protected]

1

MIVIA LabIntelligent Machines

for Video, Image and Audio Analysis

Page 2: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

MIVIA Lab✦ 3 full professors

• Mario Vento, Francesco Tortorella, Pasquale Foggia

✦ 2 associate professors • Gennaro Percannella, Pierluigi Ritrovato

✦ 3 researchers • Alessia Saggese, Luca Greco, Vincenzo Carletti

✦ 1 post-doc ✦ 3 PhD students

2

Page 3: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

MIVIA Lab

✦ Artificial vision ✦ Cognitive robotics ✦ Medical image analysis ✦ Structural Pattern Recognition and Graph-

based Algorithms ✦ Graph-based algorithms applied to genomics

3

Page 4: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

(Attributed) Graphs✦ Representation for structured information

• Nodes (aka vertices) representing atomic entities

• Edges representing relations between entities

• Nodes and edges can have attributes (aka labels) carrying additional information about entities and relations

4

Page 5: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Obvious examples

5

Page 6: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Less obvious examples✦ Image segmentation

6

Page 7: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Less obvious examples✦ Semantic data

• E.g. DBpedia stores information from Wikipedia as RDF graphs

7

Page 8: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Less obvious examples✦ Deep Learning optimization

• Tensorflow represents the DNN operations as a dataflow graph, then uses this graph to code the computation for the available CPUs/GPUs

8

Page 9: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Less obvious examples✦ … we even tend to imagine graphs when we look

at the night sky!

9

Page 10: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Graph matching

✦ A common operation on graphs is matching: ✦ Find a correspondence between the nodes of the

two graphs that preserves some interesting properties

✦ Different kinds of matching depending on the properties one wants to preserve

10

Page 11: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Isomorphism✦ Find if two graphs have the same

structure

11

1

3

2

4

a

c

b

d

Page 12: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Isomorphism✦ Find if two graphs have the same

structure

12

1

3

2

4

a

c

b

d

Page 13: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Subgraph isomorphism✦ Find if a graph contains the other as a

subgraph

13

3

1

2

a

c

b

d

Page 14: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Subgraph isomorphism✦ Find if a graph contains the other as a

subgraph

14

3

1

2

a

c

b

d

Page 15: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Maximum common subgraph✦ Find the largest subgraph common to the

two graphs

15

3

1

2

a

c

b

d

4

e

Page 16: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Maximum common subgraph✦ Find the largest subgraph common to the

two graphs

16

3

1

2

a

c

b

d

4

e

Page 17: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Maximum common subgraph✦ MCS is related to the Maximum Clique

problem: find the largest subgraph that is fully connected

17

a

c

b

d e

Page 18: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Graph edit distance✦ Find the “minimum cost” set of edit

operations that makes the first graph isomorphic to the second

18

3

1

2

a

c

b

d

4

e

Page 19: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Graph edit distance✦ Find the “minimum cost” set of edit

operations that makes the first graph isomorphic to the second

19

3

1

2

a

c

b

d

4

e

5

XRemove edge (3,4)Add edge (4,2)Add node 5Add edge (2,5)

Page 20: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Time complexity✦ All the above problems except

isomorphism have been proved to be NP-Complete

✦ For isomorphism there is a demonstration (pending verification) that the problem is QP (Quasi-Polynomial), with complexity:

20

Page 21: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Large graphs✦ Early applications of graph matching only

worked with very small graphs ✦ e.g. my (modest) very first scientific paper!

21

Page 22: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Large graphs

✦ Initial generation of algorithms able to work with graphs ranging from tens of nodes (graph edit distance) to a hundred nodes (subgraph isomorphism)

✦ In the late ‘90s and early 2000s several works addressed the problem of large graphs

22

Page 23: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Subgraph isomorphism

✦ The most popular subgraph isomorphism algorithm was Ullmann’s (1976)

✦ Ullmann required a memory space that is cubic (O(n3)) wrt to the size of the graphs • Only suitable up to a few hundred nodes

23

Page 24: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Subgraph isomorphism

✦ In 1999 we proposed the VF algorithm, that requires O(n2) space • Cordella, L.P., Foggia, P., Sansone,

C., Vento, M. Performance evaluation of the VF graph matching algorithm (1999) Proc. ICIAP 1999, pp. 1172-1177

24

Page 25: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Subgraph isomorphism✦ In 2004 we proposed the VF2 algorithm, that

requires O(n) space • Cordella, L.P., Foggia, P., Sansone, C., Vento, M. A (sub)graph

isomorphism algorithm for matching large graphs (2004) IEEE Trans PAMI, 26 (10), pp. 1367-1372.

25

Page 26: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Graph edit distance

✦ In 2007, Riesen, Neuhaus and Bunke proposed a method for approximating the GED using the Linear Assignment Problem, that can be solved in O(n3) time • Riesen K., Neuhaus M., Bunke H. (2007) Bipartite Graph Matching

for Computing the Edit Distance of Graphs. In: Escolano F., Vento M. (eds) Graph-Based Representations in Pattern Recognition. GbRPR 2007. Lecture Notes in Computer Science, vol 4538.

26

Page 27: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Graph edit distance

✦ In 2012, we used approximate GED for analyzing the dynamics of socio-economic networks • V Carletti, D De Stefano, M Fattore, P Foggia, R Grassi, Dynamical

analysis of interlocking directorates using graph edit distance, International Workshop on Network Models in Statistics (2012)

• Applied to the network induced by mutual participation in the board of directors of italian firms in the industrial, financial and publishing sector from 2007 to 2011, to detect structural changes

27

Page 28: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Graph edit distance✦ In 2017, we proposed a better but slower

approximation using the Quadratic Assignment Problem • Bougleux, S., Brun, L., Carletti, V., Foggia, P., Gaüzère,

B., Vento, M. Graph edit distance as a quadratic assignment problem (2017) Pattern Recognition Letters, 87, pp. 38-46.

28

Page 29: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Graph edit distance✦ LAP vs QAP

29

Time Error

Page 30: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Large enough?

✦ Although the above mentioned algorithms made possible to use larger graphs than before, new applications raised the bar regarding what “large” means

30

Page 31: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Examples: bioinformatics

✦ Protein Data Bank: world wide collection of information on proteins ✦ Protein chemical structures: 500 to 10000 nodes,

very sparse (max degree 4) ✦ Protein contact maps: 150 to 800 nodes, but

denser (average degree 20) ✦ Protein interaction networks:

✦ Yeast: ~3000 nodes, ~12000 edges ✦ Human: ~4600 nodes, ~86000 edges

31

Page 32: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Examples: social networks✦ Social circles in Facebook (ego-Facebook,

2012) ✦ ~4000 nodes representing (anonymized) Fb

users, with ~90000 undirected edges representing friendship

✦ Wikipedia Admin elections (wiki-Vote, 2010) ✦ ~7000 nodes representing (anonymized)

Wikipedia users, with ~100000 directed edges representing votes for admin role

32

Page 33: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Sequential approach

✦ Several algorithms have been proposed in the 2010s, that try to improve over 2nd generation matching algorithms using better heuristics (making assumption on the graph properties)

33

Page 34: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

RI

✦ Bonnici and Giugno proposed RI, a subgraph isomorphism algorithm suited to large and sparse graphs in bioinformatics • Bonnici, V., Giugno, R.: On the variable ordering

in subgraph isomorphism algorithms. IEEE/ACM Trans. Comput. Biol. Bioinform. (2016)

34

Page 35: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

VF3

✦ We proposed VF3, an evolution of VF2 better suited to large and dense graphs ✦ V. Carletti, P. Foggia, A. Saggese and M. Vento,

Challenging the Time Complexity of Exact Subgraph Isomorphism for Huge and Dense Graphs with VF3, in IEEE Trans. on PAMI, vol. 40, no. 4, pp. 804-818, 2018

35

Page 36: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Node reordering

✦ Both RI and VF3 use the idea of reordering the nodes before starting the matching so as to detect as early as possible if a tentative matching is going to fail • The ordering criteria are different

• VF3 also has additional heuristics for dense graphs

36

Page 37: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Node reordering✦ Starting here will take several steps

before discovering that the small graph is not contained in the large one

37

1

2

3

4 5

a

b

c

d g

e f

h

Page 38: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Node reordering✦ Starting here will take several steps

before discovering that the small graph is not contained in the large one

38

1

2

3

4 5

a

b

c

d g

e f

h

Page 39: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Node reordering✦ Starting here will take several steps

before discovering that the small graph is not contained in the large one

39

1

2

3

4 5

a

b

c

d g

e f

h

Page 40: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Node reordering✦ Starting here will take several steps

before discovering that the small graph is not contained in the large one

40

1

2

3

4 5

a

b

c

d g

e f

h

Page 41: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Node reordering✦ Starting here will take several steps

before discovering that the small graph is not contained in the large one

41

1

2

3

4 5

a

b

c

d g

e f

h

Page 42: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Node reordering✦ Instead, starting here will immediately

discover that no node can be associated to “3”

42

1

2

3

4 5

a

b

c

d g

e f

h

Page 43: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

RI and VF3 performance

43

Protein structures

Page 44: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

RI and VF3 performance

44

Dense random graphs

Page 45: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

TurboISO

✦ Avoids exploring equivalent permutations of nodes by using auxiliary data structures • Wook-Shin Han, Jinsoo Lee, and Jeong-Hoon Lee,

Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In Proc. SIGMOD ’13, pp. 337-348, 2013.

45

Page 46: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

TurboISO✦ Can be very efficient in matching very

small subgraphs with very large graphs

46

Human Protein Interaction Network (~4600 nodes)

Page 47: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Parallel approach

✦ Given the wide availability of multiprocessor systems, several works have proposed parallelization as a mean for reducing the matching time ✦ Usually using a shared memory architecture

47

Page 48: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

VF3P

✦ We have developed a parallel version of VF3, called VF3P (presented at GbR2019) ✦ We cannot divide the graph among the processors

(data parallelism) ✦ So we divide the parts of the solution space

(using a State-Space-Representation) to be explored (task parallelism) ✦ Finding the right granularity ✦ Reducing the communication overhead

48

Page 49: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

VF3P✦ Our preliminary results show a speedup that

is close to the number of processors when the graphs get large

49

Page 50: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Future challenges

✦ The algorithms we have seen allow us to work with graphs with tens of thousands nodes

✦ Still, we have new applications yielding much larger graphs

50

Page 51: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Future challenges

✦ The Stanford Large Networks dataset collections (SNAP, http://snap.stanford.edu) includes very large graphs from social networks ✦ LiveJournal members: almost 5 millions nodes, 69

millions edges ✦ Friendster on line gaming: ~65 millions nodes, 1.8

billions edges

51

Page 52: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Future challenges✦ A new trend in bioinformatics is the use

of graph representations for genomic data ✦ Genome graph: represents the genetic information

of a population, describing the common parts and the individual variations as paths within the graph

✦ A genome graph for 1000 human subjects: about 100 millions nodes, 3-400 millions edges

52

Page 53: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Graph summarization

✦ One approach for working with such large graphs is Graph Summarization ✦ Convert groups of nodes into nodes for obtaining a

smaller graph with the same overall structure

53

3

1

2

54

6

7

9 8

456

123

789

Page 54: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Graph summarization✦ A recent paper proposes a nice theoretical

result about summarization ✦ M. Pelillo, I. Elezi, M. Fiorucci, Revealing structure in

large graphs: Szemerédi’s regularity lemma and its use in pattern recognition, Pattern Recognition Letters, Volume 87, 2017, pp. 4-11

✦ Basically, the RL says that every graph large enough and dense enough is partitionable into a bounded number of regular bipartite graphs plus a small number of extra nodes and edges

✦ Does it work on practical applications? Can it be used to implement some form of hierarchical matching?

54

Page 55: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Subquadratic algorithms

✦ Another way to tackle these huge graphs is to replace information that is expensive to compute with approximations that can be computed in less than O(n2) time

✦ Example: in Social Network Analysis, cliques can be replaced by k-cores as a way of finding groups of nodes with strong internal connections

55

Page 56: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Subquadratic algorithms✦ k-Cores can be computed in a O(k*n)

time

56

✦ can we do something similar for (some form of) graph matching?

Page 57: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Distributed computing

✦ Shared-memory parallelism has inherent limits on the scalability it can achieve

✦ Big Data applications often use other architectures, with distributed computing ✦ Example: the Map/Reduce programming model

made popular by Google

57

Page 58: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Spark/GraphX✦ The Apache distributed computing framework

Spark includes an extension named GraphX offering APIs for implementing massively parallel algorithms on graph ✦ the programming model can be seen as an adaptation

to graphs of Map/Reduce ✦ Facebook uses a similar approach for running some

graph-based algorithms (e.g. for page ranking) ✦ Can graph matching (or a suitable surrogate of it) be

adapted to such a paradigm?

58

Page 59: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

GPU algorithms

✦ GPU accels make available tens to hundreds of teraflops at an affordable price ✦ we don’t have yet graph matching algorithms able to

fully exploit this huge computational power (GPUs are best suited for data parallelism, while most parallel matching algorithms use task parallelism)

✦ Can graph matching be formulated in a data-parallel way?

59

Page 60: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Neural networks

✦ Deep neural network can approximate very complicated functions (e.g. object recognition). And what’s best is that they learn how to do it! ✦ with tensor GPU accelerators they can be quite fast ✦ Can we use deep learning to learn how to match

graphs (even approximately)?

60

Page 61: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Neural networks

✦ Can we learn a graph similarity measure? ✦ Given two graphs, the network should output an

approximation of their edit distance ✦ Most graph NN process one graph at a time

61

Page 62: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Neural networks

✦ Can we learn how to search a pattern subgraph inside a larger target graph? ✦ YOLO detects and classifies smaller objects inside a

large image…

62

Page 63: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Neural networks

✦ Can we learn a useful compact representation for graphs? ✦ a sort of graph autoencoder ✦ this representation can be used as the starting point

for other operations (e.g. graph similarity) ✦ Several graph NN have already been proposed for

node embedding

63

Page 64: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…

Conclusions

✦ In the last three decades we have increased the size of the graphs in our applications from a few nodes to tens of thousands of nodes and more

✦ We are now at a point where we cannot just evolve our algorithms, if we want to face the challenge coming from the huge graphs of Bioinformatics and Social Network Analysis

64

Page 65: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…65

th

a

n k

q u

e

y o

u s

t i o

s

n ?

Page 66: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…66

th

a

n k

q u

e

y o

u s

t i o

s

n ?

Page 67: Graph algorithms for very large graphs and their applications to …sailab.diism.unisi.it/.../uploads/2019/07/FoggiaGraphs.pdf · 2019-07-29 · P. Foggia - Graph algorithms for very

P. Foggia - Graph algorithms for very large graphs and their applications…67

th

a

n k

q u

e

y o

u s

t i o

s

n ?