complex networks and their analysis with random walks

31
Complex Networks and their Analysis with Random Walks Daniel R. Figueiredo Federal University of Rio de Janeiro, Brazil Konstantin Avrachenkov INRIA Sophia Antipolis, France ITC 27 – Tutorial – Ghent, Belgium

Upload: others

Post on 30-Dec-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Complex Networks and their Analysis with

Random WalksDaniel R. Figueiredo

Federal University of Rio de Janeiro, Brazil

Konstantin AvrachenkovINRIA Sophia Antipolis, France

ITC 27 – Tutorial – Ghent, Belgium

ITC 2015

What is Complex Networks About?

Understand how and why things are connected and consequences of connectededness

“Connected things” Networks

“How, why, consequences” Complex

Network connectedness takes center stageᴏ necessary to understand phenomena

we are yet to understand

The network is not complex!

ITC 2015

Objectives and OrganizationFirst contact

ᴏ “networks everywhere”

Empirical findings of networksᴏ important commonalities

Mathematical models for networks

Introduction to random walk ᴏ simple yet profound

Applications of random walks

ᴏ ranking, sampling, clustering, etc

break

Daniel

1h

1h

Kostia

.5h

1.5h

ITC 2015

Networks

Set of points connected by line segments

What is a network?

a

c

b

e

d

f

Bureuacraticdefinition!

ITC 2015

Networks, another definitionAbstraction to encode relationships

among pairs of objects

network nodesobjects

network edgesrelationships

Mathematical abstraction to represent structures

Networks aka graphs (or vice-versa)

ITC 2015

Social Networks Objects: people

Relationship: be connected on Facebook

Carlos

Marcos

Maria Bruno

Ana Rodrigo

Carol

Pedro

Another relationship: have kissed (in real life!)

Carlos

Marcos

Maria Bruno

Ana Rodrigo

Carol

Pedro

Different relationships encodedon same set of objects

ITC 2015

Collaboration Networks Objects: professors Relationship: co-authorship of papers

PESC/COPPE

DBLP 2009

Importanceof structure?

ITC 2015

Sexual Contact Networks Objects: people Relationship: sexual contact

Importance ofstructure?

ITC 2015

Web Network Information network (webpage is unit of info) Assymetric relationship: hyperlinks

20+ billion webpages (worldwidewebsize.com)

Importance ofstructure?

ITC 2015

Internet Network of networks (AS Level Graph)

40+ thousand nodes (AS)

Importance ofstructure?

ITC 2015

Neural Network C. elegans (roundworm) Simple nervous system

Neural networkᴏ 302 neurons

ᴏ fully mapped in 70s

ᴏ Nobel Prize in 2002

Importance ofstructure?

openworm.org

ITC 2015

Protein Interaction Network Proteins of virus EBV (circles) and proteins of

humans (squares)

Importance ofstructure?

ITC 2015

Talking About Networks

How to describe a network?

Draw it! “A picture is worth a thousand words”ᴏ what if network is large, 10K+ nodes?

Adjacency matrix: simple structural description

ᴏ aij = 1 , if nodes i and j are related

ᴏ aij = 0 , otherwise

Encodes all relationshipsGeneralizes to asymmetric and weighted

relationships

ITC 2015

Adjacency MatrixExample

1

3

2

4

1 2 3 41 0 1 1 02 1 0 1 03 1 1 0 14 0 0 1 0

?

ITC 2015

Adjacency MatrixProblem with describing network

with adjacency matrix?

Structural Summaries!

Too much information, too little intuition!ᴏ ie., adjacency matrix of web

Adjacency matrix is DNA of networkᴏ do you describe people by their DNA?

Understand structure in intuitive level, capturing its essence

ITC 2015

Which are the important characteristics?

Network CharacteristicsCharacteristics provide structural information

on networkᴏ size, density, degrees, distances, clustering,

centrality, homophily, etc

Depends on object of studyᴏ like gens in DNA, we yet to fully know

their meaning

ITC 2015

Two properties to make structural characteristic important

1) predict behavior of particular process

2) influence various processes

Best example: network degree distribution 1) determines random walk behavior

2) influences search and epidemics

Yet to discover powerful characteristics of networksᴏ ex. eigenvalue of adjacency matrix

Important Characteristics

ITC 2015

Number of nodes in a network

ᴏ n = |V|Number of edges (present relationships)

ᴏ m = |E|Largest number of edges a network

can have?

Nodes and Edges

number of unordered pairs in a set of n objects

every node is connected to every other node

cardinality of set of nodes

(n2) n n−1

2≤n2=

ITC 2015

Average Degree and Density

degree of node ig=1/n∑i∈V

d i

∑i∈V

d i=2 m

Degree: number of edges incident on nodeᴏ for asymmetric relationships, in-degree different

from out-degree

Average degree

Note that every edge has two end points

g=2 m /n

Density: fraction of edges present in network

ρ=m

(n2)

2 mn (n−1)

gn−1

= =

ITC 2015

Degree Distribution

pkNumber of nodes with degree k

Total number of nodes=

1−∑i=0

kpk

Empirical distribution of node degreeᴏ fraction of nodes with degree k

Empirical CCDF – Complementary Cumulative Distribution Functionᴏ fraction of nodes with degree greater than k

k = 0, 1, …, n-1

Let D be a random variable representing node degree (node chosen uniformly at random)

P[ D=k ]= pk

ITC 2015

Example

Empirical degree distribution?

21 2

35

46

3

43 1

1

1 2 3

1

degree

PDFP[D = k]

4 1 2 3

1

degree

CCDFP[D > k]

4

ITC 2015

DistancesPath: sequence of edges on the networkDistance: length of shortest path between

pair of nodes

2 6

4 75

1

3

d(1,7) = ?d(5,2) = ?note: (many) more

than 1 shortest path

Average distance of networkᴏ across all

(unordered)pairs

d=

∑i , j∈V

d (i , j)

(n2)

ITC 2015

Clustering CoefficientLocal transitivity

ᴏ (i,j) neighbors, (j,k) neighhors → (i,k) neighbors?

Local transitivity forms trianglesTwo popular ways to measure it

1) Local measure: Fraction of edges among neighbors of a node

2) Global measure: Fraction of two hop paths that are triangles

Related but not identical for most networksᴏ global measure preferred

ITC 2015

Local ClusteringDefined for each network nodeFraction of edges among neighbors of node

C i=Ei

d i

2 max number of edges among node with degree d

i

number of edges among neighbors of i C i=

2 E i

d id i−1

Network clustering: average clustering across all nodes

C=1/ n∑i∈V

C i

ITC 2015

ExampleC i=

Ei

d i

2 max number of edges among node with degree d

i

number of edges among neighbors of i

C1 = ?

C4 = ?

C5 = ?

C=1/ n∑i∈V

C i

2 6

4 75

1

3

ITC 2015

Real Networks

ITC 2015

Three Important Aspects

“Small world” effect

“My friends are also friends” effect

“not born equal” effect

Fundamental and present in various real networksᴏ only recently observed (birth not net sci)

ITC 2015

“Small world” Effect“It's a small world, after all”

Average distance is very small

Even on very large networks

Web graph – 108 nodes

Actor colab net – 106 nodes

Facebook friendship - 109 nodes

7.5

3

“Six degrees of separation”

distance between two people in the world

Popularized by Milgram's experiment in 60's

Principle extends beyond social networks

4.5

ITC 2015

“My friends are also friends” Effect

If A has relationship with B and C,

then B and C have large chance of also being related

Two hop paths tend to become triangles on networks

Induces high clustering coefficient

AS graph – 104 nodes

Facebook – 109 nodes

0.39

0.14

Randomlypaired

0.000560.00000026

ITC 2015

“Not born equal” effect

Node degrees are very unequal

most very small, few extremely large

similar to distribution of wealth in some countries (like Brazil)

Heavy tailed degree distribution

AS Graph – 104 nodes

avg degree: 5.9

max degree: ~2100

Citation network – 106 nodes

avg degree: 8.6

max degree: ~9000

ITC 2015

The PuzzleFact I: Various networks have peculiar structural properties

degree distribution, distances, clustering, etc

Fact II: Various networks have similar structural properties

Web, Facebook, neural network, citation, etc

“Million dollar question”

Why?

by and large still searching for a satisfactory answer!