dynamics of complex systems self-similar phenomena and networks guido caldarelli cnr-infm istituto...

DYNAMICS OF COMPLEX SYSTEMSSelf-similar phenomena and Networks

Guido CaldarelliCNR-INFM Istituto dei Sistemi Complessi

[email protected]

4/6

1. SELF-SIMILARITY (ORIGIN AND NATURE OF POWER-LAWS)

2. GRAPH THEORY AND DATA

3. SOCIAL AND FINANCIAL NETWORKS

4. MODELS

5. INFORMATION TECHNOLOGY

6. BIOLOGY

•STRUCTURE OF THE COURSE

•STRUCTURE OF THE FOURTH LECTURE

4.1) DEFINITION OF THE MODELS

4.2) RANDOM GRAPHS

4.3) SMALL WORLD

4.4) MULTIPLICATIVE PROCESSES

4.5) BARABASI-ALBERT

4.6) REWIRING

4.7) FITNESS

Standard Theory of Random Graph(Erdös and Rényi 1960)

Random Graphs are composed by starting with n vertices. With probability p two vertices are connected by an edge

P(k)

k

Degrees are Poisson distributed

!

)()(

k

pNekP

kpN

Small World(D. Watts and S.H. Strogatz 1998)

Degrees are peaked around mean value

Small World Graph are composed by adding

shortcuts to regular lattices

•4.1 MODELS DEFINITIONS

“Intrinsic” Fitness/Static/Hidden variable Models (K.-I. Goh, B. Khang, D. Kim 2001

G.Caldarelli A. Capocci, P.De Los Rios, M.A. Muñoz 2002)

1) Growth or notNodes can be fixed at the beginning or be added

2) Attachment is related to intrinsic properties The probability to be connected depends on the sites

Degrees are Power law distributed

Model of Growing Networks(A.-L. Barabási – R. Albert 1999)

1) GrowthEvery time step new nodes enter the system

2) Preferential Attachment The probability to be connected depends on the degree P(k) k

Degrees are Power law distributed

•4.1 MODELS DEFINITIONS

2

)1()(

NNpmE

•The number m of edges in a Random Graph is a random variable whose expectation value is

•The probability to form a particular Graph G(N,m) is given by

mNN

m ppmNGE

2

)1(

)1()),((

•The degree has expectation value pNNpNmkE )1(/2)(

!

)()1(

1)( )1(

k

epNpp

k

NkP

pNkkNk

It is easy to check that the degree probability distribution is given by

•4.2 RANDOM GRAPHS

•We can give an estimate of the Clustering Coefficientfor a complete graph it must be 1.If the graph is enough sparse then two points link each other with probability p

N

kpCE

)(

•Same estimate can be given for the average distance l between two vertices.If a graph has <k> average degree then the first neighbours will be <k>the second neighbours ~ <k>2

……………..the n-th neighbours ~ <k>n

•For the Diameter D → <k>D of order N )log(

)log(

k

NDl

•4.2 RANDOM GRAPHS

Take a regular lattice and rewire with probability some of the links(for analytical treatment, a slight modification is recommended:Instead of rewiring add the new links proportional to the existing links)

The total number of shortcuts is

Average degree is now

L

)1(2 k

)2(

Therefore for small the degree distribution is peaked around 2

•4.3 SMALL WORLD

Clustering Coefficient of the regular lattice (→ 0 and <k> < 2/3N otherwise C=1)

For the average distance there is no resultbut we can define a distance in the problem, given by the mean distance between two shortcuts endpoints.

4

Nli

We have that in the regular lattice (start with c=1 and generalize)

48

2)2/...210(2

11 2

,1,

NN

NN

Nd

Nl

Ljjii

→

We have that in the Random Graph

)log(

)log(

k

Nl

)1(4

)2(3

k

kC

•4.3 SMALL WORLD

Now in Small World graphs, the behaviour must be intermediate between the regular lattice and Random Graph.If we define a characteristic length in the system as for example x = average distance between two endpoints of shortcuts (not the same!)

2

1

)(2

L

L

)(2

)/( xGL

LGl

diverges when → 0

is characteristic distance we can define in the model so that we make the ansatz

Several conjectures, made but neither the actual distribution of path lengths nor the <l> has been found

x

xxG

)log(

1

)(1x

1x

•4.3 SMALL WORLD

In a multiplicative process you have S(t)=(t)S(t-1)=(t) (t-1)…(1)S(0)

For the central limit theorem the log of S(t) is normal distributed.

2

2

2

))(ln(

2

1)(

S

eS

Sf

2

2

22

2

2

2

2

)ln()ln(1)2ln(

2))(ln(

2

))(ln()ln()2ln())(ln(

SSSf

SSSf

If variance is large it can look as a power law with apparent slope 2 -1

•4.4 MULTIPLICATIVE PROCESS

In this case the apparent slope (2-1) of blue line is 0.6 ()

If there is a threshold on the S YOU OBTAIN REAL POWER-LAWS INSTEAD

M. Mitzenmacher, Internet Mathematics 1 226 (2004)

•4.4 MULTIPLICATIVE PROCESS

This is by far the most successful and used model in the fieldalong with the related models

• FITNESS MODEL• REWIRING MODEL• AGING EFFECTS

TWO STEPS 1. GROWTH: Every time step you add a vertex2. PREFERENTIAL ATTACHMENT:

Nj j

i

k

kk

,1

)(

This idea has been reformulated in different fields and has different names

•YULE PROCESS (G. Yule Phyl. Trans. Roy. Soc. 213 21 (1925)•SIMON PROCESS (H.A. Simon Biometrika 42 425 (1955)•DE SOLLA PRICE MODEL (D.J. De Solla Price Science 149 510 (1965)•ST. MATTHEW EFFECT (K.R. Merton Science 159 56 (1968)

•4.5 BARABASI-ALBERT

2

1,)(

22,1

ii

ii

Njj

ii

t

tmtk

t

k

tm

mk

k

mk

t

k

)())((/1

/1

k

mttPktkP ii

As for the degree distribution we can compute the P(ki<k)

The basic approach is through continuum theory, degree is now a continuum variable:

Start with m0 vertices and add for every t m new links

R. Albert, A.-L. Barabási Review of Modern Physics 74 47 (2001)


tmtP i

0

1)(

)(

11)())((

0/1

/1

/1

/1

mtk

mt

k

mttPktkP ii

The distribution of incoming vertices is uniform in time

From which we obtain

kmmtk

tm

k

ktkPkP

ti /1

01/1

/1

2)(

12))(()(

311


The value of the exponent depends on details of preferential attachment

•If (k)~k NO POWER LAW•If (k)~ka =3+a/m

Clustering is larger than Erdos Renyi (m>1…)

No clear Assortative/Disassortative behaviour


Nj jj

ii

k

kk

,1

)(

TWO STEPS1. GROWTH: Every time step you add a vertex2. PREFERENTIAL ATTACHMENT:

But now vertices differ, some are good, some are bad, you measure that by assigning a ``fitness’’ i

For some choices of the distribution of fitnesses you still have Power law degree distribution and also assortativeness.Great success in reproducing Internet (A.Vazquez, R. Pastor-Satorras, A. VespignaniPhys. Rev. E 65 066130 (2002) )

G. Bianconi A.-L. BarabásiEurophys. Lett. 54 436 (2001)

•4.5 BARABASI-ALBERT: Fitness

P.L. Krapivsky, G.J. Rodgers, S. RednerPhys. Rev. Lett. 86 5401 (2001)M. Catanzaro, G. Caldarelli, L. PietroneroPhys. Rev. E 70 037101 (2004).

TWO STEPS• PROBABILITY p

1. GROWTH: Every time step you add a vertex2. PREFERENTIAL ATTACHMENT

• PROBABILITY 1-p1. REWIRING of existing nodes

•4.5 BARABASI-ALBERT: Rewiring

K. Klemm and V. M. EguíluzPhys. Rev. E 65 036123 (2002)

Only m vertices enter in the dynamics. Those are the ACTIVE SITES

DIFFERENT STEPS 1. GROWTH: Every time step you add a vertex

This new vertex draw a link with all the m active vertices.2. AGING: A vertex is deactivated with a probability proportional

to (ki+a)-1

Nj j

i

ak

akk

,1

1

1

)(

)()(

•4.5 BARABASI-ALBERT: Aging

Consider the WWW. What is the “microscopic” process of growth?

• You see a WWW page that you like (i.e. that of a friend of yours)• You copy it, and change a little bit

R. Kumar et al. Computer Networks 31 1481 (1999) A. Vazquez, et al. Nature Biotechnology 21 697 (2003)

TWO STEPS1. GROWTH: Every time step you copy a vertex and its m edges2. MUTATION (for everyone of m edges)

• With Probability (1- you keep it• With Probability you change destination vertex

•4.6 REWIRING

The rate of change is given by

N

m

N

k

t

k

)1(

It becomes clear we have an effective preferential attachment.It can be demonstrated (NOT HERE!)

1)1(

)(

1

jj t

tmtk

1

2

1)(

mkkP jj

•4.6 REWIRING

In the completely different context of protein interaction networks the same mechanism is in agreement with the current view of genome evolution. When organisms reproduce, the duplication of their DNA is accompanied by mutations. Those mutations can sometimes entail a complete duplication of a gene. Since in this case the corresponding protein can be produced by two different copies of the same gene, point-like mutations on one of them can accumulate at a rate faster than normal since a weaker selection pressure is applied. Consequently, proteins with new, properties can arise by this process. The new proteins arising by this mechanism share many physico-chemical properties with their ancestors. Many interactions remain unchanged, some are lost and some are acquired.

• CLUSTERING MUCH SIMILAR TO THAT OF WWW

•4.6 REWIRING

Without introducing growth or preferential attachment we can have power-laws We consider “disorder” in the Random Graph model (i.e. vertices differ one from the other).

This mechanism is responsible of self-similarity in Laplacian Fractals

•Dielectric Breakdown

•In reality•In a perfect dielectric

K.-I. Goh, B. Khang, D. Kim Phys. Rev. Lett 87 278701, 2001G.Caldarelli et al. Phys Rev. Lett. 89 258702 2002

•4.6 FITNESS MODEL

1. Assign to every vertex one real positive number x that we call fitness. fitnesses are drawn from probablity distribution r(x)

2. Link two vertices with fitnesses x and y according to a probability function f(x,y)=f(y,x) (choice function).

STATIC if N is kept fixedThe model can be considered

DYNAMIC if N is growingThis is a GOOD GETS RICHER modelNo preferential attachment is present.

V.D.P. Servedio, P. Buttà, G. Caldarelli Phys. Rev. E 70 027102 (2004).


Different realizations of the modela) b) c) have (x) power law with exponent 2.5 ,3 ,4 respectively. d) has (x)=exp(-x) and a threshold rule.


Degree distribution for the case d) with (x)=exp(-x) and a threshold rule.

Degree distribution for casesa) b) c) with (x) power law with

exponent 2.5 ,3 ,4 respectively.


The Degree probability distribution P(k) is a functional of (x) and f(x,y).

DIRECT PROBLEM

Given a fitness (x) → which choice function f(x,y) produces scale free

graphs? i.e. P(k) = ck

INVERSE PROBLEM

Given a choice function f(x,y) → which fitness (x) produces scale free

graphs? i.e. P(k) = ck


• Fitness probability distribution

1)(

)(0)(

R

xRy

x

o

dyyxR )()( Non decreasing

• Vertex degree

1)(0),()()(

)(

xkdyyxfyN

xKxk

o

)(')())(()())(( kxxxkPdxxdkxkP

• Vertex degree Probability Distribution

)('

)())((

xk

xxkP


• Degree Correlation

)(

)()(),()( 0

xk

dyyykyxfNxKnn

)(

)()(),(),(),()(

20

xk

dydzzyyxfzyfzxfxC

• Vertex Clustering Coefficient


)('

)()(

xk

xkP

We impose P(k)=c(k(x))→

Multipling both sides of the equation for k’(x) and integrating from 0 to x

))((

)('

)(xkc

xk

x

1

1

10

)(

0

1

1

|1|0

)(1

)(|1|

)(

xRc

k

ek

xRc

k

xk c

xR

1

1

1

)0(),()()(

)( 0 kkdyyxfyN

xKxk

o

x

o

dyyxR )()(


We now have a constraint on the fitness distribution (x) and choice function f(x,y)

1

1

10

)(

0

1

1

|1|0

)(1

)(|1|

)(

xRc

k

ek

xRc

k

xk c

xR

1

1

1

Some exact results

xexyxkyxkyxfyxf

xykxkk

ygxgyxf

)()(')()(),(

)()()(1

)()(),(2


Special case f(x,y)=g(x)g(y)


Special case f(x,y)=f(x+y)


Special case f(x,y)=f(x-y)


• Using the intrinsic fitness model it is possible to create scale-free networks with any desired power-law exponent

• This is possible for any fitness probability distribution (x), it does not matter if they are (e.g.) exponential, power-law or Gaussian.

• We found analytic expressions for the choice function f(x,y) in three cases:

• f(x,y)=f(x)f(y) for every (x), • f(x,y)=f(x y) (x)=e-x

• If f(x,y)=f(x)f(y) both vertex degree correlation and clustering coefficient are constant


There are plenty of models around, to check what is more likely to reproduce the data we have to check a series of quantities•Degree distribution•Assortativity•Clustering, Motifs etc.

Not all the ingredients are equally likely:

•RANDOM GRAPH: You choose your partner at random.

•INTRINSIC FITNESS You choose your partner if you like her/him

•BARABÁSI-ALBERT: To choose your partner:• You must know how many partners she/he

already had• The larger this number, the better

•COPYING: You choose the partners of your close friends

•CONCLUSIONS

dynamics of complex systems self-similar phenomena and networks guido caldarelli cnr-infm istituto...

Documents

small world slide

random graphs

small world graphs

graph theory

models definitions slide

mean value small world

random variable

fitness slide