dynamics of complex systems self-similar phenomena and networks guido caldarelli cnr-infm istituto...
TRANSCRIPT
DYNAMICS OF COMPLEX SYSTEMSSelf-similar phenomena and Networks
Guido CaldarelliCNR-INFM Istituto dei Sistemi Complessi
4/6
1. SELF-SIMILARITY (ORIGIN AND NATURE OF POWER-LAWS)
2. GRAPH THEORY AND DATA
3. SOCIAL AND FINANCIAL NETWORKS
4. MODELS
5. INFORMATION TECHNOLOGY
6. BIOLOGY
•STRUCTURE OF THE COURSE
•STRUCTURE OF THE FOURTH LECTURE
4.1) DEFINITION OF THE MODELS
4.2) RANDOM GRAPHS
4.3) SMALL WORLD
4.4) MULTIPLICATIVE PROCESSES
4.5) BARABASI-ALBERT
4.6) REWIRING
4.7) FITNESS
Standard Theory of Random Graph(Erdös and Rényi 1960)
Random Graphs are composed by starting with n vertices. With probability p two vertices are connected by an edge
P(k)
k
Degrees are Poisson distributed
!
)()(
k
pNekP
kpN
Small World(D. Watts and S.H. Strogatz 1998)
Degrees are peaked around mean value
Small World Graph are composed by adding
shortcuts to regular lattices
•4.1 MODELS DEFINITIONS
“Intrinsic” Fitness/Static/Hidden variable Models (K.-I. Goh, B. Khang, D. Kim 2001
G.Caldarelli A. Capocci, P.De Los Rios, M.A. Muñoz 2002)
1) Growth or notNodes can be fixed at the beginning or be added
2) Attachment is related to intrinsic properties The probability to be connected depends on the sites
Degrees are Power law distributed
Model of Growing Networks(A.-L. Barabási – R. Albert 1999)
1) GrowthEvery time step new nodes enter the system
2) Preferential Attachment The probability to be connected depends on the degree P(k) k
Degrees are Power law distributed
•4.1 MODELS DEFINITIONS
2
)1()(
NNpmE
•The number m of edges in a Random Graph is a random variable whose expectation value is
•The probability to form a particular Graph G(N,m) is given by
mNN
m ppmNGE
2
)1(
)1()),((
•The degree has expectation value pNNpNmkE )1(/2)(
!
)()1(
1)( )1(
k
epNpp
k
NkP
pNkkNk
It is easy to check that the degree probability distribution is given by
•4.2 RANDOM GRAPHS
•We can give an estimate of the Clustering Coefficientfor a complete graph it must be 1.If the graph is enough sparse then two points link each other with probability p
N
kpCE
)(
•Same estimate can be given for the average distance l between two vertices.If a graph has <k> average degree then the first neighbours will be <k>the second neighbours ~ <k>2
……………..the n-th neighbours ~ <k>n
•For the Diameter D → <k>D of order N )log(
)log(
k
NDl
•4.2 RANDOM GRAPHS
Take a regular lattice and rewire with probability some of the links(for analytical treatment, a slight modification is recommended:Instead of rewiring add the new links proportional to the existing links)
The total number of shortcuts is
Average degree is now
L
)1(2 k
)2(
Therefore for small the degree distribution is peaked around 2
•4.3 SMALL WORLD
Clustering Coefficient of the regular lattice (→ 0 and <k> < 2/3N otherwise C=1)
For the average distance there is no resultbut we can define a distance in the problem, given by the mean distance between two shortcuts endpoints.
4
Nli
We have that in the regular lattice (start with c=1 and generalize)
48
2)2/...210(2
11 2
,1,
NN
NN
Nd
Nl
Ljjii
→
We have that in the Random Graph
)log(
)log(
k
Nl
)1(4
)2(3
k
kC
•4.3 SMALL WORLD
Now in Small World graphs, the behaviour must be intermediate between the regular lattice and Random Graph.If we define a characteristic length in the system as for example x = average distance between two endpoints of shortcuts (not the same!)
2
1
)(2
L
L
)(2
)/( xGL
LGl
diverges when → 0
is characteristic distance we can define in the model so that we make the ansatz
Several conjectures, made but neither the actual distribution of path lengths nor the <l> has been found
x
xxG
)log(
1
)(1x
1x
•4.3 SMALL WORLD
In a multiplicative process you have S(t)=(t)S(t-1)=(t) (t-1)…(1)S(0)
For the central limit theorem the log of S(t) is normal distributed.
2
2
2
))(ln(
2
1)(
S
eS
Sf
2
2
22
2
2
2
2
)ln()ln(1)2ln(
2))(ln(
2
))(ln()ln()2ln())(ln(
SSSf
SSSf
If variance is large it can look as a power law with apparent slope 2 -1
•4.4 MULTIPLICATIVE PROCESS
In this case the apparent slope (2-1) of blue line is 0.6 ()
If there is a threshold on the S YOU OBTAIN REAL POWER-LAWS INSTEAD
M. Mitzenmacher, Internet Mathematics 1 226 (2004)
•4.4 MULTIPLICATIVE PROCESS
This is by far the most successful and used model in the fieldalong with the related models
• FITNESS MODEL• REWIRING MODEL• AGING EFFECTS
TWO STEPS 1. GROWTH: Every time step you add a vertex2. PREFERENTIAL ATTACHMENT:
Nj j
i
k
kk
,1
)(
This idea has been reformulated in different fields and has different names
•YULE PROCESS (G. Yule Phyl. Trans. Roy. Soc. 213 21 (1925)•SIMON PROCESS (H.A. Simon Biometrika 42 425 (1955)•DE SOLLA PRICE MODEL (D.J. De Solla Price Science 149 510 (1965)•ST. MATTHEW EFFECT (K.R. Merton Science 159 56 (1968)
•4.5 BARABASI-ALBERT
2
1,)(
22,1
ii
ii
Njj
ii
t
tmtk
t
k
tm
mk
k
mk
t
k
)())((/1
/1
k
mttPktkP ii
As for the degree distribution we can compute the P(ki<k)
The basic approach is through continuum theory, degree is now a continuum variable:
Start with m0 vertices and add for every t m new links
R. Albert, A.-L. Barabási Review of Modern Physics 74 47 (2001)
•4.5 BARABASI-ALBERT
tmtP i
0
1)(
)(
11)())((
0/1
/1
/1
/1
mtk
mt
k
mttPktkP ii
The distribution of incoming vertices is uniform in time
From which we obtain
kmmtk
tm
k
ktkPkP
ti /1
01/1
/1
2)(
12))(()(
311
•4.5 BARABASI-ALBERT
The value of the exponent depends on details of preferential attachment
•If (k)~k NO POWER LAW•If (k)~ka =3+a/m
Clustering is larger than Erdos Renyi (m>1…)
No clear Assortative/Disassortative behaviour
•4.5 BARABASI-ALBERT
Nj jj
ii
k
kk
,1
)(
TWO STEPS1. GROWTH: Every time step you add a vertex2. PREFERENTIAL ATTACHMENT:
But now vertices differ, some are good, some are bad, you measure that by assigning a ``fitness’’ i
For some choices of the distribution of fitnesses you still have Power law degree distribution and also assortativeness.Great success in reproducing Internet (A.Vazquez, R. Pastor-Satorras, A. VespignaniPhys. Rev. E 65 066130 (2002) )
G. Bianconi A.-L. BarabásiEurophys. Lett. 54 436 (2001)
•4.5 BARABASI-ALBERT: Fitness
P.L. Krapivsky, G.J. Rodgers, S. RednerPhys. Rev. Lett. 86 5401 (2001)M. Catanzaro, G. Caldarelli, L. PietroneroPhys. Rev. E 70 037101 (2004).
TWO STEPS• PROBABILITY p
1. GROWTH: Every time step you add a vertex2. PREFERENTIAL ATTACHMENT
• PROBABILITY 1-p1. REWIRING of existing nodes
•4.5 BARABASI-ALBERT: Rewiring
K. Klemm and V. M. EguíluzPhys. Rev. E 65 036123 (2002)
Only m vertices enter in the dynamics. Those are the ACTIVE SITES
DIFFERENT STEPS 1. GROWTH: Every time step you add a vertex
This new vertex draw a link with all the m active vertices.2. AGING: A vertex is deactivated with a probability proportional
to (ki+a)-1
Nj j
i
ak
akk
,1
1
1
)(
)()(
•4.5 BARABASI-ALBERT: Aging
Consider the WWW. What is the “microscopic” process of growth?
• You see a WWW page that you like (i.e. that of a friend of yours)• You copy it, and change a little bit
R. Kumar et al. Computer Networks 31 1481 (1999) A. Vazquez, et al. Nature Biotechnology 21 697 (2003)
TWO STEPS1. GROWTH: Every time step you copy a vertex and its m edges2. MUTATION (for everyone of m edges)
• With Probability (1- you keep it• With Probability you change destination vertex
•4.6 REWIRING
The rate of change is given by
N
m
N
k
t
k
)1(
It becomes clear we have an effective preferential attachment.It can be demonstrated (NOT HERE!)
1)1(
)(
1
jj t
tmtk
1
2
1)(
mkkP jj
•4.6 REWIRING
In the completely different context of protein interaction networks the same mechanism is in agreement with the current view of genome evolution. When organisms reproduce, the duplication of their DNA is accompanied by mutations. Those mutations can sometimes entail a complete duplication of a gene. Since in this case the corresponding protein can be produced by two different copies of the same gene, point-like mutations on one of them can accumulate at a rate faster than normal since a weaker selection pressure is applied. Consequently, proteins with new, properties can arise by this process. The new proteins arising by this mechanism share many physico-chemical properties with their ancestors. Many interactions remain unchanged, some are lost and some are acquired.
• CLUSTERING MUCH SIMILAR TO THAT OF WWW
•4.6 REWIRING
Without introducing growth or preferential attachment we can have power-laws We consider “disorder” in the Random Graph model (i.e. vertices differ one from the other).
This mechanism is responsible of self-similarity in Laplacian Fractals
•Dielectric Breakdown
•In reality•In a perfect dielectric
K.-I. Goh, B. Khang, D. Kim Phys. Rev. Lett 87 278701, 2001G.Caldarelli et al. Phys Rev. Lett. 89 258702 2002
•4.6 FITNESS MODEL
1. Assign to every vertex one real positive number x that we call fitness. fitnesses are drawn from probablity distribution r(x)
2. Link two vertices with fitnesses x and y according to a probability function f(x,y)=f(y,x) (choice function).
STATIC if N is kept fixedThe model can be considered
DYNAMIC if N is growingThis is a GOOD GETS RICHER modelNo preferential attachment is present.
V.D.P. Servedio, P. Buttà, G. Caldarelli Phys. Rev. E 70 027102 (2004).
•4.6 FITNESS MODEL
Different realizations of the modela) b) c) have (x) power law with exponent 2.5 ,3 ,4 respectively. d) has (x)=exp(-x) and a threshold rule.
•4.6 FITNESS MODEL
Degree distribution for the case d) with (x)=exp(-x) and a threshold rule.
Degree distribution for casesa) b) c) with (x) power law with
exponent 2.5 ,3 ,4 respectively.
•4.6 FITNESS MODEL
The Degree probability distribution P(k) is a functional of (x) and f(x,y).
DIRECT PROBLEM
Given a fitness (x) → which choice function f(x,y) produces scale free
graphs? i.e. P(k) = ck
INVERSE PROBLEM
Given a choice function f(x,y) → which fitness (x) produces scale free
graphs? i.e. P(k) = ck
•4.6 FITNESS MODEL
• Fitness probability distribution
1)(
)(0)(
R
xRy
x
o
dyyxR )()( Non decreasing
• Vertex degree
1)(0),()()(
)(
xkdyyxfyN
xKxk
o
)(')())(()())(( kxxxkPdxxdkxkP
• Vertex degree Probability Distribution
)('
)())((
xk
xxkP
•4.6 FITNESS MODEL
• Degree Correlation
)(
)()(),()( 0
xk
dyyykyxfNxKnn
)(
)()(),(),(),()(
20
xk
dydzzyyxfzyfzxfxC
• Vertex Clustering Coefficient
•4.6 FITNESS MODEL
)('
)()(
xk
xkP
We impose P(k)=c(k(x))→
Multipling both sides of the equation for k’(x) and integrating from 0 to x
))((
)('
)(xkc
xk
x
1
1
10
)(
0
1
1
|1|0
)(1
)(|1|
)(
xRc
k
ek
xRc
k
xk c
xR
1
1
1
)0(),()()(
)( 0 kkdyyxfyN
xKxk
o
x
o
dyyxR )()(
•4.6 FITNESS MODEL
We now have a constraint on the fitness distribution (x) and choice function f(x,y)
1
1
10
)(
0
1
1
|1|0
)(1
)(|1|
)(
xRc
k
ek
xRc
k
xk c
xR
1
1
1
Some exact results
xexyxkyxkyxfyxf
xykxkk
ygxgyxf
)()(')()(),(
)()()(1
)()(),(2
•4.6 FITNESS MODEL
Special case f(x,y)=g(x)g(y)
•4.6 FITNESS MODEL
Special case f(x,y)=f(x+y)
•4.6 FITNESS MODEL
Special case f(x,y)=f(x-y)
•4.6 FITNESS MODEL
• Using the intrinsic fitness model it is possible to create scale-free networks with any desired power-law exponent
• This is possible for any fitness probability distribution (x), it does not matter if they are (e.g.) exponential, power-law or Gaussian.
• We found analytic expressions for the choice function f(x,y) in three cases:
• f(x,y)=f(x)f(y) for every (x), • f(x,y)=f(x y) (x)=e-x
• If f(x,y)=f(x)f(y) both vertex degree correlation and clustering coefficient are constant
•4.6 FITNESS MODEL
There are plenty of models around, to check what is more likely to reproduce the data we have to check a series of quantities•Degree distribution•Assortativity•Clustering, Motifs etc.
Not all the ingredients are equally likely:
•RANDOM GRAPH: You choose your partner at random.
•INTRINSIC FITNESS You choose your partner if you like her/him
•BARABÁSI-ALBERT: To choose your partner:• You must know how many partners she/he
already had• The larger this number, the better
•COPYING: You choose the partners of your close friends
•CONCLUSIONS