temporal dynamics of scale-free networks - mit media...

8
Temporal Dynamics of Scale-Free Networks Erez Shmueli, Yaniv Altshuler, and Alex ”Sandy” Pentland MIT Media Lab {shmueli,yanival,sandy}@media.mit.edu Abstract. Many social, biological, and technological networks display substan- tial non-trivial topological features. One well-known and much studied feature of such networks is the scale-free power-law distribution of nodes’ degrees. Several works further suggest models for generating complex networks which comply with one or more of these topological features. For example, the known Barabasi-Albert ”preferential attachment” model tells us how to create scale-free networks. Since the main focus of these generative models is in capturing one or more of the static topological features of complex networks, they are very limited in cap- turing the temporal dynamic properties of the networks’ evolvement. Therefore, when studying real-world networks, the following question arises: what is the mechanism that governs changes in the network over time? In order to shed some light on this topic, we study two years of data that we received from eToro: the world’s largest social financial trading company. We discover three key findings. First, we demonstrate how the network topology may change significantly along time. More specifically, we illustrate how popular nodes may become extremely less popular, and emerging new nodes may become extremely popular, in a very short time. Then, we show that although the network may change significantly over time, the degrees of its nodes obey the power- law model at any given time. Finally, we observe that the magnitude of change between consecutive states of the network also presents a power-law effect. 1 Introduction Many social, biological, and technological networks display substantial non-trivial topological features. One well-known and much studied feature of such networks is the scale-free power-law distribution of nodes’ degrees [4]. That is, the degree of nodes is distributed according to the following formula: P [d]= c · d -λ . As the study of complex networks has continued to grow in importance and popularity, many other features have attracted attention as well. Such features include among the rest: short path lengths and a high clustering coefficient [12, 2], assortativity or disassortativity among vertices [10], community structure [8] and hierarchical structure [11] for undirected networks and reciprocity [7] and triad significance profile [9] for directed networks. Several works further suggested models for generating complex networks which comply with one or more of these topological features. For example, the known Barabasi-Albert model [4] tells us how to create scale-free networks. It incorpo- rates two important general concepts: growth and preferential attachment. Growth means that the number of nodes in the network increases over time and prefer- ential attachment means that the more connected a node is, the more likely it is

Upload: others

Post on 24-Feb-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Temporal Dynamics of Scale-Free Networks - MIT Media Labweb.media.mit.edu/~yanival/percolation-SBP2014.pdfeToro The Á} o [ o P } ]o financial trading company. Serving 3 million users

Temporal Dynamics of Scale-Free Networks

Erez Shmueli, Yaniv Altshuler, and Alex ”Sandy” Pentland

MIT Media Lab{shmueli,yanival,sandy}@media.mit.edu

Abstract. Many social, biological, and technological networks display substan-tial non-trivial topological features. One well-known and much studied feature ofsuch networks is the scale-free power-law distribution of nodes’ degrees.Several works further suggest models for generating complex networks whichcomply with one or more of these topological features. For example, the knownBarabasi-Albert ”preferential attachment” model tells us how to create scale-freenetworks.Since the main focus of these generative models is in capturing one or more ofthe static topological features of complex networks, they are very limited in cap-turing the temporal dynamic properties of the networks’ evolvement. Therefore,when studying real-world networks, the following question arises: what is themechanism that governs changes in the network over time?In order to shed some light on this topic, we study two years of data that wereceived from eToro: the world’s largest social financial trading company.We discover three key findings. First, we demonstrate how the network topologymay change significantly along time. More specifically, we illustrate how popularnodes may become extremely less popular, and emerging new nodes may becomeextremely popular, in a very short time. Then, we show that although the networkmay change significantly over time, the degrees of its nodes obey the power-law model at any given time. Finally, we observe that the magnitude of changebetween consecutive states of the network also presents a power-law effect.

1 Introduction

Many social, biological, and technological networks display substantial non-trivialtopological features. One well-known and much studied feature of such networksis the scale-free power-law distribution of nodes’ degrees [4]. That is, the degreeof nodes is distributed according to the following formula: P [d] = c ·d−λ. As thestudy of complex networks has continued to grow in importance and popularity,many other features have attracted attention as well. Such features include amongthe rest: short path lengths and a high clustering coefficient [12, 2], assortativityor disassortativity among vertices [10], community structure [8] and hierarchicalstructure [11] for undirected networks and reciprocity [7] and triad significanceprofile [9] for directed networks.Several works further suggested models for generating complex networks whichcomply with one or more of these topological features. For example, the knownBarabasi-Albert model [4] tells us how to create scale-free networks. It incorpo-rates two important general concepts: growth and preferential attachment. Growthmeans that the number of nodes in the network increases over time and prefer-ential attachment means that the more connected a node is, the more likely it is

Page 2: Temporal Dynamics of Scale-Free Networks - MIT Media Labweb.media.mit.edu/~yanival/percolation-SBP2014.pdfeToro The Á} o [ o P } ]o financial trading company. Serving 3 million users

to receive new links. More specifically, the network begins with an initial con-nected network of m0 nodes. New nodes are added to the network one at a time.Each new node is connected to m ≤ m0 existing nodes with a probability that isproportional to the number of links that the existing nodes already have.More sophisticated models for creating scale-free networks exist. For example,in [6], at each time step, apart of m new edges between the new node and the oldnodes, mc new edges are created between the old nodes, where the probabilitythat a new edge is attached to existing nodes of degrees d1 and d2 is proportionalto d1 · d2. A very similar effect produces a rewiring of edges [1]. That is, insteadof the creation of connections between nodes in the existing network, at each timestep, mr randomly chosen vertices loose one of their connections. In mrr cases,a free end is attached to a random vertex. In the rest mrp = mr −mrr cases, afree end is attached to a preferentially chosen vertex.The main focus of these generative models is in capturing one or more of the statictopological features of complex networks. However, these models are very lim-ited in capturing the temporal dynamic properties of the networks’ evolvement.Therefore, when studying real-world networks, the following question arises:what is the mechanism that governs changes in the network over time?In order to shed some light on this question, we studied two years of data (from2011/07/01 to 2013/06/30) that we received from eToro: the worlds largest socialfinancial trading company.We discover three key findings. First, we demonstrate how the network topologymay change significantly along time. More specifically, we illustrate how popularnodes may become extremely less popular, and emerging new nodes may becomeextremely popular, in a very short time. Then, we show that although the networkmay change significantly over time, the degrees of its nodes obey the power-law model at any given time. Finally, we observe that the magnitude of changebetween consecutive states of the network also presents a power-law effect.

2 Datasets

Our data come from eToro: the world’s largest social financial trading company(See http://www.etoro.com). eToro is an on line discounted retail broker for for-eign exchanges and commodities trading with easy-to-use buying and short sell-ing mechanisms as well as leverages up to 400 times.Similarly to other trading platforms, eToro allows users to trade between cur-rency pairs individually (see Fig ??). In addition, eToro provides a social networkplatform which allows users to watch the financial trading activity of other users(displayed in a number of statistical ways) and copy their trades (see Fig. 1). Morespecifically, users in eToro can place three types of trades: (1) Single trade: Theuser places a normal trade by himself, (2) Copy trade: The user copies one singletrade of another user and (3) Mirror trade: The user picks a target user to copy,and eToro automatically places all trades of the target user on behalf of the user.Our data contain over 67 million trades that were placed between 2011/07/01and 2013/06/30. More than 53 million of these trades are automatically executedmirror trades, less than 250 thousands are copy trades and roughly 13 millionare single trades. The total number of unique traders is roughly 275 thousandsand the total number of unique mirror operations is roughly 850 thousands (onemirror operation may result in several mirror trades).

Page 3: Temporal Dynamics of Scale-Free Networks - MIT Media Labweb.media.mit.edu/~yanival/percolation-SBP2014.pdfeToro The Á} o [ o P } ]o financial trading company. Serving 3 million users

eToro

The world’s largest social financial trading company.

Serving 3 million users worldwide.

Roughly two years of data.

The platform allows users to trade between currency pairs (individually) or…

1

eToro

Watch the financial trading activity of other users and copy them.

All trades are automatically uploaded to the network where they can be displayed in a number of statistical ways.

2

Fig. 1. The eToro platform. Illustrating the trading portfolio of a single user (left) and the tradingactivity of all users (right).

In the remainder of this paper, we use these trades to construct snapshot networksas we proceed to describe. Given a start time s and an end time e, the snapshotnetwork’s nodes consist of all users that had at least one trade open at some pointin time between s and e. An edge from user u to user v exists, if and only if, useru was mirroring user v at some point in time between s and e.

Figure 2 illustrates how the size of the eToro network grows along time terms ofboth the number of nodes and the number of edges. For each day during the twoyears period, a snapshot network is constructed, and the number of nodes andedges for that network are counted.

0 100 200 300 400 500 600 700 800Day

0

10000

20000

30000

40000

50000

Num

ber

of

nodes

0 100 200 300 400 500 600 700 800Day

0

20000

40000

60000

80000

100000

Num

ber

of

edges

Fig. 2. The size of the eToro network in terms of the number of nodes (left) and the number ofedges (right) along time.

Page 4: Temporal Dynamics of Scale-Free Networks - MIT Media Labweb.media.mit.edu/~yanival/percolation-SBP2014.pdfeToro The Á} o [ o P } ]o financial trading company. Serving 3 million users

3 Results

First, we examined the in-degrees of nodes in the eToro network, over the entireperiod of two years. As can be seen in Figure 3, the degree distribution presents astrong power-law pattern. Although, quite expected, this result is non-trivial. Onemight expect to see a bunch of users that are mirrored by the others, but whatwe actually witness is a heavy tail of users with only a few followers each. Thisresult is consistent with the observation in [3] where the authors demonstrate bysimulation that the degree distribution of social-learning networks converges to apower-law distribution, regardless of the underlying social network topology.

101 102 103 104

Degree

10-7

10-6

10-5

10-4

10-3

10-2

10-1

Densi

ty

γ=1.64

Fig. 3. In-degree distribution of nodes in the entire eToro network. (The in-degree of a nodedepicts the number of mirroring traders for the trader represented by that node)

Next, we investigated how the popularity of traders in eToro, in terms of the num-ber of mirroring traders, changes along time. Fig. 4 illustrates the popularity offour traders. As can be seen in the figure, popular traders may become extremelyless popular, and emerging new traders may become extremely popular, in a veryshort time. Note how this behavior differs significantly from the state-of-the-art”rich get richer” behavior.

0 100 200 300 400 500 600 700 800Day

0

200

400

600

800

1000

1200

1400

1600

Num

ber

of

mir

rori

ng t

raders

0 100 200 300 400 500 600 700 800Day

0

100

200

300

400

Num

ber

of

mir

rori

ng t

raders

0 100 200 300 400 500 600 700 800Day

0

500

1000

1500

Num

ber

of

mir

rori

ng t

raders

0 100 200 300 400 500 600 700 800Day

0

100

200

300

400

500

600

700

Num

ber

of

mir

rori

ng t

raders

Fig. 4. The in-degree of four nodes in the evolving eToro network. (Depicting the popularity ofthe four corresponding traders along time)

To illustrate this point further we checked how similar different snapshots of thenetwork are. Figure 5 presents the top 50 popular nodes for four different timeperiods: July-September 2011 (snapshot 1), January-March 2012 (snapshot 2),

Page 5: Temporal Dynamics of Scale-Free Networks - MIT Media Labweb.media.mit.edu/~yanival/percolation-SBP2014.pdfeToro The Á} o [ o P } ]o financial trading company. Serving 3 million users

July-September 2012 (snapshot 3) and January-March 2013. That is four three-month snapshots with three-month gaps in between. As can be seen in the fig-ure, only 11 nodes that were included in the top 50 popular nodes of snapshot1 remained in the top 50 popular nodes of snapshot 2; only 17 nodes that wereincluded in the top 50 popular nodes of snapshot 2 remained in the top 50 popularnodes of snapshot 3 and only 19 nodes that were included in the top 50 popularnodes of snapshot 3 remained in the top 50 popular nodes of snapshot 4. That is,the network may change significantly along time.

Snapshot 1 Snapshot 2 Snapshot 3 Snapshot 4

Fig. 5. The 50 most popular nodes in each one of the four snapshots. Green nodes represent nodesthat are included in the 50 most popular nodes of the current snapshot but were not included inthe previous one. Red nodes represent nodes that were included in the 50 most popular nodes ofthe previous snapshot but are not included in the current one. Blue nodes represent nodes thatwere included in both snapshots. The node’s circle area is proportional to its popularity.

We then examined the degree distribution for each one of the four snapshotsabove. As can be seen in Figure 6, although the four snapshots differ significantly,the degree distribution for each one of them obey the power-law model.

101 102 103

Degree

10-5

10-4

10-3

10-2

10-1

Densi

ty

γ=1.52

Snapshot 1

101 102 103

Degree

10-6

10-5

10-4

10-3

10-2

10-1

Densi

ty

γ=1.63

Snapshot 2

101 102 103

Degree

10-6

10-5

10-4

10-3

10-2

10-1

Densi

ty

γ=1.64

Snapshot 3

101 102 103 104

Degree

10-7

10-6

10-5

10-4

10-3

10-2

10-1

Densi

ty

γ=1.65

Snapshot 4

Fig. 6. Degree distribution for each one of the four snapshots that are shown in Figure 5

Next, we studied more carefully the eToro network changes between consecutivedays. More specifically, we measured the number of added edges (i.e., edges thatdid not appear in the previous day and appear in the current day) and the numberof removed edges (i.e., edges that appeared in the previous day and do not appearin the current day). Since the size of the eToro network grows over time (see Fig.2), we normalized the above quantities by dividing them in the number of edgesthat were present in the previous day. We found that, the normalized magnitudeof change between each two consecutive snapshots (according to each one of thetwo measures) follows a power-law distribution (see Figure 7).

Page 6: Temporal Dynamics of Scale-Free Networks - MIT Media Labweb.media.mit.edu/~yanival/percolation-SBP2014.pdfeToro The Á} o [ o P } ]o financial trading company. Serving 3 million users

2-5 2-4 2-3 2-2

Change

10-1

100

101

102

Densi

ty

gamma=2.88

2-6 2-5 2-4 2-3

Change

100

101

102

Densi

ty

gamma=2.80

Fig. 7. Distribution of the normalized changes in the eToro network: added edges (left) and re-moved edges (right).

In order to understand better this finding, we tried to break down the overallnetwork changes into two smaller components.First, we measured the changes by taking into account only the nodes that wereadded and removed between the two consecutive days. That is, we consideredonly users that were not trading in the previous day but are trading in the currentday and users that were trading in the previous day but are not trading in thecurrent day. As can be seen in the top two subfigures of Figure 8, the normalizednumber of added and removed nodes also follows a power-law distribution. Thatis, in most days, only a small number of nodes are added to or removed fromthe network, but occasionally, a large number of nodes are added or removed.We repeated the same analysis, when taking into account only the edges that atleast one of their nodes was added or removed. As can be seen in the bottom twosubfigures of Figure 8, the result was again a power-law distribution.Then, we measured the changes by taking into account only the nodes that existedin both of the two consecutive days. That is, we considered only users that weretrading in the previous day and are also trading in the current day. As can be seenin Figure 9, even when only the common nodes are considered, the normalizednumber of added and removed edges follows a power-law distribution.Our results were validated using the statistical tests for power-law distributionsthat were suggested in [5]. First, we applied the goodness of fit test. As can beseen in Table 1, the p-values for all cases are greater than 0.1, as required. Sec-ond, we tested alternative types of distribution. As can be seen in the table, thedistribution is more likely to be truncated power-law than general power-law inall cases (the GOF value is negative), and the results are significant in three outof eight of the cases (the p-values are lower than 0.05); the distribution is morelikely to be truncated power-law than exponential and the result is significant infive out of eight of the cases cases and the distribution is more likely to be trun-cated power-law than log-normal in all cases and the result is significant in fiveout of eight of the cases.

4 Summary and Future Work

In this paper, we investigate how scale-free networks evolve over time. Studyinga real-world network, we find that: (1) the network topology may change signif-icantly along time, (2) the degree distribution of nodes in the network obeys the

Page 7: Temporal Dynamics of Scale-Free Networks - MIT Media Labweb.media.mit.edu/~yanival/percolation-SBP2014.pdfeToro The Á} o [ o P } ]o financial trading company. Serving 3 million users

2-6 2-5 2-4 2-3 2-2

Change

100

101

102

Densi

ty

gamma=3.64

2-6 2-5 2-4 2-3

Change

100

101

102

Densi

ty

gamma=3.24

2-7 2-6 2-5 2-4 2-3

Change

100

101

102

Densi

ty

gamma=3.13

2-7 2-6 2-5 2-4

Change

100

101

102

Densi

ty

gamma=3.00

Fig. 8. Distribution of the normalized changes in the eToro network, as reflected by the added andremoved nodes: added nodes (top left), removed nodes (top right), added edges (bottom left) andremoved edges (bottom right)

2-6 2-5 2-4 2-3

Change

100

101

102

Densi

ty

gamma=2.87

2-6 2-5 2-4 2-3

Change

100

101

102

Densi

ty

gamma=2.61

Fig. 9. Distribution of the normalized changes in the eToro network, as reflected by the commonnodes: added edges (left) and removed edges (right).

Goodness Power-Law vs.

of Fit Trunc. Power-Law Exponential Log-Normal

added eges 0.024 2.88 0.121 (-) 0.108 (+) 0.012 (+) 0.396

removed edges 0.025 2.80 0.207 (-) 0.012 (+) 0.008 (+) 0.000

added nodes 0.073 3.64 0.613 (-) 0.613 (+) 0.093 (+) 0.732

removed nodes 0.023 3.24 0.111 (-) 0.099 (+) 0.160 (+) 0.005

added edges 0.018 3.13 0.545 (-) 0.544 (+) 0.063 (+) 0.411

removed edges 0.012 3.00 0.110 (-) 0.108 (+) 0.159 (+) 0.006

added edges 0.014 2.87 0.123 (-) 0.039 (+) 0.027 (+) 0.032

removed edges 0.014 2.61 0.131 (-) 0.009 (+) 0.014 (+) 0.000

Trunc. Power-Law vs.

7

8

9

xmin alphaFig. Subfigure

Table 1. Statistical tests for power-law distributions. The numbers in the three right columnsrepresent the p-value and the sign of the GOF value in brackets.

Page 8: Temporal Dynamics of Scale-Free Networks - MIT Media Labweb.media.mit.edu/~yanival/percolation-SBP2014.pdfeToro The Á} o [ o P } ]o financial trading company. Serving 3 million users

power-law model at any given state and (3) the magnitude of change betweenconsecutive states of the network also presents a power-law effect.Better understanding the temporal dynamics of scale-free networks would allowus to develop improved and more realistic algorithms for generating networks.Moreover, it would help us in better predicting future states of the network andestimating their probabilities. For example, it may help in bounding the probabil-ity that a given node remains popular over a certain period of time.In future work we intend to check how the distribution of changes between con-secutive states of the networks influences the overall networks performance. Wehypothesize that in cases where the distribution of changes is closer to a power-law distribution, the overall network performance would be higher. Furthermore,we would like to investigate the mechanism that is responsible for the power-lawshape of the distribution. Finally, we would like to suggest a generative model fornetworks based on the above findings.

References

1. Albert, R., and Barabasi, A.-L. Topology of evolving networks: local eventsand universality. Physical review letters 85, 24 (2000), 5234.

2. Amaral, L. A. N., Scala, A., Barthelemy, M., and Stanley, H. E. Classes ofsmall-world networks. Proceedings of the National Academy of Sciences 97,21 (2000), 11149–11152.

3. Anghel, M., Toroczkai, Z., Bassler, K. E., and Korniss, G. Competition-driven network dynamics: Emergence of a scale-free leadership structure andcollective efficiency. Physical review letters 92, 5 (2004), 058701.

4. Barabasi, A.-L., and Albert, R. Emergence of scaling in random networks.science 286, 5439 (1999), 509–512.

5. Clauset, A., Shalizi, C. R., and Newman, M. E. Power-law distributions inempirical data. SIAM review 51, 4 (2009), 661–703.

6. Dorogovtsev, S. N., and Mendes, J. F. F. Scaling behaviour of developingand decaying networks. EPL (Europhysics Letters) 52, 1 (2000), 33.

7. Garlaschelli, D., and Loffredo, M. I. Patterns of link reciprocity in directednetworks. Physical Review Letters 93, 26 (2004), 268701.

8. Girvan, M., and Newman, M. E. Community structure in social and bio-logical networks. Proceedings of the National Academy of Sciences 99, 12(2002), 7821–7826.

9. Milo, R., Itzkovitz, S., Kashtan, N., Levitt, R., Shen-Orr, S., Ayzenshtat, I.,Sheffer, M., and Alon, U. Superfamilies of evolved and designed networks.Science 303, 5663 (2004), 1538–1542.

10. Newman, M. E. Assortative mixing in networks. Physical review letters 89,20 (2002), 208701.

11. Ravasz, E., and Barabasi, A.-L. Hierarchical organization in complex net-works. Physical Review E 67, 2 (2003), 026112.

12. Watts, D. J., and Strogatz, S. H. Collective dynamics of small-worldnetworks. nature 393, 6684 (1998), 440–442.