1307.2893v1.pdf

Upload: ealabera

Post on 04-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 1307.2893v1.pdf

    1/19

    Coexistence in preferential attachment networks

    Tonci Antunovic

    Elchanan Mossel

    Miklos Z. Racz

    July 11, 2013

    Abstract

    Competition in markets is ubiquitous: cell-phone providers, computer manufacturers, and sport gearbrands all vie for customers. Though several coexisting competitors are often observed in empiricaldata, many current theoretical models of competition on small-world networks predict a single winnertaking over the majority of the network. We introduce a new model of product adoption that focuseson word-of-mouth recommendations to provide an explanation for this coexistence of competitors. Thekey property of our model is that customer choices evolve simultaneously with the network of customers.

    When a new node joins the network, it chooses neighbors according to preferential attachment, and thenchooses its type based on the number of initial neighbors of each type. This can model a new cell-phoneuser choosing a cell-phone provider, a new student choosing a laptop, or a new athletic team memberchoosing a gear provider. We provide a detailed analysis of the new model; in particular, we determinethe possible limiting proportions of the various types. The main qualitative feature of our model is that,unlike other current theoretical models, often several competitors will coexist, which matches empiricalobservations in many current markets.

    1 Introduction

    A major challenge in understanding complex networks is the interplay between the evolution of the networkand the dynamical features of processes on the network. Almost all networks we know evolve dynamically:the citation graph grows every day with new papers being published, friendships are created and broken every

    minute, webpages and links between them are born and destroyed every second, and actin filaments of thecytoskeleton assemble and disassemble every millisecond to facilitate cell motion. The changes in networkstructure are closely related to changes in the features or content of individual nodes, and the processeson these nodes. For example, the content of a Facebook page is correlated with the friendship dynamics,the changing content of webpages influences the creation and destruction of links, and the connectivity ofneurons is influenced by their utilization.

    The network structure of many complex networks is well understood since the work of Barab asi andAlbert[16], who showed that the network topology arising in these real-world networks is a consequence oftwo generic mechanisms: growth and preferential attachment. Subsequently many studies have underlinedthe universality of this network topology, confirming its relevance. However, to understand the behavior ofcomplex systems, it is not enough to understand the underlying network structure. To quote Barabasi[15],To make progress in this direction, we need to tackle the next frontier, which is to understand the dynamicsof the processes that take place on networks.

    Indeed, we argue that the only way to truly understand dynamical processes on networks is to considerthem together with the network evolution dynamics. In the past decade there have been many studieson processes on networks [17], e.g., epidemic spreading[45], evolutionary games[44], and information cas-cades [54]. However, all of these considered the network as fixed, and then studied the process of intereston this static graph. This static viewpoint hides the fact that the networks and the processes on them

    University of California, Los Angeles; [email protected] of California, Berkeley and Weizmann Institute of Science; [email protected]; supported by NSF grant

    DMS 1106999 and by DOD ONR grant N000141110140.University of California, Berkeley; [email protected]; supported by a UC Berkeley Graduate Fellowship, by NSF

    grant DMS 1106999 and by DOD ONR grant N000141110140.

    1

    arXiv:1307.2893v

    1

    [physics.soc-ph]

    10Jul2013

  • 8/13/2019 1307.2893v1.pdf

    2/19

    coevolve. Although the study of such coevolution was initiated over a decade ago [53], only recently is itstarting to be explored in greater depth (see [29,33] and references therein), and thus many questions stillremain. In particular, in the context of product adoption on networks, there is yet no clear explanation ofthe phenomena of coexistence of competing products.

    Our main contribution is to identify a simple model which couples the growth of a network and nodefeature dynamics; in particular, we focus on type adoption dynamics, where each node has a single type from

    a finite set of types. When a new node joins the network, both its connections to the existing nodes andits type are influenced by the current structure of the network. As a particular instance of such a generalmodel, we consider the dynamics where the new node chooses its connections according to linear preferentialattachment [16], and then chooses its type based on how many of its neighbors are of a certain type; seeFigure1 for an illustration.

    Figure 1: Illustration of our model. Each node in the initial graph has a type/color from a finite setof types/colors. At each time step a new node is added to the graph and connected to m existing nodesaccording to linear preferential attachment (herem = 5). When the new node joins the graph it also adopts atype/color: it picks its type/color according to a probability distribution which depends on the types/colorsof its initial neighbors. See Section1.1 for details.

    Our model is of interest in many cases where preferential attachment is a good representation of theevolution of the network structure and where competition between types is a natural process. In particular,these include models of product adoption via word-of-mouth recommendations on social networks, such as

    a new cell-phone user choosing a cell-phone provider/package/device based on her friends decisions, a newstudent choosing a laptop, or a new athletic team member choosing a gear provider.A key feature of our model is the elegance and simplicity of its analysis. We explicitly calculate the

    possible limiting ratios of the types. An interesting feature of our results is that for many settings of theparameters of the model, none of the types dominate (see Figure 4), which matches empirical observationsin many current markets. Our results thus provide a theoretical understanding of coexistence of types inpreferential attachment networks. They should be compared to results on other models of competition onscale-free networks where coexistence is rarely achieved, and typically the winner takes all [ 50,23].

    We next describe our model and our results in more detail, followed by a discussion of related work.

    1.1 Model

    For simplicity, we describe our model in the case of two types, which we refer to as red and blue colors.

    In the following, we use the terms type and color interchangeably. Our model naturally generalizes toany number of types, see Section3 for a description and results. The main feature of the model is that itincorporates and couples two processes: a network growing process and a type adoption process.

    We consider the standard linear preferential attachment model [16] as the network growing process inour model. Starting from an initial graph G0, at each time step an additional node v is added to the graph,together with m edges connecting v to existing nodes in the graph. Each edge is chosen independently,and according to linear preferential attachment, i.e., the probability that a given edge connects v to a givenexisting node u is proportional to the degree ofu.

    The type adoption process on the network is as follows. All nodes in the initial graph G0start with a type,i.e., they are either red or blue. Each additional node v receives a color when it is added to the graph, and

    2

  • 8/13/2019 1307.2893v1.pdf

    3/19

    this color depends on the colors of the nodes it connects to when it is added. Suppose that out of themedgesconnecting the new nodevto existing nodes exactlykconnect to a red node. Then, conditioned on this event,vbecomes red with probability pk and blue with probability 1 pk. The probabilities pk [0, 1] , 0 k m,are parameters of the model. See Figure 1 for an illustration.

    The parameters{pk}0km allow us to model different kinds of behavior. A natural choice is the linearmodel, whenpk = k/m for allk. However, nonlinear models, whenpk=k/m for somek, can capture a widerange of other types of behavior. In particular, they can capture diminishing and increasing returns, andeven more complex behavior that combines these.

    1.2 Results

    We are interested in the fraction of nodes of each typethis corresponds to the fraction of users using agiven companys product, or in other words, the companys market share. Our main results characterize thepossible limiting fractions of each color in the case of two colors. These results thus provide a complete phasediagram of the asymptotic behavior of the process as the size of the network goes to infinity; see Figure 4for an illustration.

    To describe our results we introduce some notation. LetGn denote the graph when n nodes have beenadded to the initial graph G0. Let An and Bn, resp., denote the number of red and blue nodes, resp., inGn, and let an :=

    AnAn+Bn

    and bn := BnAn+Bn

    denote the corresponding normalized fractions. Furthermore,

    let Xn (resp., Yn) denote the sum of the degrees of red (resp., blue) nodes in Gn, and let xn:= XnXn+Yn andyn :=

    YnXn+Yn

    denote the normalized fractions. We are primarily interested in the asymptotic proportion ofred and blue nodes, i.e., in the limits limn an and limn bn= 1 limn an.

    As we shall see, a key role in the asymptotic behavior of the process is played by the polynomial

    (1) P(z) =1

    2

    mk=0

    m

    k

    zk (1 z)mk

    pk k

    m

    ,

    and in particular its zero set, denoted by ZP :={z [0, 1] : P(z) = 0}. This is because, as we will see,{an}n0 behaves approximately like a stochastic version of the ODE dz/dt= P(z), and thus intuitively thetrajectory of{an}n0 should approximate the trajectory{z (t)}t0 of this ODE.

    The following two theorems confirm this intuition. There is an important distinction between the linear

    modelandnonlinear models, which is due to the fact that in the linear model the polynomial P is identicallyzero and thus ZP = [0, 1], while in nonlinear models the zero set ZP is a finite set.

    Theorem 1.1 (Linear model). Suppose thatpk = k/m for all0 k m, and thatX0, Y0 > 0. Thenanconverges almost surely; furthermore, the limiting distribution has full support on [0, 1] and no atoms, anddepends only onX0, Y0, andm.

    See Figure 2 for empirical histograms in the linear model with various initial parameters and variousvalues ofm.

    Theorem 1.2(Nonlinear models). Suppose thatpk=k/m for some0 k m, and thatX0, Y0 > 0. Thenan converges almost surely; furthermore, the limit is a point in the finite setZP.

    In nonlinear models we thus know that the asymptotic proportion of red nodes is contained in the finitezero set ZP. But which points z

    ZP arise as the limiting proportion with positive probability? This

    depends on the behavior of the polynomial around the zeroz ZP. Intuitively, since {an}n0 is a stochasticsystem, we expect that stable trajectories of the ODEdz/dt= P(z) should appear, but unstable trajectoriesshould not. This intuition is confirmed and formalized in the following three theorems.

    Theorem 1.3 (Nonlinear models, stable equilibria). Suppose thatpk=k/m for some0 k m, and thatX0, Y0 > 0. Supposez ZP (0, 1) is such that there exists an > 0 such that P > 0 on (z , z) andP 0, i.e.,an converges toz with positive probability. Similarly,if0ZP andP < 0 on(0, ), or if1ZP andP >0 on(1 , 1), then there is a positive probability ofconvergence ofan to 0 or1, respectively.

    3

  • 8/13/2019 1307.2893v1.pdf

    4/19

    (a) A0= B0 = 1, X0= Y0 = 1. (b) A0 = B0= 2,X0 = Y0= 4. (c)A0 = B0= 3,X0 = Y0= 9.

    (d)A0= 1, B0 = 2, X0= 1, Y0 = 3. (e)A0 = 1, B0 = 4,X0= 1, Y0 = 11. (f) A0= 2, B0= 3,X0 = 4, Y0= 8.

    Figure 2: Empirical histograms ofan in the linear model for n = 105, from 2 105 simulations. Each

    subfigure has different initial parameters (see subcaptions), and in each case empirical histograms for tendifferent values ofm are plotted. See Fig. 2e for the key to all plots.

    Theorem 1.4 (Nonlinear models, unstable equilibria). Suppose thatpk= k /m for some0 k m, andthatX0, Y0 > 0. Supposez ZP (0, 1)is such that there exists an >0 such thatP 0 on(z, z+). Then P (limn an= z) = 0. Similarly, if0ZP andP > 0 on(0, ), or if1ZPandP 0. SupposezZP (0, 1) is such that there exists an >0 such thatP is either strictly positiveor strictly negative on the union of the intervals(z , z) and(z, z+). ThenP (limn an= z)> 0.

    See Figure3 for an illustration of the polynomial Pfor various values of the parameters{pk}0km, andwhat the various limiting proportions can be in each case.

    The theorems above provide a complete phase diagram of the asymptotic behavior of the process in thecase of two types. To illustrate this, see Figure4, which shows phase diagrams for m = 3 and m = 4 whenthere is no bias towards either color, i.e., whenpk+pmk= 1 for all 0 k m. This condition implies thatP(z) = P(1 z) and so 1/2 ZP, but 1/2 need not be a limit point (see Fig. 3).

    Coexistence. In particular, the theorems above show that in many cases the two colors coexist in thelimit. Indeed, sinceP(0) = 12p0 and P(1) =

    12(pm 1), p0 = 0 or pm= 1 is necessary for one of the colors

    to asymptotically take over the network. Wheneverp0 >0 and pm

  • 8/13/2019 1307.2893v1.pdf

    5/19

    (a) (b) (c)

    Figure 3: Examples of the polynomialP and possible limiting proportions.In each case there is nobias towards either color, i.e., pk+pmk= 1 for all 0 k m. (a) Majority choice: pk = 1 ifk > m/2 and

    pk = 0 otherwise (m= 5 in the figure). The possible limits are 0 and 1, i.e., the winner takes all. (b) Theparameters here are: m = 9, p5 = p6 = 0.5, p7 = p8 = p9 = 0.95. Such an example is plausible if thestrength of the signal from the neighbors matters: if 3 k 6, then the signal towards either color is weak,so just flip a fair coin to choose, but if 0k2 or 7k9 then there is a strong signal towards one ofthe colors, so pick that color with probability close to 1. In this example the possible limits are: z1 0.055,z2 = 0.5,z3

    0.945, and there are also two zeros ofPwhich cannot be limits. (c) This is an example where

    P has touchpoints. The parameters are: m= 6,p4 = 1031/1710,p5 = p6 = 35/38, and the two touchpointsare at z1= 1/4 and z2 = 3/4. Both of these, as well as z3 = 1/2, can be limits.

    (a) m = 3 (b)m = 4

    Figure 4: Phase diagrams when there is no bias towards either color/type, i.e., whenpk+pmk = 1for all 0k m. Let qk := pk k/m. (a) Ifq2 < 0 or ifq2+ q3 0 and q3 0 and q2 > 0, then let = q3q2 [0, 1); the possible limits ofan arethen 1212

    333+. In particular, when = 0 then the winner takes all, and if (0, 1), then the two types

    coexist in the limit. (b) This is similar to (a). Here, if 2q3+q4 >0 and q4 > 0, then let =q4q3 [0, 2);the possible limits ofan are then

    12 12

    22+ .

    In marketing, competing companies fight for customers. In essence, our model describes word-of-mouthrecommendations, and thus it should be compared to other models which study the effect of such personalrecommendations. A related model of word-of-mouth learning was studied by Banerjee and Fudenberg [14],where successive generations of agents make choices between two alternatives, with new agents sampling thechoices of old ones. However, they considered the limit of a continuum of agents with no network structure,in contrast to our setup, where this is explicitly modeled. Furthermore, they assume that one of the twoalternatives is ex-ante better than the other, and focus on whether or not the agents can learn this viaword-of-mouth communication. See also [26, 27].

    5

  • 8/13/2019 1307.2893v1.pdf

    6/19

    The power of word-of-mouth has been a widely studied topic in the past half century, with researchconfirming the strong influence of word-of-mouth communication on consumer behavior [25,4,30,22, 28].This research generally supports the assertion that word-of-mouth is more influential than external marketingefforts, such as advertising. In the current information age, online feedback mechanisms have changed the waycustomers share opinions about products and services [24], and online social networks are being exploited forviral marketing purposes [37]. Nevertheless, traditional word-of-mouth recommendation networks still have

    a very important effect, and companies are advised to take advantage of this through their marketing efforts,e.g., via facilitating referrals[35,36]. Due to the ever-changing ways individuals interact, it is important toanalyze modelssuch as the one introduced in this paperthat study the interplay between how individualsinteract and the effects of word-of-mouth communication in the given setting.

    In epidemiology, pathogens fight for survival, and a central topic is the spread of diseases [13, 2]. Inclassic models of epidemic spreading, individuals are characterized by the stage of the disease in them: theycan be susceptible, infected, or recovered/removed, leading to the SIR, SIRS and SIS models. The mainobject of study is the epidemic threshold, i.e., under what conditions does the disease die out or take overthe population. An important finding is that the network structure underlying the population of individualsgreatly affects the epidemic threshold; in particular, on scale-free networks the epidemic threshold vanishes,and diseases can spread even when infection probabilities are tiny[45,39,40,42].

    Another large area of epidemiology studies conditions under which multiple strains of a pathogen cancoexist (see, e.g.,[38] and references therein), while the physics community has been studying the effects of

    the underlying network on competing epidemics [43,1, 34].This research in epidemiology is relevant in a much broader context, since many dynamical processes,

    such as the diffusion of information and opinions, can be modeled as epidemics. Indeed, the spread ofcompeting products has been modeled in this way as well [50, 21]. In [50] the authors study a SI1I2Smodel of competing viruses with perfect mutual immunity in a mean-field setting for fixed networks, andconclude that the winner takes all, i.e., one virus will take over, while in[21]they study what level of partialimmunity allows for coexistence of the two viruses. A related model of competing first passage percolationhas been studied in probability theory on various network topologies, including random regular graphs [ 3]and scale-free networks [23]; the conclusion again is that the winner takes all. In contrast, in many currentmarkets we observe that competing products coexist, even when they are mutually exclusive.

    Perhaps closest to our paper is the work of Arthur and collaborators in economics [6, 8]. The centralviewpoint of their research is that many economic systems are constantly evolving as opposed to being instatic equilibrium [9]; our model is in line with this out-of-equilibrium viewpoint. In particular, they study

    several economic systems involving positive feedback due to increasing returns, such as the evolution oftechnology choice [5]and industry locations [7]. The behavior of these systems share many things with ourmodel: multiple possible long-run states, unpredictability due to stochasticity, lock-in, path dependence,and symmetry breaking. There are also technical commonalities: nonlinear Polya urn processes feature inthese [10, 11, 12] as well as in our model. However, there are also many different features in our model.Chief among these is that we also explicitly model the network underlying the agents. We argue that theinclusion of this extra layer is important and deserves further study in such out-of-equilibrium models.

    1.4 Outline of paper

    First, in Section2we prove the results described in Section1.2. Then in Section3we study the case of threeor more types, and finally we conclude with open questions and directions for future research in Section 4.

    2 Proofs

    This section contains the proofs of our main results described in Section 1.2, and is structured as follows.First, in Section2.1 we show how the asymptotic behavior of{an}n0 is the same as that of the sum-of-degrees process{xn}n0, which is more convenient to study, as it is a Markov process. Then in Section 2.2we study the linear model and prove Theorem 1.1. Next in Section 2.3 we recall results from the theoryof stochastic approximation processes, and finally in Section 2.4we prove our results concerning nonlinearmodels.

    6

  • 8/13/2019 1307.2893v1.pdf

    7/19

    2.1 Reduction to the sum-of-degrees process

    To understand the process{An}n0 (and thus the normalized process{an}n0), it is more convenient tostudy the time evolution of the sum of the degrees of each type. The process {An}n0is not a Markov process,and therefore we study the joint process {(An, Xn)}n0, which is indeed Markov. It evolves as follows. Given(An, Xn), un+1 is drawn from the binomial distribution with parameters m and xn. Subsequently, In+1 isdrawn from the Bernoulli distribution with parameter pun+1. We then have

    An+1 = An+In+1,(2)

    Xn+1 = Xn+un+1+mIn+1.(3)

    The following lemma tells us that in order to understand the asymptotic behavior of {an}n0, it is enoughto understand the asymptotic behavior of{xn}n0. Consequently, in the following we analyze the latter, asthis is a Markov process.

    Lemma 2.1. Suppose{xn}n0 converges a.s. and letx := limn xn denote the limit. IfP(x) = 0 a.s.,then{an}n0 converges a.s. as well, andlimn an= x a.s.Proof. LetFn denote the filtration of the process until time n. GivenFn, the probability that the nodeadded at time n + 1 is red is

    P (An+1 An= 1 | Fn) =mk=0

    mk

    xkn(1 xn)mkpk= xn+

    mk=0

    mk

    xkn(1 xn)mk qk

    =xn+ 2P(xn) =:f(xn) ,

    where qk = pk k/m. Thus E (An+1 An | Fn) = f(xn). Define Mn = AnA0n1

    i=0 f(xi), withinitial conditionM0 = 0. The previous calculation tells us that{Mn}n0 is a martingale with respect to thefiltrationFn. Moreover, this martingale has bounded increments sinceMi+1 Mi =Ai+1 Ai f(xi)[1, 1], and thus limn Mn/n= 0 a.s.

    Letx := limn xn. SinceP(x) = 0, we have f(x) = x. Sincefis continuous, we have limn f(xn) =f(x) = x a.s., and thus the Cesaro mean of the sequence{f(xn)}n0 also converges to the same limit:limn 1n

    n1i=0 f(xi) = x a.s. The claim then follows from the fact that Mn/n=

    Ann A0

    n 1

    n n1i=0 f(xi)

    and limn an Ann = 0.2.2 Linear model

    Proof of Theorem1.1. In the linear model when pk = km

    for all k = 0, 1, . . . , m, we have that P 0, andthus E (Xn+1 Xn | Fn) = 2mxn. Sincexn+1xn= Xn+1Xn2mxnSn+1 , it follows that E (xn+1 xn | Fn) = 0,i.e.,{xn}n0 is a martingale. Since it is also bounded, it converges almost surely. Lemma2.1 then impliesthat{an}n0 converges a.s. as well, and limn an= limn xn a.s.

    We use a variance argument to show that the distribution ofx := limn xn has full support on [0, 1].

    First note that (xn+1 xn)2 =Xn+1Xn2mxnS0+2m(n+1)

    2 1

    (n+1)2, and consequently for anyn0 we have

    (4)

    E(x xn0)2 Fn0 = limnE(xn xn0)2 Fn0=

    j=n0

    E(xj+1 xj)2 Fn0

    j=n0

    1

    (j+ 1)2 1

    n0 .

    Now let (r, r+) (0, 1) be any fixed interval. Our goal is to show that P (x (r, r+))> 0. Letn0 be aninteger such that n0 182 and P

    xn0

    r+ 3 , r+

    23

    > 0 (this is possible since for large enough n0 there

    exists a sequence of events such that xn0

    r+ 3 , r+ 23

    ). Now condition on this event; (4) implies that

    E

    (x xn0)2

    xn0

    r+

    3, r+

    2

    3

    1

    n0

    2

    18,

    7

  • 8/13/2019 1307.2893v1.pdf

    8/19

    which in turn implies that P|x xn0 | 3 xn0 r+ 3 , r+ 23 12 . We can conclude that

    P (x (r, r+)) P

    |x xn0 |

    3

    xn0

    r+

    3, r+

    2

    3

    P

    xn0

    r+

    3, r+

    2

    3

    > 0.

    Finally, showing that the distribution ofx has no atoms can be done by adapting arguments by Peman-tle [46]. First, let us describe how the process

    {xn}n0

    is related to time-dependent Polya urn processesthat Pemantle studies in [46].

    Time-dependent Polya urn processes are generalizations of the classical Polya urn process, where thenumber of balls added to the urn is allowed to vary with time. Although{xn}n0 is not a time-dependentPolya urn process, the following slight modification of the preferential attachment process does give a time-dependent Polya urn process. When adding a new nodev to the graphGn= (Vn, En), add its m neighbors

    one by one, and after adding each neighbor, update the degree of the neighbor. LetXn denote the sumof the degrees of red nodes at time n in this model. Consider also a time-dependent Polya urn process{Zn}n0 where at times t= 0 modm a single ball is added to the urn, and at times t = 0 modm thenumber of balls added to the urn is m+ 1. It can be seen that ifX0 = Z0, thenXn and Zmn have thesame distribution. Thus Pemantles results[46, Theorem 3, Theorem 4] apply directly and show that thedistribution of limn

    xn (this limit exists a.s.) has no atoms.

    Since our setting is close to Pemantles original setting, we only sketch the proof that the distribution of

    x has no atoms, and leave the details to the reader.To show that the distribution ofx has no atoms on (0, 1), we can adapt the variance arguments of [46,

    Theorem 3]. Fixr (0, 1). Suppose on the contrary that P (x= r)> 0. Then for every >0 there existsn0 and some eventA Fn0 having positive probability such that P (xn r | A) 1 ; in fact, n0 canbe as large as desired. Define c:= r(1r)

    m/2

    102m/2 and let N= maxS0m

    , 2c2min{r,1r}

    . One can then show, via

    variance arguments, the following two inequalities. First, for every n N,

    P

    supkn

    |xk r| cn

    Fn

    12

    .

    Second, definingB=|xn r| cn

    , we have that for every n N,

    P infkn

    |xk r| c2

    nFn, B c216 .

    Putting these together we have that for every n N, the probability givenFn is at least c232 that somexn+kwill be at least c

    naway fromr and no subsequentxn+k+will ever return to the interval

    r c

    2n

    , r+ c2n

    .

    This contradicts our initial assumption and so P (x= r) = 0.To show that the distribution ofxhas no atoms at 0 and 1, we can adapt the arguments of [46,Theorem 4].

    The main idea is a domination argument. Let{vn}n0 be the Polya urn process where at each time step2m balls are added to the urn, and let v0 = x0. Then the distribution of xn can be dominated by thedistribution ofvn, in the sense that E (h (xn)) E (h (vn)) for every continuous bounded convex function h.In other words, xn is smaller than vn in the convex order [52]. Since the limiting distribution of{vn}n0 isa beta distribution, which does not have an atom at zero, one can then take h(x) := max {0, 2 x/}andlet 0 to conclude that the distribution ofx cannot have an atom at zero either. We refer the readerto[46, Theorem 4] for more details. See also the proof of Theorem1.4 for the endpoints in Section2.4.

    2.3 Stochastic approximation processes

    The key observation in the analysis of the asymptotic behavior of{xn}n0 is that it is a stochastic approx-imation process. Stochastic approximation was introduced in 1951 by Robbins and Monro[51], whose goalwas to approximate the root of an unknown function via evaluation queries that are necessarily noisy. Therehas been much follow-up research, see, e.g., the monograph by Nevelson and Hasminskii [41]. The setup ofstochastic approximation arises naturally in the study of Polya urn processes; see the survey [49]for details.

    8

  • 8/13/2019 1307.2893v1.pdf

    9/19

    In particular, we use results of Hill, Lane and Sudderth[31], who studied generalized (nonlinear) Polya urnprocesses, and we also use subsequent refinements by Pemantle [47, 48]. We state the main theorems hereand refer to the original papers for more details; see also the survey [49]. Stochastic approximation resultsin higher dimensions will be discussed in Section3.

    Let{Zn}n0 be a stochastic process in R adapted to a filtration{Fn}. Suppose that it satisfies

    (5) Zn+1 Zn= 1

    n(F(Zn) +n+1+Rn) ,

    where F : R R, E (n+1 | Fn) = 0, and the remainder terms Rn Fn go to zero and also satisfyn=1n

    1 |Rn| < almost surely. Such a process is known as a stochastic approximation process.Intuitively, trajectories of a stochastic approximation process{Zn}n0 should approximate the trajecto-

    ries{Z(t)}t0 of the corresponding ODE dZ/dt= F(Z). Moreover, since{Zn}n0 is a stochastic system,we expect that stable trajectories of the ODE should appear, but unstable trajectories should not. Thisintuition is confirmed and formalized in the following statements (quoted from the survey[49]); for proofsand more details see the papers cited above.

    Theorem 2.2(Convergence to the zero set ofF). Suppose{Zn} is a stochastic approximation process andthatE

    2n+1

    Fn Kfor some finiteK. IfF is bounded and continuous, thenZn converges almost surelyto the zero set ofF.

    Theorem 2.3 (Convergence to stable equilibria). Suppose{Zn} is a stochastic approximation process witha bounded and continuousF, and thatE

    2n+1

    Fn Kfor some finiteK. Suppose there is a pointz andan > 0 with F(z) = 0, F > 0 on (z , z) and F < 0 on (z, z+). Then P (Zn z) > 0. Similarly,whenF : [0, 1]R, ifF(0) = 0 andF 0 on(1 , 1), then there isa positive probability of convergence to 0 or 1, respectively.

    Theorem 2.4(Nonconvergence to unstable equilibria). Suppose{Zn}is a stochastic approximation processwith a bounded and continuousF. Suppose there is a pointz (0, 1)and an >0 withF(z) = 0,F 0 on(z, z+). Suppose further that E +n+1 Fn andE n+1 Fn are bounded aboveand below by positive numbers whenZn (z , z+). ThenP (Zn z) = 0.

    Pemantle studied the case of touchpoints for generalized (nonlinear) Polya urn processes in [48]. Hisproof extends to the following result.

    Theorem 2.5 (Convergence to touchpoints). Suppose{Zn} is a stochastic approximation process with abounded and continuously differentiable F, and that|n| K a.s. for some finite K. Suppose z ZPis a touchpoint, i.e., there exists an > 0 such that either F > 0 on (z , z) (z, z+) or F < 0 on(z , z) (z, z+). ThenP (Zn z)> 0.

    2.4 Nonlinear models

    We first show that{xn}n0 is a stochastic approximation process (i.e., satisfies (5)) with the function Pas in (1). Subsequently we show how this implies our results in Section1.2 using the results described inSection2.3.

    Lemma 2.6. The process{xn}n0 is a stochastic approximation process with the functionF =Pas in (1).Furthermore, the noise termn is bounded:|n| 2 for alln 1.

    Proof. From (3) we have that the conditional expectation ofXn+1 Xn is:E (Xn+1 Xn | Fn) =

    mk=0

    m

    k

    xkn(1 xn)mk (k+mpk) = 2mxn+ 2mP(xn) .

    One can check that xn+1 xn = Xn+1Xn2mxnSn+1 and consequently E (xn+1 xn | Fn) = 2mSn+1P(xn), withPas in (1). We can then write{xn}n0 as a stochastic approximation process as claimed in the statementof the lemma, i.e., we can write

    xn+1 xn= 1n

    (P(xn) +n+1+Rn)

    9

  • 8/13/2019 1307.2893v1.pdf

    10/19

    with appropriately defined n+1 andRn. Define n+1 as

    (6) n+1 = n (xn+1 xn E (xn+1 xn | Fn)) .The remainder term Rn can then be written as

    Rn= S0+ 2mS0+ 2m (n+ 1)

    P(xn) .

    Clearly Rn Fn. Let us now show thatn=1n1 |Rn| 0, then theprobabilitiesP (Xn+1

    Xn= k1

    | Fn)andP (Xn+1

    Xn= k2

    | Fn)are bounded away from zero by a positive

    function ofand the parameters{pk}0km.Proof. In the following we always assume that xn (, 1 ). If p0 < 1 then we can choose k1 = 0since P (Xn+1 Xn= 0 | Fn) = (1 xn)m (1 p0) m (1 p0). Similarly, ifpm > 0 then we can choosek2 = 2m, since P (Xn+1 Xn= 2m | Fn) = xmnpm mpm. The rest of the proof deals with the cases wheneitherp0 = 1 or pm= 0.

    First consider the case whenp0 > 0 andp1= p2 = =pm= 0. In this case P(s) = 12[p0(1 s)m s],which is decreasing in [0, 1], so it has a single zero in (0, 1). In fact, P(1/2)< 0, so the single zero ofP in(0, 1) is in (0, 1/2), and thus we can take k2 = m. Ifp0 < 1 then we can take k1 = 0 as described above.Finally, ifp0 = 1 and m >2, then we can take k1 = 1. This is because the zero ofP in (0, 1) is in

    12m

    , 12

    ,

    which follows from the fact that P 12m

    > 0. The case whenp0 = p1 = = pm1 = 1 and pm< 1 follows

    similarly.Now we can assume that there exist 1i m and 0 j m 1 such that pi > 0 and pj < 1. This

    implies that P (Xn+1 Xn= j | Fn)m

    (1 pj)> 0 and P (Xn+1 Xn= m+i | Fn)m

    pi >0. Thusifz = 1/2 then we can take k1 = j and k2 = m+i. If 0< z

  • 8/13/2019 1307.2893v1.pdf

    11/19

    Proof of Theorem1.4 forz (0, 1). This follows from Lemma2.6and Theorem2.4. The only condition ofTheorem2.4that needs to be checked additionally is that E

    +n+1

    Fnand E n+1 Fnare bounded awayfrom zero by positive numbers when xn (z , z+) for small enough > 0; this can be done usingLemma2.7. In the special cases (a), (b), and (c) described in Lemma2.7, the statement of Theorem 1.4is vacuously true, since in each case the polynomial Phas no zeros at which it is increasing. Thus we mayassume that we are not in these special cases, and we can use Lemma 2.7. Recall that

    n+1 = nSn+1

    {Xn+1 Xn 2m (xn+P(xn))} .

    Define := 12min {2mz k1, k2 2mz}, where k1 and k2 are given by Lemma 2.7, and let > 0 besmall enough such that whenever xn (z , z+), necessarily 2m (xn+P(xn)) (2mz , 2mz+ ). Ifxn (z , z+) then we have E

    +n+1

    Fn nSn+1 P (Xn+1 Xn= k2 | Fn), where limn nSn+1 = 12m ,and by Lemma2.7 the probability P (Xn+1 Xn= k2 | Fn) is bounded from below by a positive function ofz, , and the parameters{pk}0km. We can similarly bound from below E

    n+1

    Fn.We next prove Theorem 1.4 for the endpoints 0 and 1. The main idea of the proof is to compare the

    behavior near the endpoints of our process of interest to that of a standard Polya urn process where 2mballsare added at each time step. In order to formalize this, we make use of several different stochastic orders;we refer to[52]for an overview of these. We proceed by defining these stochastic orders and stating a few

    results on them, before proving Theorem1.4.

    Definition 1 (Stochastic orders). LetX andY be random variables.We say that X is smaller than Y in the usual stochastic order (denoted by Xst Y) if E ( (X))

    E ( (Y)) for all increasing continuous functions: R R for which these expectations exist.We say thatX is smaller than Y in the convex order (denoted byXcxY) ifE ( (X)) E ( (Y)) for

    all continuous convex functions : R R for which these expectations exist.We say thatX is smaller than Yin the increasing convex order (denoted byXicx Y) ifE ( (X))

    E ( (Y)) for all increasing continuous convex functions : R R for which these expectations exist.Lemma 2.8. Two random variablesX andY satisfyXicx Y if and only if there is a random variableZsuch thatXst Zcx Y.Proof. See[52, Theorem 4.A.6. (a)].

    Lemma 2.9. Let X and Ybe two random variables with cumulative distribution functions F and G, re-spectively, and bounded supports. Suppose that E (X) E (Y), and also that if t1 < t2 andG (t1)< F(t1)thenG (t2) F(t2). ThenXicxY.Proof. See[52, Theorem 4.A.22. (b)].

    Lemma 2.10. Consider the standard Polya urn process where2m balls are added at each time step. Letx1nandx2n be the proportions of red balls at then

    th step of two realizations of this process. Ifx1ncx x2n, thenx1n+1cxx2n+1, i.e., the Polya urn process preserves dominance in the convex order.Proof. See Proposition 1 in [46], in particular equation (13).

    Proof of Theorem1.4 for the endpoints. We prove nonconvergence to 1 when P (1) = 0 and P < 0 on

    (1 , 1) for some >0; the proof for the other endpoint is analogous. In the following fix 0 < < 1/m.The main idea of the proof is to compare the behavior near 1 of our process of interest to that of astandard Polya urn process where 2mballs are added at each time step. To aid in this comparison we alsointroduce an auxiliary process which is a combination of these two. We begin by describing these processes.

    Our process of interest is{Xn}n0, together with its normalized process{xn}n0. Let

    Xnn0 denote

    the process of the number of red balls in a standard P olya urn process where 2m balls are added at eachtime step, where the initial conditions are the same as those for the process {Xn}n0, i.e., X0 = X0. Let{xn}n0 denote the normalized process, i.e., xn= XnS0+2mn . Let

    Xnn0

    denote the auxiliary process, with

    initial conditionX0 = X0, and let {xn}n0 denote the normalized process, i.e.,xn= XnS0+2mn . We define this11

  • 8/13/2019 1307.2893v1.pdf

    12/19

    auxiliary process as follows. For 1 < x1, givenxn = x letXn+1 have the same distribution as Xn+1givenxn =x. For x1 , let P

    Xn+1 =Xn xn= x= 1 x and P Xn+1 =Xn+ 2m xn= x= x.In other words, whenxn > 1 then evolve the auxiliary process according to our process of interest, andwhenxn 1 then evolve it as a Polya urn process.

    We first show that it suffices to prove the claim for the auxiliary process, i.e., it suffices to show thatP (limn

    xn= 1) = 0. Define the following events:An:=

    limk

    xk= 1, xk> 1 for all k n

    , An:= limk

    xk= 1, xk> 1 for all k n .If P (limn xn= 1) > 0, then there exists n0 0. In particular, there existsy0 (1 , 1) such that both probabilities P (xn0 y0) and P (An0 | xn0 =y0) are positive. In fact, we claimthat P (An0 | xn0 =y) is positive for all y0 y 0. Moreover, if xn0 =xn0 ,on the event An0 we can couple the processes{xn}nn0 and{xn}nn0 so that xn =xn for all n n0, which shows that P

    An0 xn0 =y P (An0 | xn0 =y0) > 0 for all y y0. In particular, thisshows that P (limn xn= 1) > 0 implies that P (limn xn= 1) > 0. Thus it suffices to show thatP (limn xn= 1) = 0.

    We claim thatxnicxxn implies that P (limn xn= 1) = 0. To see this, for >0 define the functiong : [0, 1] [0, 2] by g(x) = max {0, 2 1/+x/}. This is an increasing continuous convex function, andso

    xnicxxn implies that

    (7) P (xn> 1 ) E (g(xn)) E (g(xn)) 2P (xn> 1 2) .We know that the limiting distribution ofxn is a beta distribution, and thus

    lim0

    limnP (xn> 1 2) = 0.

    By (7) this then implies that P (limn xn= 1) = 0.We proveXnicxXn (which is equivalent toxnicxxn) by induction on n; forn = 0 this is immediate

    since the initial conditions agree. Fix now a positive integern, and consider a random variable X whichattains integer values in the interval [X0, S0+ 2mn], and let x =

    XS0+2mn

    . Denote by Xa random variable

    with distribution P

    X=X+ 2mX = x and P X=XX = 1 x. Similarly, letXdenote a random

    variable with distribution the same as that of

    Xn+1 conditioned on

    Xn = X. Following Pemantle [46], the

    induction step follows from the following two claims: (1)XicxX, and (2)XicxY implies thatXicxY.First, it is enough to show that for any fixed r, conditioned on x = r we haveXicx X; one can thenintegrate out the conditioning to get (1). We show this by checking the conditions of Lemma2.9. First, when

    r 1 we have E Xx= r= E Xx= r by the definition of the auxiliary process. Ifr >1 then

    we have E

    Xx= r = r (S0+ 2mn) + 2mr, while E Xx= r = r (S0+ 2mn) + 2m (r+P(r)). Since

    r >1,P(r)< 0, and thus E Xx= r< E Xx= r. This shows that E Xx= r E Xx= r.

    The other condition of Lemma 2.9 holds automatically due to the fact that conditioned on X = , thedistribution ofXis supported on the two values{, + 2m}, while the support of the distribution ofX iscontained in the interval [, + 2m].

    12

  • 8/13/2019 1307.2893v1.pdf

    13/19

    In view of Lemmas 2.8 and 2.10, to show (2) it is enough to show that Xst Y implies Xst Y,i.e., that for any increasing function we have E

    X E Y. By conditioning on X and Y, we

    have E

    X

    = E

    (X)

    and E

    Y

    = E

    (Y)

    , where (t) := (t) (1 t) + (t+ 2m) t, where= (S0+ 2mn)

    1andt is such that 0 t 1. SinceXst Y, we only need to show that is increasing.

    Indeed, ift1 < t2 then

    (t2) (t1) = ( (t2+ 2m) (t1+ 2m)) t1+( (t2) (t1))(1 t1)+( (t2+ 2m) (t2)) (t2 t1) ,which is nonnegative, since all of the terms on the right hand side are nonnegative.

    Proof of Theorem1.5. This follows directly from Lemma2.6 and Theorem2.5.

    3 Many colors/types

    It is both natural and important to study competition between more than two colors/types. Our modelnaturally extends in this direction, and in this section we present our results regarding N 3 competingtypes. In the following, vectors will be denoted using boldface, subscripts typically correspond to time andsuperscripts correspond to the indices of types. Furthermore, denote by N the probability simplex in RN.

    The natural extension of the model to multiple competing types is as follows. At time zero, there is a

    graph G0, where each node is of exactly one of the N types. At each timestep a new node is added to thegraph, and is connected to m nodes of the original graph according to linear preferential attachment. Thetypes of these m neighbors induce a vector of types u, where ui is the number of neighbors of type i. Thetype of the new node is then determined according to a random draw from the distribution pu=

    piu

    i[N].

    The probabilitiespiu

    u,i

    are parameters of the model.

    As in the case of two types, our primary interest is in the fraction of nodes of each type. Let Ain denotethe number of nodes of type i at time n, and let An =

    A1n, . . . , A

    Nn

    denote the resulting vector of types.

    Let an denote the normalized vector of types, such thatN

    i=1ain= 1. Furthermore, letX

    in denote the sum

    of the degrees of typei nodes at time n, let Xn=

    X1n, . . . , X Nn

    denote the resulting vector of degrees, and

    let xn be the normalized vector of degrees, such thatN

    i=1xin= 1.

    As in the N= 2 case, there is a clear distinction between the linear model, when piu

    = ui

    m for all u and

    i

    [N], and nonlinear models, when there exist u and i

    [N] such that pi

    u

    = u

    i

    m. In fact, the linear model

    forN 3 types reduces to the linear model for two types. This is because in the linear model, if we want tostudy the evolution of the size of type i, then we can group all other types into a single mega-type, denotedby i, and run the process with two types: typei and mega-type i. Due to linearity, the original processwith N types and the process with type i and mega-typei can be coupled such that the evolution oftype i is identical in the two processes. Consequently, in the linear model all the results of theN= 2 caseapply. In particular, we have the following theorem.

    Theorem 3.1 (Linear model). Assume thatpiu

    = ui

    m for allu andi [N], and thatXi0 > 0 for alli [N].

    Thenan converges almost surely, and the limiting distribution has full support onN, and no atoms.

    In nonlinear models, as we will see later, a key role in the asymptotic behavior of the process {an}n0 isplayed by the vector field

    (8) P(y) =1

    2

    N

    i=1

    u

    mu (y)u piu ui

    m i,where

    mu

    = m!

    u1!...uN! denotes the multinomial coefficient, (y)u

    =

    y1u1

    y2u2

    . . .

    yNuN

    , and i is the

    N-dimensional unit vector whose ith coordinate is 1 and all other coordinates are 0. Let us denote the zeroset of this vector field on the probability simplex by ZP :=

    y N :P(y) = 0; this will be important

    later.The behavior of the process in the general nonlinear model with multiple types is involved, and its

    complete theoretical analysis is as of yet out of our reach. Nonetheless, based on partial theoretical results,we conjecture the following asymptotic behavior, which is similar to that in the case of two types.

    13

  • 8/13/2019 1307.2893v1.pdf

    14/19

    Conjecture 3.2. Assume that there existu andi [N] such thatpiu= ui

    m, and thatXi0 > 0 for alli [N].

    Thenan converges almost surely and the limit is a point in the zero setZP.

    In the rest of this section we describe theoretical progress towards this conjecture. As in the case of twocompeting types, the problem can be cast in a (multidimensional) stochastic approximation framework.

    The process{An}n0 is not a Markov process, and therefore we study the joint process{(An,Xn)}n0,which is indeed Markov. It evolves as follows. Given (An,Xn), a vector un+1 is drawn from the multinomialdistribution with parameters m and xn. Subsequently, an index In+1[N] is chosen from the distribution

    pun+1. We then have

    An+1 = An+ In+1,(9)

    Xn+1 = Xn+ un+1+mIn+1 .(10)

    Before analyzing the process {xn}n0, let us show that in order to prove Conjecture 3.2on the asymptoticbehavior of{an}n0, it is sufficient to prove a similar result on the asymptotic behavior of{xn}n0.

    Lemma 3.3. Assume that there existu and i [N] such that piu= ui

    m, and thatXi0 > 0 for all i [N].

    Assume that xn converges almost surely and the limit is a point in the zero set ZP. Then an convergesalmost surely and the limit is a point in the zero setZP.

    Proof. This is similar to the proof of Lemma 2.1. We have seen that

    E (An+1 An | Fn) = EIn+1

    Fn= Ni=1

    u

    m

    u

    (xn)

    upiui

    =xn+Ni=1

    u

    m

    u

    (xn)

    u

    piu

    ui

    m

    i =xn+ 2P(xn) =: f(xn) .

    Let M0 = 0 and define the martingale Mn := AnA0n1

    j=0 f(xj). This martingale has boundedincrements, and thus limnMn/n= 0 a.s. By the definition of the martingale, this shows that a.s.

    limnan 1

    n

    n1

    j=0 f(xj)= 0.Now if the limit limn xn exists a.s., and any limit point x satisfies P(x) =0, then also f(x) = x, andthus the limit of the Cesaro mean of the sequence{f(xn)}n0 also converges to the same limit point. Thisthen implies that the limit limn an exists a.s. and is equal to limnxn.

    The key observation in the analysis of the asymptotic behavior of{xn}n0 is that it is a stochasticapproximation process. In higher dimensions, a stochastic approximation process is defined as follows. LetZn be a stochastic process in the euclidean space R

    N and adapted to a filtration{Fn}n0. Suppose that itsatisfies

    Zn+1 Zn= 1n

    (F(Zn) + n+1+ Rn) ,

    whereFis a vector field on RN, E (n+1| F

    n) = 0 and the remainder terms Rn F

    n go to zero and satisfyn=1n1 Rn < a.s. Such a process is known as a stochastic approximation process.Lemma 3.4. The process{xn}n0 is a stochastic approximation process with the vector fieldP as in (8).Furthermore, the noise termn is bounded:n1 2N for alln 1.Proof. From (10) we have that E (Xn+1 Xn | Fn) = E (un+1 | Fn) +mE

    In+1

    Fn. GivenFn, un+1 ismultinomial with parameters m and xn, and so E (un+1 | Fn) = mxn. By construction, we have that

    EIn+1

    Fn= Ni=1

    u

    m

    u

    (xn)

    upiui.

    14

  • 8/13/2019 1307.2893v1.pdf

    15/19

    Let S0 denote the sum of the degrees in G0, and let Sn = S0 + 2mn. A simple calculation gives thatxn+1 xn= Xn+1Xn2mxnSn+1 , and so we have

    E (xn+1 xn | Fn) = mSn+1

    Ni=1

    u

    m

    u

    (xn)

    u

    piu

    ui

    m

    i =

    2m

    Sn+1P(xn) .

    We can then write{xn}n0 as a stochastic approximation process:

    xn+1 xn= 1n

    [P(xn) + n+1+ Rn] ,

    where

    (11) n+1 = n {xn+1 xn E (xn+1 xn | Fn)}

    is the martingale term, and the remainder term is

    Rn= S0+ 2mS0+ 2m (n+ 1)

    P(xn) .

    Clearly Rn Fn and similarly as at the end of the proof of Lemma 2.6 one can show thatRn c/n forsome constant c = c (N, S0, m), which implies that

    n=1n

    1 Rn < a.s.Finally, to check thatn1 2N, note that

    xin+1 xin=Xin+1 Xin 2mxinSn+1

    2m2m (n+ 1) = 1n+ 1 ,and then use (11).

    As in the one-dimensional case, intuitively, trajectories of a stochastic approximation process {Zn}n0should approximate the trajectories{Z(t)}t0 of the corresponding ODE dZ/dt= F(Z). Moreover, since{Zn}n0 is a stochastic system, we expect that stable trajectories of the ODE should appear, but unstabletrajectories should not.

    The main concept in formalizing this intuition is that of an asymptotic pseudotrajectory, introducedby Benam and Hirsch [20]. We omit the precise definition, and refer to Benams lecture notes on thetopic for more details [18](see also [49,Section 2.5] for a concise summary). There are many results thatgive sufficient conditions for a stochastic approximation process to be an asymptotic pseudotrajectory ofthe corresponding ODE. In particular, [18,Proposition 4.4 and Remark 4.5] (see also [49, Theorem 2.13]),together with Lemma3.4 and the fact that P is Lipschitz, imply the following.

    Corollary 3.5. Let{x (t)}t0 linearly interpolate{xn}n0 at nonintegral times. Then{x (t)}t0 is almostsurely an asymptotic pseudotrajectory for the flow induced by the vector fieldPvia the ODEdy/dt= P(y).

    There are further general results about asymptotic pseudotrajectories that apply to the stochastic ap-proximation process{xn}n0, e.g., about convergence to attractors and nonconvergence to linearly unstableequilibria. However, we omit these, as we prefer to emphasize the main message of Corollary3.5. The mainpoint is that in order to understand the stochastic approximation process

    {xn

    }n0, we need to understand

    the vector field P, and the corresponding ODE

    dy

    dt =P(y) .

    Unfortunately, understanding the behavior of such nonlinear ODEs is a notoriously difficult subject (see,e.g., the book by Hirsch, Smale and Devaney[32]). The most successful tool in this area is Lyapunov theory(see, e.g., the recent preprint [19]), and this can indeed be applied to our problem for special values of theparameters; however, it seems difficult to apply this theory to the vector field P for genericvalues of theparameters

    piu

    u,i

    .

    15

  • 8/13/2019 1307.2893v1.pdf

    16/19

    For instance, ifP is a gradient, i.e., P =V for some V : RN R, then Corollary3.5 and generalresults about asymptotic pseudotrajectories (see [18]) imply that xn converges almost surely and the limitis a point in the zero set ZP, which, by Lemma3.3, implies that Conjecture3.2 holds. An example of whenPis a gradient is when the probability of the new node adopting type i depends only on the proportion oftypei connections, i.e., pi

    u=

    ui/m

    for some function which does not depend on i. It is not difficult to

    show that must be of the form (z) = 1N

    + (1 ) z for some 0 1, which corresponds to a mixtureof the linear model and a uniformly random choice. In this case P(y) =

    2 1N1 y, where 1RN is the

    vector with all entries equal to 1, and thus when >0 then an converges a.s. to 1N

    1.However, forgenericparameter values P will not be a gradient. To see this, note that Pbeing a gradient

    implies that

    (12) (P(y))

    i

    yj =

    (P(y))j

    yi

    for every i= j. Without any restrictions, there are (N 1) m+N1N1

    free parameters in P. The gradient

    condition (12) imposes an additionalN2

    constraints, which will not be satisfied for generic parameter values.

    4 Open problems and future directions

    Our paper leaves open several interesting problems. Two immediate open questions concerning our modelare the following.

    Limiting distribution in the linear model for two types. Our Theorem1.1gives us informationabout the limiting behavior of{an}n0, but it does not identify the distribution ofa := limn an.Form = 1 the process{xn}n0 corresponds to a Polya urn where whenever one draws a ball, one putsback two extra balls of the same color. This is because when a new node joins the graph, its colorautomatically becomes the color of its initial neighbor. Thus the distribution ofxand by Lemma2.1the distribution ofa as wellis the Beta distribution with parameters X02 and

    Y02 .

    However, form >1 we do not know what the limiting distribution is. Note that simulations show thatthe limiting distribution can be bimodal; see, e.g., Figure2b.

    Understanding the vector field P. As discussed in Section3, in order to understand the behaviorof the general nonlinear model in the case of multiple typesand in order to prove or disprove Con-

    jecture3.2a good understanding of the vector field Pand the corresponding ODE dy/dt= P(y) isneeded. We leave this as our second open problem.

    A key property of our model is its simplicity. However, this also means that certain aspects of real-worldnetworks and processes influencing product adoption are simplified or not considered. It would be interestingto understand the following possible extensions of our model, and, in particular, whether anything can besaid analytically in these extensions.

    Changing preferences. In our model once a node receives a type, that type is then fixed and cannotchange over time. A possible extension of the model is to allow the type of a node to change over time.This can model changing preferences of individuals, e.g., somebody moving from one mobile phone

    provider to another. Allowing multiple types for a single individual. In our model a node can only have a single

    type. This is reasonable in many situations (e.g., an individual typically has only one mobile phoneprovider), but modeling other situations might require allowing nodes to simultaneously have multipletypes.

    Other network evolution models. The preferential attachment model is a good approximationof many real-world networks, and it has the advantageous property of being analytically tractable.How does our model behave under other network evolution models? Can similar results be shownanalytically/experimentally? Are the results robust to small changes in the network evolution model?

    16

  • 8/13/2019 1307.2893v1.pdf

    17/19

    Other type adoption mechanisms. Our model incorporates a fairly general type adoption mech-anism, but various modifications would be interesting to explore. For instance, in real life choices areoften made based on the opinions of specific friends, not just based on aggregate information of onesfriends.

    Marketing. In essence, our model describes word-of-mouth recommendations, and does not considermarketing efforts by the competing companies, such as advertising. How does incorporating marketingaffect the results?

    In conclusion, through a simple model we have coupled network evolution and type adoption, leadingto an explanation of coexistence in preferential attachment networks. Exploring various modifications andextensions of this model, such as those mentioned above, will be crucial in determining the robustness ofthis phenomenon, and will help elucidate our understanding of these processes.

    Acknowledgments

    We thank Erik Bodzsar, Gyorgy Korniss, Geza Meszena, Jasmine Nirody, Nathan Ross, Allan Sly, andIsabelle Stanton for helpful discussions.

    References

    [1] Y.-Y. Ahn, H. Jeong, N. Masuda, and J.D. Noh. Epidemic dynamics of two species of interactingparticles on scale-free networks. Physical Review E, 74(6):066113, 2006.

    [2] R.M. Anderson and R.M. May. Infectious diseases of humans: Dynamics and control. Oxford UniversityPress, 1991.

    [3] T. Antunovic, Y. Dekel, E. Mossel, and Y. Peres. Competing first passage percolation on random regulargraphs. Arxiv preprint arXiv:1109.2575, 2011.

    [4] J. Arndt. Role of Product-Related Conversations in the Diffusion of a New Product. Journal ofMarketing Research, 4(3):291295, 1967.

    [5] W.B. Arthur. Competing technologies, increasing returns, and lock-in by historical events. The Eco-nomic Journal, 99(394):116131, 1989.

    [6] W.B. Arthur. Positive Feedbacks in the Economy. Scientific American, 262(2):9299, 1990.

    [7] W.B. Arthur. Silicon Valley locational clusters: When do increasing returns imply monopoly? Mathe-matical Social Sciences, 19(3):235251, 1990.

    [8] W.B. Arthur. Increasing Returns and Path Dependence in the Economy. The University of MichiganPress, 1994.

    [9] W.B. Arthur. Complexity and the Economy. Science, 284(5411):107109, 1999.

    [10] W.B. Arthur, Y.M. Ermoliev, and Y.M. Kaniovski. On generalized urn schemes of the Polya kind.

    Kibernetika, 19:4956, 1983.

    [11] W.B. Arthur, Y.M. Ermoliev, and Y.M. Kaniovski. Strong laws for a class of path-dependent stochasticprocesses, with applications. InProceedings of the International Conference on Stochastic Optimization,pages 287300. Springer-Verlag, New York, 1984.

    [12] W.B. Arthur, Y.M. Ermoliev, and Y.M. Kaniovski. Path-dependent processes and the emergence ofmacro-structure. European Journal of Operational Research, 30(3):294303, 1987.

    [13] N.T.J. Bailey. The Mathematical Theory of Infectious Diseases and Its Applications. Griffin, London,1975.

    17

  • 8/13/2019 1307.2893v1.pdf

    18/19

    [14] A. Banerjee and D. Fudenberg. Word-of-mouth learning. Games and Economic Behavior, 46(1):122,2004.

    [15] A.L. Barabasi. Scale-free networks: A decade and beyond. Science, 325(5939):412413, 2009.

    [16] A.L. Barabasi and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509512,1999.

    [17] A. Barrat, M. Barthelemy, and A. Vespignani. Dynamical Processes on Complex Networks. CambridgeUniversity Press, 2008.

    [18] M. Benam. Dynamics of stochastic approximation algorithms. Seminaire de probabilites XXXIII, pages168, 1999.

    [19] M. Benam, I. Benjamini, J. Chen, and Y. Lima. A generalized Polyas urn with graph based interactions.Arxiv preprint arXiv:1211.1247, 2013.

    [20] M. Benam and M.W. Hirsch. Asymptotic Pseudotrajectories and Chain Recurrent Flows, with Appli-cations. Journal of Dynamics and Differential Equations, 8(1):141176, 1996.

    [21] A. Beutel, B.A. Prakash, R. Rosenfeld, and C. Faloutsos. Interacting viruses in networks: can both

    survive? InProceedings of the 18th ACM SIGKDD International Conference on Knowledge Discoveryand Data Mining, pages 426434. ACM, 2012.

    [22] F.A. Buttle. Word of mouth: understanding and managing referral marketing. Journal of StrategicMarketing, 6(3):241254, 1998.

    [23] M. Deijfen and R. van der Hofstad. The winner takes it all. Arxiv preprint arXiv:1306.6467, 2013.

    [24] C. Dellarocas. The digitization of word of mouth: Promise and challenges of online feedback mechanisms.Management Science, 49(10):14071424, 2003.

    [25] E. Dichter. How Word-of-Mouth Advertising Works. Harvard Business Review, 44(6):147166, 1966.

    [26] G. Ellison and D. Fudenberg. Rules of Thumb for Social Learning. Journal of Political Economy,101(4):612643, 1993.

    [27] G. Ellison and D. Fudenberg. Word-of-mouth communication and social learning.The Quarterly Journalof Economics, 110(1):93125, 1995.

    [28] J. Goldenberg, B. Libai, and E. Muller. Talk of the Network: A Complex Systems Look at the UnderlyingProcess of Word-of-Mouth. Marketing Letters, 12(3):211223, 2001.

    [29] T. Gross and B. Blasius. Adaptive coevolutionary networks: a review. Journal of the Royal SocietyInterface, 5(20):259271, 2008.

    [30] P.M. Herr, F.R. Kardes, and J. Kim. Effects of Word-of-Mouth and Product-Attribute Information onPersuasion: An Accessibility-Diagnosticity Perspective. Journal of Consumer Research, 17(4):454462,1991.

    [31] B.M. Hill, D. Lane, and W. Sudderth. A strong law for some generalized urn processes. The Annals ofProbability, 8(2):214226, 1980.

    [32] M.W. Hirsch, S. Smale, and R.L. Devaney. Differential Equations, Dynamical Systems, and An Intro-duction to Chaos. Academic Press, 2004.

    [33] P. Holme and J. Saramaki. Temporal networks. Physics Reports, 519(3):97125, 2012.

    [34] B. Karrer and M.E.J. Newman. Competing epidemics on complex networks. Physical Review E,84(3):036106, 2011.

    18

  • 8/13/2019 1307.2893v1.pdf

    19/19

    [35] V. Kumar, J.A. Petersen, and R.P. Leone. How Valuable is Word of Mouth? Harvard Business Review,85(10):139144,146,166, 2007.

    [36] V. Kumar, J.A. Petersen, and R.P. Leone. Driving Profitability by Encouraging Customer Referrals:Who, When, and How. Journal of Marketing, 74(5):117, 2010.

    [37] J. Leskovec, L.A. Adamic, and B.A. Huberman. The Dynamics of Viral Marketing. ACM Transactions

    on the Web, 1(1):139, 2007.

    [38] M. Lipsitch, C. Colijn, T. Cohen, W.P. Hanage, and C. Fraser. No coexistence for free: neutral nullmodels for multistrain pathogens. Epidemics, 1(1):213, 2009.

    [39] A.L. Lloyd and R.M. May. How viruses spread among computers and people. Science, 292(5520):13161317, 2001.

    [40] R.M. May and A.L. Lloyd. Infection dynamics on scale-free networks. Physical Review E, 64(6):066112,2001.

    [41] M.B. Nevelson and R.Z. Hasminskii. Stochastic Approximation and Recursive Estimation, volume 47 ofTranslations of Mathematical Monographs. American Mathematical Society, 1976.

    [42] M.E.J. Newman. Spread of epidemic disease on networks.Physical Review E, 66(1):016128, 2002.[43] M.E.J. Newman. Threshold effects for two pathogens spreading on a network. Physical Review Letters,

    95(10):108701, 2005.

    [44] H. Ohtsuki, C. Hauert, E. Lieberman, and M.A. Nowak. A simple rule for the evolution of cooperationon graphs and social networks. Nature, 441(7092):502505, 2006.

    [45] R. Pastor-Satorras and A. Vespignani. Epidemic spreading in scale-free networks. Physical ReviewLetters, 86(14):32003203, 2001.

    [46] R. Pemantle. A time-dependent version of Polyas urn. Journal of Theoretical Probability, 3(4):627637,1990.

    [47] R. Pemantle. Nonconvergence to unstable points in urn models and stochastic approximations. The

    Annals of Probability, 18(2):698712, 1990.

    [48] R. Pemantle. When are touchpoints limits for generalized Polya urns? Proceedings of the AmericanMathematical Society, 113(1):235243, 1991.

    [49] R. Pemantle. A survey of random processes with reinforcement. Probability Surveys, 4:179, 2007.

    [50] B.A. Prakash, A. Beutel, R. Rosenfeld, and C. Faloutsos. Winner takes all: competing viruses orideas on fair-play networks. InProceedings of the 21st International Conference on World Wide Web(WWW), pages 10371046. ACM, 2012.

    [51] H. Robbins and S. Monro. A stochastic approximation method.The Annals of Mathematical Statistics,22(3):400407, 1951.

    [52] M. Shaked and J.G. Shanthikumar. Stochastic Orders. Springer, New York, 2007.

    [53] B. Skyrms and R. Pemantle. A dynamic model of social network formation.Proceedings of the NationalAcademy of Sciences, 97(16):93409346, 2000.

    [54] D.J. Watts. A simple model of global cascades on random networks. Proceedings of the NationalAcademy of Sciences, 99(9):57665771, 2002.

    19