# arxiv:1505.02592v1 [nlin.ao] 11 may 2015 ?· department of mathematics, university of portsmouth,...

Post on 21-Aug-2018

212 views

Embed Size (px)

TRANSCRIPT

arX

iv:1

505.

0259

2v1

[nl

in.A

O]

11

May

201

5

Does good memory help you win games?

James BurridgeDepartment of Mathematics, University of Portsmouth, Portsmouth, PO1 2UP, UK

Yu Gao and Yong MaoSchool of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, UK

(Dated: May 12, 2015)

We present a simple game model where agents with different memory lengths compete for finiteresources. We show by simulation and analytically that an instability exists at a critical memorylength, and as a result, different memory lengths can compete and co-exist in a dynamical equilib-rium. Our analytical formulation makes a connection to statistical urn models, and we show thattemperature is mirrored by the agents memory. Our analysis is easily generalisable to many othergame models with implications that we briefly discuss.

PACS numbers: 02.50.Le, 89.75.Fb, 05.65.+b

Introduction All successful forms of life must eventu-ally engage in competition for resources. The equilibriumanalysis of these competitions began with von Neumann[1] and Nash [2]. The theory of games has since found ap-plications in genetics, ecology, economics and sociology[36]. Computational implementation of games leads toagent-based models, which may be of particular impor-tance in understanding the behaviour of financial systems[6, 7]. For example, the particularly successful minoritygame model [811], captures the competition between in-telligent agents with a restricted form of memory. Recentwork suggest such games may be generalised leading toclearly separated regimes of behaviour [12]. In general,understanding the complex collective behaviour arisingfrom the non-linear interactions between individuals is amajor challenge for statistical physics [13, 14].

In this Letter we present a simple game model: in eachround, individual agents pick one of two urns, each of-fering a stochastic yield to be shared by the pickers. Anagent has a memory of these payouts for the previous rounds to aid its decision. Such stochastic yield shar-ing arises in animal foraging behaviour and stock trad-ing [15], where the urns may represent different preyspecies, foraging patches, or stocks. Some form of in-telligence is essential in order to compete [16]. As withthe minority game [8], agents memory in our model isa tool for decision making. However, unlike the minor-ity game, memory in our model is used to make directestimates of the highest paying choice, rather than tosecond guess opponents next moves. In common withthe thermal minority game [10], dynamical urn models[18], and some evolutionary games [19], our agents havea temperature, which captures the level of noise in theswitches they make in search of yields. We find that theadditional noise inherent in their finite memory samples,which is greater for shorter memories, leads the system tobehave as if its agents have a higher temperature. There-fore, increasing memory cools the system. However, ata critical memory, a Hopf bifurcation [20] emerges pro-ducing stable cycles in the numbers of agents in eachurn. Perhaps not surprisingly, a long memory is advan-

tageous, but the presence of these cycles allows shortmemory agents to compete, and a mixed memory systemwill evolve toward the bifurcation point. Our theoreticalformulation follows that of statistical urn models suchas Ehrenfests dog flea model [21], which played an im-portant role in the early development of statistical me-chanics, and more recently allowed analytical investiga-tion of effects such as slow relaxation and condensationin nonequilibrium statistical mechanics [18].

Whilst we have restricted this Letter to a yield sharinggame, the analysis may be carried through for any so-cial system where agents switch between strategies usingtheir memory to determine the optimal choice. Simpleexamples include the Hawk Dove [3] and Rock ScissorsPaper [22] games. Our formulation could also accom-modate the inclusion of topological effects and agent in-teractions [23]. However, even without such complexities,urn models exhibit a remarkable range of non-equilibriumbehavior that is connected with temperature [18]. Ouranalysis suggests that effects such as condensation andthe emergence of order [24], which have social interpre-tation, may also have a connection with memory [23], butthat long memory can also introduce instability.

Model definition Consider the case of two urns anda total of n agents. We let the urn yields, U1(t), U2(t),at round t be random variables uniformly distributed on[0, n] and [0, n] respectively, where > 1 so that urn1 yields more on average than urn 2. We allow agentsaccess to the arithmetic mean of the last payoffs, but wenote that other forms of sampling could be used. Lettingt be the fraction of agents in urn 1, then the differencein the average payoffs between urn 1 and urn 2 is

t :=1

1

s=0

[

U2(t s)n(1 ts)

U1(t s)nts

]

. (1)

We refer to as the memory length of the agents.Agent dynamics is encoded in transition probabilities be-tween urns, which are deterministic functions of t. Ateach round, each agent will switch urns using the proba-

http://arxiv.org/abs/1505.02592v1

2

0.0 0.5 1.0 1.5 2.0 2.5 3.00.60

0.62

0.64

0.66

0.68

0.70

t

t

FIG. 1. Evolution of t when n = 106, = 2, = 10 and

= 106. Memory values are {5, 10, 50, 500} (squares,circles, dots, triangles). Dashed lines are analytical equilib-rium values (see equation (12)).

bilities

W12() =

2[1 + tanh()] (2)

W21() =

2[1 tanh()] . (3)

The parameter , the inverse temperature, capturesthe degree of stochasticity with which agents makechoices. For finite , agents may decide to switch strate-gies even though their estimate of the payoff difference isunfavourable. In the limit agents will only moveif their estimate of the payoff difference indicates thatthe move is favourable. The parameter controls therate at which strategy switching takes place comparedto the rate at which yield information arrives. It mayalso be seen as the frequency with which opportunitiesto switch strategy arise. In the limit 0, at most oneagent will move at each round.Simulation (instability) We simulate the model for a

series of values of when n = 106. Two different valuesof are used; in Figure 1 we have 1 = 106 and inFigure 2 we have = 103. For = 106, the expectednumber of moves at each step is < 1, and appears verystable. For larger , experiences much larger fluctua-tions about the steady state value, driven by the yieldprocess. For shorter memory values these fluctuationsare random, but as approaches 1, periodic oscilla-tions appear and dominate. The appearance of thesestable oscillations at critical memory, c, is known as aHopf bifurcation [20]. By allowing agents access onlyto the mean of their memory, we implicitly assume thatchanges in the expected payoff over the course of theirmemory, brought about by oscillations, are too subtlefor them to infer from noise.Simulation (coexistence) We now investigate how

agents with two different memories compete against oneanother by interpreting the payoff as reproduction rate.We define and as the rates of death, and reproduction

0 1 2 3 4 50.55

0.60

0.65

0.70

0.75

t

t

FIG. 2. Evolution of t when n = 106, = 2, = 5

and = 103. Memory values are {5, 50, 500} (triangles,squares, circles). Dashed line is solution to equation (15)when = 500 and , , are as above.

per unit payoff, respectively. Reproduction is assumedto occur before death in each round, but in practice theprobability of any one agent reproducing and dying inthe same round is extremely small for the , values wechoose. Letting pi (t) be the number of agents with mem-ory in urn i at time t we set the probability of birth foran agent in urn i to be

P(birth) =Ui(t)

pi (t)

. (4)

The death probability for each agent is set equal to .If populations are fixed in size and the system is notin an oscillatory state, then we expect that in equilib-rium the longer memory agents will dominate the highyielding urn. Their long memory allows them to per-ceive smaller statistical advantages that are obscured bynoise for the short memory agents. Using the thermody-namic analogy, the higher temperature (shorter memory)agents are more likely to make moves which leave themin an urn with a lower expected payoff, corresponding toa higher energy state. Above zero temperature, andin the absence of oscillations, the high yield urn will beunder-exploited, placing high memory agents at an ad-vantage. This effect can be observed in Figure 3 wherewe have simulated a mixed population of two memories {10, 1000} beginning with a ratio of 10:1 short mem-ory to long memory agents. We see that initially theadvantage afforded the long memory agents causes theirpopulation to grow, whereas the short memory agents re-duce in number. Were this advantage to be sustained in-definitely then we would expect the short memory agentsto eventually disappear, but in fact the populations stabi-lize. This effect appears because the long memory agentscause oscillations to develop once they are in sufficientlyhigh concentration. In the presence of oscillations theshort memory agents have an advantage because they canquickly observe opportunities offered by the oscillating

3

0 1000 2000 3000 4000 50000.0

0.2

0.4

0.6

0.8

1.0

-5

-4.5

-4

-3.5

-3

t

n-1p t

Log@

E@

2D-

E@D2D

FIG. 3. Scaled populations p(t) := p1(t) + p

2(t) for = 10(circles) and = 1000 (triangles) when n = p(0) = 106, = 103,

Recommended