mailath - do people play nash equilibrium

28
Journal of Economic Literature Vol. XXXVI (September 1998), pp. 1347–1374 Journal of Economic Literature, Vol. XXXVI (September 1998) Mailath: Do People Play Nash Equilibrium? Do People Play Nash Equilibrium? Lessons From Evolutionary Game Theory GEORGE J. MAILATH 1 1. Introduction A T THE SAME TIME that noncoopera- tive game theory has become a stan- dard tool in economics, it has also come under increasingly critical scrutiny from theorists and experimentalists. Noncoop- erative game theory, like neoclassical economics, is built on two heroic as- sumptions: Maximization—every eco- nomic agent is a rational decision maker with a clear understanding of the world; and consistency—the agent’s under- standing, in particular, expectations, of other agents’ behavior, is correct (i.e., the overall pattern of individual optimiz- ing behavior forms a Nash equilibrium). These assumptions are no less controver- sial in the context of noncooperative game theory than they are in neoclassical economics. A major challenge facing noncoopera- tive game theorists today is that of providing a compelling justification for these two assumptions. As I will argue here, many of the traditional justifica- tions are not compelling. But without such justification, the use of game the- ory in applications is problematic. The appropriate use of game theory requires understanding when its assumptions make sense and when they do not. In some ways, the challenge of pro- viding a compelling justification is not a new one. A major complaint other so- cial scientists (and some economists) have about economic methodology is the central role of the maximization hy- pothesis. A common informal argument is that any agent not optimizing—in particular, any firm not maximizing profits—will be driven out by market forces. This is an evolutionary argu- ment, and as is well known, Charles Darwin was led to the idea of natural selection from reading Thomas Malthus. 2 But does such a justification work? Is Nash equilibrium (or some re- 1347 1 University of Pennsylvania. Acknowledgments: I thank Robert Aumann, Steven Matthews, Loretta Mester, John Pencavel, three referees, and especially Larry Samuelson for their com- ments. Email: [email protected]. 2 “In October 1838, that is, fifteen months after I had begun my systematic enquiry, I happened to read for amusement ‘Malthus on Population,’ and being well prepared to appreciate the struggle for existence which everywhere goes on from long- continued observation of the habits of animals and plants, it at once struck me that under these cir- cumstances favorable variations would tend to be preserved, and unfavorable ones to be destroyed. The results of this would be the formation of new species. Here, then I had at last got a theory by which to work;” Charles Darwin (1887, p. 83).

Upload: cuenta-eliminada

Post on 22-Oct-2014

57 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Mailath - Do People Play Nash Equilibrium

Journal of Economic LiteratureVol. XXXVI (September 1998), pp. 1347–1374

Journal of Economic Literature, Vol. XXXVI (September 1998)Mailath: Do People Play Nash Equilibrium?

Do People Play Nash Equilibrium?Lessons From Evolutionary

Game Theory

GEORGE J. MAILATH1

1. Introduction

AT THE SAME TIME that noncoopera-tive game theory has become a stan-

dard tool in economics, it has also comeunder increasingly critical scrutiny fromtheorists and experimentalists. Noncoop-erative game theory, like neoclassicaleconomics, is built on two heroic as-sumptions: Maximization—every eco-nomic agent is a rational decision makerwith a clear understanding of the world;and consistency—the agent’s under-standing, in particular, expectations, ofother agents’ behavior, is correct (i.e.,the overall pattern of individual optimiz-ing behavior forms a Nash equilibrium).These assumptions are no less controver-sial in the context of noncooperativegame theory than they are in neoclassicaleconomics.

A major challenge facing noncoopera-tive game theorists today is that ofproviding a compelling justification forthese two assumptions. As I will arguehere, many of the traditional justifica-tions are not compelling. But without

such justification, the use of game the-ory in applications is problematic. Theappropriate use of game theory requiresunderstanding when its assumptionsmake sense and when they do not.

In some ways, the challenge of pro-viding a compelling justification is not anew one. A major complaint other so-cial scientists (and some economists)have about economic methodology isthe central role of the maximization hy-pothesis. A common informal argumentis that any agent not optimizing—inparticular, any firm not maximizingprofits—will be driven out by marketforces. This is an evolutionary argu-ment, and as is well known, CharlesDarwin was led to the idea of naturalselection from reading ThomasMalthus.2 But does such a justificationwork? Is Nash equilibrium (or some re-

1347

1 University of Pennsylvania. Acknowledgments:I thank Robert Aumann, Steven Matthews,Loretta Mester, John Pencavel, three referees,and especially Larry Samuelson for their com-ments. Email: [email protected].

2 “In October 1838, that is, fifteen months afterI had begun my systematic enquiry, I happened toread for amusement ‘Malthus on Population,’ andbeing well prepared to appreciate the struggle forexistence which everywhere goes on from long-continued observation of the habits of animals andplants, it at once struck me that under these cir-cumstances favorable variations would tend to bepreserved, and unfavorable ones to be destroyed.The results of this would be the formation of newspecies. Here, then I had at last got a theory bywhich to work;” Charles Darwin (1887, p. 83).

Page 2: Mailath - Do People Play Nash Equilibrium

lated concept) a good predictor of be-havior?

While the parallel between noncoop-erative game theory and neoclassicaleconomics is close, it is not perfect.Certainly, the question of whetheragents maximize is essentially the samein both. Moreover, the consistency as-sumption also appears in neoclassicaleconomics as the assumption that pricesclear markets. However, a fundamentaldistinction between neoclassical eco-nomics and noncooperative game theoryis that, while the many equilibria of acompetitive economy almost alwaysshare many of the same properties(such as efficiency or its lack),3 themany equilibria of games often havedramatically different properties. Whileneoclassical economics does not addressthe question of equilibrium selection,game theory must.

Much of the work in evolutionarygame theory is motivated by two basicquestions:

1. Do agents play Nash equilib-rium?

2. Given that agents play Nash equi-librium, which equilibrium do theyplay?

Evolutionary game theory formalizesand generalizes the evolutionary argu-ment given above by assuming thatmore successful behavior tends to bemore prevalent. The canonical modelhas a population of players interactingover time, with their behavior adjustingover time in response to the payoffs(utilities, profits) that various choiceshave historically received. These play-ers could be workers, consumers, firms,etc. The focus of study is the dynamicbehavior of the system. The crucial as-sumptions are that there is a population

of players, these players are interacting,and the behavior is naive (in two senses:players do not believe—understand—that their own behavior potentially af-fects future play of their opponents,and players typically do not take intoaccount the possibility that their oppo-nents are similarly engaged in adjustingtheir own behavior). It is important tonote that successful behavior becomesmore prevalent not just because marketforces select against unsuccessful be-havior, but also because agents imitatesuccessful behavior.

Since evolutionary game theory stud-ies populations playing games, it is alsouseful for studying social norms andconventions. Indeed, many of the moti-vating ideas are the same.4 The evolu-tion of conventions and social norms isan instance of players learning to playan equilibrium. A convention can bethought of as a symmetric equilibriumof a coordination game. Examples in-clude a population of consumers whomust decide which type of good to pur-chase (in a world of competing stan-dards); a population of workers whomust decide how much effort to exert; apopulation of traders at a fair (market)who must decide how aggressively tobargain; and a population of drivers ran-domly meeting at intersections whomust decide who gives way to whom.

Evolutionary game theory has pro-vided a qualified affirmative answer tothe first question: In a range of settings,agents do (eventually) play Nash. Thereis thus support for equilibrium analysisin environments where evolutionary ar-guments make sense. Equilibrium isbest viewed as the steady state of acommunity whose members are myopi-cally groping toward maximizing behav-ior. This is in marked contrast to the

3 Or perhaps economists have chosen to studyonly those properties that are shared by all equi-libria. For example, different competitive equilib-ria have different income distributions.

4 See, for example, Jon Elster (1989), BrianSkyrms (1996), Robert Sugden (1989), and H.Peyton Young (1996).

1348 Journal of Economic Literature, Vol. XXXVI (September 1998)

Page 3: Mailath - Do People Play Nash Equilibrium

earlier view (which, as I said, lackssatisfactory foundation), according towhich game theory and equilibriumanalysis are the study of the interactionof (ultra-) rational agents with a largeamount of (common) knowledge.5

The question of which equilibrium isplayed has received a lot of attention,most explicitly in the refinements litera-ture. The two most influential ideas inthat literature are backward and for-ward induction. Backward inductionand its extensions—subgame perfectionand sequentiality—capture notions ofcredibility and sequential rationality.Forward induction captures the ideathat a player’s choice of current actioncan be informative about his futureplay. The concerns about adequatefoundations extend to these refinementideas. While evolutionary game theorydoes discriminate between equilibria,backward induction receives little sup-port from evolutionary game theory.Forward induction receives more sup-port. One new important principle forselecting an equilibrium, based on sto-chastic stability, does emerge fromevolutionary game theory, and this prin-ciple discriminates between strict equi-libria (something backward and forwardinduction cannot do).

The next section outlines the majorjustifications for Nash equilibrium, andthe difficulties with them. In that sec-tion, I identify learning as the bestavailable justification for Nash equilib-rium. Section 3 introduces evolutionarygame theory from a learning perspec-tive. The idea that Nash equilibriumcan be usefully thought of as an evolu-tionary stable state is described in Sec-tion 4. The question of which Nashequilibrium is played is then discussed

in Section 5. As much as possible, Ihave used simple examples. Very fewtheorems are stated (and then only in-formally). Recent surveys of evolution-ary game theory include Eric vanDamme (1987, ch. 9), Michihiro Kan-dori (1997), Mailath (1992), and JörgenWeibull (1995).

2. The Question

Economics and game theory typicallyassume that agents are “rational” in thesense of pursuing their own welfare, asthey see it. This hypothesis is tautologi-cal without further structure, and it isusually further assumed that agents un-derstand the world as well as (if notbetter than) the researcher studying theworld inhabited by these agents. Thisoften requires an implausible degree ofcomputational and conceptual ability onthe part of the agents. For example,while chess is strategically trivial,6 it iscomputationally impossible to solve (atleast in the foreseeable future).

Computational limitations, however,are in many ways less important thanconceptual limitations of agents. Thetypical agent is not like Gary Kasparov,the world champion chess player whoknows the rules of chess, but also knowsthat he doesn’t know the winning strat-egy. In most situations, people do notknow they are playing a game. Rather,people have some (perhaps imprecise)notion of the environment they are in,their possible opponents, the actionsthey and their opponents have available,and the possible payoff implications ofdifferent actions. These people use heu-ristics and rules of thumb (generatedfrom experience) to guide behavior;sometimes these heuristics work well

5 Even in environments where an evolutionaryanalysis would not be appropriate, equilibriumanalysis is valuable in illuminating the strategicstructure of the game.

6 Since chess is a finite game of perfect informa-tion, it has been known since 1912 that eitherWhite can force a win, Black can force a win, oreither player can force a draw (Ernst Zermelo1912).

Mailath: Do People Play Nash Equilibrium? 1349

Page 4: Mailath - Do People Play Nash Equilibrium

and sometimes they don’t.7 These heu-ristics can generate behavior that isinconsistent with straightforward maxi-mization. In some settings, the be-havior can appear as if it was generatedby concerns of equity, fairness, orrevenge.

I turn now to the question of consis-tency. It is useful to first consider situ-ations that have aspects of coordination,i.e., where an agent (firm, consumer,worker, etc.) maximizes his welfare bychoosing the same action as the major-ity. For example, in choosing betweencomputers, consumers have a choicebetween PCs based on Microsoft’s op-erating systems and Apple-compatiblecomputers. There is significantly moresoftware available for Microsoft-com-patible computers, due to the marketshare of the Microsoft computers, andthis increases the value of theMicrosoft-compatible computers. Firmsmust often choose between differentpossible standards.

The first example concerns a team ofworkers in a modern version of Jean-Jacques Rousseau’s (1950) Stag Hunt.8 In the example, each worker can put inlow or high effort, the team’s total out-put (and so each worker’s compensa-tion) is determined by the minimum ef-fort of all the workers, and effort isprivately costly. Suppose that if allworkers put in low effort, the team pro-

duces a per capita output of 3, while ifall workers put in high effort, per capitaoutput is 7. Suppose, moreover, thedisutility of high effort is 2 (valued inthe same units as output). We can thusrepresent the possibilities, as in Figure1.9 It is worth emphasizing at this pointthat the characteristics that make thestag hunt game interesting are perva-sive. In most organizations, the value ofa worker’s effort is increasing in the ef-fort levels of the other workers.

What should we predict to be theoutcome? Consider a typical worker,whom I call Bruce for definiteness. Ifall the other workers are only putting inlow effort, then the best choice forBruce is also low effort: high effort iscostly, and choosing high effort cannotincrease output (since output is deter-mined by the minimum effort of all theworkers). Thus, if Bruce expects theother workers to put in low effort, thenBruce will also put in low effort. Sinceall workers find themselves in the samesituation, we see that all workers choos-ing to put in low effort is a Nash equi-librium: each worker is behaving in hisown best interest, given the behavior ofthe others. Now suppose the workers(other than Bruce) are putting in higheffort. In this case, the best choice forBruce is now high effort. While high ef-fort is costly, Bruce’s choice of highrather than low effort now does affect

7 Both Andrew Postlewaite and Larry Samuelsonhave made the observation that in life there are noone-shot games and no “last and final offers.”Thus, if an experimental subject is placed in anartificial environment with these properties, thesubject’s heuristic will not work well until it hasadjusted to this environment. I return to this inmy discussion of the ultimatum game.

8 Rousseau’s (1950) stag hunt describes severalhunters in the wilderness. Individually, eachhunter can catch rabbits and survive. Acting to-gether, the hunters can catch a stag and have afeast. However, in order to catch a stag, everyhunter must cooperate in the stag hunt. If evenone hunter does not cooperate (by catching rab-bits), the stag escapes.

9 The stag hunt game is an example of a coordi-nation game. A pure coordination game differsfrom the game in Figure 1 by having zeroes in theoff-diagonal elements. The game in Figure 13 is apure coordination game.

Figure 1. A “stag-hunt” played by workers in a team.

minimum of otherworkers’ efforts

worker’s

effort

high

low

high low

5 0

3 3

1350 Journal of Economic Literature, Vol. XXXVI (September 1998)

Page 5: Mailath - Do People Play Nash Equilibrium

output (since Bruce’s choice is theminimum) and so the increase in output(+ 4) is more than enough to justify theincrease in effort. Thus, if Bruce ex-pects all the other workers to be puttingin high effort, then he will also put inhigh effort. As for the low-effort case, adescription of behavior in which allworkers choose high effort constitutes aNash equilibrium.

These two descriptions of worker be-havior (all choose low effort and allchoose high effort) are internally con-sistent; they are also strict: Brucestrictly prefers to choose the same ef-fort level as the other workers.10 Thisimplies that even if Bruce is somewhatunsure about the minimum effortchoice (in particular, as long as Bruceassigns a probability of no more than0.4 to some worker choosing the othereffort choice), this does not affect hisbehavior.

But are these two descriptions goodpredictions of behavior? Should we, asoutside observers, be confident in aprediction that all the workers inBruce’s team will play one of the twoNash equilibria? And if so, why andwhich one? Note that this is not thesame as asking if Bruce will choose aneffort that is consistent with equilib-rium. After all, both choices (low andhigh effort) are consistent with equilib-rium, and so Bruce necessarily choosesan equilibrium effort.

Rather, the question concerns the be-havior of the group as a whole. How dowe rule out Bruce choosing high effortbecause he believes everyone else will,while Sheila (another worker on Bruce’steam) chooses low effort because shebelieves everyone else will?11 This is,

of course, ruled out by equilibrium con-siderations. But what does that mean?The critical feature of the scenario justdescribed is that the expectations ofBruce or Sheila about the behavior ofthe other members of the team are in-correct, something that Nash equilib-rium by definition does not allow.

As I said earlier, providing a compel-ling argument for Nash equilibrium is amajor challenge facing noncooperativegame theory today.12 The consistencyin Nash equilibrium seems to requirethat players know what the other play-ers are doing. But where does thisknowledge come from? When or why isthis a plausible assumption? Severaljustifications are typically given forNash equilibria: preplay communica-tion, self-fulfilling prophecies (consis-tent predictions), focal points, andlearning.13

The idea underlying preplay commu-nication is straightforward. Suppose theworkers in Bruce’s team meet before

10 A strict Nash equilibrium is a Nash equilib-rium in which, given the play of the opponents,each player has a unique best reply.

11 While the term “coordination failure” wouldseem to be an apt one to describe this scenario,

that term is commonly understood (particularly inmacroeconomics) to refer to coordination on aninefficient equilibrium (such as low effort chosenby all workers).

12 The best introspective (i.e., knowledge orepistemic) foundations for Nash equilibrium as-sume that each player’s conjectures about the be-havior of other players is known by all the players;see Robert Aumann and Adam Brandenburger(1995). This assumption does not appear to be asignificant improvement over the original assump-tion of Nash behavior. More generally, there is no compelling intro-spective argument for any useful equilibrium no-tion. The least controversial are those that do notrequire the imposition of a consistency condition,such as rationalizability (introduced by DouglasBernheim 1984 and David Pearce 1984); for twoplayers it is equivalent to the iterated deletion ofstrictly dominated strategies), the iterated deletionof weakly dominated strategies, and backward in-duction. However, in most games, rationalizabilitydoes little to constrain players’ behavior, and as wewill see below, the iterated deletion of weaklydominated strategies and backward induction areboth controversial.

13 What if there is only one equilibrium? Doesthis by itself give us a reason to believe that theunique equilibrium will be played? The answer is no.

Mailath: Do People Play Nash Equilibrium? 1351

Page 6: Mailath - Do People Play Nash Equilibrium

they must choose their effort levels anddiscuss how much effort they each willexert. If the workers reach an agree-ment that they all believe will be fol-lowed, it must be a Nash equilibrium(otherwise at least one worker has anincentive to deviate). This justificationcertainly has appeal and some range ofapplicability. However, it does notcover all possible applications. It alsoassumes that an agreement is reached,and, once reached, is kept. While itseems clear that an agreement will bereached and kept (and which one) inour stag hunt example (at least if theteam is small!), this is not true in gen-eral. The first difficulty, discussed byAumann (1990), is that an agreementmay not be kept. Suppose we changethe payoff in the bottom left cell of Fig-ure 1 from 3 to 4 (so that a workerchoosing low effort, when the minimumof the other workers’ effort is high, re-ceives a payoff of 4 rather than 3). Inthis version of the stag hunt, Brucebenefits from high-effort choices by theother workers, irrespective of his ownchoice. Bruce now has an incentive toagree to high effort, no matter what heactually intends to do (since this in-creases the likelihood of high effort bythe other workers). But then reachingthe agreement provides no informationabout the intended play of workers andso may not be kept. The second diffi-culty is that no agreement may bereached. Suppose, for example, the in-teraction has the characteristics of abattle-of-the-sexes game (Figure 2).

Such a game, which may describe a bar-gaining interaction, has several Paretononcomparable Nash equilibria: thereare several profitable agreements thatcan be reached, but the bargainers haveopposed preferences over which agree-ment is reached. In this case, it is notclear that an agreement will be reached.Moreover, if the game does have multi-ple Pareto noncomparable Nash equi-libria, then the preplay communicationstage is itself a bargaining game and soperhaps should be explicitly modelled(at which point, the equilibrium prob-lem resurfaces).14 Finally, there may beno possibility of preplay communica-tion.

The second justification of self-fulfill-ing prophecy runs as follows: If a theoryuniquely predicting players’ behaviors isknown by the players in the game, thenit must predict Nash equilibria (seeRoger Myerson (1991, pp. 105–108) foran extended discussion of this argu-ment). The difficulty, of course, is thatthe justification requires a theory thatuniquely predicts player behavior, andthat is precisely what is at issue.

The focal point justification, due toThomas Schelling (1960), can bephrased as “if there is an obvious way toplay in a game (derived from either thestructure of the game itself or from the

Figure 2. A “Battle-of-the-sexes” between an employer and a potential employee bargaining over wages. Each

simultaneously makes a wage demand or offer. The worker is only hired if they agree.

employer

employee high wage

low wage

high wage low wage

3,2 0,0

0,0 2,3

It is possible that the unique Nash equilibriumyields each player their maximin values, while atthe same time being riskier (in the sense that theNash equilibrium strategy does not guarantee themaximin value). This is discussed by, for example,John Harsanyi (1977, p. 125) and Aumann (1985).David Kreps (1990a, p. 135) describes a compli-cated game with a unique equilibrium that is alsounlikely to be played.

14 The preplay communication stage might alsoinvolve a correlating device, like a coin. For exam-ple, the players might agree to flip a coin: if heads,then the worker receives a high wage, while iftails, the worker receives a low wage.

1352 Journal of Economic Literature, Vol. XXXVI (September 1998)

Page 7: Mailath - Do People Play Nash Equilibrium

setting), then players will know whatother players are doing.” There aremany different aspects of a game thatcan single out an “obvious way to play.”For example, considerations of fairnessmay make equal divisions of a surplusparticularly salient in a bargaininggame. Previous experience suggests thatstopping at red lights and going throughgreen is a good strategy, while anotherpossible strategy (go at red and stop atgreen) is not (even though it is part ofanother equilibrium). It is sometimesargued that efficiency is such an aspect:if an equilibrium gives a higher payoffto every player than any other equilib-rium, then players “should not have anytrouble coordinating their expectationsat the commonly preferred equilibriumpoint” (John Harsanyi and ReinhardSelten 1988, p. 81). In our earlier staghunt example, this principle (calledpayoff dominance by Harsanyi andSelten 1988) suggests that the high-effort equilibrium is the obvious way toplay. On the other hand, the low-effortequilibrium is less risky, with Bruce re-ceiving a payoff of 3, no matter whatthe other members of his team do. Incontrast, it is possible that a choice ofhigh effort yields a payoff of only 0. Aswe will see, evolutionary game theoryhas been particularly important in ad-dressing this issue of riskiness and pay-off dominance. See Kreps (1990a) foran excellent extended discussion andfurther examples of focal points.15

Finally, agents may be able to learnto play an equilibrium. In order to learnto play an equilibrium, players must beplaying the same game repeatedly, or atleast, similar games that can providevaluable experience. Once all playershave learned how their opponents areplaying, and if all players are maximiz-

ing, then we must be at a Nash equilib-rium. There are two elements to thislearning story. The first is that, givenmaximizing behavior of players, playerscan learn the behavior of their oppo-nents.16 The second is that players aremaximizing. This involves, as I dis-cussed earlier, additional considera-tions of learning. Even if a player knowshow their opponents have played (forexample, the player may be the “lastmover”), they may not know what thebest action is. A player will use theirpast experience, as well as the experi-ence of other players (if that is availableand relevant), to make forecasts as tothe current or future behavior of oppo-nents, as well as the payoff implicationsof different actions. Since learning it-self changes the environment that theagents are attempting to learn (as otheragents change their behavior in re-sponse to their own learning), the pro-cess of learning is quite subtle. Notethat theories of learning have a focalpoint aspect, since histories (observedpatterns of play of other agents) canserve as coordinating devices that makecertain patterns of play “the obviousway to play,” as in the traffic examplefrom the previous paragraph.

The discussion so far points out someof the problems with many of the stan-dard justifications for Nash equilibriumand its two assumptions (maximizationand consistency). Besides learning, an-other approach is to concede the lack ofa sound foundation for consistency, butmaintain the hypothesis that agentsmaximize. The question then is whetherrationality and knowledge of the game(including the rationality of opponents)

15 Kreps (1990b, ch. 12) is a formal version ofKreps (1990a).

16 Non-evolutionary game theory work on learn-ing has focused on the question of when maximiz-ing players can in fact learn the behavior of theiropponents. Examples of this work include DrewFudenberg and David Kreps (1989), Drew Fuden-berg and David Levine (1993), Faruk Gul (1996),and Ehud Kalai and Ehud Lehrer (1993).

Mailath: Do People Play Nash Equilibrium? 1353

Page 8: Mailath - Do People Play Nash Equilibrium

is enough, at least in some interestingcases, to yield usable predictions. Thetwo (related) principles that peoplehave typically used in applications arebackward induction and the iterated de-letion of weakly dominated strategies.17 In many games, these procedures iden-tify a unique outcome. The difficulty isthat they require an implausible degreeof rationality and knowledge of otherplayers’ rationality. The two key exam-ples here are Rosenthal’s centipedegame (so called because of the appear-ance of its extensive form—Figure 3 isa short centipede) and the finitely re-peated Prisoner’s dilemma. The centi-pede is conceptually simpler, since it isa game of perfect information. The cru-cial feature of the game is that eachplayer, when it is their turn to move,strictly prefers to end the game imme-diately (i.e., choose En), rather thanhave the opponent end the game on thenext move by choosing En+1. More-over, at each move, each player strictlyprefers to have play proceed for a fur-ther two moves, rather than end imme-diately. For the game in Figure 3, ifplay reaches the last possible move,player I surely ends the game by choos-ing E3 rather than C3. Knowing this,player II should choose E2. The induc-tion argument then leads to the conclu-sion that player I necessarily ends thegame on the first move.18 I suspecteveryone is comfortable with the pre-diction that in a two-move game, playerI stops the game immediately. How-

ever, while this logic is the same inlonger versions, many researchers areno longer comfortable with the sameprediction.19 It only requires oneplayer to think that there is somechance that the other player is willingto play C initially to support playing Cin early moves. Similarly, if we considerthe repeated prisoner’s dilemma, thelogic of backward induction (togetherwith the property that the unique one-period dominant-strategy equilibriumyields each player their maximin payoff)implies that even in early periods coop-eration is not possible.

3. Evolutionary Game Theoryand Learning

The previous section argued that, ofthe various justifications that have beenadvanced for equilibrium analysis,learning is the least problematic. Evolu-tionary game theory is a particularly at-tractive approach to learning. In thetypical evolutionary game-theoreticmodel, there is a population of agents,each of whose payoff is a function ofnot only how they behave, but also howthe agents they interact with behave. Atany point in time, behavior within thepopulation is distributed over the dif-ferent possible strategies, or behaviors.If the population is finite, a state (of the

17 Backward induction is also the basic principleunderlying the elimination of equilibria relying on“incredible” threats.

18 This argument is not special to the extensiveform. If the centipede is represented as a normalform game, this backward induction is mimickedby the iterated deletion of weakly dominatedstrategies. While this game has many Nash equi-libria, they all involve the same behavior on theequilibrium path: player I chooses E1. The equi-libria only differ in the behavior of players off-the-equilibrium-path.

Figure 3. A short centipede game.

99910000

I II I

E1

100

9100

100099

E2 E3

C1 C2 C3

19 Indeed, more general knowledge-based con-siderations have led some researchers to focus onthe procedure of one round of weak dominationfollowed by iterated rounds of strict domination.A nice (although technical) discussion is EddieDekel and Faruk Gul (1997).

1354 Journal of Economic Literature, Vol. XXXVI (September 1998)

Page 9: Mailath - Do People Play Nash Equilibrium

population) is a description of whichagents are choosing which behavior. Ifthe population is infinite, a state is adescription of the fractions of the popu-lation that are playing each strategy. Ifa player can maximize, and knows thestate, then he can choose a best reply.If he does not know the state of thepopulation, then he must draw infer-ences about the state from whatever in-formation he has. In addition, evengiven knowledge of the state, the playermay not be able to calculate a best re-ply. Calculating a best reply requiresthat a player know all the strategiesavailable and the payoff implications ofall these strategies. The observed his-tory of play is now valuable for two rea-sons. First, the history conveys informa-tion about how the opponents areexpected to play. Second, the observedsuccess or failure of various choiceshelps players determine what might begood strategies in the future. Imitationis often an important part of learning;successful behavior tends to be imi-tated. In addition, successful behaviorwill be taught. To the extent that play-ers are imitating successful behaviorand not explicitly calculating best re-plies, it is not necessary for players todistinguish between knowledge of thegame being played and knowledge ofhow opponents are playing. Playersneed know only what was successful,not why it was successful.

Evolutionary game-theoretic gametheoretic models are either static or dy-namic, and the dynamics are either indiscrete or continuous time. A discretetime dynamic is a function that specifiesthe state of the population in period t +1 as a function of the state in period t,i.e., given a distribution of behavior inthe population, the dynamic specifiesnext period’s distribution. A continuoustime dynamic specifies the relative ratesof change of the fractions of the popula-

tion playing each strategy as a functionof the current state. Evolutionary (alsoknown as selection or learning) dynam-ics specify that behavior that is success-ful this period will be played by a largerfraction in the immediate future. Staticmodels study concepts that are in-tended to capture stability ideas moti-vated by dynamic stories without explic-itly analyzing dynamics.

An important point is that evolution-ary dynamics do not build in any as-sumptions on behavior or knowledge,other than the basic principle of differ-ential selection—apparently successfulbehavior increases its representation inthe population, while unsuccessful be-havior does not.

Evolutionary models are not struc-tural models of learning or bounded ra-tionality. While the motivation for thebasic principle of differential selectioninvolves an appeal to learning andbounded rationality, individuals are notexplicitly modeled. The feature thatsuccessful behavior last period is an at-tractive choice this period does seem torequire that agents are naive learners.They do not believe, or understand,that their own behavior potentially af-fects future play of their opponents,and they do not take into account thepossibility that their opponents aresimilarly engaged in adjusting their ownbehavior. Agents do not look for pat-terns in historical data. They behave asif the world is stationary, even thoughtheir own behavior should suggest tothem it is not.20 Moreover, agents be-have as if they believe that other agents’experience is relevant for them. Imita-tion then seems reasonable. Note thatthe context here is important. This style

20 It is difficult to build models of boundedlyrational agents who look for patterns. MasakiAoyagi (1996) and Doron Sonsino (1997) are rareexamples of models of boundedly rational agentswho can detect cycles.

Mailath: Do People Play Nash Equilibrium? 1355

Page 10: Mailath - Do People Play Nash Equilibrium

of modeling does not lend itself to smallnumbers of agents. If there is only asmall population, is it plausible to be-lieve that the agents are not aware ofthis? And if agents are aware, thenimitation is not a good strategy. Aswe will see, evolutionary dynamicshave the property that, in large popu-lations, if they converge, then theyconverge to a Nash equilibrium.21 Thisproperty is a necessary condition forany reasonable model of social learn-ing. Suppose we had a model in whichbehavior converged to something thatis not a Nash equilibrium. Since theenvironment is eventually stationaryand there is a behavior (strategy)available to some agent that yields ahigher payoff, then that agent shouldeventually figure this out and sodeviate.22

One concern sometimes raised aboutevolutionary game theory is that itsagents are implausibly naive. This con-cern is misplaced. If an agent is bound-edly rational, then he does not under-stand the model as written. Typically,this model is very simple (so that com-plex dynamic issues can be studied) andso the bounds of the rationality of theagents are often quite extreme. For ex-ample, agents are usually not able to de-tect any cycles generated by the dynam-ics. Why then are the agents not able tofigure out what the modeler can? As inmost of economic theory, the role of

models is to improve our intuition andto deepen our understanding of howparticular economic or strategic forcesinteract. For this literature to progress,we must analyze (certainly now, andperhaps forever) simple and tractablegames. The games are intended asexamples, experiments, and allegories.Modelers do not make assumptions ofbounded rationality because they be-lieve players are stupid, but rather thatplayers are not as sophisticated as ourmodels generally assume. In an idealworld, modelers would study very com-plicated games and understand howagents who are boundedly rational insome way behave and interact. But theworld is not ideal and these models areintractable. In order to better under-stand these issues, we need to studymodels that can be solved. Put differ-ently, the bounds on rationality are tobe understood relative to the complex-ity of the environment.

4. Nash Equilibrium as an EvolutionaryStable State

Consider a population of traders thatengage in randomly determined pair-wise meetings. As is usual, I will treatthe large population as being infinite.Suppose that when two traders meet,each can choose one of two strategies,“bold” and “cautious.” If a trader haschosen to be “bold,” then he will bar-gain aggressively, even to the point oflosing a profitable trade; on the otherhand, if a trader has chosen to be “cau-tious,” then he will never lose a profit-able trade. If a bold trader bargainswith a cautious trader, a bargain will bestruck that leaves the majority of gainsfrom trade with the bold trader. If twocautious traders bargain, they equallydivide the gains from trade. If two boldtraders bargain, no agreement isreached. One meeting between two

21 More accurately, they converge to a Nashequilibrium of the game determined by the strate-gies that are played along the dynamic path. It ispossible that the limit point fails to be a Nashequilibrium because a strategy that is not playedalong the path has a higher payoff than any strat-egy played along the path. Even if there is “drift” (see the ultimatum gamebelow), the limit point will be close to a Nashequilibrium.

22 Of course, this assumes that this superiorstrategy is something the agent could have thoughtof. If the strategy is never played, then the agentmight never think of it.

1356 Journal of Economic Literature, Vol. XXXVI (September 1998)

Page 11: Mailath - Do People Play Nash Equilibrium

traders is depicted as the symmetricgame in Figure 4.23

Behaviors with higher payoffs aremore likely to be followed in the future.Suppose that the population originallyconsists only of cautious traders. If notrader changes his behavior, the popula-tion will remain completely cautious.Now suppose there is a perturbationthat results in the introduction of somebold traders into the population. Thisperturbation may be the result of entry:perhaps traveling traders from anothercommunity have arrived; or experimen-tation: perhaps some of the traders arenot sure that they are behaving opti-mally and try something different.24 Ina population of cautious traders, boldtraders also consummate their deals andreceive a higher payoff than the cau-tious traders. So over time, the fractionof cautious traders in the populationwill decrease and the fraction of boldtraders will increase. However, oncethere are enough bold traders in thepopulation, bold traders no longer havean advantage (on average) over cautioustraders (since two bold traders cannotreach an agreement), and so the frac-tion of bold traders will always bestrictly less than one. Moreover, if thepopulation consists entirely of boldtraders, a cautious trader can success-fully invade the population. The onlystable population is divided betweenbold and cautious traders, with the pre-cise fraction determined by payoffs. Inour example, the stable population isequally divided between bold and cau-tious traders. This is the distributionwith the property that bold and cautioustraders have equal payoffs (and so alsodescribes a mixed-strategy Nash equi-

librium). At that distribution, if thepopulation is perturbed, so that, for ex-ample, slightly more than half the popu-lation is now bold while slightly lessthan half is cautious, cautious tradershave a higher payoff, and so learningwill lead to an increase in the number ofcautious traders at the expense of boldtraders, until balance is once againrestored.

It is worth emphasizing that the finalstate is independent of the original dis-tribution of behavior in the population,and that this state corresponds to thesymmetric Nash equilibrium. Moreover,this Nash equilibrium is dynamicallystable: any perturbation from this stateis always eliminated.

This parable illustrates the basics ofan evolutionary game theory model, inparticular, the interest in the dynamicbehavior of the population. The nextsection describes the well-known notionof an evolutionary stable strategy, astatic notion that attempts to capturedynamic stability. Section 4.2 then de-scribes explicit dynamics, while Section4.3 discusses asymmetric games.

4.1. Evolutionary Stable Strategies

In the biological setting, the idea thata stable pattern of behavior in a popula-tion should be able to eliminate any in-vasion by a “mutant” motivated JohnMaynard Smith and G. R. Price (1973)to define an evolutionary stable strategy(ESS).25 If a population pattern of be-23 This is the Hawk-Dove game traditionally

used to introduce the concept of an evolutionarystable strategy.

24 In the biological context, this perturbation isreferred to as a mutation or an invasion.

Figure 4. Payoff to a trader following the row strategy against a trader following the column strategy. The

gains from trade are 4.

Bold

Cautious

Bold Cautious

0 3

1 2

25 Good references on biological dynamics andevolution are Maynard Smith (1982) and JosefHofbauer and Karl Sigmund (1988).

Mailath: Do People Play Nash Equilibrium? 1357

Page 12: Mailath - Do People Play Nash Equilibrium

havior is to eliminate invading muta-tions, it must have a higher fitness thanthe mutant in the population that re-sults from the invasion. In biology,animals are programmed (perhapsgenetically) to play particular strategiesand the payoff is interpreted as “fit-ness,” with fitter strategies havinghigher reproductive rates (reproductionis asexual).

It will be helpful to use some nota-tion at this point. The collection ofavailable behaviors (strategies) is S andthe payoff to the agent choosing i whenhis opponent chooses j is π(i,j). We willfollow most of the literature in assum-ing that there is only a finite number ofavailable strategies. Any behavior in S iscalled pure. A mixed strategy is a prob-ability distribution over pure strategies.While any pure strategy can be viewedas the mixed strategy that places prob-ability one on that pure strategy, it willbe useful to follow the convention thatthe term “mixed strategy” always refersto a mixed strategy that places strictlypositive probability on at least twostrategies. Mixed strategies have twoleading interpretations as a descriptionof behavior in a population: either thepopulation is monomorphic, in whichevery member of the population playsthe same mixed strategy, or the popula-tion is polymorphic, in which eachmember plays a pure strategy and thefraction of the population playing anyparticular pure strategy equals theprobability assigned to that pure strat-egy by the mixed strategy.26 As will beclear, the notion of an evolutionary sta-ble strategy is best understood by as-suming that each agent can choose a

mixed strategy and the population isoriginally monomorphic.

Definition 1. A (potentially mixed)strategy p is an Evolutionary StableStrategy (ESS) if:

1. the payoff from playing p against pis at least as large as the payofffrom playing any other strategyagainst p; and

2. for any other strategy q that has thesame payoff as p against p, the pay-off from playing p against q is atleast as large as the payoff fromplaying q against q.27

Thus, p is an evolutionary stablestrategy if it is a symmetric Nash equi-librium, and if, in addition, when q isalso a best reply to p, then p does betteragainst q than q does. For example, a isan ESS in the game in Figure 5.

ESS is a static notion that attempts tocapture dynamic stability. There aretwo cases to consider: The first is that pis a strict Nash equilibrium (see foot-note 10). Then, p is the only best replyto itself (so p must be pure), and anyagent playing q against a populationwhose members mostly play p receivesa lower payoff (on average) than a pplayer. As a result, the fraction of thepopulation playing q shrinks.

The other possibility is that p is notthe only best reply to itself (if p is amixed strategy, then any other mixturewith the same or smaller support is alsoa best reply to p). Suppose q is a best

26 The crucial distinction is whether agents canplay and inherit (learn) mixed strategies. If not,then any mixed strategy state is necessarily the re-sult of a polymorphic population. On the otherhand, even if agents can play mixed strategies, thepopulation may be polymorphic with differentagents playing different strategies.

Figure 5. The numbers in the matrix are the row player’s payoff. The strategy a is an ESS in this game.

a

b

a b

2 1

2 0

27 The payoff to p against q is π(p,q) = Σijπ(i,j)piqjFormally, the strategy p is an ESS if, for all q,π(p,p) ≥ π(q,p), and if there exists q ≠ p such thatπ(p,p) = π(q,p), then π(q,p) > π(q,q).

1358 Journal of Economic Literature, Vol. XXXVI (September 1998)

Page 13: Mailath - Do People Play Nash Equilibrium

reply to p. Then both an agent playing pand an agent playing q earn the samepayoff against a population of p players.After the perturbation of the populationby the entry of q players, however, thepopulation is not simply a population ofp players. There is also a small fractionof q players, and their presence willdetermine whether the q players areeliminated. The second condition inthe definition of ESS guarantees thatin the perturbed population, it isthe p players who do better than the qplayers when they play against a qplayer.

4.2. The Replicator and Other More General Dynamics

While plausible, the story underlyingESS suffers from its reliance on the as-sumption that agents can learn (in thebiological context, inherit) mixed strate-gies. ESS is only useful to the extent thatit appropriately captures some notion ofdynamic stability. Suppose individualsnow choose only pure strategies. Definepi

t as the proportion of the populationchoosing strategy i at time t. The state ofthe population at time t is thenpt ≡ (p1

t , … , pnt ), where n is the number of

strategies (of course, pt is in the n − 1dimensional simplex). The simplest evo-lutionary dynamic one could use toinvestigate the dynamic properties ofESS is the replicator dynamic. In itssimplest form, this dynamic specifiesthat the proportional rate of growth in astrategy’s representation in the popula-tion, pi

t, is given by the extent to whichthat strategy does better than the popu-lation average.28 The payoff to strategy iwhen the state of the population is pt isπ(i,pt) = ∑

jπ(i,j)pj

t, while the population

average payoff is π(pt,pt) = ∑ ijπ(i,j)pi

t pjt.

The continuous time replicator dynamicis then:

dpi

t

dt = pi

t × (π(i,pt) − π(pt,pt)). (1)

Thus, if strategy i does better thanaverage, its representation in the popu-lation grows (dpi

t / dt > 0) , and if anotherstrategy i′ is even better, then itsgrowth rate is also higher than that ofstrategy i. Equation (1) is a differentialequation that, together with an initialcondition, uniquely determines a pathfor the population that describes, forany time t, the state of the population.

A state is a rest point of the dynamicsif the dynamics leave the state un-changed (i.e., dpi

t /dt = 0 for all i). A restpoint is Liapunov stable if the dynamicsdo not take states close to the rest pointfar away. A rest point is asymptoticallystable if, in addition, any path (impliedby the dynamics) that starts sufficientlyclose to the rest point converges to thatrest point.

There are several features of the rep-licator dynamic to note. First, if a purestrategy is extinct (i.e., no fraction ofthe population plays that pure strategy)at any point of time, then it is neverplayed. In particular, any state in whichthe same pure strategy is played byevery agent (and so every other strategyis extinct) is a rest point of the replica-tor dynamic. So, being a rest point isnot a sufficient condition for Nash equi-librium. This is a natural feature thatwe already saw in our discussion of thegame in Figure 4—if everyone is bar-gaining cautiously and if traders are notaware of the possibilities of bold play,then there is no reason for traders tochange their behavior (even though arational agent who understood the pay-off implications of the different strate-gies available would choose bold behav-ior rather than cautious).

Second, the replicator dynamic is not28 The replicator dynamic can also be derived

from more basic biological arguments.

Mailath: Do People Play Nash Equilibrium? 1359

Page 14: Mailath - Do People Play Nash Equilibrium

a best reply dynamic: strategies that arenot best replies to the current popula-tion will still increase their repre-sentation in the population if they dobetter than average (this feature onlybecomes apparent when there are atleast three available strategies). Thisagain is consistent with the view thatthis is a model of boundedly rationallearning, where agents do not under-stand the full payoff implications of thedifferent strategies.

Finally, the dynamics can have multi-ple asymptotically stable rest points.The asymptotic distribution of behaviorin the population can depend upon thestarting point. Returning to the staghunt game of Figure 1, if a high frac-tion of workers has chosen high efforthistorically, then those workers who hadpreviously chosen low effort would beexpected to switch to high effort, and sothe fraction playing high effort wouldincrease. On the other hand, if workershave observed low effort, perhaps loweffort will continue to be observed. Un-der the replicator dynamic (or any de-terministic learning dynamic), if p0

high >3/5, then pt

high → 1, while if p0high < 3/5,

then pthigh → 0. The equilibrium that

players eventually learn is determinedby the original distribution of playersacross high and low effort. If the origi-nal distribution is random (e.g., p0

high isdetermined as a realization of a uniformrandom variable), then the low effortequilibrium is 3/5’s as likely to arise asthe high effort equilibrium. This notionof path dependence—that history mat-ters—is important and attractive.

E. Zeeman (1980) and Peter Taylorand Leo Jonker (1978) have shown thatif p is an ESS, it is asymptotically stableunder the continuous time replicatordynamic, and that there are examples ofasymptotically stable rest points of thereplicator dynamic that are not ESS. Ifdynamics are specified that allow for

mixed strategy inheritance, then p is anESS if and only if it is asymptoticallystable (see W. G. S. Hines 1980, ArthurRobson 1992, and Zeeman 1981). Apoint I will come back to is that bothasymptotic stability and ESS are con-cerned with the stability of the systemafter a once and for all perturbation.They do not address the consequencesof continual perturbations. As we willsee, depending upon how they aremodeled, continual perturbations canprofoundly change the nature of learn-ing.

While the results on the replicatordynamic are suggestive, the dynamicsare somewhat restrictive, and there hasbeen some interest in extending theanalysis to more general dynamics.29 Interest has focused on two classes ofdynamics. The first, monotone dynam-ics, roughly requires that on average,players switch from worse to better (notnecessarily the best) pure strategies.The second, more restrictive, class, ag-gregate monotone dynamics, requiresthat, in addition, the switching of strate-gies has the property that the induceddistribution over strategies in the popu-lation has a higher average payoff. It isworth noting that the extension to ag-gregate monotone dynamics is not thatsubstantial: aggregate monotone dynam-ics are essentially multiples of the repli-cator dynamic (Samuelson and JianboZhang 1992).

29 There has also been recent work exploring thelink between the replicator dynamic and explicitmodels of learning. Tilman Börgers and RajivSarin (1997) consider a single boundedly rationaldecision maker using a version of the Robert Bushand Frederick Mosteller (1951, 1955) model ofpositive reinforcement learning, and show that theequation describing individual behavior looks likethe replicator dynamic. John Gale, Kenneth Bin-more, and Larry Samuelson (1995) derive the rep-licator dynamic from a behavioral model of aspira-tions. Karl Schlag (1998) derives the replicatordynamic in a bandit setting, where agents learnfrom others.

1360 Journal of Economic Literature, Vol. XXXVI (September 1998)

Page 15: Mailath - Do People Play Nash Equilibrium

Since, by definition, a Nash equilib-rium is a strategy profile with the prop-erty that every player is playing a bestreply to the behavior of the other play-ers, every Nash equilibrium is a restpoint of any monotone dynamic. How-ever, since the dynamics may not intro-duce behavior that is not already pre-sent in the population, not all restpoints are Nash equilibria. If a restpoint is asymptotically stable, then thelearning dynamics, starting from a pointclose to the rest point but with allstrategies being played by a positivefraction of the population, converge tothe rest point. Thus, if the rest point isnot a Nash equilibrium, some agentsare not optimizing and the dynamicswill take the system away from that restpoint. This is the first major messagefrom evolutionary game theory: if thestate is asymptotically stable, then it de-scribes a Nash equilibrium.

4.3. Asymmetric Games and Nonexistence of ESS

The assumption that the traders aredrawn from a single population and thatthe two traders in any bargaining en-counter are symmetric is important. Inorder for two traders in a trading en-counter to be symmetric in this way,their encounter must be on “neutral”ground, and not in one of their estab-lishments.30 Another possibility is thateach encounter occurs in one of thetraders’ establishments, and the behav-ior of the traders depends uponwhether he is the visitor or the owner ofthe establishment. This is illustrated inFigure 6. This game has three Nashequilibria: The stable profile of a 50/50mixture between cautious and bold be-havior is still a mixed strategy equilib-rium (in fact, it is the only symmetric

equilibrium). There are also two purestrategy asymmetric equilibria (theowner is bold, while the visitor is cau-tious; and the owner is cautious, whilethe visitor is bold). Moreover, these twopure strategy equilibria are strict.

Symmetric games, like that in Figure6, have the property that the strategiesavailable to the players do not dependupon their role , i.e., row (owner) orcolumn (visitor). The assumption that ina symmetric game, agents cannot condi-tion on their role is called no role iden-tification. In most games of interest,the strategies available to a player alsodepend upon his role. Even if not, as inthe example just discussed, there isoften some distinguishing feature thatallows players to identify their role(such as the row player being the in-cumbent and the column player beingan entrant in a contest for a location).Such games are called asymmetric.

The notion of an ESS can be appliedto such games by either changing thedefinition to allow for asymmetric mu-tants (as in Jeroen Swinkels 1992), or,equivalently, by symmetrizing thegame. The symmetrized game is thegame obtained by assuming that, exante, players do not know which rolethey will have, so that players’ strategiesspecify behavior conditional on differ-ent roles, and first having a move of na-ture that allocates each player to a role.(In the trader example, a coin flip foreach trader determines whether thetrader stays “home” this period, or visitsanother establishment.) However, every

30 More accurately, the traders’ behavior cannotdepend on the location.

Figure 6. Payoff to two traders bargaining in the row trader’s establishment. The first number is the owner’s

payoff, the second is the visitor’s. The gains from trade are 4.

visiting trader

owner Bold

Cautious

Bold Cautious

0,0 3,1

1,3 2,2

Mailath: Do People Play Nash Equilibrium? 1361

Page 16: Mailath - Do People Play Nash Equilibrium

ESS in such a symmetrized game is astrict equilibrium (Selten 1980). This isan important negative result, since mostgames (in particular, non-trivial exten-sive form games) do not have strictequilibria, and so ESSs do not exist formost asymmetric games.

The intuition for the nonexistence ishelpful in what comes later, and is mosteasily conveyed if we take the mono-morphic interpretation of mixed strate-gies. Fix a non-strict Nash equilibriumof an asymmetric game, and suppose itis the row player that has availableother best replies. Recall that a strategyspecifies behavior for the agent in hisrole as the row player and also in hisrole as the column player. The mutantof interest is given by the strategy thatspecifies one of these other best repliesfor the row role and the existing behav-ior for the column role. This mutant isnot selected against, since its expectedpayoff is the same as that of the remain-der of the population. First note that allagents in their column role still behavethe same as before the invasion. Then,in his row role, the mutant is (by as-sumption) playing an alternative bestreply, and in his column role he re-ceives the same payoff as every othercolumn player (since he is playing thesame as them). See Figure 7. The ideathat evolutionary (or selection) pres-sures may not be effective againstalternative best replies plays an im-portant role in subsequent work on

cheap talk and forward and backwardinduction.

It is also helpful to consider the be-havior of dynamics in a two-populationworld playing the trader game. Thereare two populations, owners and visi-tors. A state now consists of the pair(p,q), where p is the fraction of ownerswho are bold, while q is the fraction ofvisitors who are bold.31 There is a sepa-rate replicator dynamic for each popula-tion, with the payoff to a strategy fol-lowed by an owner depending only onq, and not p (this is the observationfrom the previous paragraph that rowplayers do not interact with row play-ers). While (p*,q*), where p* = q* =1 ⁄ 2, is still a rest point of the two di-mensional dynamical system describingthe evolution of trader behavior, it isno longer asymptotically stable. Thephase diagram is illustrated in Figure 8.If there is a perturbation, then ownersand traders move toward one of thetwo strict pure equilibria. Both of theasymmetric equilibria are asymptoti-cally stable.

If the game is asymmetric, then we

Figure 7. The asymmetric version of the game in Figure 5. The strategy (a,α) is not an ESS of the

symmetrized game, since the payoff earned by the strategy (a,α) when playing against (b,α) equals the

payoff earned by the strategy (b,α) when playing against (b,α) (which equals 1/2π(a,α)+1/2π(α,b)=3/2).

a

b

α β2,2 1,2

2,1 0,0

Figure 8. The phase diagram for the two population trader game.

Prop

ortio

n of

bol

d vi

sito

rs

Proportion of bold owners

31 This is equivalent to considering dynamics inthe one-population model where the game is sym-metrized and p is the fraction of the populationwho are bold in the owner role.

1362 Journal of Economic Literature, Vol. XXXVI (September 1998)

Page 17: Mailath - Do People Play Nash Equilibrium

have already seen that the only ESSare strict equilibria. There is a similarlack of power in considering asymptoticstability for general dynamics in asym-metric games. In particular, asymptoticstability in asymmetric games implies“almost” strict Nash equilibria, and ifthe profile is pure, it is strict.32 Forthe special case of replicator dynamics,a Nash equilibrium is asymptoticallystable if and only if it is strict (KlausRitzberger and Jörgen Weibull 1995).

5. Which Nash Equilibrium?

Beyond excluding non-Nash behavioras stable outcomes, evolutionary gametheory has provided substantial insightinto the types of behavior that are con-sistent with evolutionary models, andthe types of behavior that are not.

5.1. Domination

The strongest positive results concernthe behavior of the dynamics with re-spect to strict domination: If a strategyis strictly dominated by another (thatis, it yields a lower payoff than theother strategy for all possible choicesof the opponents), then over time thatstrictly dominated strategy will disap-pear (a smaller fraction of the popula-tion will play that strategy). Once thatstrategy has (effectively) disappeared,any strategy that is now strictly domi-nated (given the deletion of the original

dominated strategy) will also now dis-appear.

There is an important distinctionhere between strict and weak domina-tion. It is not true that weakly domi-nated strategies are similarly elimi-nated. Consider the game taken fromSamuelson and Zhang (1992, p. 382) inFigure 9. It is worth noting that this isthe normal form of the extensive formgame with perfect information given inFigure 10.

There are two populations, withagents in population 1 playing the roleof player 1 and agents in population 2playing the role of player 2. Any state inwhich all agents in population 2 play Lis Liapunov stable: Suppose the statestarts with almost all agents in popula-tion 2 playing L. There is then very lit-tle incentive for agents in the firstpopulation playing B to change behav-ior, since T is only marginally better(moreover, if the game played is in factthe extensive form of Figure 10, thenagents in population 1 only have achoice to make if they are matched withone of the few agents choosing R). So,dynamics will not move the state farfrom its starting point, if the startingpoint has mostly L-playing agents inpopulation 2. If we model the dynamicsfor this game as a two-dimensional con-

32 Samuelson and Zhang (1992, p. 377) has theprecise statement. See also Daniel Friedman(1991) and Weibull (1995) for general results oncontinuous time dynamics.

Figure 9. Any state in which all agents in population 2 play L is a stable rest point.

Player 2

Player 1 T

B

L R

1,1 1,0

1,1 0,0

Figure 10. An extensive form game with the normal form in Figure 9.

2

L R

111

T B

10

00

Mailath: Do People Play Nash Equilibrium? 1363

Page 18: Mailath - Do People Play Nash Equilibrium

tinuous time replicator dynamic, wehave33

dpt

dt = pt × (1 − pt)(1 − qt)

anddqt

dt = qt × (1 − qt),

where pt is the fraction of population 1playing T and qt is the fraction of popula-tion 2 playing L. The adjustment of thefraction of population 2 playing L re-flects the strict dominance of L over R:Since L always does better than R, if qt isinterior (that is, there are both agentsplaying L and R in the population) thefraction playing L increases (dqt⁄dt > 0),independently of the fraction in popula-tion 1 playing T, with the adjustmentonly disappearing as qt approaches 1.The adjustment of the fraction of popu-lation 1 playing T, on the other hand de-pends on the fraction of population 2playing R: if almost all of population 2 isplaying L (qt is close to 1), then the ad-justment in pt is small (in fact, arbitrarilysmall for qt arbitrarily close to 1), nomatter what the value of pt. The phasediagram is illustrated in Figure 11.

It is also important to note that norest point is asymptotically stable. Eventhe state with all agents in population 1playing T and all agents in population 2playing L is not asymptotically stable,because a perturbation that increasesthe fraction of population 2 playing Rwhile leaving the entire population 1playing T will not disappear: the systemhas been perturbed toward another rest

point. Nothing prevents the systemfrom “drifting” along the heavy linedhorizontal segment in Figure 11. More-over, each time the system is perturbedfrom a rest point toward the interior ofthe state space (so that there is a posi-tive fraction of population 1 playing Band of population 2 playing R), it typi-cally returns to a different rest point(and for many perturbations, to a restpoint with a higher fraction of popula-tion 2 playing L). This logic is reminis-cent of the intuition given earlier on forthe nonexistence of ESS in asymmetricgames. It also suggests that sets ofstates will have better stability proper-ties than individual states.

Recall that if a single strategy profileis “evolutionarily stable,” then behaviorwithin the population, once near thatprofile, converges to it and never leavesit. A single strategy profile describesthe aggregate pattern of behaviorwithin the population. A set of strategyprofiles is a collection of such descrip-tions. Loosely, we can think of a set ofstrategy profiles as being “evolutionarilystable” if behavior within the popula-tion, once near any profile in the set,converges to some profile within it andnever leaves the set. The important fea-ture is that behavior within the popula-

33 The two-population version of (1) is:

dpi

t

dt = pi

t × (π1(i,qt) − π1(pt,qt))

and

dqi

t

dt = qj

t × (π2(pt,j) − π2(pt,qt)),

where πk(i,j) is player k’s payoff from the strategypair (i,j) and

πk(p,q) = Σijπk(i,j)piqj.

Figure 11. The phase diagram for the game with weak dominance.

Fra

ctio

n pl

ayin

g L

, qt

Fraction playing T, pt

1364 Journal of Economic Literature, Vol. XXXVI (September 1998)

Page 19: Mailath - Do People Play Nash Equilibrium

tion need not settle down to a steadystate, rather it can “drift” between thedifferent patterns within the “evolution-arily stable” set.

5.2. The Ultimatum Game and Backward Induction

The ultimatum game is a simple gamewith multiple Nash equilibria and aunique backward induction outcome.There is $1 to divide. The proposerproposes a division to the responder.The responder either accepts the divi-sion, in which case it is implemented, orrejects it, in which case both players re-ceive nothing. If the proposer can makeany proposal, the only backward induc-tion solution has the receiver acceptingthe proposal in which the proposer re-ceives the entire dollar. If the dollarmust be divided into whole pennies,there is another solution in which theresponder rejects the proposal in whichhe receives nothing and accepts theproposal of 99 cents to the proposerand 1 cent to the responder. This pre-diction is uniformly rejected in experi-ments!

How do Gale, Binmore, and Samuel-son (1995) explain this? The critical is-sue is the relative speeds of conver-gence. Both proposers and respondersare “learning.” The proposers are learn-ing which offers will be rejected (this islearning as we have been discussing it).In principle, if the environment (termsof the proposal) is sufficiently compli-cated, responders may also have diffi-culty evaluating offers. In experiments,responders do reject as much as 30cents. However, it is difficult to imag-ine that responders do not understandthat 30 cents is better than zero. Thereare at least two possibilities that still al-low the behavior of the responders tobe viewed as learning. The first is thatresponders do not believe the rules ofthe game as described and take time to

learn that the proposer really was mak-ing a take-it-or-leave-it offer. As men-tioned in footnote 7, most people maynot be used to take-it-or-leave-it offers,and it may take time for the respondersto properly appreciate what this means.The second is that the monetary rewardis only one ingredient in responders’utility functions, and that respondersmust learn what the “fair” offer is.

If proposers learn sufficiently fastrelative to responders, then there canbe convergence to a Nash equilibriumthat is not the backward induction solu-tion. In the backward induction solu-tion, the responder gets almost nothing,so that the cost of making an error islow, while the proposer loses signifi-cantly if he misjudges the acceptancethreshold of the responder. In fact, in asimplified version of the ultimatumgame, Nash equilibria in which the re-sponder gets a substantial share arestable (although not asymptoticallystable). In a non-backward inductionoutcome, the proposer never learns thathe could have offered less to the re-sponder (since he never observes suchbehavior). If he is sufficiently pessimis-tic about the responder’s acceptancethreshold, then he will not offer less,since a large share is better than noth-ing. Consider a simplified ultimatumgame that gives the proposer and re-sponder two choices: the proposereither offers even division or a smallpositive payment, and the responderonly responds to the small positive pay-ment (he must accept the equal divi-sion). Figure 12 is the phase diagramfor this simplified game. While thisgame has some similarities to that inFigure 9, there is also an important dif-ference. All the rest points inconsistentwith the backward induction solution(with the crucial exception of the pointlabelled A) are Liapunov stable (but notasymptotically so). Moreover, in re-

Mailath: Do People Play Nash Equilibrium? 1365

Page 20: Mailath - Do People Play Nash Equilibrium

sponse to some infrequent perturba-tions, the system will effectively “movealong” these rest points toward A. Butonce at A (unlike the corresponding sta-ble rest point in Figure 11), the systemwill move far away. The rest point la-beled A is not stable: Perturbationsnear A can move the system onto a tra-jectory that converges to the backwardinduction solution.

While this seems to suggest that non-backward-induction solutions are frag-ile, such a conclusion is premature.Since any model is necessarily an ap-proximation, it is important to allow fordrift. Binmore and Samuelson (1996)use the term drift to refer to un-modeled small changes in behavior.One way of allowing for drift would beto add to the learning dynamic an addi-tional term reflecting a deterministicflow from one strategy to another thatwas independent of payoffs. In general,this drift would be small, and in thepresence of strong incentives to learn,irrelevant. However, if players (such asthe responders above) have little incen-tive to learn, then the drift term be-comes more important. In fact, addingan arbitrarily small uniform drift termto the replicator dynamic changes the

dynamic properties in a fundamentalway. With no drift, the only asymptoti-cally stable rest point is the backwardinduction solution. With drift, there canbe another asymptotically stable restpoint near the non-backward-inductionNash equilibria.

The ultimatum game is special in itssimplicity. Ideas of backward inductionand sequential rationality have been in-fluential in more complicated games(like the centipede game, the repeatedprisoner’s dilemma, and alternating of-fer bargaining games). In general, back-ward induction has received little sup-port from evolutionary game theory formore complicated games (see, for exam-ple, Ross Cressman and Karl Schlag(1995 ), Georg Nöldeke and Samuelson(1993 ), and Giovanni Ponti (1996)).

5.3. Forward Induction and Efficiency

In addition to backward induction,the other major refinement idea is for-ward induction. In general, forward in-duction receives more support fromevolutionary arguments than does back-ward induction. The best examples ofthis are in the context of cheap talk(starting with Akihiko Matsui 1991).Some recent papers are V. Bhaskar(1995), Andreas Blume, Yong-GwanKim and Joel Sobel (1993), Kim andSobel (1995), and Karl Wärneryd(1993)—see Sobel (1993) and Samuel-son (1993) for a discussion.

Forward induction is the idea that ac-tions can convey information about thefuture intentions of players even off theequilibrium path. Cheap talk games(signaling games in which the messagesare costless) are ideal to illustrate theseideas. Cheap talk games have both re-vealing equilibria (cheap talk can con-vey information) and the so-called bab-bling equilibria (messages do notconvey any information because neitherthe sender nor the receiver expects

Figure 12. A suggestive phase diagram for a simplified version of the ultimatum game.

Fra

ctio

n of

res

pond

ers

reje

ctin

g

Fraction of proposersoffering even split

A

1366 Journal of Economic Literature, Vol. XXXVI (September 1998)

Page 21: Mailath - Do People Play Nash Equilibrium

them to), and forward induction hasbeen used to eliminate some nonreveal-ing equilibria.

Consider the following cheap talkgame: There are two states of theworld, rain and sun. The sender knowsthe state and announces a message, rainor sun. On the basis of the message, thereceiver chooses an action, picnic ormovie. Both players receive 1 if the re-ceiver’s action agrees with the state ofthe world (i.e., picnic if sun, and movieif rain), and 0 otherwise. Thus, thesender’s message is payoff irrelevantand so is “cheap talk.” The obvious pat-tern of behavior is for the sender to sig-nal the state by making a truthful an-nouncement in each state and for thereceiver to then choose the action thatagrees with the state. In fact, since thereceiver can infer the state from the an-nouncement (if the announcement dif-fers across states), there are two sepa-rating equilibrium profiles (truthfulannouncing, where the sender an-nounces rain if rain and sun if sun; andfalse announcing, where sender an-nounces sun if rain and rain if sun).

A challenge for traditional non-coop-erative theory, however, is that bab-bling is also an equilibrium: The senderplaces equal probability on rain andsun, independent of the state. The re-ceiver, learning nothing from the mes-sage, places equal probability on rainand sun. Consider ESS in the sym-metrized game (where each player hasequal probability of being the sender orreceiver). It turns out that only separat-ing equilibria are ESS. The intuition isin two parts. First, the babbling equilib-rium is not an ESS: Consider the truth-ful entrant (who announces truthfullyand responds to announcements bychoosing the same action as the an-nouncement). The payoff to this entrantis strictly greater than that of the bab-bling strategy against the perturbed

population (both receive the same pay-off when matched with the babblingstrategy, but the truthful strategy doesstrictly better when matched with thetruthful strategy). Moreover, the sepa-rating equilibria are strict equilibria,and so, ESSs. Suppose, for example, allplayers are playing the truthful strategy.Then any other strategy must yieldstrictly lower payoffs: Either, as asender, the strategy specifies an actionconditional on a state that does not cor-respond to that state, leading to an in-correct action choice by the truthful re-ceiver, or, as a responder, the strategyspecifies an incorrect action after atruthful announcement).

This simple example is driven by thecrucial assumption that the number ofmessages equals the number of states.If there are more messages than states,then there are no strict equilibria, andso no ESSs. To obtain similar efficiencyresults for a larger class of games, weneed to use set-valued solution con-cepts, such as cyclical stability (ItzhakGilboa and Matsui 1991) used by Mat-sui (1991) , and equilibrium evolution-ary stability (Swinkels 1992) used byBlume, Kim, and Sobel (1993). Matsui(1991) and Kim and Sobel (1995) studycoordination games with a preplayround of communication. In suchgames, communication can allow play-ers to coordinate their actions. How-ever, as in the above example, there arealso babbling equilibria, so that commu-nication appears not to guarantee co-ordination. Evolutionary pressures, onthe other hand, destabilize the babblingequilibria. Blume, Kim, and Sobel(1993) study cheap talk signaling gameslike that of the example. Bhaskar (1995)obtains efficiency with noisy pre-playcommunication, and shows that, in hiscontext at least, the relative importanceof noise and mutations is irrelevant.

The results in this area strongly sug-

Mailath: Do People Play Nash Equilibrium? 1367

Page 22: Mailath - Do People Play Nash Equilibrium

gest that evolutionary pressures candestabilize inefficient outcomes. Thekey intuition is that suggested by theexample above. If an outcome is ineffi-cient, then there is an entrant that isequally successful against the currentpopulation, but that can achieve the ef-ficient outcome when playing against asuitable opponent. A crucial aspect isthat the model allow the entrant to havesufficient flexibility to achieve this, andthat is the role of cheap talk above.

5.4. Multiple Strict Equilibria

Multiple best replies for a playerraise the question of determining whichof these best replies are “plausible” or“sensible.” The refinements literature(of which backward and forward induc-tion are a part) attempted to answerthis question, and by so doing eliminatesome equilibria as being uninteresting.Multiple strict equilibria raise a com-pletely new set of issues. It is also worthrecalling that any strict equilibrium isasymptotically stable under any mono-tonic dynamic. I argued earlier that thisled to the desirable feature of historydependence. However, even at an intui-tive level, some strict equilibria aremore likely. For example, in the gamedescribed by Figure 13, (H,H) seemsmore likely than (L,L). There are sev-eral ways this can be phrased. Certainly,(H,H) seems more “focal,” and if askedto play this game, I would play H, aswould (I suspect) most people. Anotherway of phrasing this is to compare thebasins of attraction of the two equilibriaunder monotone dynamics (they will all

agree in this case, since the game is sosimple),34 and observe that the basin ofattraction of (H,H) is 100-times thesize of (L,L). If we imagine that the in-itial condition is chosen randomly, thenthe H pattern of behavior is 100 timesas likely to arise. I now describe a morerecent perspective that makes the lastidea more precise by eliminating theneed to specify an initial condition.

The motivation for ESS and (asymp-totic) stability was a desire for robust-ness to a single episode of perturbation.It might seem that if learning operatesat a sufficiently higher rate than therate at which new behavior is intro-duced into the population, focusing onthe dynamic implications of a singleperturbation is reasonable. Dean Fosterand Young (1990) have argued that thenotions of an ESS and attractor of thereplicator dynamic do not adequatelycapture long-run stability when thereare continual small stochastic shocks.Young and Foster (1991) describe simu-lations and discuss this issue in the con-text of Robert Axelrod’s (1984) com-puter tournaments.

There is a difficulty that must be con-fronted when explicitly modeling ran-domness. As I mentioned above, thestandard replicator dynamic story is anidealization for large populations (spe-cifically, a continuum). If the mutation-experimentation occurs at the individ-ual level, there should be no aggregateimpact; the resulting evolution of thesystem is deterministic and there are no“invasion events.” There are two waysto approach this. One is to consideraggregate shocks (this is the approachof Foster and Young 1990 and Fuden-berg and Christopher Harris 1992). The

Figure 13. (H,H) seems more likely than (L,L).

column player

row player H

L

H L

100,100 0,0

0,0 1,1

34 A state is in the basin of attraction of an equi-librium if, starting at that state and applying thedynamic, eventually the system is taken to thestate in which all players play the equilibriumstrategy.

1368 Journal of Economic Literature, Vol. XXXVI (September 1998)

Page 23: Mailath - Do People Play Nash Equilibrium

other is to consider a finite populationand analyze the impact of individual ex-perimentation. Michihiro Kandori,Mailath, and Rafael Rob (1993) studythe implications of randomness on theindividual level; in addition to empha-sizing individual decision making, thepaper analyzes the simplest model thatillustrates the role of perpetual random-ness.

Consider again the stag hunt game,and suppose each work-team consists oftwo workers. Suppose, moreover, thatthe firm has N workers. The relevantstate variable is z, the number of work-ers who choose high effort. Learningimplies that if high effort is a betterstrategy than low effort, then at leastsome workers currently choosing low ef-fort will switch to high effort (i.e.,zt+1 > zt if zt < N). A similar propertyholds if low effort is a better strategythan high. The learning or selection dy-namics describe, as before, a dynamicon the set of population states, which isnow the number of workers choosinghigh effort. Since we are dealing with afinite set of workers, this can also bethought of as a Markov process on a fi-nite state space. The process is Markov,because, by assumption workers onlylearn from last period’s experience.Moreover, both all workers choosinghigh effort and all workers choosing loweffort are absorbing states of thisMarkov process.35 This is just a restate-ment of the observation that both statescorrespond to Nash equilibria.

Perpetual randomness is incorporatedby assuming that, in each period, eachworker independently switches his ef-fort choice (i.e., experiments) withprobability ε, where ε > 0 is to bethought of as small. In each period,there are two phases: the learning phase

and the experimentation phase. Notethat after the experimentation phase (incontrast to the learning phase), withpositive probability, fewer workers maybe choosing a best reply.

Attention now focuses on the behav-ior of the Markov chain with the per-petual randomness. Because of the ex-perimentation, every state is reachedwith positive probability from any otherstate (including the states in which allworkers choose high effort and all work-ers choose low effort). Thus, theMarkov chain is irreducible and aperi-odic. It is a standard result that such aMarkov chain has a unique stationarydistribution. Let µ(ε) denote the station-ary distribution. The goal has been tocharacterize the limit of µ(ε) as ε be-comes small. This, if it exists, is calledthe stochastically stable distribution(Foster and Young 1990) or the limitdistribution. States in the support ofthe limit distribution are sometimescalled long-run equilibria (Kandori,Mailath, and Rob 1993). The limit dis-tribution is informative about the be-havior of the system for positive butvery small ε . Thus, for small degrees ofexperimentation, the system will spendalmost all of its time with all playerschoosing the same action. However,every so often (but infrequently)enough players will switch action, whichwill then switch play of the populationto the other action, until again enoughplayers switch.

It is straightforward to show that thestochastically stable distribution exists;characterizing it is more difficult. Kan-dori, Mailath, and Rob (1993) is mostlyconcerned with the case of 2×2 symmet-ric games with two strict symmetricNash equilibria. Any monotone dynamicdivides the state space into the sametwo basins of attraction of the equilib-ria. The risk dominant equilibrium(Harsanyi and Selten 1988) is the equi-

35 An absorbing state is a state that the process,once in, never leaves.

Mailath: Do People Play Nash Equilibrium? 1369

Page 24: Mailath - Do People Play Nash Equilibrium

librium with the larger basin of attrac-tion. The risk dominant equilibrium is“less risky” and may be Pareto domi-nated by the other equilibrium (the riskdominant equilibrium results whenplayers choose best replies to beliefsthat assign equal probability to the twopossible actions of the opponent). Inthe stag hunt example, the risk domi-nant equilibrium is low effort and it isPareto dominated by high effort. InFigure 13, the risk dominant equilib-rium is H and it Pareto dominates L.Kandori, Mailath, and Rob (1993) showthat the limit distribution puts prob-ability one on the risk dominant equilib-rium. The non-risk dominant equilib-rium is upset because the probability ofa sufficiently large number of simulta-neous mutations that leave society inthe basin of attraction of the risk domi-nant equilibrium is of higher order thanthat of a sufficiently large number of si-multaneous mutations that cause societyto leave that basin of attraction.

This style of analysis allows us to for-mally describe strategic uncertainty. Tomake this point, consider again theworkers involved in team production asa stag hunt game. For the payoffs as inFigure 1, risk dominance leads to thelow effort outcome even if each workteam has only two workers. Suppose,though, the payoffs are as in Figure 14,with V being the value of high effort (ifreciprocated). For V > 6 and two-workerteams, the principles of risk dominanceand payoff dominance agree: high ef-fort. But now suppose each team re-

quires n workers. While this is nolonger a two - player game, it is stilltrue that (for large populations) the sizeof the basin of attraction is the deter-mining feature of the example. For ex-ample, if V = 8, high effort has a smallerbasin of attraction than low effort for alln ≥ 3. As V increases, the size of theteam for which high effort becomes toorisky increases.36 This attractively cap-tures the idea that cooperation in largegroups can be harder to achieve thancooperation in small groups, just due tothe uncertainty that everyone will coop-erate. While a small possibility of non-cooperation by any one agent is not de-stabilizing in small groups (sincecooperation is a strict equilibrium), it isin large ones.

In contrast to both ESS and replica-tor dynamics, there is a unique out-come. History does not matter. This isthe result of taking two limits: first,time is taken to infinity (which justifieslooking at the stationary distribution),and then the probability of mutation istaken to zero (looking at small rates).The rate at which the stationary distri-bution is approached from an arbitrarystarting point is decreasing in popula-tion size (since the driving force is theprobability of a simultaneous mutationby a fraction of the population). Moti-vated by this observation, Glenn Ellison(1993) studied a model with local inter-actions that has substantially fasterrates of convergence. The key idea isthat, rather than playing against a ran-domly drawn opponent from the entirepopulation, each player plays onlyagainst a small number of neighbors.The neighborhoods are overlapping,however, so that a change of behavior inone neighborhood can (eventually) in-fluence the entire population.

Figure 14. A new “stag-hunt” played by workers in a team.

minimum of otherworker’s efforts

worker’s

effort

high

low

high low

V 0

3 3

36 High effort has the larger basin of attractionif (1⁄2)n > 3⁄V.

1370 Journal of Economic Literature, Vol. XXXVI (September 1998)

Page 25: Mailath - Do People Play Nash Equilibrium

Since it is only for the case of 2×2symmetric games that the precise mod-eling of the learning dynamic is irrele-vant, extensions to larger games requirespecific assumptions about the dynam-ics. Kandori and Rob (1995), Nöldekeand Samuelson (1993), and Young(1993) generalize the best reply dy-namic in various directions.

Kandori, Mailath, and Rob (1993),Young (1993), and Kandori and Rob(1995) study games with strict equilib-ria, and (as the example above illus-trates) the relative magnitudes ofprobabilities of simultaneous mutationsare important. In contrast, Samuelson(1994) studies normal form games withnonstrict equilibria, and Nöldeke andSamuelson (1993) study extensive formgames. In these cases, since the equilib-ria are not strict, states that correspondto equilibria can be upset by a singlemutation. This leads to the limit distri-bution having nonsingleton support.This is the reflection in the context ofstochastic dynamics of the issues illus-trated by discussion of Figures 9, 10,and 12. In general, the support will con-tain “connected” components, in thesense that there is a sequence of singlemutations from one state to anotherstate that will not leave the support.Moreover, each such state will be a restpoint of the selection dynamic. The re-sults on extensive forms are particularlysuggestive, since different points in aconnected component of the supportcorrespond to different specifications ofoff-the-equilibrium path behavior.

The introduction of stochastic dy-namics does not, by itself, provide ageneral theory of equilibrium selection.James Bergin and Bart Lipman (1996)show that allowing the limiting behaviorof the mutation probabilities to dependon the state gives a general possibilitytheorem: Any strict equilibrium of anygame can be selected by an appropriate

choice of state-dependent mutationprobabilities. In particular, in 2 × 2games, if the risk dominant strategy is“harder” to learn than the other (in thesense that the limiting behavior of themutation probabilities favors the non-risk dominant strategy), then the riskdominant equilibrium will not be se-lected. On the other hand, if the statedependence of the mutation prob-abilities only arises because the prob-abilities depend on the difference inpayoffs from the two strategies, the riskdominant equilibrium is selected(Lawrence Blume 1994). This latterstate dependence can be thought of asbeing strategy neutral in that it only de-pends on the payoffs generated by thestrategies, and not on the strategiesthemselves. The state dependence thatBergin and Lipman (1996) need fortheir general possibility result is per-haps best thought of as strategy depen-dence, since the selection of some strictequilibria only occurs if players find iteasier to switch to (learn) certain strate-gies (perhaps for complexity reasons).

Binmore, Samuelson, and RichardVaughan (1995), who study the result ofthe selection process itself being thesource of the randomness, also obtaindifferent selection results. Finally, thematching process itself can also be animportant source of randomness; seeYoung and Foster (1991) and Robsonand Fernando Vega-Redondo (1996).

6. Conclusion

The result that any asymptotically sta-ble rest point of an evolutionary dy-namic is a Nash equilibrium is an im-portant result. It shows that there areprimitive foundations for equilibriumanalysis. However, for asymmetricgames, asymptotic stability is effectivelyequivalent to strict equilibria (which donot exist for many games of interest).

Mailath: Do People Play Nash Equilibrium? 1371

Page 26: Mailath - Do People Play Nash Equilibrium

To a large extent, this is due to the fo-cus on individual states. If we insteadconsider sets of states (strategy pro-files), as I discussed at the end of Sec-tion 5.1, there is hope for more positiveresults.37

The lack of support for the standardrefinement of backward induction is insome ways a success. Backward induc-tion has always been a problematic prin-ciple, with some examples (like the cen-tipede game) casting doubt on itsuniversal applicability. The reasons forthe lack of support improve our under-standing of when backward induction isan appropriate principle to apply.

The ability to discriminate betweendifferent strict equilibria and provide aformalization of the intuition of strate-gic uncertainty is also a major contribu-tion of the area.

I suspect that the current evolution-ary modeling is still too stylized to beused directly in applications. Rather,applied researchers need to be aware ofwhat they are implicitly assuming whenthey do equilibrium analysis.

In many ways, there is an importantparallel with the refinements literature.Originally, this literature was driven bythe hope that theorists could identifythe unique “right” equilibrium. If thatoriginal hope had been met, applied re-searchers need never worry about amultiplicity problem. Of course, that

hope was not met, and we now under-stand that that hope, in principle, couldnever be met. The refinements litera-ture still serves the useful role of pro-viding a language to describe the prop-erties of different equilibria. Appliedresearchers find the refinements litera-ture of value for this reason, eventhough they cannot rely on it mechani-cally to eliminate “uninteresting” equi-libria. The refinements literature is cur-rently out of fashion because there weretoo many papers in which one examplesuggested a minor modification of anexisting refinement and no persuasivegeneral refinement theory emerged.

There is a danger that evolutionary gametheory could end up like refinements. Itis similar in that there was a lot of earlyhope and enthusiasm. And, again, therehave been many perturbations of mod-els and dynamic processes, not alwayswell motivated. As yet, the overall pic-ture is still somewhat unclear.

However, on the positive side, impor-tant insights are still emerging fromevolutionary game theory (for example,the improving understanding of whenbackward induction is appropriate andthe formalization of strategic uncer-tainty). Interesting games have manyequilibria, and evolutionary game the-ory is an important tool in under-standing which equilibria are particu-larly relevant in different environments.

REFERENCES

Aoyagi, Masaki. 1996. “Evolution of Beliefs andthe Nash Equilibrium of Normal Form Games,”J. Econ. Theory, 70, pp. 444–69.

Aumann, Robert J. 1985. “On the Non-Transfer-able Utility Value: A Comment on the Roth–Shafer Examples,” Econometrica, 53, pp. 667–77.

———. 1990. “Nash Equilibria are Not Self-En-forcing,” in Jean Jaskold Gabszewicz, JeanFrançois Richard, and Laurence A. Wolsey, pp.201–206.

Aumann, Robert J. and Brandenburger, Adam.1995. “Epistemic Conditions for Nash Equilib-rium,” Econometrica, 63, pp. 1161–80.

37 Sets of strategy profiles that are asymptoti-cally stable under plausible deterministic dynam-ics turn out also to have strong Elon Kohlberg andJean-Francois Mertens (1986) type stability prop-erties (Swinkels (1993)), in particular, the prop-erty of robustness to deletion of never weak bestreplies. This latter property implies many of therefinements that have played an important role inthe refinements literature and signaling games,such as the intuitive criterion, the test of equilib-rium domination, and D1 (In-Koo Cho and Kreps1987). A similar result under different conditionswas subsequently proved by Ritzberger andWeibull (1995), who also characterize the sets ofprofiles that can be asymptotically stable undercertain conditions.

1372 Journal of Economic Literature, Vol. XXXVI (September 1998)

Page 27: Mailath - Do People Play Nash Equilibrium

Axelrod, Robert. 1984. The Evolution of Coopera-tion. New York: Basic Books.

Bergin, James and Barton L. Lipman. 1996. “Evo-lution with State-Dependent Mutations,”Econometrica, 64, pp. 943–56.

Bernheim, B. Douglas. 1984. “Rationalizable Stra-tegic Behavior,” Econometrica, 52, pp. 1007–28.

Bhaskar, V. 1995. “Noisy Communication and theEvolution of Cooperation,” U. St. Andrews.

Binmore, Kenneth G. and Larry Samuelson. 1996.“Evolutionary Drift and Equilibrium Selection,”mimeo, U. Wisconsin.

Binmore, Kenneth G., Larry Samuelson, and Rich-ard Vaughan. 1995. “Musical Chairs: ModelingNoisy Evolution,” Games & Econ. Behavior, 11,pp. 1–35.

Blume, Andreas, Yong-Gwan Kim, and Joel Sobel.1993. “Evolutionary Stability in Games of Com-munication,” Games & Econ. Behavior, 5, pp.547–75.

Blume, Lawrence. 1994. “How Noise Matters,”mimeo, Cornell U.

Börgers, Tilman and Rajiv Sarin. 1997. “LearningThrough Reinforcement and Replicator Dy-namics,” J. Econ. Theory, 77, pp. 1–14.

Bush, Robert R. and Frederick Mosteller. 1951.“A Mathematical Model for Simple Learning,”Psych. Rev., 58, pp. 313–23.

———. 1955. Stochastic Models of Learning.New York: Wiley.

Cho, In-Koo and David Kreps. 1987. “SignalingGames and Stable Equilibria,” Quart. J. Econ.,102, pp. 179–221.

Creedy, John, Jeff Borland, and Jürgen Eichber-ger, eds. 1992. Recent Developments in GameTheory. Hants, England: Edward Elgar Publish-ing Limited.

Cressman, Ross and Karl H. Schlag. 1995. “TheDynamic (In)Stability of Backwards Induction.”Technical report, Wilfred Laurier U. and BonnU.

Darwin, Charles. 1887. The Life and Letters ofCharles Darwin, Including an AutobiographicalChapter. Francis Darwin, ed., second ed., Vol.1, London: John Murray.

Dekel, Eddie and Faruk Gul. 1997. “Rationalityand Knowledge in Game Theory,” in David M.Kreps and Kenneth F. Wallis, pp. 87–172.

Ellison, Glenn. 1993. “Learning, Local Interac-tion, and Coordination,” Econometrica, 61, pp.1047–71.

Elster, Jon. 1989. “Social Norms and EconomicTheory,” J. Econ. Perspectives, 3, pp. 99–117.

Foster, Dean and H. Peyton Young. 1990. “Sto-chastic Evolutionary Game Dynamics,” Theor.Population Bio., 38, pp. 219–32.

Friedman, Daniel. 1991. “Evolutionary Games inEconomics,” Econometrica, 59, pp. 637–66.

Fudenberg, Drew and Christopher Harris. 1992.“Evolutionary Dynamics with AggregateShocks,” J. Econ. Theory, 57, pp. 420–41.

Fudenberg, Drew and David Kreps. 1989. “A The-ory of Learning, Experimentation, and Equilib-rium,” mimeo, MIT and Stanford.

Fudenberg, Drew and David Levine. 1993.“Steady State Learning and Nash Equilibrium,”Econometrica, 61, pp. 547–73.

Gabszewicz, Jean Jaskold, Jean François Richard,and Laurence A. Wolsey, eds. 1990. EconomicDecision-Making: Games, Econometrics, andOptimisation. Contributions in Honour ofJacques H. Drèze. New York: North-Hol-land.

Gale, John, Kenneth G. Binmore, and Larry Sam-uelson. 1995. “Learning to be Imperfect: TheUltimatum Game,” Games & Econ. Behavior, 8,pp. 56–90.

Gilboa, Itzhak and Akihiko Matsui. 1991. “SocialStability and Equilibrium,” Econometrica, 59,pp. 859–67.

Gul, Faruk. 1996. “Rationality and CoherentTheories of Strategic Behavior,” J. Econ. The-ory, 70, pp. 1–31.

Harsanyi, John C. 1977. Rational Behavior andBargaining Equilibrium in Games and SocialSituations. Cambridge, UK: Cambridge U.Press.

Harsanyi, John C. and Reinhard Selten. 1988. AGeneral Theory of Equilibrium in Games. Cam-bridge: MIT Press.

Hines, W. G. S. 1980. “Three Characterizations ofPopulation Strategy Stability,” J. Appl. Prob-ability, 17, pp. 333–40.

Hofbauer, Josef and Karl Sigmund. 1988. The The-ory of Evolution and Dynamical Systems. Cam-bridge: Cambridge U. Press.

Kalai, Ehud and Ehud Lehrer. 1993. “RationalLearning Leads to Nash Equilibrium,” Econo-metrica, 61, pp. 1019–45.

Kandori, Michihiro. 1997. “Evolutionary GameTheory in Economics,” in David M. Kreps andKenneth F. Wallis, pp. 243–77.

Kandori, Michihiro and Rafael Rob. 1995. “Evolu-tion of Equilibria in the Long Run: A GeneralTheory and Applications,” J. Econ. Theory, 65,pp. 383–414.

Kandori, Michihiro, George J. Mailath, and RafaelRob. 1993. “Learning, Mutation, and Long RunEquilibria in Games,” Econometrica, 61, pp.29–56.

Kim, Yong-Gwan and Joel Sobel. 1995. “An Evolu-tionary Approach to Pre-Play Communication,”Econometrica, 63, pp. 1181–93.

Kohlberg, Elon and Jean-Francois Mertens. 1986.“On the Strategic Stability of Equilibria,”Econometrica, 54, pp. 1003–37.

Kreps, David M. 1990a. Game Theory and Eco-nomic Modelling. Oxford: Clarendon Press.

———. 1990b. A Course in Microeconomic The-ory. Princeton, NJ: Princeton U. Press.

Kreps, David M. and Kenneth F. Wallis, eds.1997. Advances in Economics and Econo-metrics: Theory and Applications–SeventhWorld Congress of the Econometric Society,Vol. 1. Cambridge: Cambridge U. Press.

Mailath, George J. 1992. “Introduction: Sympo-sium on Evolutionary Game Theory,” J. Econ.Theory, 57, pp. 259–77.

Mailath: Do People Play Nash Equilibrium? 1373

Page 28: Mailath - Do People Play Nash Equilibrium

Matsui, Akihiko. 1991. “Cheap-Talk and Coopera-tion in Society,” J. Econ. Theory, 54, pp. 245–58.

Maynard Smith, John. 1982. Evolution and theTheory of Games. Cambridge: Cambridge U.Press.

Maynard Smith, John and G. R. Price. 1973. “TheLogic of Animal Conflict,” Nature, 246, pp. 15–18.

Myerson, Roger B. 1991. Game Theory: Analysisof Conflict. Cambridge, MA: Harvard U. Press.

Nitecki, Z. and Robinson, C. eds. 1980. GlobalTheory of Dynamical Systems. Vol. 819 of Lec-ture Notes in Mathematics, Berlin: Springer-Verlag.

Nöldeke, Georg and Larry Samuelson. 1993. “AnEvolutionary Analysis of Backward and ForwardInduction,” Games & Econ. Behavior, 5, pp.425–54.

Pearce, David. 1984. “Rationalizable Strategic Be-havior and the Problem of Perfection,” Econo-metrica, 52, pp. 1029–50.

Ponti, Giovanni. 1996. “Cycles of Learning in theCentipede Game,” Technical Report, UniversityCollege, London.

Ritzberger, Klaus and Jörgen W. Weibull. 1995.“Evolutionary Selection in Normal-FormGames,” Econometrica, 63, pp. 1371–99.

Robson, Arthur J. 1992. “Evolutionary Game The-ory,” in John Creedy, Jeff Borland, and JürgenEichberger, pp. 165–78.

Robson, Arthur J. and Fernando Vega-Redondo.1996. “Efficient Equilibrium Selection in Evo-lutionary Games with Random Matching,” J.Econ. Theory, 70, pp. 65–92.

Rousseau, Jean-Jacques. 1950. “A Discourse onthe Origin of Inequality,” in The Social Con-tract and Discourses. New York: Dutton. Trans-lated by G. D. H. Cole.

Samuelson, Larry. 1993. “Recent Advances in Evo-lutionary Economics: Comments,” Econ. Let-ters, 42, pp. 313–19.

———. 1994. “Stochastic Stability in Games withAlternative Best Replies,” J. Econ. Theory, 64,pp. 35–65.

Samuelson, Larry and Jianbo Zhang. 1992. “Evolu-tionary Stability in Asymmetric Games,” J.Econ. Theory, 57, pp. 363–91.

Schelling, Thomas. 1960. The Strategy of Conflict.Cambridge, MA: Harvard U. Press.

Schlag, Karl H. 1998. “Why Imitate, and If So,How?” J. Econ. Theory, 78, pp. 130–56.

Selten, Reinhard. 1980. “A Note on EvolutionaryStable Strategies in Asymmetric Animal Con-flicts,” J. Theor. Bio., 84, pp. 93–101.

Skyrms, Brian. 1996. Evolution of the Social Con-tract. Cambridge: Cambridge U. Press.

Sobel, Joel. 1993. “Evolutionary Stability and Effi-ciency,” Econ. Letters, 42, pp. 301–12.

Sonsino, Doron. 1997. “Learning to Learn, PatternRecognition, and Nash Equilibrium,” Games &Econ. Behavior, 18, pp. 286–331.

Sugden, Robert. 1989. “Spontaneous Order,” J.Econ. Perspectives, 3, pp. 85–97.

Swinkels, Jeroen M. 1992. “Evolutionary Stabilitywith Equilibrium Entrants,” J. Econ. Theory,57, pp. 306–32.

———. 1993. “Adjustment Dynamics and RationalPlay in Games,” Games & Econ. Behavior, 5,pp. 455–84.

Taylor, Peter D. and Leo B. Jonker. 1978. “Evolu-tionary Stable Strategies and Game Dynamics,”Math. Biosciences, 40, pp. 145–56.

van Damme, Eric. 1987. Stability and Perfectionof Nash Equilibria. Berlin: Springer-Verlag.

Wärneryd, Karl. 1993. “Cheap Talk, Coordination,and Evolutionary Stability,” Games & Econ. Be-havior, 5, pp. 532–46.

Weibull, Jörgen W. 1995. Evolutionary Game The-ory. Cambridge: MIT Press.

Young, H. Peyton. 1993. “The Evolution of Con-ventions,” Econometrica, 61, pp. 57–84.

———. 1996. “The Economics of Conventions,” J.Econ. Perspectives, 10, pp. 105–22.

Young, H. Peyton. and Dean Foster. 1991. “Coop-eration in the Short and in the Long Run,”Games & Econ. Behavior, 3, pp. 145–56.

Zeeman, E. 1980. “Population Dynamics fromGame Theory,” in Z. Nitecki and C. Robinson,pp. 471–97.

———. 1981. “Dynamics of the Evolution of Ani-mal Conflicts,” J. Theor. Bio., 89, pp. 249–70.

Zermelo, Ernst. 1912. “Über eine Anwendung derMengenlehre auf die Theorie des Schach-speils,” Proceedings of the Fifth InternationalCongress of Mathematicians, II, pp. 501–504.

1374 Journal of Economic Literature, Vol. XXXVI (September 1998)