2014 - lectures notes on game theory - william h sandholm

Upload: diego-velasquez

Post on 07-Aug-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    1/167

    Lecture Notes on Game Theory∗

    William H. Sandholm†

    October 21, 2014

    Contents

    0 Basic Decision Theory 30.1 Ordinal Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.2 Expected Utility and the von Neumann-Morgenstern Theorem . . . . . . . . 40.3 Bayesian Rationality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    1 Normal Form Games 81.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    1.1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.1.2 Randomized strategies and beliefs . . . . . . . . . . . . . . . . . . . . 9

    1.2 Dominance and Iterated Dominance . . . . . . . . . . . . . . . . . . . . . . . 151.2.1 Strictly dominant strategies . . . . . . . . . . . . . . . . . . . . . . . . 16

    1.2.2 Strictly dominated strategies . . . . . . . . . . . . . . . . . . . . . . . 171.2.3 Iterated strict dominance . . . . . . . . . . . . . . . . . . . . . . . . . 181.2.4 Weak dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    1.3 Rationalizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211.3.1 Definition and examples . . . . . . . . . . . . . . . . . . . . . . . . . . 211.3.2 The separating hyperplane theorem . . . . . . . . . . . . . . . . . . . 261.3.3 A positive characterization . . . . . . . . . . . . . . . . . . . . . . . . 29

    1.4 Nash Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311.4.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311.4.2 Computing Nash equilibria . . . . . . . . . . . . . . . . . . . . . . . . 331.4.3 Interpretations of Nash equilibrium . . . . . . . . . . . . . . . . . . . 381.4.4 Existence of Nash equilibrium and structure of the equilibrium set . 41

    1.5 Correlated Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    ∗Many thanks to Katsuhiko Aiba, Emin Dokumacı, Danqing Hu, Rui Li, Allen Long, Ignacio Monzón,Michael Rapp, and Ryoji Sawa for creating the initial draft of this document from my handwritten notesand various other primitive sources.

    †Department of Economics, University of Wisconsin, 1180 Observatory Drive, Madison, WI 53706, USA.e-mail:  [email protected]; website:  http://www.ssc.wisc.edu/˜whs.

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    2/167

    1.5.1 Definition and examples . . . . . . . . . . . . . . . . . . . . . . . . . . 421.5.2 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

    1.6 The Minmax Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

    2 Extensive Form Games 56

    2.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562.1.1 Defining extensive form games . . . . . . . . . . . . . . . . . . . . . . 562.1.2 Pure strategies in extensive form games . . . . . . . . . . . . . . . . . 612.1.3 Randomized strategies in extensive form games . . . . . . . . . . . . 622.1.4 Reduced normal form games . . . . . . . . . . . . . . . . . . . . . . . 66

    2.2 The Principle of Sequential Rationality . . . . . . . . . . . . . . . . . . . . . . 672.3 Games of Perfect Information and Backward Induction . . . . . . . . . . . . 68

    2.3.1 Subgame perfect equilibrium, sequential rationality, and backwardinduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

    2.3.2 Epistemic foundations for backward induction . . . . . . . . . . . . . 752.3.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

    2.3.4 Subgame perfect equilibrium in more general classes of games . . . 80Interlude: Asymmetric Information, Economics, and Game Theory . . . . . . . . 802.4 Games of Imperfect Information and Sequential Equilibrium . . . . . . . . . 82

    2.4.1 Subgames and subgame perfection in games of imperfect information 822.4.2 Beliefs and sequential rationality . . . . . . . . . . . . . . . . . . . . . 832.4.3 Definition of sequential equilibrium . . . . . . . . . . . . . . . . . . . 862.4.4 Computing sequential equilibria . . . . . . . . . . . . . . . . . . . . . 892.4.5 Existence of sequential equilibrium and structure of the equilibrium

    set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 982.5 Invariance and Proper Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . 98

    2.6 Forward Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1072.6.1 Motivation and discussion . . . . . . . . . . . . . . . . . . . . . . . . 1072.6.2 Forward induction in signaling games . . . . . . . . . . . . . . . . . . 109

    2.7 Full Invariance and Kohlberg-Mertens Stability . . . . . . . . . . . . . . . . . 1162.7.1 Fully reduced normal forms and full invariance . . . . . . . . . . . . 1162.7.2 KM stability and set-valued solution concepts . . . . . . . . . . . . . 118

    3 Bayesian Games 1213.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1213.2 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1253.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

    4 Repeated Games 1314.1 The Repeated Prisoner’s Dilemma . . . . . . . . . . . . . . . . . . . . . . . . 1314.2 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1364.3 The Folk Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1384.4 Computing the Set of Subgame Perfect Equilibrium Payoff s . . . . . . . . . 146

    4.4.1 Dynamic programming . . . . . . . . . . . . . . . . . . . . . . . . . . 146

    –2–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    3/167

    4.4.2 Dynamic programs vs. repeated games . . . . . . . . . . . . . . . . . 1474.4.3 Factorization and self-generation . . . . . . . . . . . . . . . . . . . . . 148

    4.5 Simple and Optimal Penal Codes . . . . . . . . . . . . . . . . . . . . . . . . . 157

    0. Basic Decision Theory

    0.1 Ordinal Utility

    We consider a decision maker (or agent) who chooses among alternatives (or outcomes) in

    some set Z. To begin we assume that Z is finite.

    The primitive description of preferences is in terms of a  preference relation  . For any

    ordered pair of alternatives (x, y)  ∈  Z × Z, the agent can tell us whether or not he  weakly

     prefers x to  y. If yes, we write x 

     y. If no, we write x

     y.We can use these to define

    strict preference:  a   b means [a   b and b a].

    indi ff erence a ∼  b means [a   b and b    a].

    We say that the preference relation is a weak order if it satisfies the two  weak order axioms:

    Completeness: For all a, b ∈  Z, either a    b or b   a  (or both).

    Transitivity: For all a, b, c ∈  Z, if  a    b and b   c, then a    c.

    Completeness says that there are no alternatives that the agent is unwilling or unable to

    compare. (Consider Z   =   {do nothing, save five lives by murdering a person chosen at

    random}.)

    Transitivity rules out preference cycles. (Consider Z = {a scoop of ice-cream, an enormous

    hunk of chocolate cake, a small plain salad}.)

    The function u :  Z  → R is an ordinal utility function that represents if 

    u(a) ≥  u(b) if and only if  a    b.

    Theorem 0.1.  Let Z be finite and let    be a preference relation. Then there is an ordinal utility

     function u :  Z → R that represents if and only if  is complete and transitive.

     Moreover, the function u is unique up to increasing transformations: v : Z → R also represents

    if and only if v  =   f  ◦ u for some increasing function f  :  R → R.

    –3–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    4/167

    In the first part of the theorem, the “only if” direction follows immediately from the

    fact that the real numbers are ordered. For the “if” direction, assign the elements of  Z

    utility values sequentially; the weak order axioms ensure that this can be done without

    contradiction.

    “Ordinal” refers to the fact that only the order of the values of the utility function havemeaning. Neither the values nor diff erences between them convey information about

    intensity of preferences. This is captured by the second part of the theorem, which says

    that utility functions are only unique up to increasing transformations.

    If  Z  is (uncountably) infinite, weak order is not enough to ensure that there is an ordinal

    utility representation:

    Example 0.2. Lexicographic preferences.   Let Z   =  R2, and suppose that a    b  ⇔  a1   >  b1  or

    [a1   =  b1  and a2   ≥  b2]. In other words, the agent’s first priority is the first component of 

    the prize; he only uses the second component to break ties. While   satisfies the weakorder axioms, it can be shown that there is no ordinal utility function that represents .

    In essence, there are too many levels of preference to fit them all into the real line.   ♦

    There are various additional assumptions that rule out such examples. One is

    Continuity:  Z ⊆ Rn, and for every a  ∈  Z, the sets {b : b   a} and {b :  a    b} are closed.

    Notice that Example 0.2 violates this axiom.

    Theorem 0.3.   Let Z   ⊆   Rn and let     be a preference relation. Then there is a continuousordinal utility function u :  Z →  Rn that represents   if and only if    is complete, transitive, and

    continuous.

    In the next section we consider preferences over lotteries—probability distributions over

    a finite set of prizes. Theorem 0.3 ensures that if preferences satisfy the weak order and

    continuity axioms, then they can be represented by a continuous ordinal utility function.

    By introducing an additional axiom, one can obtain a more discriminating representation.

    0.2 Expected Utility and the von Neumann-Morgenstern Theorem

    Now we consider preferences in settings with uncertainty: an agent chooses among

    “lotteries” in which diff erent alternatives in Z have diff erent probabilities of being realized.

    Example 0.4.  Suppose you are off ered a choice between

    –4–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    5/167

    lottery 1: $1M for sure

    lottery 2:

    $2M with probability   1

    2,

    $0 with probability   12 .

    One tempting possibility is to look at expected values: the weighted averages of the

    possible values, with weights given by probabilities.

    lottery 1: $1M × 1 = $1M

    lottery 2: $2M ×   12  + $0M ×

      12   = $1M

    But most people strictly prefer lottery 1.

    The lesson: if outcomes are in dollars, ranking outcomes in terms of expected numbers of 

    dollars may not capture preferences.   ♦

    If  Z  is a finite set, then we let  ∆Z  represent the set of probability distributions over  Z:∆Z =  { p : Z → R+|

    a∈Z p(a) =  1}.

    The objects a player must evaluate, p  ∈  ∆Z, are distributions over an outcome set Z.

    We can imagine he has preferences , where “ p   q” means that he likes p at least as much

    as q. When can these preferences be represented using numerical assessments of each  p?

    When can these assessments take the form of expected utilities?

    Let Z be a finite set of  alternatives, so that ∆Z is the set of  lotteries over alternatives.

    Example 0.5. Z =  {$0, $10, $100},∆Z =

      p =  ( p($0), p($10), p($100))

    $0

    $100   $100

    $0

    $10

    .2

    .8

    .9

    .1   1

     p =  (.2, .8, 0)   q =  (.9, 0, .1)   r =  (0, 0, 1)   ♦

    Let    be a  preference relation  on   ∆Z, where  p     q  means that lottery  p   ∈   ∆Z   is weakly

    preferred to lottery q  ∈  ∆Z.

    If  p and q are lotteries and α  ∈  [0, 1] is a scalar, then the compound lottery c  =  α p + (1 − α)q

    is the lottery defined by c( z) =  α p( z) + (1 − α)q( z) for all z  ∈  Z.

    –5–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    6/167

    Example 0.6. c =  .7 p + .3q =  .7(.2, .8, 0) + .3(.9, 0, .1) =  (.41, .56, .03)

    .7

    .3

    .2

    .8

    .9

    .1

    $0

    $10

    $0

    $100

    =   .56.03

    .41  $0

    $10

    $100

    Preference axioms:

    (NM1)  Weak order:   is complete and transitive.

    (NM2)   Continuity: For all p, q, and r such that p    q    r, there exist δ, ε ∈  (0, 1) such that

    (1 − δ) p + δr   q    (1 − ε)r + ε p.

    Example 0.7. p  = win Nobel prize,  q  = nothing, r  = get hit by a bus. (Since  p, q  and  r  are

    supposed to be lotteries, we should really write  p  =  win Nobel prize with probability 1,

    etc.)   ♦

    (NM3)   Independence: For all p, q, and r and all α ∈  (0, 1),

     p   q  ⇔  α p + (1 − α)r   αq + (1 − α)r.

    Example 0.8. p =  (.2, .8, 0),  q =  (.9, 0, .1),   r =  (0, 0, 1)

    $100

    $10

    $0.2

    .8

    1

    .5

    .5

    .5

    .5

    .1

    .9

    1

    $0

    $100

    $100

    ˆ p =  .5 p + .5r =  (.1, .4, .5)   q̂ =  .5q + .5r =  (.45, 0, .55)   ♦

    We say that u :  Z →R

    provides an expected utility representation for the preference relation on ∆Z if 

    (1)   p   q  ⇔ z∈Z

    u( z) p( z) ≥ z∈Z

    u( z) q( z).

    The function u is then called a von Neumann-Morgenstern (or NM) utility function.

    –6–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    7/167

    Theorem 0.9 (von Neumann and Morgenstern (1944)).

    Let Z be a finite set, and let be a preference relation on ∆Z. Then there is an NM utility function

    u :  Z →  R that provides an expected utility representation for if and only if  satisfies  (NM1),

    (NM2), and (NM3).

     Moreover, the function u is unique up to positive a ffine transformations. That is, v also satisfies

    (1) if and only if v ≡  au + b for some a >  0 and b ∈ R.

    If outcomes are money, ui(x) need not equal x. If it is, we say that utility is linear in money

    or that the player is risk neutral.

    Discussion of Theorem 0.9

    (i) The theorem tells us that as long as (NM1)–(NM3) hold, there is some way of 

    assigning numbers to the alternatives such that taking expected values of these

    numbers is the right way to evaluate lotteries over alternatives.(ii) The values of an NM utility function are sometimes called  cardinal utilities   (as

    opposed to ordinal utilities). What more-than-ordinal information do cardinal

    utilities provide?

    The nature of this information can be deduced from the fact that a NM utility

    function is unique up to positive affine transformations.

    Example 0.10.   Let a, b, c ∈  Z, and suppose that ua  > uc  > ub. Let λ  =  uc−ubua−ub

    . This quantity is

    not aff ected by positive affine transformations. Indeed, if  v  =  αu + β, then

    vc − vbva − vb

    =(αuc + β) − (αub + β)

    (αua + β) − (αub + β)  =

     α(uc − ub)

    α(ua − ub)  = λ.

    To interpret λ, rearrange its definition to obtain

    uc  = λua + (1 − λ)ub.

    This says that  λ  is the probability on  a  in a lottery over  a  and b that makes this lottery

    exactly as good as getting c for sure.   ♦

    0.3 Bayesian Rationality

    In settings with uncertainty, where all relevant probabilities are objective and known, we

    call an agent NM rational if he acts as if he is maximizing a NM expected utility function.

    –7–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    8/167

    What if the probabilities are not given? We call an agent  Bayesian rational (or say that he

    has subjective expected utility preferences) if 

    (i) In settings with uncertainty, he forms beliefs describing the probabilities of all

    relevant events.

    (ii) When making decisions, he acts to maximize his expected utility given his beliefs.

    (iii) After receiving new information, he updates his beliefs by taking conditional prob-

    abilities whenever possible.

    In game theory, it is standard to begin analyses with the assumption that players are

    Bayesian rational.

    Foundations for subjective expected utility preferences are obtained from state-space modelsof uncertainty. These models begin with a set of possible states whose probabilities arenot given, and consider preferences over maps from states to outcomes. Savage (1954)provides an axiomatization of subjective expected utility preferences in this framework.Both the utility function and the assignment of probabilities to states are determined aspart of the representation. Anscombe and Aumann (1963) consider a state-space model inwhich preferences are not over state-contingent alternatives, but over maps from states tolotteries à la von Neumann-Morgenstern. This formulation allows for a much a simplerderivation of subjective expected utility preferences, and fits very naturally into game-theoretic models. See Gilboa (2009) for a textbook treatment of these and more generalmodels of decision under uncertainty.

    1. Normal Form Games

    Game theory models situations in which multiple players make strategically interdepen-

    dent decisions.  Strategic interdependence means that your outcomes depend both on what

    you do and on what others do.

    This course focuses on noncooperative game theory, which works from the hypothesis that

    agents act independently, each in his own self interest.   Cooperative game theory  studies

    situations in which subsets of the agents can make binding agreements.

    We study some basic varieties of games and the connections among them:

    1.  Normal form games: moves are simultaneous

    2.  Extensive form games: moves take place over time

    3.  Bayesian games: players receive private information before play begins

    4.  Repeated games: a normal form game is played repeatedly, with all previous moves

     being observed before each round of play

    –8–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    9/167

    1.1 Basic Concepts

    Example 1.1. Prisoner’s Dilemma.

    Story 1: Two bankers are each asked to report on excessive risk taking by the other. If 

    neither reports such activity, both get a $2M bonus. If only one reports such activity, hegets a $3M bonus and the other gets nothing. If both report such activity, then each gets a

    $1M bonus.

    Story 2: There are two players and a pile of money. Each player can either let the opponent

    take $2, or take $1 for himself.

    2

    c d

    1  C   2, 2 0, 3

    D   3, 0 1, 1

    (i) Players: P   = {1, 2}

    (ii) Pure strategy sets S1  = {C, D}, S2  = {c, d}

    Set of pure strategy profiles:  S =  S1 × S2. For example: (C, d) ∈  S

    (iii) Utility functions ui  :  S  → R. For example:  u1(C, d) =  0, u2(C, d) =  3.   ♦

    1.1.1 Definition

    A normal form game G  =  {P , {Si}i∈P , {ui}i∈P } consists of:(i) a finite set of  players P   = {1,..., n},

    (ii) a finite set of  pure strategies Si for each player,

    (iii) a von Neumann-Morgenstern (NM) utility function ui :  S  →  R for each player, where

    S =

    i∈P  Si is the set of  pure strategy profiles (lists of strategies, one for each player).

    If each player chooses some  si   ∈   Si, the strategy profile is  s   =   (s1, . . . , sn) and player   j’s

    payoff is u j(s).

    1.1.2 Randomized strategies and beliefs

    In our description of a game above, players each choose a particular pure strategy si  ∈ Si.

    But it is often worth considering the possibility that each player makes a randomized

    choice.

    –9–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    10/167

     Mixed strategies and mixed strategy profiles

    If  A is a finite set, then we let ∆ A represent the set of probability distributions over A: that

    is, ∆ A =  { p :  A  → R+|

    a∈ A p(a) =  1}.

    Then σi  ∈  ∆Si  is a mixed strategy for  i, while σ  =  (σ1, . . . , σn)  ∈ i∈P  ∆Si  is a mixed strategy profile.

    Under a mixed strategy profile, players are assumed to randomize  independently: for

    instance, learning that 1 played C provides no information about what 2 did.

    In other words, the distribution on the set  S  of pure strategy profiles created by  σ   is a

     product distribution.

    Example 1.2. Battle of the Sexes. 2

    a b

    1   A   3, 1 0, 0B   0, 0 1, 3

    Suppose that 1 plays A with probability   34

    , and 2 plays a with probability   14

    . Then

    σ =  (σ1, σ2) =(σ1( A), σ1(B)), (σ2(a), σ2(b))

     =

    ( 3

    4 ,  14

    ), ( 14 ,  34

    )

    The pure strategy profile ( A, a) is played with probability  σ1( A) · σ2(a)   =  3

    4  ·   1

    4   =  316

    . The

    complete product distribution is presented in the matrix below.

    2

    a ( 14

    )   b ( 34

    )

    1  A ( 34 )

      316

    916

    B ( 14

    )   116

    316

      ♦

    When player i has two strategies, his set of mixed strategies ∆Si is the simplex inR2, which

    is an interval.

    A

    B1

    1

    0

    A   B

    –10–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    11/167

    When player  i has three strategies, his set of mixed strategies  ∆Si  is the simplex in  R3,

    which is a triangle.

    A

    B   C

    A

    B   C

    When player   i  has four strategies, his set of mixed strategies   ∆Si   is the simplex in  R4,

    which is a pyramid.

    (use imagination here)

    A

    B

    C

    D

    Correlated strategies

    In some circumstances we need to consider the possibility that players all have access to

    the same randomizing device, and so are able to correlate their behavior. This is not as

    strange as it may seem, since any uncertain event that is commonly observed can serve to

    correlate behavior.

    Example 1.3. Battle of the Sexes revisited.

    Suppose that the players observe a toss of a fair coin. If the outcome is Heads, they play

    ( A, a); if it is Tails, they play (B, b).

    A formal description of their behavior specifies the probability of each pure strategyprofile: ρ =  (ρ( A, a), ρ( A, b), ρ(B, a), ρ(B, b)) =  ( 12 , 0, 0,

     12 ).

    2

    a b

    1  A   1

    2  0

    B   0   12

    –11–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    12/167

    This behavior cannot be achieved using a mixed strategy profile, since it requires correla-

    tion: any mixed strategy profile putting weight on ( A, a) and (B, b) would also put weight

    on ( A, b) and (B, a):

    2

     y >  0 (1 − y) >  0a b

    x >  01

      A xy x(1 − y)

    (1 − x) >  0   B   (1 − x) y   (1 − x)(1 − y)

    all marginal probabilities > 0   ⇒  all joint probabilities > 0   ♦

    We call  ρ   ∈   ∆

    i∈P  Si

      = ∆S  a  correlated strategy. It is an arbitrary joint distribution oni∈P  Si.

    Example 1.4.   Suppose that P   =  {1, 2, 3} and  Si   =  {1, . . . , k i}. Then a mixed strategy profileσ   =   (σ1, σ2, σ3)   ∈

    i∈P  ∆Si  consists of three probability vectors of lengths  k 1,  k 2, and  k 3,

    while a correlated strategy ρ ∈  ∆

    i∈P  Si is a single probability vector of length k 1 · k 2 · k 3.

    Because players randomize independently in mixed strategy profiles, mixed strategy

    profiles generate the product distributions on

    i∈P  Si. Thus:

    mixed strategy profiles  = i∈P 

    ∆Si   “⊂”   ∆ i∈P 

    Si = correlated strategies.

    We write “⊂” because the items on each side are not the same kinds of mathematical

    objects (i.e., they live in diff erent spaces).

    Example 1.5.   If  S1  = { A, B} and S2  = {a, b}, then the set of mixed strategies ∆{ A, B} × ∆{a, b}

    is the product of two intervals, and hence a square. The set of correlated strategies

    ∆({ A, B} × {a, b}) is a pyramid.

    a b

    A

    B

    Aa

    Ab   Ba

    Bb

    –12–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    13/167

    The set of correlated strategies that correspond to mixed strategy profiles—in other words,

    the product distributions on { A, B} × {a, b}—form a surface in the pyramid.

    Beliefs

    One can divide traditional game-theoretic analyses into two classes: equilibrium and

    non-equilibrium. In equilibrium analyses (e.g., using Nash equilibrium), one assumes

    that players correctly anticipate how opponents will act. In this case, Bayesian rational

    players will maximize their expected utility with respect to correct predictions about how

    opponents will act. In nonequilibrium analyses (e.g., dominance arguments) this is not

    assumed. Instead, Bayesian rationality requires players to form beliefs about how their

    opponents will act, and to maximize their expected payoff s given their beliefs. In some

    cases knowledge of opponents’ rationality leads to restrictions on plausible beliefs, and

    hence on our predictions of play.

    Let us consider beliefs in a two-player game. Suppose for now first that player i  expects

    his opponent to play a pure strategy, but that  i may not be certain of which strategy  j will

    play. Then player i  should form beliefs µi   ∈  ∆S j  about what his opponent will do. Note

    that in this case, player  i’s beliefs about player   j  are the same sort of object as a mixed

    strategy of player  j.

    Remarks:

    (i) If player i thinks that player  j might randomize, then i’s beliefs µ j would need to bea probability measure on  ∆S j  (so that loosely speaking,  µ j  ∈  ∆(∆S j).) Such beliefs

    can be reduced to a probability measure on S j by taking expectations. Specifically,let  µ̄i  = Eµiσ j be the mean of a random variable that takes values in ∆S j and whosedistribution is µ j. Then µ̄i ∈ ∆S j, and µ̄i(s j) represents the probability that i assignsto the realization of   j’s mixed strategy being the pure strategy s j. In the end, theseprobabilities are all that matter for player i’s expected utility calculations. Thus innonequilibrium analyses, there is no loss in restricting attention to beliefs that onlyput weight on opponents’ pure strategies. We do just this in Sections 1.2 and 1.3.

    –13–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    14/167

    On the other hand, if player j plays mixed strategy σ j, then player i’s beliefs are onlycorrect if  µi(σ j)  =  1. But when we consider solution concepts that require correct

     beliefs (especially Nash equilibrium—see Section 1.4), there will be no need to referto beliefs explicitly in the formal definitions, since the definitions will implicitlyassume that beliefs are correct.

    (ii) When player i’s beliefs µi assigns probability 1 to player  j  choosing a pure strategy but put weight on multiple pure strategies, these beliefs are formally identical toa mixed strategy σ j of player   j. Therefore, the optimization problem player i faceswhen he is uncertain and holds beliefs  µi   is equivalent to the the optimizationproblem he faces when he knows player   j   will play the mixed strategy  σ j   (see

     below). The implications of this point will be explored in Section 1.3.

    Now we consider beliefs in games with many players. Suppose again that each player

    expects his opponents to play pure strategies, although he is not sure which pure strategies

    they will choose. In this case, player  i’s beliefs µi  are an element of  ∆  ji S j, and so are

    equivalent to a correlated strategy among player i’s opponents. Remark (i) above applieshere as well: in nonequilibrium analyses, defining beliefs as just described is without loss

    of generality.

    (It may be preferable in some applications to restrict a player’s beliefs about diff erent

    opponents’ choices to be independent, in which case beliefs are described by elements

    of 

     ji ∆S j, the set of opponents’ mixed strategy profiles. We do not do so here, but we

    discuss this point further in Section 1.3.)

    In all cases, we assume that if a player chooses a mixed strategy, learning which of his

    pure strategies is realized does not alter his beliefs about his opponents.

    Expected utility

    To compute a numerical assessment of a correlated strategy or mixed strategy profile, a

    player takes the weighted average of the utility of each pure strategy profile, with the

    weights given by the probabilities that each pure strategy profile occurs. This is called the

    expected utility associated with σ. See Section 0.2.

    Example 1.6. Battle of the Sexes once more.

    2a b

    1  A   3, 1 0, 0

    B   0, 0 1, 3

    payoff s

    2a ( 14 )   b (

    34 )

    1  A ( 3

    4)   3

    169

    16

    B ( 14

    )   116

    316

    probabilities

    –14–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    15/167

    Suppose σ  =  (σ1, σ2) =(σ1( A), σ1(B)), (σ2(a), σ2(b))

     =

    ( 3

    4 ,  14

    ), ( 14 ,  34

    )

     is played. Then

    u1(σ) =  3 ·  316

     + 0 ·   916

     + 0 ·   116

     + 1 ·   316

      =   34 ,

    u2(σ) =  1 ·  316

     + 0 ·   916

     + 0 ·   116

     + 3 ·   316

      =   34

    .   ♦

    In general, player i’s expected utility from correlated strategy ρ is

    (2)   ui(ρ) =s∈S

    ui(s) · ρ(s).

    Player i’s expected utility from mixed strategy profile σ  =  (σ1, . . . , σn) is

    (3)   ui(σ) =s∈S

    ui(s) ·

     j∈P 

    σ j(s j).

    In (3), the term in parentheses is the probability that s  =  (s1, . . . , sn) is played.

    We can also write down an agent’s expected utility in a setting in which he is uncertain

    about his opponent’s strategies. If player i  plays mixed strategy  σi   ∈  ∆Si  and his beliefs

    about his opponents’ behavior are given by µi  ∈ ∆

     ji S j, his expected utility is

    (4)   ui(σi, µi) =s∈S

    ui(s) · σi(si) µi(s−i).

    There is a standard abuse of notation here. In (2)  ui  acts on correlated strategies (so thatui :  ∆

     j∈P  S j

     →  R), in (3) ui acts on mixed strategy profiles (so that ui :

     j∈P  ∆S j  →  R),

    and in (4)   ui  acts on mixed strategy /  beliefs pairs (so that   ui :   ∆Si  ×  ∆

     ji S j

      →   R).

    Sometimes we even combine mixed strategies with pure strategies, as in  ui(si, σ−i). In the

    end we are always taking the expectation of  ui(s) over the relevant distribution on pure

    strategy profiles s, so there is really no room for confusion.

    1.2 Dominance and Iterated Dominance

    Suppose we are given some normal form game G. How should we expect Bayesian rational

    players (i.e., players who form beliefs about opponents’ strategies and choose optimally

    given their beliefs) playing G to behave? We consider a sequence of increasingly restrictive

    methods for analyzing normal form games. We start by considering the implications of 

    Bayesian rationality and of common knowledge of rationality. After this, we introduce

    –15–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    16/167

    equilibrium assumptions.

    We always assume that the structure and payoff s of the game are common knowledge: that

    everyone knows these things, that everyone knows that everyone knows them, and so on.

    Notation:G =  {P , {Si}i∈P , {ui}i∈P }   a normal form game

    s−i  ∈  S−i  =

     ji S j   a profile of pure strategies for i’s opponents

    µi  ∈  ∆

     ji S j   i’s beliefs about his opponents’ strategies

    (formally equivalent to a correlated strategy for i’s opponents)

    Remember that (i) in a two-player game, player  i’s beliefs µi  are the same kind of object

    as player   j’s mixed strategy σ j, and (ii) in a game with more than two players, player i’s

     beliefs µi can emulate any mixed strategy profile σ−i of  i’s opponents, but in addition can

    allow for correlation.

    1.2.1 Strictly dominant strategies

    “Dominance” concerns strategies whose performance is good (or bad) regardless of how

    opponents behave.

    Pure strategy si  ∈  Si is  strictly dominant if 

    (5)   ui(si, s−i) >  ui(si , s−i) for all s

    i  si and s−i  ∈ S−i.

    In words: player  i  prefers  s i   to any alternative  si regardless of the pure strategy profile

    played by his opponents.

    Example 1.7. Prisoners’ Dilemma revisited.2

    c d

    1  C   2, 2 0, 3

    D   3, 0 1, 1

     Joint payoff s are maximized if both players cooperate. But regardless of what player 2does, player 1 is better off defecting. The same is true for player 2. In other words,  D and

    d are strictly dominant strategies.

    The entries in the payoff  bimatrix are the players’ NM utilities. If the game is supposed to

    represent the banker story from Example 1.1, then having these entries correspond to the

    dollar amounts in the story is tantamount to assuming that (i) each player is risk neutral,

    –16–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    17/167

    and (ii) each player cares only about his own dollar payoff s. If other considerations are

    important—for instance, if the two bankers are friends and care about each others’ fates—

    then the payoff  matrix would need to be changed to reflect this, and the analysis would

    diff er correspondingly. Put diff erently, the analysis above tells us only that if  each banker

    is rational and cares only about his dollar payoff s, then we should expect to see (D, d).   ♦

    The next observation shows that a Bayesian rational player must play a strictly dominant

    strategy whenever one is available.

    Observation 1.8.  Strategy si is strictly dominant if and only if 

    (6)   ui(si, µi) >  ui(si , µi) for all s

    i   si and  µi ∈ ∆S−i.

    Thus, if strategy si  is strictly dominant, then it earns the highest expected utility regardless of 

     player i’s beliefs.

    While condition (6) directly addresses Bayesian rationality, condition (5) is easier to check.

    Why are the conditions equivalent? (⇐) is immediate. (⇒) follows from the fact that the

    inequality in (6) is a weighted average of those in (5).

    Considering player i’s mixed strategies would not allow anything new here: First, a pure

    strategy that strictly dominates all other pure strategies also dominates all other mixed

    strategies. Second, a mixed strategy that puts positive probability on more than one pure

    strategy cannot be strictly dominant (since it cannot be the unique best response to anys−i; see Observation 1.14).

    1.2.2 Strictly dominated strategies

    Most games do not have strictly dominant strategies. How can we get more mileage the

    notion of dominance?

    A strategy σi  ∈ ∆Si is  strictly dominated if there exists a σi ∈ ∆Si such that

    ui(σ

    i , s−i) >  ui(σi, s−i) for all s−i ∈ S−i

    Remarks on strictly dominated strategies:

    (i)   σi is strictly dominated if and only if 

    ui(σi , µi) >  ui(σi, µi) for all µi  ∈  ∆

     ji S j.

    –17–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    18/167

    Thus, Bayesian rational players never choose strictly dominated strategies.

    (ii) A strategy that is not dominated by any pure strategy may be dominated by a

    mixed strategy:

    2

    L R

    1

    T    3, −   0, −

     M   0, −   3, −

    B   1, −   1, −

    B   is not dominated by  T  or  M  but it is dominated by   12

    T  +   12 M. Note how this

    conclusion depends on taking expected utility seriously: the payoff of 1.5 generated

     by playing   12

    T +   12 M against L is just as “real” as the payoff s of 3 and 0 obtained by

    playing T  and M against L.

    (iii) If a pure strategy s

    i is strictly dominated, then so is any mixed strategy σi with s

    i inits support (i.e., that uses  si with positive probability). This is because any weight

    placed on a strictly dominated strategy can instead be placed on the dominating

    strategy, which raises player i’s payoff s regardless of how his opponents act.

    For instance, in the example from part (ii),   23 M +   1

    3B is strictly dominated by

    23 M +   1

    3( 1

    2T +   1

    2 M) =   1

    6T +   5

    6 M.

    (iv) But even if a group of pure strategies are not dominated, mixed strategies that

    combine them may be:

    2

    L R

    1

    T    3, −   0, −

     M   0, −   3, −

    B   2, −   2, −

     T

    M   B

    not dominated

    T , M,and B areallbestresponsestosome σ2 ∈ ∆S2, and so are not strictlydominated.

    But   12 T +  12 M (guarantees   32 ) is strictly dominated by B  (guarantees 2). In fact, any

    mixed strategy with both T and M in its support is strictly dominated.

    1.2.3 Iterated strict dominance

    Some games without a strictly dominant strategy can still be solved using the idea of 

    dominance.

    –18–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    19/167

    Example 1.9.   In the game below, 2 does not have a dominated pure strategy.

    2

    L C R

    1  T 2, 2 6, 1 1, 1

    M 1, 3 5, 5 9, 2

    B 0, 0 4, 2 8, 8

    B is dominated for 1 (by  M), so if 1 is rational he will not play B.

    If 2 knows that 1 is rational, she knows that he will not play  B.

    So if 2 is rational, she won’t play  R, which is strictly dominated by L once B is removed.

    Now if 1 knows:

    (i) that 2 knows that 1 is rational

    (ii) that 2 is rational

    then 1 knows that 2 will not play R. Hence, since 1 is rational he will not play M.

    Continuing in a similar vein: 2 will not play  C.

    Therefore, (T , L) solves G by iterated strict dominance.   ♦

    Iterated strict dominance is driven by common knowledge of rationality—by the assumption

    that all statements of the form “i knows that   j knows that . . .   k  is rational” are true—as

    well as by common knowledge of the game itself.

    To see which strategies survive iterated strict dominance it is enough to

    (i) Iteratively remove all dominated pure strategies.

    (ii) When no further pure strategies can be removed, check all remaining mixed strate-

    gies. (We do not have to do this earlier because in the early rounds, we are only

    going to check performance versus pure strategies anyway.)

    A basic fact about iterated strict dominance is:

    Proposition 1.10.   The set of strategies that remains after iteratively removing strictly dominated

    strategies does not depend on the order in which the dominated strategies are removed.

    See Dufwenberg and Stegeman (2002) or Ritzberger (2002) for a proof.

    Often, iterated strict dominance will eliminate a few strategies but not completely solve

    the game.

    –19–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    20/167

    1.2.4 Weak dominance

    Strategy σi ∈ ∆Si is  weakly dominated by σi if 

    ui(σi , s−i) ≥  ui(σi, s−i) for all s−i ∈ S−i,   and

    ui(σi , s−i) >  ui(σi, s−i) for some s−i  ∈  S−i

    Strategy si  ∈ Si is  weakly dominant if it weakly dominates all other strategies.

    Example 1.11.   Weakly dominated strategies are not ruled out by Bayesian rationality alone.

    2

    L R

    1  T    1, −   0, −

    B   0, −   0, −   ♦

    While the use of weakly dominated strategies is not ruled out by Bayesian rationality

    alone, the avoidance of such strategies is often taken as a first principle. In decision

    theory, this principle is referred to as  admissibility; see Kohlberg and Mertens (1986) for

    discussion and historical comments. In game theory, admissibility is sometimes deduced

    from the principle of  cautiousness, which requires that players not view any opponents’

     behavior as impossible; see Asheim (2006) for discussion.

    It is natural to contemplate iteratively removing weakly dominated strategies. However,

    iterated removal and cautiousness conflict with one another: removing a strategy means

    viewing it as impossible, which contradicts cautiousness. See Samuelson (1992) for dis-cussion and analysis. One consequence is that the order of removal of weakly dominated

    strategies can matter—see Example 1.12 below. (For results on when order of removal

    does not matter, see Marx and Swinkels (1997) and Østerdal (2005).) But versions of 

    iterated weak dominance can be placed on a secure epistemic footing (see Brandenburger

    et al. (2008)), and moreover, iterated weak dominance is a powerful tool for analyzing

    extensive form games (see Section 2.6.1).

    Example 1.12. Order of removal matters under IWD.

    2

    L R

    1

    U    5, 1 4, 0

     M   6, 0 3, 1

    D   6, 4 4, 4

    –20–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    21/167

    In the game above, removing weakly dominated stategy U  first leads to prediction (D, R),

    while removing the weakly dominated strategy M first leads to the prediction (D, L).   ♦

    An intermediate solution concept between ISD and IWD is introduced by Dekel and Fu-

    denberg (1990), who suggest one round of elimination of all weakly dominated strategies,

    followed by iterated elimination of strictly dominated strategies. Since weak dominance

    is not applied iteratively, the tensions described above do not arise. Strategies that survive

    this Dekel-Fudenberg procedure are sometimes called permissible. See Section 2.5 for further

    discussion.

    1.3 Rationalizability

    1.3.1 Definition and examples

    Q: What is the tightest prediction that we can make assuming only common knowledge

    of rationality?

    A: Bayesian rational players not only avoid dominated strategies; they also avoid strate-

    gies that are never a best response. If we apply this idea iteratively, we obtain the

    sets of  rationalizable strategies.

    Strategy σi is a best response to beliefs µi (denoted σi  ∈ Bi(µi)) if 

    ui(σi, µi) ≥  ui(σi , µi) for all σi  ∈ ∆Si

    (In contrast with dominance, there is no “for all  µi”.)

    The set-valued map Bi is called player i’s best response correspondence. As with the notation

    ui(·), we will abuse the notation Bi(·) as necessary, writing both Bi(µi) and Bi(σ−i).

    Informally, the  rationalizable strategies  (Bernheim (1984), Pearce (1984)) are those that re-

    main after we iteratively remove all strategies that are cannot be a best response, account-

    ing for each player’s uncertainty about his opponents’ behavior. We provide a definition

     below, and an alternate characterization in Theorem 1.23.

    Because it only requires CKR, rationalizability is a relatively weak solution concept. Still,

    when rationalizability leads to many rounds of removal, it can result in stark predictions.

    Example 1.13. Guessing   34

     of the average.

    There are n players. Each player’s strategy set is Si  = {0, 1, . . . , 100}.

    –21–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    22/167

    The target integer is defined to be   34

     of the average strategy chosen, rounding down.

    All players choosing the target integer split a prize worth V  >  0 (or, alternatively, each is

    given the prize with equal probability). If no one chooses the target integer, the prize is

    not awarded.

    Which pure strategies are rationalizable in this game?

    To start, we claim that for any pure strategy profile  s−i  of his opponents, player  i  has a

    response ri   ∈  Si  such that the target integer generated by (ri, s−i) is ri. (You are asked to

    prove this on the problem set.) Thus for any beliefs µi about his opponents, player  i  can

    obtain a positive expected payoff  (for instance, by playing a best response to some  s−i  in

    the support of  µi).

    So: Since Si  = {0, 1, . . . , 100},

    ⇒  The highest possible average is 100.

    ⇒  The highest possible target is 34 · 100 =  75.

    ⇒  Strategies in {76, 77, . . . , 100} yield a payoff of 0.

    ⇒  Since player i has a strategy that earns a positive expected payoff given his beliefs,

    strategies in {76, 77, . . . , 100} are not best responses.

    Thus if players are rational, no player chooses a strategy above 75.

    ⇒  The highest possible average is 75.

    ⇒  The highest possible target is 34 · 75 =  56 1

    4 =  56.

    ⇒  Strategies in {57, . . . , 100} yield a payoff of 0.

    ⇒  Since player i has a strategy that earns a positive expected payoff given his beliefs,strategies in {57, . . . , 100} are not best responses.

    Thus if players are rational and know that others are rational, no player chooses a strategy

    above 56.

    Proceeding through the rounds of eliminating strategies that cannot be best responses, we

    find that no player will choose a strategy higher than

    75 . . . 56 . . . 42 . . . 31 . . . 23 . . . 17 . . . 12 . . . 9 . . . 6 . . . 4 . . . 3 . . . 2 . . . 1 . . . 0.

    Thus, after 14 rounds of iteratively removing strategies that cannot be best responses, we

    conclude that each player’s unique rationalizable strategy is 0.   ♦

    When applying rationalizability, we may reach a point in our analysis at which a player

    has multiple pure strategies, none of which can be removed (meaning that for each such

    strategy, there are beliefs against which that strategy is optimal). In this case, we should

    –22–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    23/167

    consider whether any mixtures of these pure strategies can be removed.

    The following observation provides an easy way of checking whether a mixed strategy is

    a best response.

    Observation 1.14. Strategy σi is a best response to µi if and only if every pure strategy si in thesupport of  σi is a best response to µi.

    This follows immediately from thefact that thepayoff toamixedstrategyistheappropriate

    weighted average of the payoff s to the pure strategies in its support.

    Example 1.15. Determining the rationalizable strategies in a normal form game.

    2

    L C R

    T 3, 3 0, 0 0, 21 M 0, 0 3, 3 0, 2

    B 2, 2 2, 2 2, 0

    To find B1  :  ∆S2  ⇒   ∆S1

    u1(T , µ1) ≥  u1( M, µ1)   ⇔   3l ≥  3c

    ⇔   l ≥  c

    u1(T , µ1) ≥  u1(B, µ1)   ⇔   3l ≥  2

    ⇔   l ≥   23

    To find B2  :  ∆S1  ⇒ ∆S2

    u2(µ2, L) ≥  u2(µ2, C)   ⇔   3t + 2b ≥  3m + 2b

    ⇔   t ≥  m

    u2(µ2, L) ≥  u2(µ2, R)   ⇔   3t + 2b ≥  2t + 2m

    ⇔   t + 2b ≥  2m

    –23–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    24/167

    L

    RC

    B

    T

    M

    B1: ΔS

    2 ⇒ ΔS

    1

    T

    M   B

    Best responsesfor player 1

    ¹⁄₃C + ²⁄₃L

    ²⁄₃C + ¹⁄₃L

    ¹⁄₂C + ¹⁄₂L

    ²⁄₃L + ¹⁄₃R

    ²⁄₃C + ¹⁄₃R

    T

    BM

    R L

    C

    B2: ΔS

    1 ⇒ ΔS

    2

    L

    RC

    Everything is a best responsefor player 2.

    ¹⁄₂B + ¹⁄₂T

    ²⁄₅M + ²⁄₅T + ¹⁄₅B

    ¹⁄₂M + ¹⁄₂B

    ¹⁄₂M + ¹⁄₂T

    ¹⁄₃M + ²⁄₃T

    ²⁄₃M + ¹⁄₃T

    Q: No mixtures of  T  and  M  are a best response for 1. Since 2 knows this, can it be a best

    response for her to play R?

    A: R is not a best response to any point on the dark lines  TB  and  BM, which represent

    mixtures between strategies T  and B and between B and M.

    Since player 2 is uncertain about which best response player 1 will play, Bayesian ratio-

    nality requires her to form beliefs about this. These beliefs µ2  are a probability measure

    on the set of player 1’s best responses.

    If 2’s beliefs about 1’s behavior are µ2(T )  = µ2( M)  =  1

    2, then it is as if  2 knows that 1 will

    play   12 T +  12 M, and R is a best response to these beliefs.

    In fact, if   µ2(T )   =   µ2( M)   =  2

    5  and   µ2(B)   =

      15

    , then it is   as if   2 knows that 1 will play25

    T +   25 M +  15

    B, so all of 2’s mixed strategies are possible best responses.

    Thus, the player 1’s set of rationalizable strategies isR  ∗1

     = {σ1

     ∈  ∆S1

    : [σ1(T ) =  0 or σ

    1( M) =

    0]}, and player 2’s set of rationalizable strategies is simply  R  ∗2  = ∆S2.   ♦

    When we compute the rationalizable strategies, we must account for each player’s un-

    certainty about his opponent’s strategies. Thus, during each iteration we must leave in

    all of his best responses to any  mixture of the opponent’s surviving pure strategies, even

    mixtures that are never a best response. Put diff erently, strategic uncertainty leads us to

    –24–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    25/167

    include the convex hull of the surviving mixed strategies at each intermediate stage of the

    elimination process.

    Iterative definition of (and procedure to compute) rationalizable strategies:

    (i) Iteratively remove pure strategies that are never a best response (to any allowable

     beliefs).(ii) When no further pure strategies can be removed, remove mixed strategies that are

    never a best response.

    The mixed strategies that remain are the rationalizable strategies.

    There are refinements of rationalizability based on assumptions beyond CKR that generate

    tighter predictions in some games, while still avoiding the use of equilibrium knowledge

    assumptions—see Section 2.5.

    Rationalizability and iterated strict dominance in two-player games

    It is obvious that

    Observation 1.16.   If  σi is strictly dominated, then  σi is never a best response.

    In two-player games, the converse statement is not obvious, but is nevertheless true:

    Proposition 1.17.  In a two-player game, any strategy that is never a BR is strictly dominated.

    The proof is based on the separating hyperplane theorem: see Section 1.3.2.

    So “never a best response” and “strictly dominated” are equivalent in two-player games.

    Iterating yields

    Theorem 1.18.   In a two-player game, a strategy is rationalizable if and only if it satisfies iterated

    strict dominance.

    Rationalizability and iterated strict dominance in games with three or more players

    For games with three or more players, there are two definitions of rationalizability in use.

    The original one (sometimes called  independent rationalizability) computes best responses

    under the assumption that a player’s beliefs about diff erent opponents’ choices are in-

    dependent, so that these beliefs are formally equivalent to an opponents’ mixed strategy

    profile. The alternative (sometimes called  correlated rationalizability) allows correlation in

    a player’s beliefs about diff erent opponents’ choices. This agrees with the way we defined

     beliefs in Section 1.1.2. In either case, [σi strictly dominated] ⇒ [σi never a best response],

    so all rationalizable strategies survive iterated strict dominance. But the analogues of 

    Proposition 1.17 and Theorem 1.18 are only true under correlated rationalizability.

    –25–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    26/167

    While opinion is not completely uniform, most game theorists would choose correlated

    rationalizability as the more basic of the two concepts. See Hillas and Kohlberg (2002) for

    a compelling defense of this point of view. We take “rationalizability” to mean “correlated

    rationalizability” unless otherwise noted.

    Example 1.19.  Consider the following three-player game in which only player 3’s payo ff sare shown.

    3:A2

    L R

    1  T    −, −, 5   −, −, 2

    B   −, −, 2   −, −, 1

    3:B2

    L R

    1  T    −, −, 4   −, −, 0

    B   −, −, 0   −, −, 4

    3:C2

    L R

    1  T    −, −, 1   −, −, 2

    B   −, −, 2   −, −, 5

    Strategy B  is not strictly dominated, since a dominating mixture of  A  and  C  would needto put at least probability   34

     on both A  (in case 1 and 2 play (T , L)) and C  (in case 1 and2 play (B, R)). If player 3’s beliefs about player 1’s choices and player 2’s choices areindependent, B is not a best response: Independence implies that for some  t, l ∈  [0, 1], wecan write  µ3(T , L)   =   tl,  µ3(T , R)   =  t(1 − l), µ3(B, L)   =  (1 − t)l, and µ3(B, R)   =  (1 − t)(1 − l).Then

    u3(C, µ3) >  u3(B, µ3)

    ⇔   tl + 2t(1 − l) + 2(1 − t)l + 5(1 − t)(1 − l) >  4tl + 4(1 − t)(1 − l)

    ⇔   1 + t + l >  6tl,

    which is true whenever  t + l   ≤  1 (why?); symmetrically,  u3( A, µ3)  >  u3(B, µ3) whenevert + l ≥  1. But B is a best response to the correlated beliefs  µ3(T , L) =  µ3(B, R) =

      12

    .   ♦

    1.3.2 The separating hyperplane theorem

    A hyperplane is a set of points in Rn that satisfy a scalar linear equality. More specifically,

    the hyperplane H  p,c   =  {x  ∈  Rn :  p · x  =  c} is identified by some normal vector p  ∈  Rn − {0}

    and intercept c  ∈  R. Since the hyperplane is an  n − 1 dimensional affine subset of  Rn, its

    normal vector is unique up to a multiplicative constant.

    A half space is a set {x ∈ Rn : p · x ≤  c}.

    Example 1.20.   In R2, a hyperplane is a line.  x2  = ax1 + b ⇒  (−a, 1) · x =  b, so p  =  (−a, 1).

    The figure below displays cases in which a  =  − 12

    , so that p  =  ( 12 , 1).

    –26–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    27/167

    6

    62

    2

    4

    4

    8

    p.x = 4

    p.x = 0

    p.x = 2

    p

    Interpreting the figure:

     H  p,0  = {x : p · x =  0} is the hyperplane through the origin containing all vectors orthogonal

    to p.

     H  p,c is a hyperplane parallel to  H  p,0. (Why? If  y,  ˆ y ∈

     H  p,c  = {

    x : p·

    x =

     c}, then the tangentvector   ˆ y − y is orthogonal to p: that is, p · ( ˆ y − y) =  0, or equivalently,   ˆ y − y  ∈  H  p,0.)

    The normal vector p points towards H  p,c with c  >  0. (Why? Because x · y  =  |x|| y| cos θ > 0

    when the angle θ formed by x and y  is acute.)   ♦

    Theorem 1.21 (The Separating Hyperplane Theorem).

    Let A, B   ⊂  Rn be closed convex sets such that A  ∩  B   ⊆   bd( A) ∩  bd(B). Then there exists a

     p ∈ Rn − {0} such that p · x ≤  p · y for all x  ∈  A and y ∈  B.

    p

    z

    p x < c.

    p x = p z = c.   .

    p x > c.

    A

    B

    In cases where B  consists of a single point on the boundary of  A, the hyperplane whose

    existence is guaranteed by the theorem is often called a  supporting hyperplane.

    For proofs, discussion, examples, etc. see Hiriart-Urruty and Lemaréchal (2001).

    Application: Best responses and dominance in two-player games

    Observation 1.16.   If  σi is strictly dominated, then  σi is never a best response.

    –27–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    28/167

    Proposition 1.17.  Let G be a two-player game. Then  σi  ∈ ∆Si is strictly dominated if and only if 

    σi is not a best response to any  µi  ∈ ∆S−i

    Theorem 1.18.   In a two-player game, a strategy is rationalizable if and only if it satisfies iterated

    strict dominance.

    We illustrate the idea of the proof of Proposition 1.17 with an example.

    Example 1.22.   Our goal is to show that in the two-player game below, [σi   ∈   ∆Si   is not

    strictly dominated] implies that [σi is a best response to some µi  ∈  ∆S−i].

    2

    L R

    1

     A   2, −   5, −

    B   6, −   3, −

    C   7, −   1, −

    D   3, −   2, −

    Let v1(σ1) = (u1(σ1, L), u1(σ1, R)) be the vector payoff induced by σ1. Note that u1(σ1, µ1)  =

    µ1 · v1(σ1).

    Let V 1  = {v1(σ1) : σ1  ∈ ∆S1} be the set of such vector payoff s. Equivalently, V 1 is the convex

    hull of the vector payoff s to player 1’s pure strategies. It is closed and convex.

    Now σ1  ∈ ∆S1 is not strictly dominated if and only if  v1(σ1) lies on the northeast boundary

    of  V 1. For example,  σ̃1  =  1

    2 A +   12 B is not strictly dominated, with v1(σ̃1)  = (4, 4). We want

    to show that σ̃1 is a best response to some µ̃1  ∈  ∆S2.

    v1(A) = (2, 5)

    62

    2

    4

    4

    5

    5

    3

    73

    v1(D) = (3, 2)

    v1(B) = (6, 3)

    v1(C) = (7, 1)

    v1(σ

    1) = v

    1( A + B) = (4, 4)

    V1

    μ1 = ( , )

    L

    R

    1

    2

    1

    2

    3 3

    21

    ~

    ~

    μ1‧ w

    1= 4~

    μ1‧ w

    1< 4

    ~

    1

    1

    6

    8

    –28–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    29/167

    A general principle: when you are given point on the boundary of a convex set, the normal

    vector at that point often reveals something interesting.

    The point v1(σ̃1) lies on the hyperplane µ̃1 · w1  = 4, where µ̃1  = (13

    ,  23

    ).

    This hyperplane separates the point v1(σ̃

    1) from the set V 

    1, on which  µ̃

    1· w

    1 ≤ 4.

    Put diff erently,

    µ̃1 · w1  ≤  µ̃1 · v1(σ̃1) for all w1  ∈ V 1

    ⇒   µ̃1 · v1(σ1) ≤  µ̃1 · v1(σ̃1) for all σ1  ∈ ∆S1 (by the definiton of  V 1)

    ⇒   u1(σ1, µ̃1) ≤  u1(σ̃1, µ̃1) for all σ1  ∈  ∆S1

    Therefore, σ̃1 is a best response to µ̃1.

    The same argument shows that every mixture of  A and B is a best response to µ̃1.

    We can repeat this argument for all mixed strategies of player 1 corresponding to points onthe northeast frontier of V 1, as in the figure below at left. The figure below at right presents

    player 1’s best response correspondence, drawn beneath graphs of his pure strategy payoff 

    functions. Both figures link player 1’s beliefs and best responses: in the left figure, player

    1’s beliefs are the normal vectors, while in the right figure, player 1’s beliefs correspond

    to diff erent horizontal coordinates.

    v1(A)

    v1(D)

    62

    2

    4

    4

    5

    5

    3

    73

    v1(B)

    v1(C)

    V1

    μ1 = (1/3, 2/3)

    L

    R

    μ1 = (2/3, 1/3)

    ~

    u1(A ,μ

    1)

    u1(B ,μ

    1)

    u1(D ,μ

    1)

    u1(C ,μ

    1)

    L R1

    3

    2

    3 L+   R  2

    3

    1

    3 L+   RC B A

    1.3.3 A positive characterization

    The procedure introduced earlier defines rationalizability in a “negative” fashion, by

    iteratively removing strategies that are not rationalizable. It is good to have a “positive”

    –29–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    30/167

    characterization, describing rationalizability in terms of what it requires rather than what

    it rules out.

    One informal way to state a positive characterization says that each rationalizable strategy

    is a best response to beliefs that only put weight on best responses . . . to beliefs that only

    put weight on best responses . . . to beliefs that only put weight on best responses . . .In this way, the choice of the strategy is justified by a chain of expectations of rational

     behavior.

     best responses

     beliefs  best responses

     beliefs  best responses

     justified by

     placed on

     justified by

     placed on

    It is possible to eliminate this infinite regress by introducing a fixed point.

     best responses

     beliefs

    ustified by placed on

    The precise version of this fixed point idea is provided by part (i) of Theorem 1.23, which

    later will allow us to relate rationalizability to Nash equilibrium. Part (ii) of the theorem

    provides the new characterization of rationalizability. We state the characterization for

    pure strategies. (To obtain the version for mixed strategies, take Ψi ⊆ ∆Si as the candidate

    set and let Ri  = ∪σi∈Ψi support(σi).)

    Theorem 1.23.  Let Ri ⊆ Si for all i ∈  P , and let R−i  =

     ji R j

    (i)   Suppose that for each i ∈  P  and each si  ∈  Ri, there is a µi ∈ ∆S−i such that

    (a)   si is a best response to µi, and

    (b)  the support of  µi is contained in R−i.

    Then for each player i, all strategies in Ri are rationalizable.

    (ii)   There is a largest product set

     j∈S R∗ j

      such that the collection R∗1 , . . . , R

    ∗n   satisfies   (i).

     Moreover, for each i  ∈  P , R∗i  is player i’s set of rationalizable pure strategies.

    –30–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    31/167

    Osborne (2004, p. 380–382) provides a clear discussion of these ideas.

    Example 1.24.  Which pure strategies are rationalizable in the following game?

    2

    a b c d

     A   7, 0 2, 5 0, 7 0, 1

    1  B   5, 2 3, 3 5, 2 0, 1

    C   0, 7 2, 5 7, 0 0, 1

    D   0, 0 0, −2 0, 0 9, −1

    Notice that d is strictly dominated by   12

    a +   12

    c. Once d is removed, D is strictly dominated

    in the game that remains.

    To show that all remaining pure strategies are rationalizable, we apply Theorem 1.23.

    Let (R∗1 , R

    ∗2 ) =  ({ A, B, C}, {a, b, c}).

    Then:

    B is optimal for 1 when  µ1  = b  ∈  R∗2 , and

    b is optimal for 2 when  µ2  = B  ∈  R∗1

    .

    Also:

     A is optimal for 1 when  µ1  = a  ∈  R∗2 ,

    a is optimal for 2 when  µ2  = C  ∈  R∗1 .C is optimal for 1 when  µ1  = c  ∈  R

    ∗2 , and

    c is optimal for 2 when  µ2  = A  ∈  R∗1

    .

    Thus the strategies in (R∗1 , R∗2 ) are rationalizable.

    We can gain further insight by focusing on the collections of smaller sets that satisfy the

    conditions of Theorem 1.23.

    (R1, R2)  =  ({B}, {b}) satisfies these conditions. When each Ri  is a singleton, as in this case,

    there is no flexibility in choosing the beliefs  µi: the beliefs must be correct. Indeed, thestrategy profile generated by the Ri is a pure strategy Nash equilibrium.

    (R1, R2)   =  ({ A, C}, {a, c}) also satisfies the conditions of Theorem 1.23(i). The strategies in

    these sets form a  best response cycle. In this case, each strategy  si   ∈   Ri  is justified using

    diff erent beliefs  µi. Thus, the fact that rationalizability does not assume that players’

     beliefs are correct plays a crucial role here.   ♦

    –31–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    32/167

    1.4 Nash Equilibrium

    Rationalizability only relies on common knowledge of rationality. Unfortunately, it often

    fails to provide tight predictions of play. To obtain tighter predictions, we need to impose

    stronger restrictions on players’ beliefs about their opponents’ behavior. Doing so will

    lead to the central solution concept of noncooperative game theory.

    1.4.1 Definition

    To reduce the amount of notation, let Σi  = ∆Si denote player i’s set of mixed strategies.

    Similarly, let Σ =

     j∈P  ∆S j and Σ−i  =

     ji ∆S j.

    Define player i’s best response correspondence Bi :  Σ−i  ⇒ Σi by

    Bi(σ−i) =  argmaxσi∈Σi ui(σi, σ−i)

    Strategy profile σ  ∈  Σ is a Nash equilibrium (Nash (1950)) if 

    σi  ∈  Bi(σ−i) for all i  ∈  P .

    In words: each player plays a best response to the strategies of his opponents.

    Underlying assumptions:

    (i) Each player has correct beliefs about what opponents will do (vs. rationalizability:

    reasonable beliefs).

    (ii) Each behaves rationally given these beliefs.

    Example 1.25. Good Restaurant, Bad Restaurant.

    2

     g b

    1  G   2, 2 0, 0

    B   0, 0 1, 1

    Everything is rationalizable.

    The Nash equilibria are: (G, g), (B, b), ( 13 G +  23

    B,   13 g +  23

    b).

    Checking the mixed equilibrium:

    u2(13

    G +   23

    B, g) =   13

     · 2 +   23

     · 0 =   23

    –32–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    33/167

    u2(13

    G +   23

    B, b) =   13  · 0 +  23

     · 1 =   23

    ⇒ All strategies in Σ2 are best responses.

    In the mixed equilibrium each player is indiff erent between his mixed strategies.

    Each chooses the mixture that makes his opponent indiff erent.   ♦

    We can refine our prediction applying the notion of strict equilibrium.   s∗ is a   strict

    equilibrium if for each i, s∗i is the unique best response to s∗

    −i. That is, Bi(s

    ∗−i

    ) =  {s∗i

    } for all i.

    Strict equilibria seem especially compelling.

    But strict equilibria do not exist in all games (unlike Nash equilibria: see Section 1.4.4).

    In the previous example, the Nash equilibrium (G, g) maximizes both players’ payoff s.

    One might be tempted to say that a Nash equilibrium with this property is always the one

    to focus on. But this criterion is not always compelling:

    Example 1.26. Joint investment.

    Each player can make a safe investment that pays 8 for sure, or a risky investment that

    pays 9 if the other player joins in the investment and 0 otherwise.

    2

    s r

    1  S   8, 8 8, 0

    R   0, 8 9, 9

    The Nash equilibria here are (S, s), (R, r), and ( 19 S +89

    R,  19 s +89

    r). Although (R, r) yields both

    players the highest payoff , each player might be tempted by the sure payoff of 8 that the

    safe investment guarantees.   ♦

    1.4.2 Computing Nash equilibria

    The next proposition provides links between Nash equilibrium and rationalizability.

    Proposition 1.27.   (i)   Any pure strategy used with positive probability in a Nash equilibrium

    is rationalizable.

    (ii)   If each player has a unique rationalizable strategy, the profile of these strategies is a Nash

    equilibrium.

    Proof.  Theorem 1.23 provided conditions under which strategies in the sets  Ri   ⊆   Si  are

    rationalizable: for each i  ∈  P  and each si  ∈ Ri, there is a µi  ∈ ∆S−i such that

    –33–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    34/167

    (a)   si is a best response to  µi, and

    (b) the support of  µi is contained in R−i.

    To prove part (i) of the proposition, let  σ  ∈  (σ1, . . . , σn) be a mixed equilibrium, and let Ri

     be the support of  σi. Observation 1.14 tells us that each si   ∈  Ri  is a best response to  σ−i.

    Thus (a) and (b) hold with µi determined by σ−i.

    To prove part (ii) of the proposition, suppose that s =  (si, . . . , sn) is the unique rationalizable

    strategy profile. Then (a) and (b) say that si  is a best response to s−i, and so s  is a Nash

    equilibrium.  

    See Osborne (p. 383–384) for further discussion.

    Proposition 1.27 provides guidelines for computing the Nash equilibria of a game. First

    eliminate all non-rationalizable strategies. If this leaves only one pure strategy profile,this profile is a Nash equilibrium.

    Guidelines for computing all Nash equilibria:

    (i) Eliminate pure strategies that are not rationalizable.

    (ii) For each profile of supports, find all equilibria.

    Once the profile of supports is fixed, one identifies all equilibria with this profile of 

    supports by introducing the optimality conditions implied by the supports: namely, that

    the pure strategies in the support of a player’s equilibrium strategy receive the same

    payoff , which is at least as high as payoff s for strategies outside the support. In this way,

    each player’s optimality conditions restrict what the other players’ strategies may be.

    This approach is simply a convenient way of evaluating every strategy profile. In eff ect,

    one finds all equilibria by ruling out all non-equilibria and keeping what remains.

    Inevitably, this approach is computationally intensive: if player i  has  k i  strategies, there

    are

    i∈P (2k i − 1) possible profiles of supports, and each can have multiple equilibria. (In

    practice, one fixes the supports of only n−1 players’ strategies, and determines the support

    for the nth player’s strategy using the optimality conditions—see the examples below.)

    –34–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    35/167

    Example 1.28.  (Example 1.15 revisited).

    2

    L C R

    1

    T    3, 3 0, 0 0, 2

     M   0, 0 3, 3 0, 2

    B   2, 2 2, 2 2, 0

    L

    RC

    B

    T

    M

    B1: ΔS

    2 ⇒ ΔS

    1

    T

    M   B

    Best responsesfor player 1

    ¹⁄₃C + ²⁄₃L

    ²⁄₃C + ¹⁄₃L

    ¹⁄₂C + ¹⁄₂L

    ²⁄₃L + ¹⁄₃R

    ²⁄₃C + ¹⁄₃R

    T

    BM

    RL

    C

    B2: ΔS

    1 ⇒ ΔS

    2

    L

    RC

    Everything is a best responsefor player 2.

    ¹⁄₂B + ¹⁄₂T

    ²⁄₅M + ²⁄₅T + ¹⁄₅B

    ¹⁄₂M + ¹⁄₂B

    ¹⁄₂M + ¹⁄₂T

    ¹⁄₃M + ²⁄₃T

    ²⁄₃M + ¹⁄₃T

    R  ∗1

      R  ∗2

    R is rationalizable since it is a best response to some probability distributions over  R  1, as

    such distributions can replicate every point in ∆S1.

    But since R is not a best response to any σ1  ∈ R  ∗1, R is never played in a Nash equilibrium.

    The key point here is that in Nash equilibrium, player 2’s beliefs are correct (i.e., place

    probability on player 1’s actual strategy).)

    Thus, we need not consider any support for  σ2  that includes R. Three possible supportsfor σ2 remain:

    {L} ⇒   1’s BR is T  ⇒ 2’s BR is L   ∴ (T , L) is Nash

    {C} ⇒   1’s BR is M ⇒ 2’s BR is C   ∴ ( M, C) is Nash

    {L, C} ⇒   u2(σ1, L)(i)= u2(σ1, C)

    (ii)

    ≥ u2(σ1, R): look at B2, or compute as follows:

    (i) 3t + 2b =  3m + 2b   (ii) 3m + 2b ≥  2t + 2m   (use t  =  m,   b =  1 − m − t =  1 − 2t)

    ⇒ t  =  m   ⇒ 3t + 2(1 − 2t) ≥  4t

    ∴ t  =  m  ≤  25

    Looking at B1 (or R  ∗1), we see that this is only possible if player 1 plays B for sure. Player

    1 is willing to do this if 

    u1(B, σ2) ≥  u1(T , σ2) ⇔  l  ≤  23 ,   and

    –35–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    36/167

    u1(B, σ2) ≥  u1( M, σ2) ⇔  c  ≤  23

    SinceweknowthatR is not used in any Nash equilibrium, we conclude that (B, αL+(1−α)C)

    is a Nash equilibrium for α ∈  [ 13

    ,   23

    ].

    Since we have checked all possible supports for σ2, we are done.   ♦

    Example 1.29. Zeeman’s (1980) game.2

     A B C

    1 A   0, 0 6, −3   −4, −1B   −3, 6 0, 0 5, 3C   −1, −4 3, 5 0, 0

    Since the game is symmetric, both players have the same incentives as a function of the

    opponent’s behavior.

     A   B  ⇔  6b − 4c ≥ −3a + 5c   ⇔ a + 2b ≥  3c;

     A   C  ⇔  6b − 4c ≥ −a + 3b   ⇔ a + 3b ≥  4c;

    B   C  ⇔ −3a + 5c ≥ −a + 3b ⇔  5c ≥  2a + 3b.

    A

    B C

    A

    B

    C⅘A+⅕C

    ⁵⁄₇A+²⁄₇C

    ⅗B+⅖C

    Now consider each possible support of player 1’s equilibrium strategy.

     A   Implies that 2 plays A, and hence that 1 plays A. Equilibrium.B   Implies that 2 plays A, and hence that 1 plays A.C   Implies that 2 plays B, and hence that 1 plays A.

     A, B   Implies that 2 plays A, and hence that 1 plays A. A, C   This allows many best responses for player 2, but the only one that

    makes both A and C a best response for 1 is   45 A +   15 C, which is only

    a best response for 2 if 1 plays   45 A +   15 C himself. Equilibrium.

    B, C   Implies that 2 plays A, B  or a mixture of the two, and hence that 1plays A.

    all This is only optimal for 1 if 2 plays   13 A +   1

    3B +   1

    3C, which 2 is only

    willing to do if 1 plays   13 A +   1

    3B +   1

    3C. Equilibrium.

    –36–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    37/167

    ∴ There are three Nash equilibria: ( A, A)( 4

    5 A +   1

    5C,  4

    5 A +   1

    5C)

    ( 13 A +   1

    3B +   1

    3C,  1

    3 A +   1

    3B +   1

    3C)   ♦

    Example 1.30. Selten’s (1975) horse.

    1 2

    222

    3 3

    A a

    D d

    L R L R

    033

    001

    102

    003

    3:L2

    a d

    1  A   2, 2, 2 0, 0, 1

    D   0, 0, 3 0, 0, 3

    3:R2

    a d

    1  A   2, 2, 2 0, 3, 3

    D   1, 0, 2 1, 0, 2

    Consider all possible mixed strategy supports for players 1 and 2:

    (D, d) Implies that 3 plays L. Since 1 and 2 are also playing best responses,this is a Nash equilibrium.

    (D, a) Implies that 3 plays L, which implies that 1 prefers to deviate to A.(D, mix) Implies that 3 plays L, which with 2 mixing implies that 1 prefers

    to deviate to A.( A, d) Implies that 3 plays R, which implies that 1 prefers to deviate to D.( A, a) 1 and 2 are willing to do this if  σ3(L)  ≥

      13

    . Since 3 cannot aff ect his

    payoff 

    s given the behavior of 1 and 2, these are Nash equilibria.( A, mix) 2 only mixes if  σ3(L)   =  1

    3 ; but if 1 plays  A  and 2 mixes, 3 strictlyprefers R – a contradiction.

    (mix, a) Implies that 3 plays L, which implies that 1 strictly prefers  A.(mix, d) If 2 plays d, then for 1 to be willing to mix, 3 must play L; this leads

    2 to deviate to a.(mix, mix) Notice that 2 can only aff ect her own payoff s when 1 plays   A.

    Hence, for 2 to be indiff erent, σ3(L)  =  1

    3. Given this, 1 is willing to

    mix if  σ2(d)  =  2

    3. Then for 3 to be indiff erent, σ1(D)  =

      47

    . This is aNash equilibrium.

    ∴ There are three components of Nash equilibria: (D, d, L)( A, a, σ3(L) ≥

      13

    )( 3

    7 A +   4

    7D,  1

    3a +   2

    3d,  1

    3L +   2

    3R)   ♦

    –37–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    38/167

    1.4.3 Interpretations of Nash equilibrium

    Example 1.31.  (The Good Restaurant, Bad Restaurant game (Example 1.25))

    2

     g b1

      G   2, 2 0, 0

    B   0, 0 1, 1

    NE: (G, g)(B, b)

    ( 13 G +  23

    B,   13 g +  23

    b).   ♦

    Example 1.32. Matching Pennies.2

    h t

    1  H    1, −1   −1, 1

    T    −1, 1 1, −1   unique NE: ( 12 H +  12

    T ,  12 h +  12

    t).   ♦

    Nash equilibrium is a minimal condition for self enforcing behavior.

    This explains why we should not  expect players to behave in a way that is  not  Nash, but

    not why we should expect players to coordinate on a Nash equilibrium.

     Justifications of equilibrium knowledge: why expect correct beliefs? 

    There is no general justification for assuming equilibrium knowledge. But justifications can be

    found in certain specific instances:

    (i) Coordination of play by a mediator.If a mediator proposes a Nash equilibrium, no player can benefit from deviating.

    Of course, this only helps if there actually is a mediator.

    (ii) Pre-play agreement.

    But it may be more appropriate to include the “pre-play” communication explicitly

    in the game. (The result is a model of  cheap talk : see Crawford and Sobel (1982)

    and a large subsequent literature.) This raises two new issues: (i) one now needs

    equilibrium knowledge in a larger game, and (ii) the expanded game typically has

    a “babbling” equilibrium in which all communication is ignored.

    (iii) Focal points (Schelling (1960)). Something about the game makes some Nash

    equilibrium the obvious choice about how to behave.

    ex: meeting in NYC at the information booth at Grand Central Station at noon.

    ex: coordinating on the good restaurant.

    –38–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    39/167

    Focal points can also be determined abstractly, using (a)symmetry to single out

    certain distinct strategies: see Alós-Ferrer and Kuzmics (2013).

    (iv) Learning / Evolution: If players repeatedly face the same game, they may find their

    way from arbitrary initial behavior to Nash equilibrium.

    Heuristic learning: Small groups of players, typically employing rules that

    condition on the empirical distribution of past play (Young (2004))

    Evolutionary game theory: Large populations of agents using myopic up-

    dating rules (Sandholm (2010))

    In some classes of games (that include the two examples above), many learning

    and evolutionary processes do converge to Nash equilibrium.

    But there is no general guarantee of convergence:

    Many games lead to cycling or chaotic behavior, and in some games any “reason-

    able” dynamic process fails to converge to equilibrium (Shapley (1964), Hofbauer

    and Swinkels (1996), Hart and Mas-Colell (2003)).

    Some games introduced in applications are known to have poor convergence prop-

    erties (Hopkins and Seymour (2002), Lahkar (2011)).

    In fact, evolutionary game theory models do not even support the elimination of 

    strictly dominated strategies in all games (Hofbauer and Sandholm (2011)).

     Interpretation of mixed strategy Nash equilibrium: why mix in precisely the way that 

    makes your opponents indi ff erent? 

    In the unique equilibrium of Matching Pennies, player 1 is indiff erent among all of his

    mixed strategies. He chooses (12

    ,  12

    ) because this makes player 2 indiff erent. Why should

    we expect player 1 to behave in this way?

    (i) Deliberate randomization

    Sometimes it makes sense to expect players to deliberately randomize (ex.: poker).

    In zero-sum games (Section 1.6), randomization can be used to ensure that you obtain

    at least the equilibrium payoff regardless of how opponents behave:

    In a mixed equilibrium, you randomize to make your opponent indiff 

    erent betweenher strategies. In a zero-sum game, this implies that you  are indiff erent between

    your opponent’s strategies. This implies that you do not care if your opponent

    finds out your randomization probabilities in advance, as this does not enable her

    to take advantage of you.

    (ii) Mixed equilibrium as equilibrium in beliefs

    –39–

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    40/167

    We can interpret σ∗i as describing the beliefs that player i’s opponents have about

    player i’s behavior. The fact that σ∗i  is a mixed strategy then reflects the opponents’

    uncertainty about how i will behave, even if i is not actually planning to randomize.

    But as Rubinstein (1991) observes, this interpretation

    “. . . implies that an equilibrium does not lead to a prediction (statistical or other-wise) of the players’ behavior. Any player  i’s action which is a best response givenhis expectation about the other players’ behavior (the other n − 1 strategies) is con-sistent as a prediction for i’s action (this might include actions which are outside thesupport of the mixed strategy). This renders meaningless any comparative staticsor welfare analysis of the mixed strategy equilibrium and brings into question theenormous economic literature which utilizes mixed strategy equilibrium.”

    (iii) Mixed equilibria as time averages of play: fictitious play (Brown (1951))

    Suppose that the game is played repeatedly, and that in each period, each player

    chooses a best response to the time average of past play.

    Then in certain classes of games, the time average of each players’ behavior con-

    verges to his part in some Nash equilibrium strategy profile.

    (iv) Mixed equilibria as population equilibria (Nash (1950))

    Suppose that there is one population for the player 1 role and another for the player

    2 role, and that players are randomly matched to play the game.

    If half of the players in each population play Heads, no one has a reason to deviate.

    Hence, the mixed equilibrium describes stationary distributions of  pure strategies in

    each population.

    (v) Purification: mixed equilibria as pure equilibria of games with payoff uncertainty

    (Harsanyi (1973))

    Example 1.33. Purification in Matching Pennies.  Suppose that while the Matching Pennies

    payoff  bimatrix gives player’s approximate payoff s, players’ actual payoff s also contain

    small terms  ε H , εh  representing a bias toward playing heads, and that each player only

    knows his own bias. (The formal framework for modeling this situation is called a Bayesian

     game—see Section 3.)

    2

    h t

    1  H    1 + ε H , −1 + εh   −1 + ε H , 1

    T    −1, 1 + εh   1, −1

    Specifically, suppose that ε H  and εh  are independent random variables with P(ε H  >  0)  =

    P(ε H   0)  =  P(εh  

  • 8/20/2019 2014 - Lectures Notes on Game Theory - WIlliam H Sandholm

    41/167

    player to follow his bias. From the ex ante point of view, the distribution over actions that

    this equilibrium generates in the original normal form game is ( 12 H +  12

    T ,   12 h +  12

    t).

    Harsanyi (1973) shows that any mixed equilibrium can be purified in this way. This

    includes not only “reasonable” mixed equilibria like that in Matching Pennies, but also

    “unreasonable” ones like those in coordination games.   ♦

    1.4.4 Existence of Nash equilibrium and structure of the equilibrium set

    Existence and structure theorems for finite normal form games

    When does Nash equilibrium provide us with at least one prediction of play? Always, at

    least in the context of finite nor