218 1 lecture 3 part2-print
TRANSCRIPT
![Page 1: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/1.jpg)
Lecture 3
Dynamic games of complete information
- Part 2
![Page 2: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/2.jpg)
Outline (part 2)
Questions/comments/observations are always encouraged, at any point during the lecture!!
• Repeated games • Finite • Infinite
![Page 3: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/3.jpg)
Motivation • Play the same normal-form game over and over
– each round is called a “stage game” Prisoner dilemma
![Page 4: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/4.jpg)
Repeated games
• Repeated game is designed to examine the logic of long-term interaction
• It captures the idea that a player will take into account the effect of his current behavior on the other players’ future behavior, and aims to explain phenomena like cooperation, revenge, threats etc.
![Page 5: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/5.jpg)
Finitely repeated games
• Everything is straightforward if we repeat a game a finite number of times
• We can write it as an extensive-form game with imperfect information – at each round players don’t know what the others
have done; afterwards they do – overall payoff function is additive: sum of payoffs in
stage games
![Page 6: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/6.jpg)
Remarks
• Observe that the strategy space is much richer than it was in the normal-form setting
• Repeating a Nash strategy in each stage game will be an equilibrium in behavioral strategies (called a stationary strategy)
• We can apply backward induction in these games when the normal form game has a dominant strategy.
![Page 7: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/7.jpg)
Prisoner’s dilemma as repeated game
![Page 8: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/8.jpg)
Infinitely repeated games
![Page 9: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/9.jpg)
Infinitely repeated games
![Page 10: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/10.jpg)
Strategies
![Page 11: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/11.jpg)
Nash equilibria with no discounting
![Page 12: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/12.jpg)
Nash equilibria with no discounting
![Page 13: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/13.jpg)
Can we get anything else rather than repetitions of the stage game equilibrium?
![Page 14: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/14.jpg)
Can we get anything else rather than repetitions of the stage game equilibrium?
![Page 15: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/15.jpg)
Important points
![Page 16: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/16.jpg)
![Page 17: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/17.jpg)
What outcomes can be achieved as equilibria?
![Page 18: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/18.jpg)
What outcomes can be achieved as equilibria?
![Page 19: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/19.jpg)
What outcomes can be achieved as equilibria?
![Page 20: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/20.jpg)
The folk theorem
![Page 21: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/21.jpg)
Folk theorems
• Repeated games – Structure of the equilibrium strategies (more useful) – Determine the payoffs that can be sustained by
equilibria -> conditions under which this set consists of nearly all reasonable payoff profiles (just existence of equilibria)
• “Folk theorems” – focus of most of the formal development in repeated games – Socially desirable outcomes that cannot be sustained
if players are myopic, can be sustained if players are foresighted (i.e. have long-term objectives)
![Page 22: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/22.jpg)
Folk Theorems
• When players are patient, repeated play allows virtually any payoff to be an equilibrium outcome
• The set of Nash equilibria outcomes includes outcomes that are not repetitions of the constituent game
• To support such an outcome, each player must be deterred from deviating by being “punished”
• Punishment may take many forms – One possibility – “trigger strategy” (any deviation
causes his opponent(s) to carry out a punitive action that lasts forever)
![Page 23: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/23.jpg)
Repeated games – some preliminary conclusions
• Repeated games may introduce new equilibria and stimulate cooperation
– Infinitely repeated games (infinite horizon T) • Finitely repeated games (finite horizon, T finite): solved
by backward induction – Players have incentives to cheat • Infinite Horizon: description of a game where players
think the game extends one more period with high probability
• Finite Horizon: terminal date of the game is known.
![Page 24: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/24.jpg)
Nash equilibria with discounting
![Page 25: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/25.jpg)
![Page 26: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/26.jpg)
The folk theorem
![Page 27: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/27.jpg)
The folk theorem with discounting
![Page 28: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/28.jpg)
What about subgame perfect NE?
![Page 29: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/29.jpg)
Strategies representation in repeated games - Automata The following game is infinitely repeated with
discount factor δ.
C D C D
2, 2
0, 3 1, 1
3, 0
![Page 30: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/30.jpg)
Grim Trigger Strategy
Consider the repeated prisoner’s dilemma. The strategy prescribes that the player initially
cooperates, and continues to do so if both players cooperated at all previous times.
si (a1, . . . , aT) = C if at = (C,C) for all t = 1, . . . , T. si (a1, . . . , aT) = D otherwise. Note that a player defects if either she or her
opponent defected in the past.
![Page 31: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/31.jpg)
Automaton of Grim Trigger Strategy
• There are two states: C in which C is chosen, and D, in which D is chosen.
• The initial state (*) is C. • If the play is not (C,C) in any period then the
state changes to D. • If the automaton is in state D, it remains there
forever.
* C D (C,D)
(D,C) (D,D)
(C,C)
![Page 32: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/32.jpg)
More formalism!
![Page 33: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/33.jpg)
![Page 34: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/34.jpg)
![Page 35: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/35.jpg)
![Page 36: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/36.jpg)
One-step deviation principle
![Page 37: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/37.jpg)
![Page 38: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/38.jpg)
Central questions
1. If players are patient can we get cooperative outcomes = better than NE for all players? 2. If players are patient, what else can we get?
![Page 39: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/39.jpg)
![Page 40: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/40.jpg)
Grim-Trigger is SGPE
Suppose that both players adopt the grim-trigger strategy.
There are two sets of histories. Those for which grim-trigger strategy prescribes that the players play (C,C) and those for which the grim trigger strategy prescribes that they play (D,D).
In the first set of histories, if player i plays grim- trigger, then the outcome is (C, C) in every period with payoffs (2, 2, . . .), whose discounted average is 2.
If I deviates only once, she plays D. Then she reverts to the grim trigger-strategy, that prescribes to play D at all subsequent periods.
![Page 41: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/41.jpg)
• The opponent, playing grim trigger strategy, plays D forever as a consequence of i’s one-shot deviation.
• The OSD yields the stream of payoffs (3, 1, 1, . . .) with discounted average
(1 − d)[3 + d + d 2 + d 3 + · · ·] = 3(1 − d) + d. • Thus player i cannot increase her payoff by deviating
if and only if 2 ≥ 3(1 − d) + d, or d ≥ 1/2. • In the second set of histories, if player i plays grim
trigger, then the outcome is (D, D) in every period with payoffs (1, 1, . . .), whose discounted average is 1.
• If I deviates only once, she plays C. Then she reverts to the grim trigger strategy, that prescribes to play D at all subsequent periods.
![Page 42: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/42.jpg)
• The opponent, playing grim trigger strategy, plays D forever as a consequence of i’s one-shot deviation.
• The OSD yields the stream of payoffs (0, 1, 1, . . .) with discounted average
(1 − δ)[0 + δ + δ 2 + δ 3 + · · ·] = δ. • Player i cannot increase her payoff by deviating: 1
≥ δ. • We conclude that if δ ≥ ½ then the strategy pair in
which each player’s strategy is the grim-trigger strategy is a Subgame-Perfect equilibrium of the infinitely repeated Prisoner’s Dilemma.
![Page 43: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/43.jpg)
Tit-for-Tat
• The player initially cooperates. • At subsequent rounds, she plays the strategy
played by the opponent at the previous round.
si (a1, . . . , aT) = C if aTj = C.
si (a1, . . . , aT) = D if aTj = D.
* C D ( . ,D)
(C, . )
( . ,C)
(D, . )
![Page 44: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/44.jpg)
Tit for Tat is SGPE • Suppose that both players adopt tit for tat strategy. • There are four sets of histories. They prescribe
respectively, (C,C), (C,D), (D,C) and (D,D). • In the first set of histories, if player i plays tit for tat,
then the outcome is (C, C) in every period with payoffs (2, 2, . . .), whose discounted average is 2.
• If i deviates only once, she plays D. Then she reverts to tit for tat. Given that the opponent plays tit for tat, the induced play is {(D,C),(C,D),(D,C),(C,D)…}, with payoffs (3,0,3,0,…).
• Hence player i does not deviate if: 2 ≥ (1−δ)[3+0δ +3δ 2+0δ 3+· · ·] = 3 (1−δ)/(1-δ2)
that is to say: δ ≥ 1/2.
![Page 45: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/45.jpg)
• In the set of histories prescribing (C,D), if players play tit for tat, then the outcome is {(C,D),(D,C),…}, which yields (0,3,0,3,…).
• If i deviates only once, she plays D. Then she reverts to tit for tat. Given that the opponent plays tit for tat, the induced play is (D,D) forever with payoffs 1.
• Hence player i does not deviate if: (1−δ)[0+3δ +0δ 2+3δ 3+· · ·] = 3δ (1−δ)/(1-δ2) ≥ 1
that is to say: δ ≥ 1/2. • In the set of histories prescribing (D,C), if players
play tit for tat, then the outcome is {(D,C),(C,D),…}, which yields (3,0,3,0…). If i deviates only once, the induced play is (C,C) forever with payoffs 2. Player i does not deviate if:
(1−δ)[3+0δ +3δ 2+0δ 3+· · ·] = 3 (1−δ)/(1-δ2) ≥1 that is to say: 1/2 ≥ δ.
![Page 46: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/46.jpg)
• In the set of histories prescribing (D,D), if players play tit for tat, then the outcome is (D,D) forever, with payoff 1.
• If i deviates only once, the induced play is {(C,D),(D,C),…}, which yields (0,3,0,3,…).
• Player i does not deviate if: 1 ≥ (1−δ)[0+3δ +0δ 2+3δ 3+· · ·] = 3 (1−δ)/(1-δ2) that is to say: 1/2 ≥ δ. • We conclude that the strategy pair in which each
player plays the tit-for-tat strategy is a Subgame-Perfect equilibrium of the infinitely repeated Prisoner’s Dilemma if and only if δ = ½.
• This underlines the inherent fragility of tit-for-tat: it works only in a knife-hedge case.
![Page 47: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/47.jpg)
Other strategies for repeated games
![Page 48: 218 1 Lecture 3 Part2-Print](https://reader031.vdocuments.net/reader031/viewer/2022030313/577ccd321a28ab9e788bc1e9/html5/thumbnails/48.jpg)
Other strategies for repeated games