vince's game theory course documentationvince’s game theory course documentation strategy...

60
Vince’s Game theory course Documentation Vince Knight Mar 02, 2020

Upload: others

Post on 17-Apr-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory courseDocumentation

Vince Knight

Mar 02, 2020

Page 2: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a
Page 3: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Contents

1 Preparation 3

2 Class meetings 52.1 00 Introduction to class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 01 Strategies and utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 02 Rationalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.4 03 Best responses and Nash equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.5 04 Support enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.6 05 Best response polytopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.7 06 Lemke Howson Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.8 07 Repeated games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.9 08 Prisoners Dilemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.10 09 Evolutionary dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.11 10 Evolutionary game theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.12 11 Moran Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.13 12 Contemporary research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3 Optional meetings 413.1 AA- Revisiting Polytopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.2 AB - Revisiting Repeated games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.3 AC - Coursework peer review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.4 AD - Moran process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.5 AE - Support enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.6 AF- Revisiting Lemke-Howson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4 About the materials 554.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.2 How this is done . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.3 Testing all the code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

i

Page 4: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

ii

Page 5: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

The class notes are available here: vknight.org/gt/.

Contents 1

Page 6: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

2 Contents

Page 7: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

CHAPTER 1

Preparation

Update hackmd link if necessary.

3

Page 8: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

4 Chapter 1. Preparation

Page 9: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

CHAPTER 2

Class meetings

Here are my notes for each class meeting.

2.1 00 Introduction to class

2.1.1 Corresponding chapters

• Introduction to the course

• Normal form games

Duration: 50 minutes

2.1.2 Objectives

• Outline timetable and approach to class.

• Discuss feedback from previous years.

• Make students aware of learning resources available to them.

• Make students aware of assessment criteria

• Make students aware of coding aspect of course (both in assessment and in content)

• Introduce games.

2.1.3 Notes

Timetable and approach

Discuss timetable

5

Page 10: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

I will not lecture

This course is taught in a flipped learning environment. All notes, homework, exercises are available to you.

In class we will work in parallel to the notes but not directly following them. You will be expected to play an activerole in your learning:

• Activities that demonstrate concepts;

• Discussions

Feedback

Show feedback from previous year.

Learning and assessment

All resources can be found at http://vknight.org/gt/

Discuss the markdown version of notes (available on hackmd.io).

Seeking assistance

Details on website.

Programming

The course notes will include code throughout, this is both to show you how to use code to study mathematics (anexemplar of modern mathematics) but also to illustrate the ideas.

Details about the programming involved can be found at http://vknight.org/gt/other/programming/

Introduction to games

Using http://vknight.org/school-outreach/assets/playing-games/tex/form.pdf

1. Play two thirds of the average game

2. Rationalise

3. Play two thirds of the average game

Discuss notes on Normal form games.

2.2 01 Strategies and utilities

2.2.1 Corresponding chapters

• Calculating utilities of strategies

Duration: 50 minutes

6 Chapter 2. Class meetings

Page 11: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

2.2.2 Objectives

• Define mixed strategies

• Understand utility calculation for mixed strategies

• Understand utility calculation as a linear algebraic construct

2.2.3 Notes

Utility calculations

Use matching pennies form have students play in pairs. Following each game:

• Ask how many people won?

• Ask why they won?

Mixed strategies

Look at definition for mixed strategies.

Consider:

𝐴 =

(︂2 −2−1 1

)︂𝐵 =

(︂−2 21 −1

)︂Let us assume we have 𝜎𝑟 = (.3, .7) and 𝜎𝑐 = (.1, .9):

𝑢𝑟(𝜎𝑟, 𝜎𝑐) = 0.3× 0.1× 2 + 0.3× 0.9× (−2) + 0.7× 0.1× (−1) + 0.7× 0.9× 1 = 0.08

because the game is zero sum we immediately know:

𝑢𝑐(𝜎𝑟, 𝜎𝑐) = −0.08

This corresponds to the linear algebraic multiplication:

𝑢𝑟(𝜎𝑟, 𝜎𝑐) = 𝜎𝑟𝐴𝜎𝑇𝑐

𝑢𝑐(𝜎𝑟, 𝜎𝑐) = 𝜎𝑟𝐵𝜎𝑇𝑐

(Go through this on the board, make sure students are comfortable.)

This can be done straightforwardly using numpy:

>>> import numpy as np>>> A = np.array([[2, -2], [-1, 1]])>>> B = np.array([[-2, 2], [1, -1]])>>> sigma_r = np.array([.3, .7])>>> sigma_c = np.array([.1, .9])>>> np.dot(sigma_r, np.dot(A, sigma_c)), np.dot(sigma_r, np.dot(B, sigma_c))(0.079..., -0.079...)

2.2. 01 Strategies and utilities 7

Page 12: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

Strategy profiles as coordinates on a game

One way to thing of any game (𝐴,𝐵) ∈ R𝑚×𝑛2 is as a mapping from the set of strategies [0, 1]𝑚R × [0, 1]𝑛R to R2: theutility space.

Equivalently, if 𝑆𝑟, 𝑆𝑐 are the strategy spaces of the row/column player:

(𝐴,𝐵) : 𝑆𝑟 × 𝑆𝑐 → R2

We can use games defined in nashpy in that way:

>>> import nashpy as nash>>> game = nash.Game(A, B)>>> game[sigma_r, sigma_c]array([ 0.08, -0.08])

2.3 02 Rationalisation

2.3.1 Corresponding chapters

• Rationalisation

Duration: 50 minutes

2.3.2 Objectives

• Be able to identify dominated strategies

• Understand notion of Common knowledge of rationality

2.3.3 Notes

Rationalisation of strategies

Identify two volunteers and play a sequence of zero sum games where they play as a team against me. The group isthe row player.

𝐴 =

(︂1 −1−1 2

)︂(No dominated strategy)

𝐴 =

(︂1 2−1 2

)︂(First column weakly dominates second column)

𝐴 =

(︂1 −1−1 −3

)︂• First row strictly dominates second row

• Second column strictly dominates first column

8 Chapter 2. Class meetings

Page 13: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

𝐴 =

(︂2 2−1 2

)︂• First row weakly dominates second row

• First column weakly dominates second column

𝐴 =

⎛⎝−1 2 1−2 −2 11 1 −1

⎞⎠• First row dominates second row

• First column dominates second column

Now pit the two players against each other, the utilities represent the share of the total amount of chocolates/sweetsgathered so far:

𝐴 =

(︂1 01.5 .5

)︂𝐵 =

(︂1 1.50 .5

)︂Capture all of the above (on the white board) and discuss each action and why they were taken.

Ask students to repeat the game against each other in groups of two (use “days of doing the dishes perhaps?”).

Iterated elimination of dominated strategies

As a class work through the following example.

𝐴 =

⎛⎝2 51 27 3

⎞⎠ 𝐵 =

⎛⎝0 36 10 1

⎞⎠1. First row dominates second row

𝐴 =

(︂2 57 3

)︂𝐵 =

(︂0 30 1

)︂

2. Second column dominates first column

𝐴 =

(︂27

)︂𝐵 =

(︂31

)︂

3. First row dominates third row. Thus the rationalised behaviour is (𝑟1, 𝑐2).

Now return to the last example played as a pair:

𝐴 =

⎛⎝−1 2 1−2 −2 11 1 −1

⎞⎠ 𝐵 =

⎛⎝ 1 −2 −12 2 −1−1 −1 1

⎞⎠1. The first row/column weakly dominate the second row/column:

𝐴 =

(︂−1 11 −1

)︂𝐵 =

(︂1 −1−1 1

)︂

There is nothing further that we can do here.

2.3. 02 Rationalisation 9

Page 14: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

2.4 03 Best responses and Nash equilibrium

2.4.1 Corresponding chapters

• Best responses

Duration: 100 minutes

2.4.2 Objectives

• Define best responses

• Identify best responses in pure strategies

• Identify best responses against mixed strategies

• Theorem: best response condition

• Definition of Nash equilibria

2.4.3 Notes

Discuss best response in pure strategies.

Best response against mixed strategies

Use best responses against mixed strategies have students play against a mixed strategy:

>>> import random>>> random.seed(0) # Don't seed in class>>> ["D", "C"][random.random() < 0.3] # 30 chance of Cooperating'D'

Discuss the definition of a best response. Identify best responses for the game considered:

𝐴 =

(︂2 −2−1 1

)︂𝐵 =

(︂−2 21 −1

)︂Consider the best responses against a mixed strategy:

• Assume 𝜎𝑟 = (𝑥, 1− 𝑥)

• Assume 𝜎𝑐 = (𝑦, 1− 𝑦)

We have:

𝐴𝜎𝑇𝑐 =

(︂4𝑦 − 21− 2𝑦

)︂𝜎𝑟𝐵 =

(︀1− 3𝑥 3𝑥− 1

)︀Here is the code to do this calculation with sympy:

>>> import sympy as sym>>> import numpy as np>>> x, y = sym.symbols('x, y')>>> A = sym.Matrix([[2, -2], [-1, 1]])>>> B = - A

(continues on next page)

10 Chapter 2. Class meetings

Page 15: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

(continued from previous page)

>>> sigma_r = sym.Matrix([[x, 1-x]])>>> sigma_c = sym.Matrix([y, 1-y])>>> A * sigma_c, sigma_r * B(Matrix([[ 4*y - 2],[-2*y + 1]]), Matrix([[-3*x + 1, 3*x - 1]]))

Plot these two things:

>>> import matplotlib.pyplot as plt>>> ys = [0, 1]>>> row_us = [[(A * sigma_c)[i].subs({y: val}) for val in ys] for i in range(2)]>>> plt.plot(ys, row_us[0], label="$(A\sigma_c^T)_1$")[<matplotlib...>]>>> plt.plot(ys, row_us[1], label="$(A\sigma_c^T)_2$")[<matplotlib...>]>>> plt.xlabel("$\sigma_c=(y, 1-y)$")>>> plt.title("Utility to player 1")>>> plt.legend();

>>> xs = [0, 1]>>> row_us = [[(sigma_r * B)[j].subs({x: val}) for val in xs] for j in range(2)]>>> plt.plot(ys, row_us[0], label="$(\sigma_rB)_1$")[<matplotlib...>]>>> plt.plot(ys, row_us[1], label="$(\sigma_rB)_2$")[<matplotlib...>]>>> plt.xlabel("$\sigma_r=(x, 1-x)$")>>> plt.title("Utility to column player")>>> plt.legend();

Conclude:

𝜎*𝑟 =

⎧⎪⎨⎪⎩(1, 0), if 𝑦 > 1/2

(0, 1), if 𝑦 < 1/2

indifferent, if 𝑦 = 1/2

𝜎*𝑐 =

⎧⎪⎨⎪⎩(0, 1), if 𝑥 > 1/3

(1, 0), if 𝑥 < 1/3

indifferent, if 𝑥 = 1/3

Some examples:

• If 𝜎𝑟 = (2/3, 1/3) then 𝜎*𝑐 = (0, 1).

• If 𝜎𝑟 = (1/3, 2/3) then any strategy is a best response.

Discuss best response condition theorem and proof.

This gives a finite condition that needs to be checked. To find the best response against 𝜎𝑐 we potentially would needto check all infinite possibilities alternatives to 𝜎*

𝑟 . Now we simply need to check the values of the pure strategiesagainst 𝜎𝑐:

• Either there will be a single pure best response;

• There will be multiple pure strategies for which the row player is indifferent.

Return to previous example:if 𝜎𝑟 = (1/3, 2/3) then (𝜎𝑟𝐵) = (0, 0) thus (𝜎𝑟𝐵)𝑗 = 0 for all 𝑗.

(𝜎𝑟, 𝜎𝑐) = ((1/3, 1/2), (1/2, 1/2)) is a pair of best responses.

Discuss definition of Nash equilibria.

Explain how the best response condition theorem can be used to find NE.

2.4. 03 Best responses and Nash equilibrium 11

Page 16: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

• All possible supports (strategies that are played with positive probabilities) can be checked.

• All pure strategies must have maximum and equal payoff.

2.5 04 Support enumeration

2.5.1 Corresponding chapters

• Support enumeration

Duration: 100 minutes

2.5.2 Objectives

• Define of support of a strategy

• Define nondegenerate games

• Describe and use suport enumeration

2.5.3 Notes

Play a knockout RPSLS tournament

Tell students to create groups of 4 or 8 and play a knockout RPSLS tournament.

Explain rules and show:

12 Chapter 2. Class meetings

Page 17: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

Following the tournaments: have discussion about how winner won etc. . .

Explain that we will study this game using Game Theory:

𝐴 =

⎛⎜⎜⎜⎜⎝0 −1 1 1 −11 0 −1 −1 1−1 1 0 1 −1−1 1 −1 0 11 −1 1 −1 0

⎞⎟⎟⎟⎟⎠Then go over chapter 5:

• Define support of a strategy (relate to tournament previously played). For example the strategy (0, 1/2, 0, 0, 1/2)has support {2, 5}:

>>> import numpy as np>>> sigma = np.array([0, 1/2, 0, 0, 1/2])>>> np.where(sigma > 0)(array([1, 4]),)

• Discuss non degenerate games and show how RPSLS is in fact a degenerate game:

>>> A = np.array([[0, -1, 1, 1, -1],... [1, 0, -1, -1, 1],... [-1, 1, 0, 1, -1],... [-1, 1, -1, 0, 1],... [1, -1, 1, -1, 0]])>>> sigma_c = np.array([1, 0, 0, 0, 0])>>> np.dot(A, sigma_c) # Two element give maxarray([ 0, 1, -1, -1, 1])

• Discuss support enumeration algorithm. Theoretically could be used for RPSLS but would need to considersupports of unequal size. For the purpose of the illustration consider RPS:

𝐴 =

⎛⎝ 0 −1 11 0 −1−1 1 0

⎞⎠Apply the algorithm.

1. We see there are no pairs of best response, so consider 𝑘 ∈ {2, 3}.

2. We have the following set of utilities to consider:

1. 𝐼 = {1, 2}

1. 𝐽 = {1, 2}

2. 𝐽 = {1, 3}

3. 𝐽 = {2, 3}

2. 𝐼 = {1, 3}

1. 𝐽 = {1, 2}

2. 𝐽 = {1, 3}

3. 𝐽 = {2, 3}

3. 𝐼 = {2, 3}

1. 𝐽 = {1, 2}

2.5. 04 Support enumeration 13

Page 18: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

2. 𝐽 = {1, 3}

3. 𝐽 = {2, 3}

4. 𝐼 = 𝐽 = {1, 2, 3}

3. Now we consider (some not all as they are mainly the same) of the corresponding linear equations.

1. 𝐼 = {1, 2}

1. 𝐽 = {1, 2}

0𝜎𝑟1 − 𝜎𝑟2 = 𝜎𝑟1 + 0𝜎𝑟2

−𝜎𝑟2 = 𝜎𝑟1

0𝜎𝑐1 − 𝜎𝑐2 = 𝜎𝑐1 + 0𝜎𝑐2

𝜎𝑐1 = −𝜎𝑐2

2. 𝐽 = {1, 3}

0𝜎𝑟1 − 𝜎𝑟2 = −𝜎𝑟1 + 𝜎𝑟2

𝜎𝑟1 = 2𝜎𝑟2

0𝜎𝑐1 + 𝜎𝑐3 = 𝜎𝑐1 − 𝜎𝑐3

𝜎𝑐1 = 2𝜎𝑐3

2. 𝐽 = {2, 3}

𝜎𝑟1 + 0𝜎𝑟2 = −𝜎𝑟1 + 𝜎𝑟2

2𝜎𝑟1 = 𝜎𝑟2

−𝜎𝑐2 + 𝜎𝑐3 = 0𝜎𝑐2 − 𝜎𝑐3

𝜎𝑐2 = 2𝜎𝑐3

2. 𝐼 = {1, 3} Similarly.

3. 𝐼 = {2, 3} Similarly.

4. 𝐼 = 𝐽 = {1, 2, 3}

In this case we have:

−𝜎𝑟2 + 𝜎𝑟3 = 𝜎𝑟1 − 𝜎𝑟3

𝜎𝑟1 − 𝜎𝑟3 = −𝜎𝑟1 + 𝜎𝑟2

which has solution:

𝜎𝑟1 = 𝜎𝑟2 = 𝜎𝑟3

Similarly:

−𝜎𝑐2 + 𝜎𝑐3 = 𝜎𝑐1 − 𝜎𝑐3

𝜎𝑐1 − 𝜎𝑐3 = −𝜎𝑐1 + 𝜎𝑐2

which has solution:

𝜎𝑐1 = 𝜎𝑐2 = 𝜎𝑐3

14 Chapter 2. Class meetings

Page 19: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

4. Now we consider which of those supports give valid mixed strategies:

1. 𝐼 = {1, 2}

1. 𝐽 = {1, 2}

𝜎𝑟 = (𝑘,−𝑘, 0)

which is not possible

2. 𝐽 = {1, 3}

𝜎𝑟 = (2/3, 1/3, 0)

𝜎𝑐 = (2/3, 0, 1/3)

3. 𝐽 = {2, 3}

𝜎𝑟 = (1/3, 2/3, 0)

𝜎𝑐 = (0, 2/3, 1/3)

2. 𝐼 = {1, 3} Similarly.

3. 𝐼 = {2, 3} Similarly.

4. 𝐼 = 𝐽 = {1, 2, 3}

𝜎𝑟 = (1/3, 1/3, 1/3)

𝜎𝑐 = (1/3, 1/3, 1/3)

5. The final step is to check the best response condition:

1. 𝐼 = {1, 2}

2. 𝐽 = {1, 3}

𝐴𝜎𝑇𝑐 =

⎛⎝ 1/31/3−2/3

⎞⎠Thus 𝜎𝑟 is a best response to 𝜎𝑐.

𝜎𝑟𝐵 = (−1/3, 2/3,−1/3)

Thus 𝜎𝑐 is not a best response to 𝜎𝑟.

3. 𝐽 = {2, 3}

𝐴𝜎𝑇𝑐 =

⎛⎝−1/3−1/32/3

⎞⎠Thus 𝜎𝑟 is not a best response to 𝜎𝑐.

𝜎𝑟𝐵 = (−2/3, 1/3, 1/3)

Thus 𝜎𝑐 is a best response to 𝜎𝑟.

2. 𝐼 = {1, 3} Similarly.

2.5. 04 Support enumeration 15

Page 20: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

3. 𝐼 = {2, 3} Similarly.

4. 𝐼 = 𝐽 = {1, 2, 3}

𝐴𝜎𝑇𝑐 =

⎛⎝000

⎞⎠Thus 𝜎𝑟 is a best response to 𝜎𝑐.

𝜎𝑟𝐵 = (0, 0, 0)

Thus 𝜎𝑐 is a best response to 𝜎𝑟.

We can confirm all of this using nashpy:

>>> import nashpy as nash>>> A = np.array([[0, -1, 1],... [1, 0, -1],... [-1, 1, 0]])>>> rps = nash.Game(A)>>> list(rps.support_enumeration())[(array([0.333..., 0.333..., 0.333...]), array([0.333..., 0.333..., 0.333...]))]

Note that it can be computationally expensive to find all equilibria however nashpy can be used to find a Nashequilibrium by finding the first one:

>>> next(rps.support_enumeration())(array([0.333..., 0.333..., 0.333...]), array([0.333..., 0.333..., 0.333...]))

Discuss Nash’s theorem briefly. Highlight how that can seem contradictory for the output of nashpy (using supportenumeration) for the degenerate game of the notes. However, that won’t always be the case:

>>> A = np.array([[0, -1, 1, 1, -1],... [1, 0, -1, -1, 1],... [-1, 1, 0, 1, -1],... [-1, 1, -1, 0, 1],... [1, -1, 1, -1, 0]])>>> rpsls = nash.Game(A)>>> list(rpsls.support_enumeration())[(array([0.2, 0.2, 0.2, 0.2, 0.2]), array([0.2, 0.2, 0.2, 0.2, 0.2]))]

Some details about the proof:

• Proved in a 19 page thesis! (2 pages of appendices)

• Noble prize for economics

• Watch a beautiful mind

2.6 05 Best response polytopes

2.6.1 Corresponding chapters

• Best response polytopes

Duration: 100 minutes

16 Chapter 2. Class meetings

Page 21: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

2.6.2 Objectives

• Define the best response polytopes (using both definitions: convex hull and halfspaces).

• Labelling of vertices

• Equivalence between equilibrium and fully labeled pair

• Vertex enumeration

2.6.3 Notes

Defining best response polytopes

Discuss what an average is. Place this in the context of the line between two points. Weighted averages of two pointsare just a line. So an average is just a probability distribution. Expand this notion to the average over two points. Howwould you take the weighted average between three points 𝑥, 𝑦, 𝑧:

𝜆1𝑥+ 𝜆2𝑦 + 𝜆3𝑧

This corresponds to something like:

2.6. 05 Best response polytopes 17

Page 22: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

Explain how another definition for that space would be to draw inequalities on our variables. Give definition of bestresponse polytopes for a game. Illustrate this with the battle of the sexes game (scaled):

𝐴 =

(︂3 10 2

)︂𝐵 =

(︂2 10 3

)︂Go over the definition of the best response polytopes.

The row best response polytope is given by:

𝒫 = {𝑥 ∈ R𝑚 | 𝑥 ≥ 0;𝑥𝐵 ≤ 1}

This corresponds to the following inequalities:

• 𝑥1 ≥ 0

• 𝑥2 ≥ 0

• 2𝑥1 ≤ 1

• 𝑥1 + 3𝑥2 ≤ 1

From this we can identify the vertices:

• (𝑥1, 𝑥2) = (0, 0)

• (𝑥1, 𝑥2) = (1/2, 0)

• (𝑥1, 𝑥2) = (0, 1/3)

• (𝑥1, 𝑥2) = (1/2, 1/6)

This can be confirmed using nashpy:

>>> import numpy as np>>> import nashpy as nash>>> B = np.array([[2, 1], [0, 3]])>>> halfspaces = nash.polytope.build_halfspaces(B.transpose())>>> for v, l in nash.polytope.non_trivial_vertices(halfspaces):... print(v)[0.5 0. ][0. 0.333...][0.5 0.166...]

The column best response polytope is given by:

𝒬 = {𝑦 ∈ R𝑚 | 𝐴𝑦 ≤ 1; 𝑦 ≥ 0}

This corresponds to the following inequalities:

• 3𝑦1 + 𝑦2 ≤ 1

• 2𝑦2 ≤ 1

• 𝑦1 ≥ 0

• 𝑦2 ≥ 0

From this we can identify the vertices:

• (𝑦1, 𝑦2) = (0, 0)

• (𝑦1, 𝑦2) = (1/3, 0)

• (𝑦1, 𝑦2) = (0, 1/2)

18 Chapter 2. Class meetings

Page 23: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

• (𝑦1, 𝑦2) = (1/6, 1/2)

Confirmed:

>>> import numpy as np>>> A = np.array([[3, 1], [0, 2]])>>> halfspaces = nash.polytope.build_halfspaces(A)>>> for v, l in nash.polytope.non_trivial_vertices(halfspaces):... print(v)[0.333... 0. ][0. 0.5][0.1666... 0.5 ]

Pair activity

Ask everyone to draw these two polytopes.

Now describe how we label the vertices: using the same ordering as the inequalities (starting at 0), a vertex has thelabel corresponding to that inequality if it is a strict equality.

𝒫:

𝒬:

2.6. 05 Best response polytopes 19

Page 24: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

Explain that what these polytopes represent is the scaled strategies when players maximum utilities are 1. So given,the action of an opponent, if the players’ utility is 1 they are playing a best response.

Ask each person in a pair to be the row or the column player.

Have the row player pick a vertex, then identify the strategy and it’s best response and then pick the correspondingvertex.

For example:

Row player picks (0, 1/3) which has labels (0, 3) and corresponds to the strategy (0, 1). The best response to thesecond row is the first column which corresponds to vertex (0, 1/2) which has labels (1, 2).

Remind ourselves why these are the labels? (Refer to the defining properties of the polytopes).

Repeat with other vertices.

Aim to identify strategies that are best responses to each other:

• 𝜎𝑟 = (0, 1) and 𝜎𝑟 = (0, 1)

• 𝜎𝑟 = (3/4, 1/4) and 𝜎𝑟 = (1/4, 3/4)

• 𝜎𝑟 = (1, 0) and 𝜎𝑟 = (1, 0)

Discuss how this relates to the labels again.

Finally show how this is implemented in nashpy:

20 Chapter 2. Class meetings

Page 25: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

>>> A = np.array([[1, -1], [-1, 1]])>>> matching_pennies = nash.Game(A)>>> for eq in matching_pennies.vertex_enumeration():... print(eq)(array([0.5, 0.5]), array([0.5, 0.5]))

If there is time use support enumeration to compare.

2.7 06 Lemke Howson Algorithm

2.7.1 Corresponding chapters

• The Lemke Howson algorithm

2.7.2 Objectives

• Describe the Lemke Howson algorithm graphically

• Describe integer pivoting

Duration: 50 minutes

2.7.3 Notes

Describe the Lemke Howson algorithm

Using the polytopes for the matching pennies game:

𝒫:

𝒬:

2.7. 06 Lemke Howson Algorithm 21

Page 26: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

illustrate the Lemke Howson algorithm:

• ((0, 0), (0, 0)) have labels {0, 1}, {2, 3}. Drop 0 in 𝒫 .

• → ((1/2, 0), (0, 0)) have labels {1, 2}, {2, 3}. Drop 2 in 𝒬.

• → ((1/2, 0), (1/3, 0)) have labels {1, 2}, {0, 3}. Have an equilibria.

Try varying other initial labels. Is it possible to arrive at the mixed nash equilibrium?

Work through integer pivoting

Consider Rock Paper Scissors:

𝐴 =

⎛⎝ 0 −1 11 0 −1−1 1 0

⎞⎠ 𝐵 =

⎛⎝ 0 1 −1−1 0 11 −1 0

⎞⎠We see that drawing this would be a pain. Let us build the tableaux, but let us first scale our payoffs so that all variablesare positive (let us add 2 to all the utilities):

𝐴 =

⎛⎝2 1 33 2 11 3 2

⎞⎠ 𝐵 =

⎛⎝2 3 11 2 33 1 2

⎞⎠

22 Chapter 2. Class meetings

Page 27: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

This gives the following tableaux:

𝑇𝑟 =

⎛⎝2 1 3 1 0 0 13 2 1 0 1 0 11 3 2 0 0 1 1

⎞⎠Following along in numpy:

>>> import numpy as np>>> row_tableau = np.array([[2, 1, 3, 1, 0, 0, 1],... [3, 2, 1, 0, 1, 0, 1],... [1, 3, 2, 0, 0, 1, 1]])

and:

𝑇𝑐 =

⎛⎝1 0 0 2 1 3 10 1 0 3 2 1 10 0 1 1 3 2 1

⎞⎠Following along in numpy:

>>> col_tableau = np.array([[1, 0, 0, 2, 1, 3, 1],... [0, 1, 0, 3, 2, 1, 1],... [0, 0, 1, 1, 3, 2, 1]])

The labels of 𝑇𝑟 are {0, 1, 2} and the labels of 𝑇𝑐 are {3, 4, 5}.

Let us drop label 0, so we pivot the first column of 𝑇𝑟. The minimum ratio test gives: 1/3 < 1/2 < 1/1 so we pivoton the second row giving:

𝑇𝑟 =

⎛⎝0 −1 7 3 −2 0 13 2 1 0 1 0 10 7 5 0 −1 3 2

⎞⎠Code:

>>> import nashpy as nash>>> dropped_label = nash.integer_pivoting.pivot_tableau(row_tableau,... column_index=0)>>> row_tableauarray([[ 0, -1, 7, 3, -2, 0, 1],

[ 3, 2, 1, 0, 1, 0, 1],[ 0, 7, 5, 0, -1, 3, 2]])

This has labels {1, 2, 4} so we need to drop label 4 by pivoting the fifth column of 𝑇𝑐. The minimum ratio test impliesthat we pivot on the third row.

𝑇𝑐 =

⎛⎝3 0 −1 5 0 7 20 3 −2 7 0 −1 10 0 1 1 3 2 1

⎞⎠Code:

>>> dropped_label = nash.integer_pivoting.pivot_tableau(col_tableau,... column_index=4)>>> col_tableauarray([[ 3, 0, -1, 5, 0, 7, 2],

[ 0, 3, -2, 7, 0, -1, 1],[ 0, 0, 1, 1, 3, 2, 1]])

2.7. 06 Lemke Howson Algorithm 23

Page 28: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

This has labels {2, 3, 5} so we need to drop 2 by pivoting the third row of 𝑇𝑟. The minimum ratio test implies that wepivot on the first row:

𝑇𝑟 =

⎛⎝ 0 −1 7 3 −2 0 121 15 0 −3 9 0 60 54 0 −15 3 21 9

⎞⎠Code:

>>> dropped_label = nash.integer_pivoting.pivot_tableau(row_tableau,... column_index=2)>>> row_tableauarray([[ 0, -1, 7, 3, -2, 0, 1],

[ 21, 15, 0, -3, 9, 0, 6],[ 0, 54, 0, -15, 3, 21, 9]])

This has labels {1, 3, 4} so we need to drop 3 by pivoting the fourth column of 𝑇𝑐. The minimum ratio test impliesthat we pivot on the second row:

𝑇𝑐 =

⎛⎝21 −15 3 0 0 54 90 3 −2 7 0 −1 10 −3 9 0 21 15 6

⎞⎠Code:

>>> dropped_label = nash.integer_pivoting.pivot_tableau(col_tableau,... column_index=3)>>> col_tableauarray([[ 21, -15, 3, 0, 0, 54, 9],

[ 0, 3, -2, 7, 0, -1, 1],[ 0, -3, 9, 0, 21, 15, 6]])

This has labels {1, 2, 5} so we need to drop 1 by pivoting the second column of 𝑇𝑟. The minimum ratio test impliesthat we pivot on the third row:

𝑇𝑟 =

⎛⎝ 0 0 378 147 −105 21 631134 0 0 63 441 −315 1890 54 0 −15 3 21 9

⎞⎠Code:

>>> dropped_label = nash.integer_pivoting.pivot_tableau(row_tableau,... column_index=1)>>> row_tableauarray([[ 0, 0, 378, 147, -105, 21, 63],

[1134, 0, 0, 63, 441, -315, 189],[ 0, 54, 0, -15, 3, 21, 9]])

This has labels {3, 4, 5} so we need to drop 5 by pivoting the sixth column of 𝑇𝑐. The minimum ratio test implies thatwe pivot on the first row:

𝑇𝑐 =

⎛⎝ 21 −15 3 0 0 54 921 147 −105 378 0 0 63

−315 63 441 0 1134 0 189

⎞⎠Code:

24 Chapter 2. Class meetings

Page 29: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

>>> dropped_label = nash.integer_pivoting.pivot_tableau(col_tableau,... column_index=5)>>> col_tableauarray([[ 21, -15, 3, 0, 0, 54, 9],

[ 21, 147, -105, 378, 0, 0, 63],[-315, 63, 441, 0, 1134, 0, 189]])

This has labels {0, 1, 2} so we have a Nash equilibrium with vertices:

((189/1134, 9/54, 63/378), (63/378, 189/1134, 9/54)) = ((1/6, 1/6, 1/6), (1/6, 1/6, 1/6))

Which in turn corresponds to the expected equilibrium:

((1/3, 1/3, 1/3, (1/3, 1/3, 1/3))

• Mention how different pivoting methods are fine, doing it this way ensures everything is an integer which is alsomore efficient for a computer.

2.8 07 Repeated games

2.8.1 Corresponding chapters

• Repeated games

2.8.2 Objectives

• Define repeated games and strategies in repeated games

• Understand proof of theorem for sequence of stage Nash

• Understand that other equilibria exist

2.8.3 Notes

Playing repeated games in pairs

• Explain that we are about to play a game twice.

• Explain that this has to be done SILENTLY

In groups we are going to play:

𝐴 =

⎛⎝12 60 2412 23

⎞⎠ 𝐵 =

⎛⎝2 15 40 0

⎞⎠In pairs:

• Decide on row/column player (recall you don’t care about your opponents reward).

• We are going to play the game TWICE and write down both players cumulative scores.

• Define a strategy and ask players to write down a strategy that must describe what they do in both stages byanswering the following question: - What should the player do in the first stage? - What should the player do inthe second stage given knowledge of what both

2.8. 07 Repeated games 25

Page 30: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

players did in the first period?

• SILENTLY, after having written down a strategy: show each other your strategies and SILENTLY agree onthe pair of utilities. If you are unable to agree on a utility this indicates that the strategies were not descriptiveenough. SILENTLY start again :)

As a challenge: repeat this (so repeatedly play a repeated game, repeatedly write down a new strategy) and make anote when you arrive at an equilibria (where no one has a reason to write a different strategy down)

If anyone arrives at an equilibria where the row player scores more than 24 and the column player more than4 stand up as a pair.

Following this, assuming a pair has arrived at such an equilibrium discuss this.

Then work through the notes:

• Definition of a repeated game;

• Definition of a strategy;

• Theorem of sequence of stage Nash (relate this back to the utilities of our game):

– (𝑟1𝑟1, 𝑐1𝑐1) with utility: (24, 4).

– (𝑟1𝑟3, 𝑐1𝑐1) with utility: (24, 2).

– (𝑟3𝑟1, 𝑐1𝑐1) with utility: (24, 2).

– (𝑟3𝑟3, 𝑐1𝑐1) with utility: (24, 0).

Now discuss the potential of a different equilibrium:

1. For the row player:

(∅, ∅) → 𝑟2

(𝑟2, 𝑐1) → 𝑟3

(𝑟2, 𝑐2) → 𝑟1

2. For the column player:

(∅, ∅) → 𝑐2

(𝑟1, 𝑐2) → 𝑐1

(𝑟2, 𝑐2) → 𝑐1

(𝑟3, 𝑐2) → 𝑐1

This corresponds to the following scenario:

> Play (𝑟2, 𝑐2) in first stage and (𝑟1, 𝑐1) in second stage unless the column player does not cooperate in which caseplay (𝑟3, 𝑐1).

This gives a utility of (36, 6). Is this an equilibrium?

1. If the row player deviates, they would do so in the first round and gain no utility.

2. If the column player deviates, they would only be rational to do so in the first stage, if they did they would gain1 but lose 2 in the second round.

Thus this is Nash equilibrium.

• Discuss how to identify such an equilibria: which player has an incentive to build a reputation? (The columnplayer want to prove to be trustworthy to gain 2 in the final round).

• Mention how this shows how game theory studies the emergence of unexpected behaviour.

26 Chapter 2. Class meetings

Page 31: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

2.9 08 Prisoners Dilemma

2.9.1 Corresponding chapters

• Prisoners Dilemma

2.9.2 Objectives

• Explore the Iterated Prisoners Dilemma

2.9.3 Notes

Playing team based Iterated Prisoner’s Dilemma

Show this video: https://www.youtube.com/watch?v=p3Uos2fzIJ0

Ask class to form in to groups of approximately 8: 4 teams of 2.

Write the following on the board:

Name Score vs 1st opp∑︀

score vs 2nd opp∑︀

score vs 3rd opp

Explain that teams will play the iterated Prisoners dilemma:

𝐴 =

(︂3 05 1

)︂𝐵 =

(︂3 50 1

)︂Discuss NE of stage game.

Discuss “Cooperation” versus “Defection” and explain that goal is to maximise overall score (not necessarily beatdirect opponent).

For every dual write the following on the board (assuming 8 turns):

Name Score∑︀

score∑︀

score∑︀

score∑︀

score∑︀

score∑︀

score∑︀

score

Instructions for a dual:

• Give all teams copies of a document to help record.

• Think about strategy.

• Before every stage invite both individuals to talk to each other.

• Get teams to “face away”, after a count down “show” (either a C or a D).

After the tournament:

• Discuss winning strategy and other interesting strategies. Discuss potential coalitions that arose.

2.9. 08 Prisoners Dilemma 27

Page 32: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

• Discuss how teams had more information that usual.

• Invite the possibility of modifications (prob end, noise and lack of information).

Consider Reactive strategies

Go over theory.

In the case of (𝑝1, 𝑞1) = (1/4, 4/5) and (𝑝2, 𝑞2) = (2/5, 1/3) we have:

𝑟1 =11

20𝑟2 =

1

15

𝑠1 =185

311𝑠2 =

116

311

The steady state probabilities are given by:

𝜋 = (21460/96721, 36075/96721, 14616/96721, 24570/96721)

Here is some sympy code to illustrate this:

>>> import sympy as sym>>> p_1, p_2 = sym.S(1) / sym.S(4), sym.S(4) / sym.S(5)>>> q_1, q_2 = sym.S(2) / sym.S(5), sym.S(1) / sym.S(3)

(continues on next page)

28 Chapter 2. Class meetings

Page 33: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

(continued from previous page)

>>> r_1 = p_1 - p_2>>> r_2 = q_1 - q_2>>> r_1, r_2(-11/20, 1/15)>>> s_1 = (q_2 * r_1 + p_2) / (1 - r_1 * r_2)>>> s_2 = (p_2 * r_2 + q_2) / (1 - r_1 * r_2)>>> s_1, s_2(185/311, 116/311)>>> pi = sym.Matrix([[s_1 * s_2, s_1 * (1 - s_2), (1 - s_1) * s_2, (1 - s_1) * (1 - s_→˓2)]])>>> piMatrix([[21460/96721, 36075/96721, 14616/96721, 24570/96721]])

We can verify that this is a steady state vector:

𝑀 =

⎛⎜⎜⎝1/10 3/20 3/10 9/208/25 12/25 2/25 3/251/12 1/6 1/4 1/24/15 8/15 1/15 2/15

⎞⎟⎟⎠𝜋𝑀 = (21460/96721, 36075/96721, 14616/96721, 24570/96721)

Sympy code:

>>> M = sym.Matrix([[p_1*q_1, p_1*(1-q_1), (1-p_1)*q_1, (1-p_1)*(1-q_1)],... [p_2 * q_1, p_2 * (1-q_1), (1-p_2) * q_1, (1-p_2) * (1-q_1)],... [p_1 * q_2, p_1 * (1-q_2), (1-p_1) * q_2, (1-p_1) * (1-q_2)],... [p_2 * q_2, p_2 * (1-q_2), (1-p_2) * q_2, (1-p_2)*(1-q_2)]])>>> MMatrix([[1/10, 3/20, 3/10, 9/20],[8/25, 12/25, 2/25, 3/25],[1/12, 1/6, 1/4, 1/2],[4/15, 8/15, 1/15, 2/15]])>>> pi * MMatrix([[21460/96721, 36075/96721, 14616/96721, 24570/96721]])>>> pi * M == piTrue

The utility is then given by:

3𝑠1𝑠2 + 0𝑠1(1− 𝑠2) + 5(1− 𝑠1)𝑠2 + (1− 𝑠1)(1− 𝑠2) = 162030/96721 ≈ 1.675

Sympy code:

>>> rstp = sym.Matrix([[sym.S(3), sym.S(0), sym.S(5), sym.S(1)]])>>> score = pi.dot(rstp)>>> score, float(score)(162030/96721, 1.675...)

2.10 09 Evolutionary dynamics

2.10.1 Corresponding chapters

• Evolutionary dynamics

2.10. 09 Evolutionary dynamics 29

Page 34: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

2.10.2 Objectives

• Go over basic differential models for population dynamics.

2.10.3 Notes

Explain difference from games so far:

• Players were discrete entities (and we only consider 2 player games);

• Now there are no players: just strategies in a population.

Our population will be stand ers or sit ers.

Say we are going to use the following differential equation to denote the rate at which people stand:

𝑑𝑥

𝑑𝑡= 2𝑥

Use seconds (artificially) as a time unit, 𝑑𝑥𝑑𝑡 corresponds to the speed with which people stand up. We will instead

consider it as the rate at which people become standing up. Using a continuous population these two things areequivalent.

Get one student to stand up.

The speed at which other individuals are becoming standing up is at that exact point 2. Get someone to slowly standup . . . until they become stood up etc. . .

Note how we will run out of people very quickly as the solution to our equation is:

𝑥(𝑡) = 𝑥0𝑒2𝑡

Show how we can use Python to solve this differential equation numerically:

>>> import numpy as np>>> from scipy.integrate import odeint>>> t = np.linspace(0, 10, 100) # Obtain 100 time points>>> def dx(x, t, a):... """Define the derivate of x"""... return a * x>>> a = 10>>> xs = odeint(func=dx, y0=1, t=t, args=(a,))>>> xsarray([[1.000...

Now consider population of two individuals (see notes) (𝑥: standers and 𝑦 sitting on desk). What happens over time:

𝑑𝑥

𝑑𝑡= 2𝑥

𝑑𝑦

𝑑𝑡= 3𝑦

This means that the we would very quickly run out of sitters and desk sitters but what would happen to the limit of theratio? (Go over mathematics in notes)

Now, finally consider a constant population:

Consider .5 population standing and .5 sitting.

What must we have for our rates?

One population increases at a rate of 2 and the other increases at a rate of 3.

So our population source also, somehow decrease by 5.

30 Chapter 2. Class meetings

Page 35: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

Ask students to suggest how this can be done. Have a discussion.

Solution: share this between both populations:

• One populations “increases” at a rate of 2 - 2.5 = -0.5

• One populations “increases” at a rate of 3 - 2.5 = 0.5

Go over notes.

Discuss what happens at following starting conditions:

• x=1, y=0

• x=0, y=1

• x=0.5, y=0.5 but changes rates to be equal.

Show how can obtain this numerically:

>>> def dxy(xy, t, a, b):... """... Define the derivate of x and y.... It takes `xy` as a vector... """... x, y = xy... phi = a * x + b * y... return x * (a - phi), y * (b - phi)>>> a, b = 10, 5>>> xys = odeint(func=dxy, y0=[.5, .5], t=t, args=(a, b))>>> xysarray([[ 5.00000000e-01, 5.000...

Discuss how this basic notion of fitness will now be extended to consider game theoretic models where the fitnesscomes out game theoretic models.

2.11 10 Evolutionary game theory

2.11.1 Corresponding chapters

• Evolutionary game theory

2.11.2 Objectives

• Go over the logical progression from dynamics to games;

• Explain ESS

2.11.3 Notes

Recall that there are no players: just strategies in a population.

Go over the notes, explaining the mathematics.

Now consider the Matching pennies game:

𝐴 =

(︂1 −1−1 1

)︂

2.11. 10 Evolutionary game theory 31

Page 36: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

This gives:

𝑓1 = 𝑥1 − 𝑥2 𝑓2 = −𝑥1 + 𝑥2

which is equivalent to:

𝑓 = 𝐴𝑥 =

(︂𝑥1 − 𝑥2

−𝑥1 + 𝑥2

)︂and so:

𝑝ℎ𝑖 = 𝑓𝑥 = 𝑥1(𝑥1 − 𝑥2) + 𝑥2(𝑥2 − 𝑥1) = (𝑥1 − 𝑥2)2

thus:

𝑑𝑥

𝑑𝑡=

(︂𝑥1((𝑥1 − 𝑥2)− (𝑥1 − 𝑥2)

2)𝑥2((𝑥2 − 𝑥1)− (𝑥1 − 𝑥2)

2)

)︂𝑑𝑥

𝑑𝑡=

(︂𝑥1(𝑥1 − 𝑥2)(1− (𝑥1 − 𝑥2))𝑥2(𝑥1 − 𝑥2)(−1− (𝑥1 − 𝑥2))

)︂however recall that 𝑥1 + 𝑥2 = 1:

𝑑𝑥1

𝑑𝑡= 𝑥1(2𝑥1 − 1)2(1− 𝑥1))

Verification of above calculations:

>>> import sympy as sym>>> x_1, x_2 = sym.symbols("x_1, x_2")>>> f_1 = x_1 - x_2>>> f_2 = x_2 - x_1>>> phi = x_1 * f_1 + x_2 * f_2>>> dx_1 = x_1 * (f_1 - phi)>>> sym.factor(dx_1.subs({x_1: 1 - x_2}))2*x_2*(x_2 - 1)*(2*x_2 - 1)

We have the stable points as expected:

1. 𝑥1 = 0

2. 𝑥1 = 1

3. 𝑥1 = 1/2

Relate this to the notes and discuss notion of mutated population and stable ESS.

Show output of the following:

>>> import numpy as np>>> import matplotlib.pyplot as plt>>> from scipy.integrate import odeint>>> t = np.linspace(0, 10, 100) # Obtain 100 time points>>> def dx(x, t, A):... """... Define the derivate of x.... """... f = np.dot(A, x)... phi = np.dot(f, x)... return x * (f - phi)

Slight deviation from 𝑥1 = 0:

32 Chapter 2. Class meetings

Page 37: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

>>> A = np.array([[1, -1], [-1, 1]])>>> epsilon = 10 ** -1>>> xs = odeint(func=dx, y0=[epsilon, 1 - epsilon], t=t, args=(A,))>>> plt.plot(xs);[...

Slight deviation from 𝑥1 = 1:

>>> xs = odeint(func=dx, y0=[1 - epsilon, epsilon], t=t, args=(A,))>>> plt.plot(xs);[...

Slight deviation from 𝑥1 = 1/2:

>>> xs = odeint(func=dx, y0=[1 / 2 - epsilon, 1 / 2 + epsilon], t=t, args=(A,))>>> plt.plot(xs);[...

Look at theorem and discuss proof.

Carry out Nash equilibria computation (which corresponds to the above case):

>>> import nashpy as nash>>> game = nash.Game(A, A.transpose())>>> list(game.support_enumeration())[(array([1., 0.]), array([1., 0.])), (array([0., 1.]), array([0., 1.])), (array([0.5,→˓0.5]), array([0.5, 0.5]))]

Now look at cases:

1. 𝑥1 = 0: we are in the first case of the theorem so we have an ESS.

2. 𝑥1 = 1: we are in the first case of the theorem so we have an ESS.

3. 𝑥1 = 1/2: we now have the second case of the theorem 𝑢(𝑥, 𝑦) = 𝑢(𝑥, 𝑥) (indeed the row strategy aims tomake the column strategy indifferent - according to the best response condition).

To deal with this case we need to look at the next part of the second condition:

𝑢(𝑥, 𝑦) = 0

𝑢(𝑦, 𝑦) = 𝑦21 + (1− 𝑦1)2 − 2(1− 𝑦1)𝑦1 = (𝑦1 − (1− 𝑦1))

2 = (2𝑦1 − 1)2

And as long as 𝑦1 ̸= 𝑥1 = 1/2 then 𝑢(𝑥, 𝑦) < 𝑢(𝑦, 𝑦) thus this is not an ESS:

>>> A = sym.Matrix(A)>>> y_1, y_2 = sym.symbols("y_1, y_2")>>> y = sym.Matrix([y_1, y_2])>>> sym.factor((y.transpose() * A * y)[0].subs({y_2: 1 - y_1}))1.0*(2.0*y_1 - 1.0)**2

Discuss the work John Maynard Smith in 1973 who formalised this work. Mention that he was not actually aware ofNash equilibria at the time.

2.12 11 Moran Processes

Duration: 140 minutes

2.12. 11 Moran Processes 33

Page 38: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

2.12.1 Corresponding chapters

• Moran Processes

2.12.2 Objectives

• Play a class activity of a Moran process;

• Define a Moran process;

• Prove theorem for formula of fixation probabilities;

• Numeric calculations.

2.12.3 Activity

Use moran process form have students play in pairs.

Also requires use of dice, allowing for multiples of 4 and 6. For example:

• 1 D12 (12 sided dice) or;

• 1 D6 and 1 D4 (the example used in this page);

• 1 D6 and 1 D8.

Explain that we will aim to reproduce a Moran process with 𝑁 = 3.

We will do this with the Hawk-Dove game:

𝐴 =

(︂0 31 2

)︂Recall, this corresponds to sharing of 4 resources:

• Two hawks both get nothing;

• A hawk meeting a dove gets 3 out of 4.

• A dove meeting a hawk gets 1 out of 4.

• A dove meeting a dove gets 2 out of 4.

Give students 5 minutes to write out the fitnesses of each type in each possible situation.

Confirm:

𝑓(Hawk) 𝑓(Dove)1 Hawk, 2 Doves 0× 0 + 2× 3 = 6 1× 1 + 1× 2 = 32 Hawks, 1 Dove 1× 0 + 1× 3 = 3 2× 1 + 0× 2 = 2

Give students 5 minutes to write out the probabilities of selection of each type in each possible situation.

Confirm:

34 Chapter 2. Class meetings

Page 39: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

Select Selection: Birth Selection: Death1 Hawk, 2 Doves Hawk 6

6+2×313

1 Hawk, 2 Doves Dove 2×36+2×3

23

2 Hawks, 1 Dove Hawk 2×32×3+2

23

2 Hawks, 1 Dove Dove 22×3+2

13

Give students 5 minutes to identify what sided dice and what values will allow them to simulate the process of selectingbirth/death individuals.

Confirm:

State Birth: dice used Select Hawk values Death: dice used Select Hawk values1 Hawk 6 {1, 2, 3} 6 {1, 2}2 Hawks 4 {1, 2, 3} 6 {1, 2, 3, 4}

Hand out a dice to each group (ideally a D6 and a D4 to each group but perhaps they can identify other ways).

Let students simulate

Count from groups to obtain mean fixation rate of a Hawk.

Show the following code that allows us to simulate the simulation undertaken by the students (this is not the samecode as in the notes):

>>> import collections>>> import matplotlib.pyplot as plt>>> import numpy as np>>> import tqdm

>>> def roll_n_sided_dice(n=6):... """... Roll a dice with n sides.... """... return np.random.randint(1, n + 1)

>>> class MoranProcess:... """... A class for a moran process with a population of... size N=3 using the standard Hawk-Dove Game:...... A =... [0, 3]... [1, 2]...... Note that this is a simulation corresponding to an... in class activity where students roll dice.... """... def __init__(self, number_of_hawks=1, seed=None):...... if seed is not None:... np.random.seed(seed)...... self.number_of_hawks = number_of_hawks... self.number_of_doves = 3 - number_of_hawks...... self.dice_and_values_for_hawk_birth = {1: (6, {1, 2, 3}), 2: (4, {1, 2, 3}→˓)}

(continues on next page)

2.12. 11 Moran Processes 35

Page 40: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

(continued from previous page)

... self.dice_and_values_for_hawk_death = {1: (6, {1, 2}), 2: (6, {1, 2, 3, 4}→˓)}...... self.history = [(self.number_of_hawks, self.number_of_doves)]...... def step(self):... """... Select a hawk or a dove for birth.... Select a hawk or a dove for death....... Update history and states.... """... birth_dice, birth_values = self.dice_and_values_for_hawk_birth[self.→˓number_of_hawks]... death_dice, death_values = self.dice_and_values_for_hawk_death[self.→˓number_of_hawks]...... select_hawk_for_birth = self.roll_dice_for_selection(dice=birth_dice,→˓values=birth_values)... select_hawk_for_death = self.roll_dice_for_selection(dice=death_dice,→˓values=death_values)...... if select_hawk_for_birth:... self.number_of_hawks += 1... else:... self.number_of_doves += 1...... if select_hawk_for_death:... self.number_of_hawks -= 1... else:... self.number_of_doves -= 1...... self.history.append((self.number_of_hawks, self.number_of_doves))...... def roll_dice_for_selection(self, dice, values):... """... Given a dice and values return if the random roll is in the values.... """... return roll_n_sided_dice(n=dice) in values...... def simulate(self):... """... Run the entire simulation: repeatedly step through... until the number of hawks is either 0 or 3.... """... while self.number_of_hawks in [1, 2]:... self.step()... return self.number_of_hawks...... def __len__(self):... return len(self.history)

This carries out the simulations:

>>> repetitions = 10 ** 5>>> end_states = []>>> path_lengths = []

(continues on next page)

36 Chapter 2. Class meetings

Page 41: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

(continued from previous page)

>>> for seed in range(repetitions):... mp = MoranProcess(seed=seed)... end_states.append(mp.simulate())... path_lengths.append(len(mp))>>> counts = collections.Counter(end_states)>>> counts[3] / repetitions0.54666

Discuss obtaining theoretic probabilities of changing state:

𝑝10 =6

12

1

3=

1

6𝑝12 =

6

12

2

3=

1

3𝑝21 =

2

8

2

3=

1

6𝑝23 =

6

8

1

3=

1

4

Now work through the notes: culminating in the proof of the theorem for the absorption probabilities of a birthdeath process.

Discuss and use code from chapter to show the fixation with the Hawk Dove game:

>>> A = np.array([[0, 3], [1, 2]])

Calculate theoretic value using formula from theorem:

𝑓1𝑖 =3(𝑁 − 𝑖)

𝑁 − 1= 3

𝑁 − 𝑖

𝑁 − 1

𝑓2𝑖 =𝑖+ 2(𝑁 − 𝑖− 1)

𝑁 − 1=

2𝑁 − 2− 𝑖

𝑁 − 1

This gives (for 𝑁 = 3):

𝑖 = 1 𝑖 = 2𝑓1𝑖 3 3/2𝑓2𝑖 3/2 1𝛾𝑖 1/2 2/3

Thus:

𝑥1 =1

1 + 1/2 + 1/2× 2/3=

1

11/6≈ .545455

• Discuss work of Maynard smith but that this actually used Hawk Dove game in infinite population games.

• Discussion possibility for using a utility model on top of fitness.

• A lot of current work looks at Moran processes: a good model of invasion of a specifies etc. . .

• The Prisoners dilemma can also be included, there is documentation about simulating this with Axelrod is here:http://axelrod.readthedocs.io/en/stable/tutorials/getting_started/moran.html

2.13 12 Contemporary research

2.13.1 Corresponding chapters

• Contemporary research

2.13. 12 Contemporary research 37

Page 42: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

2.13.2 Objectives

• Discuss different papers that will be looked at.

• Answer queries about assessment of these papers.

Explain that in future sessions we will discuss the papers, students will be expected to have read them. This will allowme to answer any questions they might have.

2.13.3 Studying the emergence of invasiveness in tumours using game theory

https://link.springer.com/article/10.1140/epjb/e2008-00249-y

• A paper that looks at the GT of cancer: it reads like a proof of concept sitting in literature that already exists onthe subject.

• Main result is confirming intuitive understanding of proliferation/motility. Increased nutrients implies less motil-ity.

• Use two different approaches: describe an analytic evo game. Note that there is a typo in the game:(︂𝑏/2 𝑏− 𝑐𝑏 𝑏− 𝑐/2

)︂Also note that this is showing the utility of the column player so that’s slightly different from the framework inthe course.

The solution comes from:

>>> import sympy as sym>>> p, b, c = sym.symbols("p, b, c")>>> sym.solveset(p * (b - c / 2) + (1 - p) * (b - c) - p * (b) - (1 - p) * (b /→˓2), p){(b - 2*c)/(b - c)}

They also describe how a mixed stable solution will also exist: What does that correspond to in notes?

And the theorem about evolutionary stability: what does that correspond to?

• They implement a Cellular automaton model (why?): does this compare to the evo game or to a Moran process?

• Improvements could include:

– more complex dynamics;

– using more phenotypes (strategies);

– finite population analytical model;

Here is a blog post on the EGT blog about this paper: https://egtheory.wordpress.com/2013/07/05/motility/

2.13.4 Iterated Prisoner’s Dilemma contains strategies that dominate any evolution-ary opponent

http://www.pnas.org/content/109/26/10409.abstract

• A paper that looks at memory one strategies in the IPD.

38 Chapter 2. Class meetings

Page 43: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

• The paper makes use of a clever mathematical theorem (Cramer’s theorem) to show that there is a linear rela-tionship between the utility of two memory 1 players (eqn 6). This is because the linear relationship:

𝛼𝑆𝑋 + 𝛽𝑆𝑌 + 𝛾

is shown to be the ratio two determinants (essentially one determinant with a scaling factor). The columns ofthe corresponding matrix however are independent across both players (one has only 𝑝 variables and another,only 𝑞 variables). Thus, choosing a particular set of values for the 𝑝 can set the linear relationship above to be 0.

• This shows that it’s possible for one player to set the score of the other. It is however not possible for a playerto set it’s own score..

• The paper then goes on to show that it is possible to set a value where the strategy’s payoff is a factor of 𝜒 morethan the opponent. This linkage (it is now a win-win situation) of the strategies implies that both strategiesmaximise their payoffs when the opponent cooperates.

• The paper also describes playing against an “evolutionary” player which has a “theory of mind”, ie whichadapts. They make a claim (the title of the article) that numerical evidence indicates the such a player wouldevolve towards cooperation.

• The paper has two appendices which provide further proofs, one of the theorems for example is that shortmemory strategies are sufficient. This seems limited as it only considers a match and not a sufficient evolutionarysetting (such as a Moran process or a tournament). The other proves that the evolutionary process will stabilise(ultimately all results are being calculated at steady state).

• A deeper literature review placing this paper in it’s proper context would be beneficial.

• Perhaps some more numerical examples would have been helpful to highlight the process they mention. Thereis one graphic of the evolutionary process and note that’s claimed to be for 𝜒 = 5, we can read off the graph thatthe final utilty of the opponent is 1.6, so 𝑆𝑦 − 𝑃 = 1.6− 1 = .6 and .6× 5 = 3 whereas we see that the utilityof the other player is indeed 3 + 1.

• What occurs if two ZD strategies attempt to play each other?

2.13.5 Measuring the price of anarchy in critical care unit interactions

https://link.springer.com/article/10.1057/s41274-016-0100-8

• A piece of work that applies Nash equilibria to the interaction of two health providers. It assumes that a hospital’sstrategy is a threshold at which patients are deviated to other hospitals.

• This is one of a few papers on health care logistics that uses Game Theory.

• To calculate the utilities a Markov model (just like the model of Reactive strategies) is used. For every strategypair that Markov model is used to get the utilities.

• The main theorem in the paper is one that allows for an easily calculation of the Nash equilibria by showing thatthe best response functions are mutually increasing which implies that there will be a single intersection of thebest response functions.

• The above theorem is proven for some conditions which correspond to two scenarios.

• The paper then uses this to compute the PoA: price of anarchy looking at a comparison between the optimalcoordinated behaviour and the Nash equilibria. Similarly to a Prisoners dilemma.

• This is used to find a target that aligns selfish behaviour with coordinated behaviour.

• Things that could be improved: more players, non stationarity and different models of behaviour.

2.13. 12 Contemporary research 39

Page 44: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

2.13.6 Evolution Reinforces Cooperation with the Emergence of Self-RecognitionMechanisms: an empirical study of the Moran process for the iterated Pris-oner’s dilemma

• This is a paper that studies the Prisoner’s dilemma in a Moran process.

• It uses all the strategies from the Axelrod library but also includes 3 strategies that were trained using a geneticalgorithm specifically for the Paper. These are trained using a genetic algorithm.

• These make use of Finite State Machines: a structure that maps states and actions to another state and action.

• The paper starts by showing why the numerical simulation is necessary with how the theoretic values do notmatch up with the simulated ones for stochastic strategies.

• Then it goes on to examine two main scenarios: resistance and invasion.

• One observation is that the results for N=2 (just 2 strategies, 1 of each type) differ quite a lot from N>3. Thisimplies that a lot of theoretic research isn’t quite right.

• The other observation is that the trained strategies do well:

– Strategies with handshakes (self recognition mechanisms) do well in resistance.

– Strategies trained for scores against opponents do well at invasion.

• Some discussion is given to the placing of a particular type of strategy that is of interest in the literature (thePress and Dyson paper) but does not do well here.

Here is are two blog posts on this subject:

• http://vknight.org/unpeudemath/math/2017/07/28/sophisticated-ipd-strategies-beat-simple-ones.html

• http://marcharper.codes/2017-07-31/axelrod.html

Here is a video about a talk I gave on this subject:

• (https://www.youtube.com/watch?v=p4t9dMKxZAo)

40 Chapter 2. Class meetings

Page 45: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

CHAPTER 3

Optional meetings

Notes for specific optional meetings that take place reacting to needs ot students.

3.1 AA- Revisiting Polytopes

3.1.1 Corresponding chapters

• Best response polytopes

Duration: 50 minutes

3.1.2 Objectives

• Re visit best response polytopes using a game with a dominated strategy.

3.1.3 Notes

Consider the (𝐴,𝐵) ∈ R2×22 game:

𝐴 =

(︂1 4−3 2

)︂𝐵 =

(︂5 2−1 9

)︂Ask students in pairs to construct the column player best response polytope

First we add 4 to all elements to ensure they are positive:

𝐴 =

(︂5 81 6

)︂𝐵 =

(︂9 63 13

)︂By definition (https://vknight.org/gt/chapters/06/#Definition-of-best-response-polytopes), the column player best re-sponse polytope 𝒬 is given by:

𝒬 = {𝑦 ∈ R𝑛 | 𝐴𝑦 ≤ 1 𝑦 ≥ 0}

41

Page 46: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

So the inequalities are:

5𝑦1 + 8𝑦2 ≤ 1 label: 0𝑦1 + 6𝑦2 ≤ 1 label: 1

𝑦1 ≥ 0 label: 2𝑦2 ≥ 0 label: 3

The first two inequalities come from 𝐴𝑦 ≤ 1, however, 𝐴𝑦 corresponds to the row players utility against vertex 𝑦.

Note that everything is scaled so that the best response of the row player is 1 (the scaling takes place in the strate-gies/vertices).

Thus if a vertex has label 0 or 1 that implies that the first/second row player strategy is a best response to 𝑦.

We will return to this idea but first let us plot our 𝑄. We start by rearranging our inequalities:

𝑦2 ≤ 1/8− 5/8𝑦1 label: 0𝑦2 ≤ 1/6− 𝑦1/6 label: 1𝑦1 ≥ 0 label: 2𝑦2 ≥ 0 label: 3

Now let us sketch these:

import matplotlib.pyplot as plt%matplotlib inliney1s = [0, 1]first_line = [1 / 8 - 5 / 8 * y for y in y1s]second_line = [1 / 6 - y / 6 for y in y1s]plt.figure()plt.plot(first_line, label="$1/8-5/8y_1$ (label: 0)")plt.plot(second_line, label="$1/6-y_1/6$ (label: 1)")plt.legend();

This gives:

42 Chapter 3. Optional meetings

Page 47: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

We can see that our polytope is a straightforward triangle and that no vertex will ever have label 1.

Discuss if this is OK?

Returning to the game:

𝐴 =

(︂5 81 6

)︂We see that the first strategy dominates the second: so the second strategy is never a best response.

Our vertices with labels are:

(0, 0) labels {2, 3}(0, 1/8) labels {0, 2}(1/5, 0) labels {0, 3}

Exercise:

Obtain the best response polytope for the row player:

By definition (https://vknight.org/gt/chapters/06/#Definition-of-best-response-polytopes), the row player best re-sponse polytope 𝒫 is given by:

𝒬 = {𝑥 ∈ R𝑚 | 𝑥 ≥ 0 𝑥𝐵 ≤ 1}

So the inequalities are:

𝑥1 ≥ 0 label: 0𝑥2 ≥ 0 label: 1

9𝑥1 + 3𝑥2 ≤ 1 label: 22𝑥1 + 9𝑥2 ≤ 1 label: 3

Rearranging:

𝑥1 ≥ 0 label: 0𝑥2 ≥ 0 label: 1𝑥2 ≤ 1/3− 3𝑥1 label: 2𝑥2 ≤ 1/9− 2/9𝑥1 label: 3

Now let us sketch these:

x1s = [0, 1]first_line = [1 / 3 - 3 * x for x in x1s]second_line = [1 / 9 - 2 / 9 * x for x in x1s]plt.figure()plt.plot(first_line, label="$1/3-3x_1$ (label: 2)")plt.plot(second_line, label="$1/9-2/9x_1/6$ (label: 3)")plt.legend();

This gives:

3.1. AA- Revisiting Polytopes 43

Page 48: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

We see that there our polytope has 4 vertices:

(0, 0) labels {0, 1}(0, 1/9) labels {0, 3}(1/9, 0) labels {1, 2}

(2/25, 7/75) labels {2, 3}

Here is some sympy code to verify the intersection of both boundaries:

>>> import sympy as sym>>> x = sym.Symbol("x")>>> sym.solveset(sym.S(1) / 3-3 * x - sym.S(1) / 9 + sym.S(2) / 9 * x, x){2/25}

We can now identify a fully labelled vertex pair (we already know that label 𝑐𝑜𝑑𝑒 must come from a vertex in 𝒫):

((1/9, 0), (1/5, 0))

Which once normalised gives: ((1, 0), (0, 1)) which is in fact readily readable using pure best responses:

𝐴 =

(︂1 4−3 2

)︂𝐵 =

(︂5 2−1 9

)︂Checking using nashpy:

>>> import nashpy as nash>>> A = [[1, 4], [-3, 2]]>>> B = [[5, 2], [-1, 9]]>>> game = nash.Game(A, B)>>> list(game.vertex_enumeration())[(array([1., 0.]), array([1., 0.]))]

44 Chapter 3. Optional meetings

Page 49: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

3.2 AB - Revisiting Repeated games

3.2.1 Corresponding chapters

• Repeated games

Duration: 50 minutes

3.2.2 Objectives

• Re visit repeated games and ensure able to compute Nash equilibria for games that are not stage nash.

3.2.3 Notes

Consider the (𝐴,𝐵) ∈ R2×22 game with 𝑇 = 2:

𝐴 =

(︂200 4 31 0 2

)︂𝐵 =

(︂0 −3 0

−10 0 −10

)︂Ask students in pairs to:

• Identify pure Nash equilibria of stage game:

𝐴 =

(︂200 4 31 0 2

)︂𝐵 =

(︂0 −3 0

−10 0 −10

)︂• Identify repeated game Nash equilibria:

– (𝑟1𝑟1, 𝑐1𝑐1) with utility: (400, 0).

– (𝑟1𝑟1, 𝑐1𝑐3) with utility: (203, 0).

– (𝑟1𝑟1, 𝑐3𝑐1) with utility: (203, 0).

– (𝑟1𝑟1, 𝑐3𝑐3) with utility: (6, 0).

• Identify an equilibria that is not stage Nash:

1. For the row player:

(∅, ∅) → 𝑟2

(𝑟2, 𝑐1) → 𝑟1

(𝑟2, 𝑐2) → 𝑟1

(𝑟2, 𝑐3) → 𝑟1

2. For the column player:

(∅, ∅) → 𝑐2

(𝑟2, 𝑐2) → 𝑐1

(𝑟1, 𝑐2) → 𝑐3

3.2. AB - Revisiting Repeated games 45

Page 50: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

This corresponds to the following scenario:

> Play (𝑟2, 𝑐2) in first stage and (𝑟1, 𝑐1) in second stage unless the row player does not cooperate in which caseplay (𝑟1, 𝑐3).

This gives a utility of (200, 0). Is this an equilibrium?

1. If the row player deviares, they would only be rational to do so in the first state, if they did theywould gain 4 but lose 197.

2. If the column player deviates they would do so in the first round and gain no utility.

Potentially discuss how this repeated game framework can/does correspond to a normal normal form game.

• Writing all strategies down;

• Obtaining large matrix (very large)

3.3 AC - Coursework peer review

3.3.1 Corresponding assessment

• Individual coursework

Duration: 50 minutes

3.3.2 Objectives

• Use peer review to discuss and give feedback on individual coursework.

3.3.3 Notes

5 mins: Discuss marking criteria:

• Understanding of subject (scope): the specific modelling scenario chosen is not terribly important but the com-munication of understanding is. Does the work use something specific to highlight high understanding?

• Presentation: how does work align with guidelines for writing mathematics:

– Inclusion of graphics?

– Spelling/grammar?

– Clarity

• Code: how has the code been used, is it the correct code, does it help with the scope? A simple use of somethingfrom the notes is what is expected but is it done appropriately (for example using the lemke-howson algorithmwhen it’s not appropriate, plotting the incorrect thing, etc. . . ).

5 mins: Exchange coursework with someone else.

15 mins: Read and write feedback (Encourage students to write positively).

10 mins: Pair and discuss feedback.

10 mins: Group discussion.

46 Chapter 3. Optional meetings

Page 51: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

3.4 AD - Moran process

Duration: 25 minutes

3.4.1 Objectives

• Compute the probability of fixation in the Hawk Dove game using the formula.

3.4.2 Notes

For:

𝐴 =

(︂0 31 2

)︂Ask students to work in pairs, using the formula to compute 𝑥1.

This is given by:

𝑓1𝑖 =3(𝑁 − 𝑖)

𝑁 − 1= 3

𝑁 − 𝑖

𝑁 − 1(3.1)

𝑓2𝑖 =𝑖+ 2(𝑁 − 𝑖− 1)

𝑁 − 1=

2𝑁 − 2− 𝑖

𝑁 − 1(3.2)

(3.3)

This gives (for 𝑁 = 4):

𝑖 = 1 𝑖 = 2 𝑖 = 3𝑓1𝑖 3 2 1𝑓2𝑖 5/3 4/3 1𝛾𝑖 5/9 2/3 1

Thus:

𝑥1 =1

1 + 5/9 + 5/9× 2/3 + 5/9× 2/3× 1=

1

62/27=

27

62≈ .44

3.5 AE - Support enumeration

3.5.1 Corresponding chapters

• Support enumeration

Duration: 50 minutes

3.5.2 Objectives

• Obtain the Nash equilibria of a game using support enumeration.

3.4. AD - Moran process 47

Page 52: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

3.5.3 Notes

Play a knockout RPSLS tournament

Tell students to works in pairs.

Invite students to obtain the Nash equilibria, using support enumeration for the following game:

𝐴 =

(︂1 −1−2 2

)︂Ask students to apply the support enumeration algorithm.

For supports of size one: there are no pairs of best response so now we only consider 𝑘 = 2. This implies the followingpair of linear equations:

−𝜎𝑟1 + 2𝜎𝑟2 = 𝜎𝑟1 − 2𝜎𝑟2

𝜎𝑟1 =1

2𝜎𝑟2

𝜎𝑐1 − 𝜎𝑐2 = −2𝜎𝑐1 + 2𝜎𝑐2

𝜎𝑐1 = 𝜎𝑐2

Setting this to be probability distributions gives the final result:

𝜎𝑟 = (1/3, 2/3)

𝜎𝑐 = (1/2, 1/2)

We don’t need to really check the best response condition as the players have no where to deviate to but we can:

𝐴𝜎𝑇𝑐 =

(︂00

)︂Thus 𝜎𝑟 is a best response to 𝜎𝑐.

𝜎𝑟𝐵 = 𝜎𝑟(−𝐴) = (0, 0)

Thus 𝜎𝑐 is a best response to 𝜎𝑟.

We can confirm all of this using nashpy:

>>> import nashpy as nash>>> import numpy as np>>> A = np.array([[1, -1],... [-2, 2]])>>> game = nash.Game(A)>>> list(game.support_enumeration())[(array([0.666..., 0.333...]), array([0.5, 0.5]))]

As a second example consider:

𝐴 =

⎛⎝ 0 −2 21 0 −1−1 1 0

⎞⎠1. We see there are no pairs of best response, so consider 𝑘 ∈ {2, 3}.

48 Chapter 3. Optional meetings

Page 53: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

2. We have the following set of utilities to consider:

1. 𝐼 = {1, 2}

1. 𝐽 = {1, 2}

2. 𝐽 = {1, 3}

3. 𝐽 = {2, 3}

2. 𝐼 = {1, 3}

1. 𝐽 = {1, 2}

2. 𝐽 = {1, 3}

3. 𝐽 = {2, 3}

3. 𝐼 = {2, 3}

1. 𝐽 = {1, 2}

2. 𝐽 = {1, 3}

3. 𝐽 = {2, 3}

4. 𝐼 = 𝐽 = {1, 2, 3}

The case for k=2 leads to contradictions (let students identify some of these).

4. 𝐼 = 𝐽 = {1, 2, 3}

In this case we have:

−𝜎𝑟2 + 𝜎𝑟3 = 2𝜎𝑟1 − 𝜎𝑟3

2𝜎𝑟1 − 𝜎𝑟3 = −2𝜎𝑟1 + 𝜎𝑟2

which has solution:

2𝜎𝑟1 = 𝜎𝑟2 = 𝜎𝑟3

Similarly:

−2𝜎𝑐2 + 2𝜎𝑐3 = 𝜎𝑐1 − 𝜎𝑐3

𝜎𝑐1 − 𝜎𝑐3 = −𝜎𝑐1 + 𝜎𝑐2

which has solution:

𝜎𝑐1 = 𝜎𝑐2 = 𝜎𝑐3

4. Now we consider which of those supports give valid mixed strategies:

4. 𝐼 = 𝐽 = {1, 2, 3}

𝜎𝑟 = (1/5, 2/5, 2/5)

𝜎𝑐 = (1/3, 1/3, 1/3)

5. The final step is to check the best response condition:

4. 𝐼 = 𝐽 = {1, 2, 3}

3.5. AE - Support enumeration 49

Page 54: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

𝐴𝜎𝑇𝑐 =

⎛⎝000

⎞⎠Thus 𝜎𝑟 is a best response to 𝜎𝑐.

𝜎𝑟𝐵 = (0, 0, 0)

Thus 𝜎𝑐 is a best response to 𝜎𝑟.

We can confirm all of this using nashpy:

>>> import nashpy as nash>>> A = np.array([[0, -2, 2],... [1, 0, -1],... [-1, 1, 0]])>>> rps = nash.Game(A)>>> list(rps.support_enumeration())[(array([0.2..., 0.4..., 0.4...]), array([0.333..., 0.333..., 0.333...]))]

Discuss with students about what happens when we have a 3 by 2 game?

3.6 AF- Revisiting Lemke-Howson

3.6.1 Corresponding chapters

• Lemke Howson algorithm

Duration: 50 minutes

3.6.2 Objectives

• Re visit Lemke Howson Algorithm

3.6.3 Notes

Consider the (𝐴,𝐵) ∈ R2×22 game:

𝐴 =

(︂1 4−3 2

)︂𝐵 =

(︂5 2−1 9

)︂Ask students in pairs to construct the column player best response polytope

First we add 4 to all elements to ensure they are positive:

𝐴 =

(︂5 81 6

)︂𝐵 =

(︂9 63 13

)︂By definition (https://vknight.org/gt/chapters/06/#Definition-of-best-response-polytopes), the column player best re-sponse polytope 𝒬 is given by:

𝒬 = {𝑦 ∈ R𝑛 | 𝐴𝑦 ≤ 1 𝑦 ≥ 0}

50 Chapter 3. Optional meetings

Page 55: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

So the inequalities are:

5𝑦1 + 8𝑦2 ≤ 1 label: 0𝑦1 + 6𝑦2 ≤ 1 label: 1

𝑦1 ≥ 0 label: 2𝑦2 ≥ 0 label: 3

The first two inequalities come from 𝐴𝑦 ≤ 1, however, 𝐴𝑦 corresponds to the row players utility against vertex 𝑦.

Note that everything is scaled so that the best response of the row player is 1 (the scaling takes place in the strate-gies/vertices).

Thus if a vertex has label 0 or 1 that implies that the first/second row player strategy is a best response to 𝑦.

We will return to this idea but first let us plot our 𝑄. We start by rearranging our inequalities:

𝑦2 ≤ 1/8− 5/8𝑦1 label: 0𝑦2 ≤ 1/6− 𝑦1/6 label: 1𝑦1 ≥ 0 label: 2𝑦2 ≥ 0 label: 3

Now let us sketch these:

import matplotlib.pyplot as plt%matplotlib inliney1s = [0, 1]first_line = [1 / 8 - 5 / 8 * y for y in y1s]second_line = [1 / 6 - y / 6 for y in y1s]plt.figure()plt.plot(first_line, label="$1/8-5/8y_1$ (label: 0)")plt.plot(second_line, label="$1/6-y_1/6$ (label: 1)")plt.legend();

This gives:

3.6. AF- Revisiting Lemke-Howson 51

Page 56: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

We can see that our polytope is a straightforward triangle and that no vertex will ever have label 1.

Discuss if this is OK?

Returning to the game:

𝐴 =

(︂5 81 6

)︂We see that the first strategy dominates the second: so the second strategy is never a best response.

Our vertices with labels are:

(0, 0) labels {2, 3}(0, 1/8) labels {0, 2}(1/5, 0) labels {0, 3}

Exercise:

Obtain the best response polytope for the row player:

By definition (https://vknight.org/gt/chapters/06/#Definition-of-best-response-polytopes), the row player best re-sponse polytope 𝒫 is given by:

𝒬 = {𝑥 ∈ R𝑚 | 𝑥 ≥ 0 𝑥𝐵 ≤ 1}

So the inequalities are:

𝑥1 ≥ 0 label: 0𝑥2 ≥ 0 label: 1

9𝑥1 + 3𝑥2 ≤ 1 label: 22𝑥1 + 9𝑥2 ≤ 1 label: 3

Rearranging:

𝑥1 ≥ 0 label: 0𝑥2 ≥ 0 label: 1𝑥2 ≤ 1/3− 3𝑥1 label: 2𝑥2 ≤ 1/9− 2/9𝑥1 label: 3

Now let us sketch these:

x1s = [0, 1]first_line = [1 / 3 - 3 * x for x in x1s]second_line = [1 / 9 - 2 / 9 * x for x in x1s]plt.figure()plt.plot(first_line, label="$1/3-3x_1$ (label: 2)")plt.plot(second_line, label="$1/9-2/9x_1/6$ (label: 3)")plt.legend();

This gives:

52 Chapter 3. Optional meetings

Page 57: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

We see that there our polytope has 4 vertices:

(0, 0) labels {0, 1}(0, 1/9) labels {0, 3}(1/9, 0) labels {1, 2}

(2/25, 7/75) labels {2, 3}

Here is some sympy code to verify the intersection of both boundaries:

>>> import sympy as sym>>> x = sym.Symbol("x")>>> sym.solveset(sym.S(1) / 3-3 * x - sym.S(1) / 9 + sym.S(2) / 9 * x, x){2/25}

We now carry out the Lemke-Howson algorithm.

• Start at ((0, 0), (0, 0)), have labels {0, 1, 2, 3}. Drop 3 in 𝒬: move to ((0, 0), (0, 1/8)) and pick up 0.

• At ((0, 0), (0, 1/8)), have labels {0, 1, 2}. Drop 0 in 𝒫: move to ((1/9, 0), (0, 1/8)) and pick up 2.

• At ((1/9, 0), (0, 1/8)), have labels {0, 1, 2}. Drop 2 in 𝒬: move to ((1/9, 0), (1/5, 0)) and pick up 3.

• At ((1/9, 0), (1/5, 0)), have labels {0, 1, 2, 3}: stop.

Normalise to get:

((1, 0), (1, 0))

Now, ask students to carry this out using tableaux:

Apply definitions to get:

𝑇𝑟 =

(︂9 3 1 0 16 13 0 1 1

)︂

3.6. AF- Revisiting Lemke-Howson 53

Page 58: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

and:

𝑇𝑐 =

(︂1 0 5 8 10 1 1 6 1

)︂• 𝑇𝑟 has labels {0, 1}.

• 𝑇𝑐 has labels {2, 3}.

Now let us drop 3 from 𝑇𝑐. The minimum ratio test: 1/8 < 1/6 so pivot on 1st row.

𝑇𝑐 =

(︂1 0 5 8 1−6 8 −22 0 2

)︂which has labels {0, 2} thus we need to drop 0 in 𝒫 . The minimum ratio test: 1/9 < 1/6 so pivot on 1st row:

𝑇𝑟 =

(︂9 3 1 0 10 99 −6 9 3

)︂which has labels {1, 2} thus we need to drop 2 in 𝒬. The minimum ratio test: there is only 1 positive ratio so we pivoton 1st row.

𝑇𝑐 =

(︂1 0 5 8 1−8 40 0 176 32

)︂which has labels {0, 3} thus we have a Nash equilibria.

Recall that the variable for 𝑇𝑟 in order are:

𝑥1, 𝑥2, 𝑠1, 𝑠2

Recall that the variable for 𝑇𝑟 in order are:

𝑠1, 𝑠2, 𝑦1, 𝑦2

Thus, 5𝑦1 = 1 and 9𝑥1 = 1 which gives:

𝑥 = (1/9, 0) 𝑦 = (1/5, 0)

after normalising we get the required result:

𝜎𝑟 = (1, 0) 𝜎𝑐 = (1, 0)

54 Chapter 3. Optional meetings

Page 59: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

CHAPTER 4

About the materials

4.1 Overview

The course content is all written using Jupyter notebooks which you can find here: https://github.com/drvinceknight/gt/tree/master/nbs.

This is done using a combination of Python and Markdown (as well as LaTeX for the mathematics).

The Jupyter notebooks are then converted in to html which is served using Github pages and can be seen here: https://vknight.org/gt/.

4.2 How this is done

Using git you can clone the repository locally:

git clone https://github.com/drvinceknight/gt.git

This will create a copy of all the source files on to a directory on your computer called gt.

To view the notebooks locally you will need to have Python and Jupyter installed on your computer. I recommendusing the Anaconda distribution of Python which not only includes Jupyter but also a number of tools to ensureportability and reproducibility of scientific computing.

In the gt directory you will see an environment.yml file. This is used to define a specific set of Python libraries.To create this environment on your computer you will need to use a command line application:

• On Mac OSX or linux: search for terminal.

• On Windows, assuming you have installed the Anaconda distribution: search for Anaconda Prompt.

Open this and navigate to the gt directory, this is done using the cd command. For more information about using thecommand line, take a look at https://vknight.org/rsd/chapters/01/.

Once there, type:

55

Page 60: Vince's Game theory course DocumentationVince’s Game theory course Documentation Strategy profiles as coordinates on a game One way to thing of any game ( , ) ∈R × 2 is as a

Vince’s Game theory course Documentation

conda env create -f environment.yml

This will create a Python environment on your machine with all the required software called gt. To activate thisenvironment, type:

source activate gt

You can now start a Jupyter server by typing (still in the command line application):

jupyter notebook

This will open a tab in your browser (note that you do not need to be online for this) that you can browse and usethe notebooks with.

Finally if you want to convert the notebooks in to the html. In your command line application, making sure you haveactivated the gt environment type:

inv main

4.3 Testing all the code

One important advantage of writing the course notes in this way is that all code snippets (which are often used toconfirm calculations carried out) can be tested. This helps ensure that there are no errors in the notes.

To test the code, in your command line application type:

inv test

56 Chapter 4. About the materials