vince's game theory course documentationvince’s game theory course documentation strategy...

Vince’s Game theory courseDocumentation

Vince Knight

Mar 02, 2020

Contents

1 Preparation 3

2 Class meetings 52.1 00 Introduction to class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 01 Strategies and utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 02 Rationalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.4 03 Best responses and Nash equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.5 04 Support enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.6 05 Best response polytopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.7 06 Lemke Howson Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.8 07 Repeated games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.9 08 Prisoners Dilemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.10 09 Evolutionary dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.11 10 Evolutionary game theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.12 11 Moran Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.13 12 Contemporary research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3 Optional meetings 413.1 AA- Revisiting Polytopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.2 AB - Revisiting Repeated games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453.3 AC - Coursework peer review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.4 AD - Moran process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.5 AE - Support enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.6 AF- Revisiting Lemke-Howson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4 About the materials 554.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.2 How this is done . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.3 Testing all the code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

i

Vince’s Game theory course Documentation

The class notes are available here: vknight.org/gt/.

Contents 1

http://vknight.org/gt/


2 Contents

CHAPTER 1

Preparation

Update hackmd link if necessary.

3


4 Chapter 1. Preparation

CHAPTER 2

Class meetings

Here are my notes for each class meeting.

2.1 00 Introduction to class

2.1.1 Corresponding chapters

• Introduction to the course

• Normal form games

Duration: 50 minutes

2.1.2 Objectives

• Outline timetable and approach to class.

• Discuss feedback from previous years.

• Make students aware of learning resources available to them.

• Make students aware of assessment criteria

• Make students aware of coding aspect of course (both in assessment and in content)

• Introduce games.

2.1.3 Notes

Timetable and approach

Discuss timetable

5

http://vknight.org/gt/chapters/00/



I will not lecture

This course is taught in a flipped learning environment. All notes, homework, exercises are available to you.

In class we will work in parallel to the notes but not directly following them. You will be expected to play an activerole in your learning:

• Activities that demonstrate concepts;

• Discussions

Feedback

Show feedback from previous year.

Learning and assessment

All resources can be found at http://vknight.org/gt/

Discuss the markdown version of notes (available on hackmd.io).

Seeking assistance

Details on website.

Programming

The course notes will include code throughout, this is both to show you how to use code to study mathematics (anexemplar of modern mathematics) but also to illustrate the ideas.

Details about the programming involved can be found at http://vknight.org/gt/other/programming/

Introduction to games

Using http://vknight.org/school-outreach/assets/playing-games/tex/form.pdf

1. Play two thirds of the average game

2. Rationalise

3. Play two thirds of the average game

Discuss notes on Normal form games.

2.2 01 Strategies and utilities


• Calculating utilities of strategies


6 Chapter 2. Class meetings

http://vknight.org/gt/

http://vknight.org/gt/other/programming/

http://vknight.org/school-outreach/assets/playing-games/tex/form.pdf



2.2.2 Objectives

• Define mixed strategies

• Understand utility calculation for mixed strategies

• Understand utility calculation as a linear algebraic construct

2.2.3 Notes

Utility calculations

Use matching pennies form have students play in pairs. Following each game:

• Ask how many people won?

• Ask why they won?

Mixed strategies

Look at definition for mixed strategies.

Consider:

𝐴 =

(︂2 −2−1 1

)︂𝐵 =

(︂−2 21 −1

)︂Let us assume we have 𝜎𝑟 = (.3, .7) and 𝜎𝑐 = (.1, .9):

𝑢𝑟(𝜎𝑟, 𝜎𝑐) = 0.3× 0.1× 2 + 0.3× 0.9× (−2) + 0.7× 0.1× (−1) + 0.7× 0.9× 1 = 0.08

because the game is zero sum we immediately know:

𝑢𝑐(𝜎𝑟, 𝜎𝑐) = −0.08

This corresponds to the linear algebraic multiplication:

𝑢𝑟(𝜎𝑟, 𝜎𝑐) = 𝜎𝑟𝐴𝜎𝑇𝑐

𝑢𝑐(𝜎𝑟, 𝜎𝑐) = 𝜎𝑟𝐵𝜎𝑇𝑐

(Go through this on the board, make sure students are comfortable.)

This can be done straightforwardly using numpy:

>>> import numpy as np>>> A = np.array([[2, -2], [-1, 1]])>>> B = np.array([[-2, 2], [1, -1]])>>> sigma_r = np.array([.3, .7])>>> sigma_c = np.array([.1, .9])>>> np.dot(sigma_r, np.dot(A, sigma_c)), np.dot(sigma_r, np.dot(B, sigma_c))(0.079..., -0.079...)

2.2. 01 Strategies and utilities 7


Strategy profiles as coordinates on a game

One way to thing of any game (𝐴,𝐵) ∈ R𝑚×𝑛2 is as a mapping from the set of strategies [0, 1]𝑚R × [0, 1]𝑛R to R2: theutility space.

Equivalently, if 𝑆𝑟, 𝑆𝑐 are the strategy spaces of the row/column player:

(𝐴,𝐵) : 𝑆𝑟 × 𝑆𝑐 → R2

We can use games defined in nashpy in that way:

>>> import nashpy as nash>>> game = nash.Game(A, B)>>> game[sigma_r, sigma_c]array([ 0.08, -0.08])

2.3 02 Rationalisation


• Rationalisation


2.3.2 Objectives

• Be able to identify dominated strategies

• Understand notion of Common knowledge of rationality

2.3.3 Notes

Rationalisation of strategies

Identify two volunteers and play a sequence of zero sum games where they play as a team against me. The group isthe row player.

𝐴 =

(︂1 −1−1 2

)︂(No dominated strategy)

𝐴 =

(︂1 2−1 2

)︂(First column weakly dominates second column)

𝐴 =

(︂1 −1−1 −3

)︂• First row strictly dominates second row

• Second column strictly dominates first column




𝐴 =

(︂2 2−1 2

)︂• First row weakly dominates second row

• First column weakly dominates second column

𝐴 =

⎛⎝−1 2 1−2 −2 11 1 −1

⎞⎠• First row dominates second row

• First column dominates second column

Now pit the two players against each other, the utilities represent the share of the total amount of chocolates/sweetsgathered so far:

𝐴 =

(︂1 01.5 .5

)︂𝐵 =

(︂1 1.50 .5

)︂Capture all of the above (on the white board) and discuss each action and why they were taken.

Ask students to repeat the game against each other in groups of two (use “days of doing the dishes perhaps?”).

Iterated elimination of dominated strategies

As a class work through the following example.

𝐴 =

⎛⎝2 51 27 3

⎞⎠ 𝐵 =

⎛⎝0 36 10 1

⎞⎠1. First row dominates second row

𝐴 =

(︂2 57 3

)︂𝐵 =

(︂0 30 1

)︂

2. Second column dominates first column

𝐴 =

(︂27

)︂𝐵 =

(︂31

)︂

3. First row dominates third row. Thus the rationalised behaviour is (𝑟1, 𝑐2).

Now return to the last example played as a pair:

𝐴 =

⎛⎝−1 2 1−2 −2 11 1 −1

⎞⎠ 𝐵 =

⎛⎝ 1 −2 −12 2 −1−1 −1 1

⎞⎠1. The first row/column weakly dominate the second row/column:

𝐴 =

(︂−1 11 −1

)︂𝐵 =

(︂1 −1−1 1

)︂

There is nothing further that we can do here.

2.3. 02 Rationalisation 9


2.4 03 Best responses and Nash equilibrium


• Best responses


2.4.2 Objectives

• Define best responses

• Identify best responses in pure strategies

• Identify best responses against mixed strategies

• Theorem: best response condition

• Definition of Nash equilibria

2.4.3 Notes

Discuss best response in pure strategies.

Best response against mixed strategies

Use best responses against mixed strategies have students play against a mixed strategy:

>>> import random>>> random.seed(0) # Don't seed in class>>> ["D", "C"][random.random() < 0.3] # 30 chance of Cooperating'D'

Discuss the definition of a best response. Identify best responses for the game considered:

𝐴 =

(︂2 −2−1 1

)︂𝐵 =

(︂−2 21 −1

)︂Consider the best responses against a mixed strategy:

• Assume 𝜎𝑟 = (𝑥, 1− 𝑥)

• Assume 𝜎𝑐 = (𝑦, 1− 𝑦)

We have:

𝐴𝜎𝑇𝑐 =

(︂4𝑦 − 21− 2𝑦

)︂𝜎𝑟𝐵 =

(︀1− 3𝑥 3𝑥− 1

)︀Here is the code to do this calculation with sympy:

>>> import sympy as sym>>> import numpy as np>>> x, y = sym.symbols('x, y')>>> A = sym.Matrix([[2, -2], [-1, 1]])>>> B = - A

(continues on next page)




(continued from previous page)

>>> sigma_r = sym.Matrix([[x, 1-x]])>>> sigma_c = sym.Matrix([y, 1-y])>>> A * sigma_c, sigma_r * B(Matrix([[ 4*y - 2],[-2*y + 1]]), Matrix([[-3*x + 1, 3*x - 1]]))

Plot these two things:

>>> import matplotlib.pyplot as plt>>> ys = [0, 1]>>> row_us = [[(A * sigma_c)[i].subs({y: val}) for val in ys] for i in range(2)]>>> plt.plot(ys, row_us[0], label="$(A\sigma_c^T)_1$")[<matplotlib...>]>>> plt.plot(ys, row_us[1], label="$(A\sigma_c^T)_2$")[<matplotlib...>]>>> plt.xlabel("$\sigma_c=(y, 1-y)$")>>> plt.title("Utility to player 1")>>> plt.legend();

>>> xs = [0, 1]>>> row_us = [[(sigma_r * B)[j].subs({x: val}) for val in xs] for j in range(2)]>>> plt.plot(ys, row_us[0], label="$(\sigma_rB)_1$")[<matplotlib...>]>>> plt.plot(ys, row_us[1], label="$(\sigma_rB)_2$")[<matplotlib...>]>>> plt.xlabel("$\sigma_r=(x, 1-x)$")>>> plt.title("Utility to column player")>>> plt.legend();

Conclude:

𝜎*𝑟 =

⎧⎪⎨⎪⎩(1, 0), if 𝑦 > 1/2

(0, 1), if 𝑦 < 1/2

indifferent, if 𝑦 = 1/2

𝜎*𝑐 =

⎧⎪⎨⎪⎩(0, 1), if 𝑥 > 1/3

(1, 0), if 𝑥 < 1/3

indifferent, if 𝑥 = 1/3

Some examples:

• If 𝜎𝑟 = (2/3, 1/3) then 𝜎*𝑐 = (0, 1).

• If 𝜎𝑟 = (1/3, 2/3) then any strategy is a best response.

Discuss best response condition theorem and proof.

This gives a finite condition that needs to be checked. To find the best response against 𝜎𝑐 we potentially would needto check all infinite possibilities alternatives to 𝜎*

𝑟 . Now we simply need to check the values of the pure strategiesagainst 𝜎𝑐:

• Either there will be a single pure best response;

• There will be multiple pure strategies for which the row player is indifferent.

Return to previous example:if 𝜎𝑟 = (1/3, 2/3) then (𝜎𝑟𝐵) = (0, 0) thus (𝜎𝑟𝐵)𝑗 = 0 for all 𝑗.

(𝜎𝑟, 𝜎𝑐) = ((1/3, 1/2), (1/2, 1/2)) is a pair of best responses.

Discuss definition of Nash equilibria.

Explain how the best response condition theorem can be used to find NE.

2.4. 03 Best responses and Nash equilibrium 11


• All possible supports (strategies that are played with positive probabilities) can be checked.

• All pure strategies must have maximum and equal payoff.

2.5 04 Support enumeration


• Support enumeration


2.5.2 Objectives

• Define of support of a strategy

• Define nondegenerate games

• Describe and use suport enumeration

2.5.3 Notes

Play a knockout RPSLS tournament

Tell students to create groups of 4 or 8 and play a knockout RPSLS tournament.

Explain rules and show:




Following the tournaments: have discussion about how winner won etc. . .

Explain that we will study this game using Game Theory:

𝐴 =

⎛⎜⎜⎜⎜⎝0 −1 1 1 −11 0 −1 −1 1−1 1 0 1 −1−1 1 −1 0 11 −1 1 −1 0

⎞⎟⎟⎟⎟⎠Then go over chapter 5:

• Define support of a strategy (relate to tournament previously played). For example the strategy (0, 1/2, 0, 0, 1/2)has support {2, 5}:

>>> import numpy as np>>> sigma = np.array([0, 1/2, 0, 0, 1/2])>>> np.where(sigma > 0)(array([1, 4]),)

• Discuss non degenerate games and show how RPSLS is in fact a degenerate game:

>>> A = np.array([[0, -1, 1, 1, -1],... [1, 0, -1, -1, 1],... [-1, 1, 0, 1, -1],... [-1, 1, -1, 0, 1],... [1, -1, 1, -1, 0]])>>> sigma_c = np.array([1, 0, 0, 0, 0])>>> np.dot(A, sigma_c) # Two element give maxarray([ 0, 1, -1, -1, 1])

• Discuss support enumeration algorithm. Theoretically could be used for RPSLS but would need to considersupports of unequal size. For the purpose of the illustration consider RPS:

𝐴 =

⎛⎝ 0 −1 11 0 −1−1 1 0

⎞⎠Apply the algorithm.

1. We see there are no pairs of best response, so consider 𝑘 ∈ {2, 3}.

2. We have the following set of utilities to consider:

1. 𝐼 = {1, 2}

1. 𝐽 = {1, 2}

2. 𝐽 = {1, 3}

3. 𝐽 = {2, 3}

2. 𝐼 = {1, 3}

1. 𝐽 = {1, 2}

2. 𝐽 = {1, 3}

3. 𝐽 = {2, 3}

3. 𝐼 = {2, 3}

1. 𝐽 = {1, 2}

2.5. 04 Support enumeration 13


2. 𝐽 = {1, 3}

3. 𝐽 = {2, 3}

4. 𝐼 = 𝐽 = {1, 2, 3}

3. Now we consider (some not all as they are mainly the same) of the corresponding linear equations.

1. 𝐼 = {1, 2}

1. 𝐽 = {1, 2}

0𝜎𝑟1 − 𝜎𝑟2 = 𝜎𝑟1 + 0𝜎𝑟2

−𝜎𝑟2 = 𝜎𝑟1

0𝜎𝑐1 − 𝜎𝑐2 = 𝜎𝑐1 + 0𝜎𝑐2

𝜎𝑐1 = −𝜎𝑐2

2. 𝐽 = {1, 3}

0𝜎𝑟1 − 𝜎𝑟2 = −𝜎𝑟1 + 𝜎𝑟2

𝜎𝑟1 = 2𝜎𝑟2

0𝜎𝑐1 + 𝜎𝑐3 = 𝜎𝑐1 − 𝜎𝑐3

𝜎𝑐1 = 2𝜎𝑐3

2. 𝐽 = {2, 3}

𝜎𝑟1 + 0𝜎𝑟2 = −𝜎𝑟1 + 𝜎𝑟2

2𝜎𝑟1 = 𝜎𝑟2

−𝜎𝑐2 + 𝜎𝑐3 = 0𝜎𝑐2 − 𝜎𝑐3

𝜎𝑐2 = 2𝜎𝑐3

2. 𝐼 = {1, 3} Similarly.


4. 𝐼 = 𝐽 = {1, 2, 3}

In this case we have:

−𝜎𝑟2 + 𝜎𝑟3 = 𝜎𝑟1 − 𝜎𝑟3

𝜎𝑟1 − 𝜎𝑟3 = −𝜎𝑟1 + 𝜎𝑟2

which has solution:

𝜎𝑟1 = 𝜎𝑟2 = 𝜎𝑟3

Similarly:

−𝜎𝑐2 + 𝜎𝑐3 = 𝜎𝑐1 − 𝜎𝑐3

𝜎𝑐1 − 𝜎𝑐3 = −𝜎𝑐1 + 𝜎𝑐2

which has solution:

𝜎𝑐1 = 𝜎𝑐2 = 𝜎𝑐3



4. Now we consider which of those supports give valid mixed strategies:

1. 𝐼 = {1, 2}

1. 𝐽 = {1, 2}

𝜎𝑟 = (𝑘,−𝑘, 0)

which is not possible

2. 𝐽 = {1, 3}

𝜎𝑟 = (2/3, 1/3, 0)

𝜎𝑐 = (2/3, 0, 1/3)

3. 𝐽 = {2, 3}

𝜎𝑟 = (1/3, 2/3, 0)

𝜎𝑐 = (0, 2/3, 1/3)



4. 𝐼 = 𝐽 = {1, 2, 3}

𝜎𝑟 = (1/3, 1/3, 1/3)

𝜎𝑐 = (1/3, 1/3, 1/3)

5. The final step is to check the best response condition:

1. 𝐼 = {1, 2}

2. 𝐽 = {1, 3}

𝐴𝜎𝑇𝑐 =

⎛⎝ 1/31/3−2/3

⎞⎠Thus 𝜎𝑟 is a best response to 𝜎𝑐.

𝜎𝑟𝐵 = (−1/3, 2/3,−1/3)

Thus 𝜎𝑐 is not a best response to 𝜎𝑟.

3. 𝐽 = {2, 3}

𝐴𝜎𝑇𝑐 =

⎛⎝−1/3−1/32/3

⎞⎠Thus 𝜎𝑟 is not a best response to 𝜎𝑐.

𝜎𝑟𝐵 = (−2/3, 1/3, 1/3)

Thus 𝜎𝑐 is a best response to 𝜎𝑟.


2.5. 04 Support enumeration 15



4. 𝐼 = 𝐽 = {1, 2, 3}

𝐴𝜎𝑇𝑐 =

⎛⎝000


𝜎𝑟𝐵 = (0, 0, 0)


We can confirm all of this using nashpy:

>>> import nashpy as nash>>> A = np.array([[0, -1, 1],... [1, 0, -1],... [-1, 1, 0]])>>> rps = nash.Game(A)>>> list(rps.support_enumeration())[(array([0.333..., 0.333..., 0.333...]), array([0.333..., 0.333..., 0.333...]))]

Note that it can be computationally expensive to find all equilibria however nashpy can be used to find a Nashequilibrium by finding the first one:

>>> next(rps.support_enumeration())(array([0.333..., 0.333..., 0.333...]), array([0.333..., 0.333..., 0.333...]))

Discuss Nash’s theorem briefly. Highlight how that can seem contradictory for the output of nashpy (using supportenumeration) for the degenerate game of the notes. However, that won’t always be the case:

>>> A = np.array([[0, -1, 1, 1, -1],... [1, 0, -1, -1, 1],... [-1, 1, 0, 1, -1],... [-1, 1, -1, 0, 1],... [1, -1, 1, -1, 0]])>>> rpsls = nash.Game(A)>>> list(rpsls.support_enumeration())[(array([0.2, 0.2, 0.2, 0.2, 0.2]), array([0.2, 0.2, 0.2, 0.2, 0.2]))]

Some details about the proof:

• Proved in a 19 page thesis! (2 pages of appendices)

• Noble prize for economics

• Watch a beautiful mind

2.6 05 Best response polytopes


• Best response polytopes





2.6.2 Objectives

• Define the best response polytopes (using both definitions: convex hull and halfspaces).

• Labelling of vertices

• Equivalence between equilibrium and fully labeled pair

• Vertex enumeration

2.6.3 Notes

Defining best response polytopes

Discuss what an average is. Place this in the context of the line between two points. Weighted averages of two pointsare just a line. So an average is just a probability distribution. Expand this notion to the average over two points. Howwould you take the weighted average between three points 𝑥, 𝑦, 𝑧:

𝜆1𝑥+ 𝜆2𝑦 + 𝜆3𝑧

This corresponds to something like:

2.6. 05 Best response polytopes 17


Explain how another definition for that space would be to draw inequalities on our variables. Give definition of bestresponse polytopes for a game. Illustrate this with the battle of the sexes game (scaled):

𝐴 =

(︂3 10 2

)︂𝐵 =

(︂2 10 3

)︂Go over the definition of the best response polytopes.

The row best response polytope is given by:

𝒫 = {𝑥 ∈ R𝑚 | 𝑥 ≥ 0;𝑥𝐵 ≤ 1}

This corresponds to the following inequalities:

• 𝑥1 ≥ 0

• 𝑥2 ≥ 0

• 2𝑥1 ≤ 1

• 𝑥1 + 3𝑥2 ≤ 1

From this we can identify the vertices:

• (𝑥1, 𝑥2) = (0, 0)

• (𝑥1, 𝑥2) = (1/2, 0)

• (𝑥1, 𝑥2) = (0, 1/3)

• (𝑥1, 𝑥2) = (1/2, 1/6)

This can be confirmed using nashpy:

>>> import numpy as np>>> import nashpy as nash>>> B = np.array([[2, 1], [0, 3]])>>> halfspaces = nash.polytope.build_halfspaces(B.transpose())>>> for v, l in nash.polytope.non_trivial_vertices(halfspaces):... print(v)[0.5 0. ][0. 0.333...][0.5 0.166...]

The column best response polytope is given by:

𝒬 = {𝑦 ∈ R𝑚 | 𝐴𝑦 ≤ 1; 𝑦 ≥ 0}

This corresponds to the following inequalities:

• 3𝑦1 + 𝑦2 ≤ 1

• 2𝑦2 ≤ 1

• 𝑦1 ≥ 0

• 𝑦2 ≥ 0

From this we can identify the vertices:

• (𝑦1, 𝑦2) = (0, 0)

• (𝑦1, 𝑦2) = (1/3, 0)

• (𝑦1, 𝑦2) = (0, 1/2)



• (𝑦1, 𝑦2) = (1/6, 1/2)

Confirmed:

>>> import numpy as np>>> A = np.array([[3, 1], [0, 2]])>>> halfspaces = nash.polytope.build_halfspaces(A)>>> for v, l in nash.polytope.non_trivial_vertices(halfspaces):... print(v)[0.333... 0. ][0. 0.5][0.1666... 0.5 ]

Pair activity

Ask everyone to draw these two polytopes.

Now describe how we label the vertices: using the same ordering as the inequalities (starting at 0), a vertex has thelabel corresponding to that inequality if it is a strict equality.

𝒫:

𝒬:

2.6. 05 Best response polytopes 19


Explain that what these polytopes represent is the scaled strategies when players maximum utilities are 1. So given,the action of an opponent, if the players’ utility is 1 they are playing a best response.

Ask each person in a pair to be the row or the column player.

Have the row player pick a vertex, then identify the strategy and it’s best response and then pick the correspondingvertex.

For example:

Row player picks (0, 1/3) which has labels (0, 3) and corresponds to the strategy (0, 1). The best response to thesecond row is the first column which corresponds to vertex (0, 1/2) which has labels (1, 2).

Remind ourselves why these are the labels? (Refer to the defining properties of the polytopes).

Repeat with other vertices.

Aim to identify strategies that are best responses to each other:

• 𝜎𝑟 = (0, 1) and 𝜎𝑟 = (0, 1)

• 𝜎𝑟 = (3/4, 1/4) and 𝜎𝑟 = (1/4, 3/4)

• 𝜎𝑟 = (1, 0) and 𝜎𝑟 = (1, 0)

Discuss how this relates to the labels again.

Finally show how this is implemented in nashpy:



>>> A = np.array([[1, -1], [-1, 1]])>>> matching_pennies = nash.Game(A)>>> for eq in matching_pennies.vertex_enumeration():... print(eq)(array([0.5, 0.5]), array([0.5, 0.5]))

If there is time use support enumeration to compare.

2.7 06 Lemke Howson Algorithm


• The Lemke Howson algorithm

2.7.2 Objectives

• Describe the Lemke Howson algorithm graphically

• Describe integer pivoting


2.7.3 Notes

Describe the Lemke Howson algorithm

Using the polytopes for the matching pennies game:

𝒫:

𝒬:

2.7. 06 Lemke Howson Algorithm 21



illustrate the Lemke Howson algorithm:

• ((0, 0), (0, 0)) have labels {0, 1}, {2, 3}. Drop 0 in 𝒫 .

• → ((1/2, 0), (0, 0)) have labels {1, 2}, {2, 3}. Drop 2 in 𝒬.

• → ((1/2, 0), (1/3, 0)) have labels {1, 2}, {0, 3}. Have an equilibria.

Try varying other initial labels. Is it possible to arrive at the mixed nash equilibrium?

Work through integer pivoting

Consider Rock Paper Scissors:

𝐴 =

⎛⎝ 0 −1 11 0 −1−1 1 0

⎞⎠ 𝐵 =

⎛⎝ 0 1 −1−1 0 11 −1 0

⎞⎠We see that drawing this would be a pain. Let us build the tableaux, but let us first scale our payoffs so that all variablesare positive (let us add 2 to all the utilities):

𝐴 =

⎛⎝2 1 33 2 11 3 2

⎞⎠ 𝐵 =

⎛⎝2 3 11 2 33 1 2

⎞⎠



This gives the following tableaux:

𝑇𝑟 =

⎛⎝2 1 3 1 0 0 13 2 1 0 1 0 11 3 2 0 0 1 1

⎞⎠Following along in numpy:

>>> import numpy as np>>> row_tableau = np.array([[2, 1, 3, 1, 0, 0, 1],... [3, 2, 1, 0, 1, 0, 1],... [1, 3, 2, 0, 0, 1, 1]])

and:

𝑇𝑐 =

⎛⎝1 0 0 2 1 3 10 1 0 3 2 1 10 0 1 1 3 2 1

⎞⎠Following along in numpy:

>>> col_tableau = np.array([[1, 0, 0, 2, 1, 3, 1],... [0, 1, 0, 3, 2, 1, 1],... [0, 0, 1, 1, 3, 2, 1]])

The labels of 𝑇𝑟 are {0, 1, 2} and the labels of 𝑇𝑐 are {3, 4, 5}.

Let us drop label 0, so we pivot the first column of 𝑇𝑟. The minimum ratio test gives: 1/3 < 1/2 < 1/1 so we pivoton the second row giving:

𝑇𝑟 =

⎛⎝0 −1 7 3 −2 0 13 2 1 0 1 0 10 7 5 0 −1 3 2

⎞⎠Code:

>>> import nashpy as nash>>> dropped_label = nash.integer_pivoting.pivot_tableau(row_tableau,... column_index=0)>>> row_tableauarray([[ 0, -1, 7, 3, -2, 0, 1],

[ 3, 2, 1, 0, 1, 0, 1],[ 0, 7, 5, 0, -1, 3, 2]])

This has labels {1, 2, 4} so we need to drop label 4 by pivoting the fifth column of 𝑇𝑐. The minimum ratio test impliesthat we pivot on the third row.

𝑇𝑐 =

⎛⎝3 0 −1 5 0 7 20 3 −2 7 0 −1 10 0 1 1 3 2 1

⎞⎠Code:

>>> dropped_label = nash.integer_pivoting.pivot_tableau(col_tableau,... column_index=4)>>> col_tableauarray([[ 3, 0, -1, 5, 0, 7, 2],

[ 0, 3, -2, 7, 0, -1, 1],[ 0, 0, 1, 1, 3, 2, 1]])

2.7. 06 Lemke Howson Algorithm 23


This has labels {2, 3, 5} so we need to drop 2 by pivoting the third row of 𝑇𝑟. The minimum ratio test implies that wepivot on the first row:

𝑇𝑟 =

⎛⎝ 0 −1 7 3 −2 0 121 15 0 −3 9 0 60 54 0 −15 3 21 9

⎞⎠Code:

>>> dropped_label = nash.integer_pivoting.pivot_tableau(row_tableau,... column_index=2)>>> row_tableauarray([[ 0, -1, 7, 3, -2, 0, 1],

[ 21, 15, 0, -3, 9, 0, 6],[ 0, 54, 0, -15, 3, 21, 9]])

This has labels {1, 3, 4} so we need to drop 3 by pivoting the fourth column of 𝑇𝑐. The minimum ratio test impliesthat we pivot on the second row:

𝑇𝑐 =

⎛⎝21 −15 3 0 0 54 90 3 −2 7 0 −1 10 −3 9 0 21 15 6

⎞⎠Code:

>>> dropped_label = nash.integer_pivoting.pivot_tableau(col_tableau,... column_index=3)>>> col_tableauarray([[ 21, -15, 3, 0, 0, 54, 9],

[ 0, 3, -2, 7, 0, -1, 1],[ 0, -3, 9, 0, 21, 15, 6]])

This has labels {1, 2, 5} so we need to drop 1 by pivoting the second column of 𝑇𝑟. The minimum ratio test impliesthat we pivot on the third row:

𝑇𝑟 =

⎛⎝ 0 0 378 147 −105 21 631134 0 0 63 441 −315 1890 54 0 −15 3 21 9

⎞⎠Code:

>>> dropped_label = nash.integer_pivoting.pivot_tableau(row_tableau,... column_index=1)>>> row_tableauarray([[ 0, 0, 378, 147, -105, 21, 63],

[1134, 0, 0, 63, 441, -315, 189],[ 0, 54, 0, -15, 3, 21, 9]])

This has labels {3, 4, 5} so we need to drop 5 by pivoting the sixth column of 𝑇𝑐. The minimum ratio test implies thatwe pivot on the first row:

𝑇𝑐 =

⎛⎝ 21 −15 3 0 0 54 921 147 −105 378 0 0 63

−315 63 441 0 1134 0 189

⎞⎠Code:



>>> dropped_label = nash.integer_pivoting.pivot_tableau(col_tableau,... column_index=5)>>> col_tableauarray([[ 21, -15, 3, 0, 0, 54, 9],

[ 21, 147, -105, 378, 0, 0, 63],[-315, 63, 441, 0, 1134, 0, 189]])

This has labels {0, 1, 2} so we have a Nash equilibrium with vertices:

((189/1134, 9/54, 63/378), (63/378, 189/1134, 9/54)) = ((1/6, 1/6, 1/6), (1/6, 1/6, 1/6))

Which in turn corresponds to the expected equilibrium:

((1/3, 1/3, 1/3, (1/3, 1/3, 1/3))

• Mention how different pivoting methods are fine, doing it this way ensures everything is an integer which is alsomore efficient for a computer.

2.8 07 Repeated games


• Repeated games

2.8.2 Objectives

• Define repeated games and strategies in repeated games

• Understand proof of theorem for sequence of stage Nash

• Understand that other equilibria exist

2.8.3 Notes

Playing repeated games in pairs

• Explain that we are about to play a game twice.

• Explain that this has to be done SILENTLY

In groups we are going to play:

𝐴 =

⎛⎝12 60 2412 23

⎞⎠ 𝐵 =

⎛⎝2 15 40 0

⎞⎠In pairs:

• Decide on row/column player (recall you don’t care about your opponents reward).

• We are going to play the game TWICE and write down both players cumulative scores.

• Define a strategy and ask players to write down a strategy that must describe what they do in both stages byanswering the following question: - What should the player do in the first stage? - What should the player do inthe second stage given knowledge of what both

2.8. 07 Repeated games 25



players did in the first period?

• SILENTLY, after having written down a strategy: show each other your strategies and SILENTLY agree onthe pair of utilities. If you are unable to agree on a utility this indicates that the strategies were not descriptiveenough. SILENTLY start again :)

As a challenge: repeat this (so repeatedly play a repeated game, repeatedly write down a new strategy) and make anote when you arrive at an equilibria (where no one has a reason to write a different strategy down)

If anyone arrives at an equilibria where the row player scores more than 24 and the column player more than4 stand up as a pair.

Following this, assuming a pair has arrived at such an equilibrium discuss this.

Then work through the notes:

• Definition of a repeated game;

• Definition of a strategy;

• Theorem of sequence of stage Nash (relate this back to the utilities of our game):

– (𝑟1𝑟1, 𝑐1𝑐1) with utility: (24, 4).




Now discuss the potential of a different equilibrium:

1. For the row player:

(∅, ∅) → 𝑟2

(𝑟2, 𝑐1) → 𝑟3

(𝑟2, 𝑐2) → 𝑟1

2. For the column player:

(∅, ∅) → 𝑐2

(𝑟1, 𝑐2) → 𝑐1

(𝑟2, 𝑐2) → 𝑐1

(𝑟3, 𝑐2) → 𝑐1

This corresponds to the following scenario:

> Play (𝑟2, 𝑐2) in first stage and (𝑟1, 𝑐1) in second stage unless the column player does not cooperate in which caseplay (𝑟3, 𝑐1).

This gives a utility of (36, 6). Is this an equilibrium?

1. If the row player deviates, they would do so in the first round and gain no utility.

2. If the column player deviates, they would only be rational to do so in the first stage, if they did they would gain1 but lose 2 in the second round.

Thus this is Nash equilibrium.

• Discuss how to identify such an equilibria: which player has an incentive to build a reputation? (The columnplayer want to prove to be trustworthy to gain 2 in the final round).

• Mention how this shows how game theory studies the emergence of unexpected behaviour.



2.9 08 Prisoners Dilemma


• Prisoners Dilemma

2.9.2 Objectives

• Explore the Iterated Prisoners Dilemma

2.9.3 Notes

Playing team based Iterated Prisoner’s Dilemma

Show this video: https://www.youtube.com/watch?v=p3Uos2fzIJ0

Ask class to form in to groups of approximately 8: 4 teams of 2.

Write the following on the board:

Name Score vs 1st opp∑︀

score vs 2nd opp∑︀

score vs 3rd opp

Explain that teams will play the iterated Prisoners dilemma:

𝐴 =

(︂3 05 1

)︂𝐵 =

(︂3 50 1

)︂Discuss NE of stage game.

Discuss “Cooperation” versus “Defection” and explain that goal is to maximise overall score (not necessarily beatdirect opponent).

For every dual write the following on the board (assuming 8 turns):

Name Score∑︀

score∑︀

score∑︀

score∑︀

score∑︀

score∑︀

score∑︀

score

Instructions for a dual:

• Give all teams copies of a document to help record.

• Think about strategy.

• Before every stage invite both individuals to talk to each other.

• Get teams to “face away”, after a count down “show” (either a C or a D).

After the tournament:

• Discuss winning strategy and other interesting strategies. Discuss potential coalitions that arose.

2.9. 08 Prisoners Dilemma 27


https://www.youtube.com/watch?v=p3Uos2fzIJ0


• Discuss how teams had more information that usual.

• Invite the possibility of modifications (prob end, noise and lack of information).

Consider Reactive strategies

Go over theory.

In the case of (𝑝1, 𝑞1) = (1/4, 4/5) and (𝑝2, 𝑞2) = (2/5, 1/3) we have:

𝑟1 =11

20𝑟2 =

1

15

𝑠1 =185

311𝑠2 =

116

311

The steady state probabilities are given by:

𝜋 = (21460/96721, 36075/96721, 14616/96721, 24570/96721)

Here is some sympy code to illustrate this:

>>> import sympy as sym>>> p_1, p_2 = sym.S(1) / sym.S(4), sym.S(4) / sym.S(5)>>> q_1, q_2 = sym.S(2) / sym.S(5), sym.S(1) / sym.S(3)





>>> r_1 = p_1 - p_2>>> r_2 = q_1 - q_2>>> r_1, r_2(-11/20, 1/15)>>> s_1 = (q_2 * r_1 + p_2) / (1 - r_1 * r_2)>>> s_2 = (p_2 * r_2 + q_2) / (1 - r_1 * r_2)>>> s_1, s_2(185/311, 116/311)>>> pi = sym.Matrix([[s_1 * s_2, s_1 * (1 - s_2), (1 - s_1) * s_2, (1 - s_1) * (1 - s_→˓2)]])>>> piMatrix([[21460/96721, 36075/96721, 14616/96721, 24570/96721]])

We can verify that this is a steady state vector:

𝑀 =

⎛⎜⎜⎝1/10 3/20 3/10 9/208/25 12/25 2/25 3/251/12 1/6 1/4 1/24/15 8/15 1/15 2/15

⎞⎟⎟⎠𝜋𝑀 = (21460/96721, 36075/96721, 14616/96721, 24570/96721)

Sympy code:

>>> M = sym.Matrix([[p_1*q_1, p_1*(1-q_1), (1-p_1)*q_1, (1-p_1)*(1-q_1)],... [p_2 * q_1, p_2 * (1-q_1), (1-p_2) * q_1, (1-p_2) * (1-q_1)],... [p_1 * q_2, p_1 * (1-q_2), (1-p_1) * q_2, (1-p_1) * (1-q_2)],... [p_2 * q_2, p_2 * (1-q_2), (1-p_2) * q_2, (1-p_2)*(1-q_2)]])>>> MMatrix([[1/10, 3/20, 3/10, 9/20],[8/25, 12/25, 2/25, 3/25],[1/12, 1/6, 1/4, 1/2],[4/15, 8/15, 1/15, 2/15]])>>> pi * MMatrix([[21460/96721, 36075/96721, 14616/96721, 24570/96721]])>>> pi * M == piTrue

The utility is then given by:

3𝑠1𝑠2 + 0𝑠1(1− 𝑠2) + 5(1− 𝑠1)𝑠2 + (1− 𝑠1)(1− 𝑠2) = 162030/96721 ≈ 1.675

Sympy code:

>>> rstp = sym.Matrix([[sym.S(3), sym.S(0), sym.S(5), sym.S(1)]])>>> score = pi.dot(rstp)>>> score, float(score)(162030/96721, 1.675...)

2.10 09 Evolutionary dynamics


• Evolutionary dynamics

2.10. 09 Evolutionary dynamics 29



2.10.2 Objectives

• Go over basic differential models for population dynamics.

2.10.3 Notes

Explain difference from games so far:

• Players were discrete entities (and we only consider 2 player games);

• Now there are no players: just strategies in a population.

Our population will be stand ers or sit ers.

Say we are going to use the following differential equation to denote the rate at which people stand:

𝑑𝑥

𝑑𝑡= 2𝑥

Use seconds (artificially) as a time unit, 𝑑𝑥𝑑𝑡 corresponds to the speed with which people stand up. We will instead

consider it as the rate at which people become standing up. Using a continuous population these two things areequivalent.

Get one student to stand up.

The speed at which other individuals are becoming standing up is at that exact point 2. Get someone to slowly standup . . . until they become stood up etc. . .

Note how we will run out of people very quickly as the solution to our equation is:

𝑥(𝑡) = 𝑥0𝑒2𝑡

Show how we can use Python to solve this differential equation numerically:

>>> import numpy as np>>> from scipy.integrate import odeint>>> t = np.linspace(0, 10, 100) # Obtain 100 time points>>> def dx(x, t, a):... """Define the derivate of x"""... return a * x>>> a = 10>>> xs = odeint(func=dx, y0=1, t=t, args=(a,))>>> xsarray([[1.000...

Now consider population of two individuals (see notes) (𝑥: standers and 𝑦 sitting on desk). What happens over time:

𝑑𝑥

𝑑𝑡= 2𝑥

𝑑𝑦

𝑑𝑡= 3𝑦

This means that the we would very quickly run out of sitters and desk sitters but what would happen to the limit of theratio? (Go over mathematics in notes)

Now, finally consider a constant population:

Consider .5 population standing and .5 sitting.

What must we have for our rates?

One population increases at a rate of 2 and the other increases at a rate of 3.

So our population source also, somehow decrease by 5.



Ask students to suggest how this can be done. Have a discussion.

Solution: share this between both populations:

• One populations “increases” at a rate of 2 - 2.5 = -0.5

• One populations “increases” at a rate of 3 - 2.5 = 0.5

Go over notes.

Discuss what happens at following starting conditions:

• x=1, y=0

• x=0, y=1

• x=0.5, y=0.5 but changes rates to be equal.

Show how can obtain this numerically:

>>> def dxy(xy, t, a, b):... """... Define the derivate of x and y.... It takes `xy` as a vector... """... x, y = xy... phi = a * x + b * y... return x * (a - phi), y * (b - phi)>>> a, b = 10, 5>>> xys = odeint(func=dxy, y0=[.5, .5], t=t, args=(a, b))>>> xysarray([[ 5.00000000e-01, 5.000...

Discuss how this basic notion of fitness will now be extended to consider game theoretic models where the fitnesscomes out game theoretic models.

2.11 10 Evolutionary game theory


• Evolutionary game theory

2.11.2 Objectives

• Go over the logical progression from dynamics to games;

• Explain ESS

2.11.3 Notes

Recall that there are no players: just strategies in a population.

Go over the notes, explaining the mathematics.

Now consider the Matching pennies game:

𝐴 =

(︂1 −1−1 1

)︂

2.11. 10 Evolutionary game theory 31



This gives:

𝑓1 = 𝑥1 − 𝑥2 𝑓2 = −𝑥1 + 𝑥2

which is equivalent to:

𝑓 = 𝐴𝑥 =

(︂𝑥1 − 𝑥2

−𝑥1 + 𝑥2

)︂and so:

𝑝ℎ𝑖 = 𝑓𝑥 = 𝑥1(𝑥1 − 𝑥2) + 𝑥2(𝑥2 − 𝑥1) = (𝑥1 − 𝑥2)2

thus:

𝑑𝑥

𝑑𝑡=

(︂𝑥1((𝑥1 − 𝑥2)− (𝑥1 − 𝑥2)

2)𝑥2((𝑥2 − 𝑥1)− (𝑥1 − 𝑥2)

2)

)︂𝑑𝑥

𝑑𝑡=

(︂𝑥1(𝑥1 − 𝑥2)(1− (𝑥1 − 𝑥2))𝑥2(𝑥1 − 𝑥2)(−1− (𝑥1 − 𝑥2))

)︂however recall that 𝑥1 + 𝑥2 = 1:

𝑑𝑥1

𝑑𝑡= 𝑥1(2𝑥1 − 1)2(1− 𝑥1))

Verification of above calculations:

>>> import sympy as sym>>> x_1, x_2 = sym.symbols("x_1, x_2")>>> f_1 = x_1 - x_2>>> f_2 = x_2 - x_1>>> phi = x_1 * f_1 + x_2 * f_2>>> dx_1 = x_1 * (f_1 - phi)>>> sym.factor(dx_1.subs({x_1: 1 - x_2}))2*x_2*(x_2 - 1)*(2*x_2 - 1)

We have the stable points as expected:

1. 𝑥1 = 0

2. 𝑥1 = 1

3. 𝑥1 = 1/2

Relate this to the notes and discuss notion of mutated population and stable ESS.

Show output of the following:

>>> import numpy as np>>> import matplotlib.pyplot as plt>>> from scipy.integrate import odeint>>> t = np.linspace(0, 10, 100) # Obtain 100 time points>>> def dx(x, t, A):... """... Define the derivate of x.... """... f = np.dot(A, x)... phi = np.dot(f, x)... return x * (f - phi)

Slight deviation from 𝑥1 = 0:



>>> A = np.array([[1, -1], [-1, 1]])>>> epsilon = 10 ** -1>>> xs = odeint(func=dx, y0=[epsilon, 1 - epsilon], t=t, args=(A,))>>> plt.plot(xs);[...

Slight deviation from 𝑥1 = 1:

>>> xs = odeint(func=dx, y0=[1 - epsilon, epsilon], t=t, args=(A,))>>> plt.plot(xs);[...

Slight deviation from 𝑥1 = 1/2:

>>> xs = odeint(func=dx, y0=[1 / 2 - epsilon, 1 / 2 + epsilon], t=t, args=(A,))>>> plt.plot(xs);[...

Look at theorem and discuss proof.

Carry out Nash equilibria computation (which corresponds to the above case):

>>> import nashpy as nash>>> game = nash.Game(A, A.transpose())>>> list(game.support_enumeration())[(array([1., 0.]), array([1., 0.])), (array([0., 1.]), array([0., 1.])), (array([0.5,→˓0.5]), array([0.5, 0.5]))]

Now look at cases:

1. 𝑥1 = 0: we are in the first case of the theorem so we have an ESS.

2. 𝑥1 = 1: we are in the first case of the theorem so we have an ESS.

3. 𝑥1 = 1/2: we now have the second case of the theorem 𝑢(𝑥, 𝑦) = 𝑢(𝑥, 𝑥) (indeed the row strategy aims tomake the column strategy indifferent - according to the best response condition).

To deal with this case we need to look at the next part of the second condition:

𝑢(𝑥, 𝑦) = 0

𝑢(𝑦, 𝑦) = 𝑦21 + (1− 𝑦1)2 − 2(1− 𝑦1)𝑦1 = (𝑦1 − (1− 𝑦1))

2 = (2𝑦1 − 1)2

And as long as 𝑦1 ̸= 𝑥1 = 1/2 then 𝑢(𝑥, 𝑦) < 𝑢(𝑦, 𝑦) thus this is not an ESS:

>>> A = sym.Matrix(A)>>> y_1, y_2 = sym.symbols("y_1, y_2")>>> y = sym.Matrix([y_1, y_2])>>> sym.factor((y.transpose() * A * y)[0].subs({y_2: 1 - y_1}))1.0*(2.0*y_1 - 1.0)**2

Discuss the work John Maynard Smith in 1973 who formalised this work. Mention that he was not actually aware ofNash equilibria at the time.

2.12 11 Moran Processes


2.12. 11 Moran Processes 33



• Moran Processes

2.12.2 Objectives

• Play a class activity of a Moran process;

• Define a Moran process;

• Prove theorem for formula of fixation probabilities;

• Numeric calculations.

2.12.3 Activity

Use moran process form have students play in pairs.

Also requires use of dice, allowing for multiples of 4 and 6. For example:

• 1 D12 (12 sided dice) or;

• 1 D6 and 1 D4 (the example used in this page);

• 1 D6 and 1 D8.

Explain that we will aim to reproduce a Moran process with 𝑁 = 3.

We will do this with the Hawk-Dove game:

𝐴 =

(︂0 31 2

)︂Recall, this corresponds to sharing of 4 resources:

• Two hawks both get nothing;

• A hawk meeting a dove gets 3 out of 4.

• A dove meeting a hawk gets 1 out of 4.

• A dove meeting a dove gets 2 out of 4.

Give students 5 minutes to write out the fitnesses of each type in each possible situation.

Confirm:

𝑓(Hawk) 𝑓(Dove)1 Hawk, 2 Doves 0× 0 + 2× 3 = 6 1× 1 + 1× 2 = 32 Hawks, 1 Dove 1× 0 + 1× 3 = 3 2× 1 + 0× 2 = 2

Give students 5 minutes to write out the probabilities of selection of each type in each possible situation.

Confirm:




Select Selection: Birth Selection: Death1 Hawk, 2 Doves Hawk 6

6+2×313

1 Hawk, 2 Doves Dove 2×36+2×3

23

2 Hawks, 1 Dove Hawk 2×32×3+2

23

2 Hawks, 1 Dove Dove 22×3+2

13

Give students 5 minutes to identify what sided dice and what values will allow them to simulate the process of selectingbirth/death individuals.

Confirm:

State Birth: dice used Select Hawk values Death: dice used Select Hawk values1 Hawk 6 {1, 2, 3} 6 {1, 2}2 Hawks 4 {1, 2, 3} 6 {1, 2, 3, 4}

Hand out a dice to each group (ideally a D6 and a D4 to each group but perhaps they can identify other ways).

Let students simulate

Count from groups to obtain mean fixation rate of a Hawk.

Show the following code that allows us to simulate the simulation undertaken by the students (this is not the samecode as in the notes):

>>> import collections>>> import matplotlib.pyplot as plt>>> import numpy as np>>> import tqdm

>>> def roll_n_sided_dice(n=6):... """... Roll a dice with n sides.... """... return np.random.randint(1, n + 1)

>>> class MoranProcess:... """... A class for a moran process with a population of... size N=3 using the standard Hawk-Dove Game:...... A =... [0, 3]... [1, 2]...... Note that this is a simulation corresponding to an... in class activity where students roll dice.... """... def __init__(self, number_of_hawks=1, seed=None):...... if seed is not None:... np.random.seed(seed)...... self.number_of_hawks = number_of_hawks... self.number_of_doves = 3 - number_of_hawks...... self.dice_and_values_for_hawk_birth = {1: (6, {1, 2, 3}), 2: (4, {1, 2, 3}→˓)}


2.12. 11 Moran Processes 35



... self.dice_and_values_for_hawk_death = {1: (6, {1, 2}), 2: (6, {1, 2, 3, 4}→˓)}...... self.history = [(self.number_of_hawks, self.number_of_doves)]...... def step(self):... """... Select a hawk or a dove for birth.... Select a hawk or a dove for death....... Update history and states.... """... birth_dice, birth_values = self.dice_and_values_for_hawk_birth[self.→˓number_of_hawks]... death_dice, death_values = self.dice_and_values_for_hawk_death[self.→˓number_of_hawks]...... select_hawk_for_birth = self.roll_dice_for_selection(dice=birth_dice,→˓values=birth_values)... select_hawk_for_death = self.roll_dice_for_selection(dice=death_dice,→˓values=death_values)...... if select_hawk_for_birth:... self.number_of_hawks += 1... else:... self.number_of_doves += 1...... if select_hawk_for_death:... self.number_of_hawks -= 1... else:... self.number_of_doves -= 1...... self.history.append((self.number_of_hawks, self.number_of_doves))...... def roll_dice_for_selection(self, dice, values):... """... Given a dice and values return if the random roll is in the values.... """... return roll_n_sided_dice(n=dice) in values...... def simulate(self):... """... Run the entire simulation: repeatedly step through... until the number of hawks is either 0 or 3.... """... while self.number_of_hawks in [1, 2]:... self.step()... return self.number_of_hawks...... def __len__(self):... return len(self.history)

This carries out the simulations:

>>> repetitions = 10 ** 5>>> end_states = []>>> path_lengths = []





>>> for seed in range(repetitions):... mp = MoranProcess(seed=seed)... end_states.append(mp.simulate())... path_lengths.append(len(mp))>>> counts = collections.Counter(end_states)>>> counts[3] / repetitions0.54666

Discuss obtaining theoretic probabilities of changing state:

𝑝10 =6

12

1

3=

1

6𝑝12 =

6

12

2

3=

1

3𝑝21 =

2

8

2

3=

1

6𝑝23 =

6

8

1

3=

1

4

Now work through the notes: culminating in the proof of the theorem for the absorption probabilities of a birthdeath process.

Discuss and use code from chapter to show the fixation with the Hawk Dove game:

>>> A = np.array([[0, 3], [1, 2]])

Calculate theoretic value using formula from theorem:

𝑓1𝑖 =3(𝑁 − 𝑖)

𝑁 − 1= 3

𝑁 − 𝑖

𝑁 − 1

𝑓2𝑖 =𝑖+ 2(𝑁 − 𝑖− 1)

𝑁 − 1=

2𝑁 − 2− 𝑖

𝑁 − 1

This gives (for 𝑁 = 3):

𝑖 = 1 𝑖 = 2𝑓1𝑖 3 3/2𝑓2𝑖 3/2 1𝛾𝑖 1/2 2/3

Thus:

𝑥1 =1

1 + 1/2 + 1/2× 2/3=

1

11/6≈ .545455

• Discuss work of Maynard smith but that this actually used Hawk Dove game in infinite population games.

• Discussion possibility for using a utility model on top of fitness.

• A lot of current work looks at Moran processes: a good model of invasion of a specifies etc. . .

• The Prisoners dilemma can also be included, there is documentation about simulating this with Axelrod is here:http://axelrod.readthedocs.io/en/stable/tutorials/getting_started/moran.html

2.13 12 Contemporary research


• Contemporary research

2.13. 12 Contemporary research 37

http://axelrod.readthedocs.io/en/stable/tutorials/getting_started/moran.html



2.13.2 Objectives

• Discuss different papers that will be looked at.

• Answer queries about assessment of these papers.

Explain that in future sessions we will discuss the papers, students will be expected to have read them. This will allowme to answer any questions they might have.

2.13.3 Studying the emergence of invasiveness in tumours using game theory

https://link.springer.com/article/10.1140/epjb/e2008-00249-y

• A paper that looks at the GT of cancer: it reads like a proof of concept sitting in literature that already exists onthe subject.

• Main result is confirming intuitive understanding of proliferation/motility. Increased nutrients implies less motil-ity.

• Use two different approaches: describe an analytic evo game. Note that there is a typo in the game:(︂𝑏/2 𝑏− 𝑐𝑏 𝑏− 𝑐/2

)︂Also note that this is showing the utility of the column player so that’s slightly different from the framework inthe course.

The solution comes from:

>>> import sympy as sym>>> p, b, c = sym.symbols("p, b, c")>>> sym.solveset(p * (b - c / 2) + (1 - p) * (b - c) - p * (b) - (1 - p) * (b /→˓2), p){(b - 2*c)/(b - c)}

They also describe how a mixed stable solution will also exist: What does that correspond to in notes?

And the theorem about evolutionary stability: what does that correspond to?

• They implement a Cellular automaton model (why?): does this compare to the evo game or to a Moran process?

• Improvements could include:

– more complex dynamics;

– using more phenotypes (strategies);

– finite population analytical model;

Here is a blog post on the EGT blog about this paper: https://egtheory.wordpress.com/2013/07/05/motility/

2.13.4 Iterated Prisoner’s Dilemma contains strategies that dominate any evolution-ary opponent

http://www.pnas.org/content/109/26/10409.abstract

• A paper that looks at memory one strategies in the IPD.


https://link.springer.com/article/10.1140/epjb/e2008-00249-y

https://egtheory.wordpress.com/2013/07/05/motility/

http://www.pnas.org/content/109/26/10409.abstract


• The paper makes use of a clever mathematical theorem (Cramer’s theorem) to show that there is a linear rela-tionship between the utility of two memory 1 players (eqn 6). This is because the linear relationship:

𝛼𝑆𝑋 + 𝛽𝑆𝑌 + 𝛾

is shown to be the ratio two determinants (essentially one determinant with a scaling factor). The columns ofthe corresponding matrix however are independent across both players (one has only 𝑝 variables and another,only 𝑞 variables). Thus, choosing a particular set of values for the 𝑝 can set the linear relationship above to be 0.

• This shows that it’s possible for one player to set the score of the other. It is however not possible for a playerto set it’s own score..

• The paper then goes on to show that it is possible to set a value where the strategy’s payoff is a factor of 𝜒 morethan the opponent. This linkage (it is now a win-win situation) of the strategies implies that both strategiesmaximise their payoffs when the opponent cooperates.

• The paper also describes playing against an “evolutionary” player which has a “theory of mind”, ie whichadapts. They make a claim (the title of the article) that numerical evidence indicates the such a player wouldevolve towards cooperation.

• The paper has two appendices which provide further proofs, one of the theorems for example is that shortmemory strategies are sufficient. This seems limited as it only considers a match and not a sufficient evolutionarysetting (such as a Moran process or a tournament). The other proves that the evolutionary process will stabilise(ultimately all results are being calculated at steady state).

• A deeper literature review placing this paper in it’s proper context would be beneficial.

• Perhaps some more numerical examples would have been helpful to highlight the process they mention. Thereis one graphic of the evolutionary process and note that’s claimed to be for 𝜒 = 5, we can read off the graph thatthe final utilty of the opponent is 1.6, so 𝑆𝑦 − 𝑃 = 1.6− 1 = .6 and .6× 5 = 3 whereas we see that the utilityof the other player is indeed 3 + 1.

• What occurs if two ZD strategies attempt to play each other?

2.13.5 Measuring the price of anarchy in critical care unit interactions

https://link.springer.com/article/10.1057/s41274-016-0100-8

• A piece of work that applies Nash equilibria to the interaction of two health providers. It assumes that a hospital’sstrategy is a threshold at which patients are deviated to other hospitals.

• This is one of a few papers on health care logistics that uses Game Theory.

• To calculate the utilities a Markov model (just like the model of Reactive strategies) is used. For every strategypair that Markov model is used to get the utilities.

• The main theorem in the paper is one that allows for an easily calculation of the Nash equilibria by showing thatthe best response functions are mutually increasing which implies that there will be a single intersection of thebest response functions.

• The above theorem is proven for some conditions which correspond to two scenarios.

• The paper then uses this to compute the PoA: price of anarchy looking at a comparison between the optimalcoordinated behaviour and the Nash equilibria. Similarly to a Prisoners dilemma.

• This is used to find a target that aligns selfish behaviour with coordinated behaviour.

• Things that could be improved: more players, non stationarity and different models of behaviour.

2.13. 12 Contemporary research 39

https://link.springer.com/article/10.1057/s41274-016-0100-8


2.13.6 Evolution Reinforces Cooperation with the Emergence of Self-RecognitionMechanisms: an empirical study of the Moran process for the iterated Pris-oner’s dilemma

• This is a paper that studies the Prisoner’s dilemma in a Moran process.

• It uses all the strategies from the Axelrod library but also includes 3 strategies that were trained using a geneticalgorithm specifically for the Paper. These are trained using a genetic algorithm.

• These make use of Finite State Machines: a structure that maps states and actions to another state and action.

• The paper starts by showing why the numerical simulation is necessary with how the theoretic values do notmatch up with the simulated ones for stochastic strategies.

• Then it goes on to examine two main scenarios: resistance and invasion.

• One observation is that the results for N=2 (just 2 strategies, 1 of each type) differ quite a lot from N>3. Thisimplies that a lot of theoretic research isn’t quite right.

• The other observation is that the trained strategies do well:

– Strategies with handshakes (self recognition mechanisms) do well in resistance.

– Strategies trained for scores against opponents do well at invasion.

• Some discussion is given to the placing of a particular type of strategy that is of interest in the literature (thePress and Dyson paper) but does not do well here.

Here is are two blog posts on this subject:

• http://vknight.org/unpeudemath/math/2017/07/28/sophisticated-ipd-strategies-beat-simple-ones.html

• http://marcharper.codes/2017-07-31/axelrod.html

Here is a video about a talk I gave on this subject:

• (https://www.youtube.com/watch?v=p4t9dMKxZAo)


http://vknight.org/unpeudemath/math/2017/07/28/sophisticated-ipd-strategies-beat-simple-ones.html

http://marcharper.codes/2017-07-31/axelrod.html

https://www.youtube.com/watch?v=p4t9dMKxZAo

CHAPTER 3

Optional meetings

Notes for specific optional meetings that take place reacting to needs ot students.

3.1 AA- Revisiting Polytopes


• Best response polytopes


3.1.2 Objectives

• Re visit best response polytopes using a game with a dominated strategy.

3.1.3 Notes

Consider the (𝐴,𝐵) ∈ R2×22 game:

𝐴 =

(︂1 4−3 2

)︂𝐵 =

(︂5 2−1 9

)︂Ask students in pairs to construct the column player best response polytope

First we add 4 to all elements to ensure they are positive:

𝐴 =

(︂5 81 6

)︂𝐵 =

(︂9 63 13

)︂By definition (https://vknight.org/gt/chapters/06/#Definition-of-best-response-polytopes), the column player best re-sponse polytope 𝒬 is given by:

𝒬 = {𝑦 ∈ R𝑛 | 𝐴𝑦 ≤ 1 𝑦 ≥ 0}

41

https://vknight.org/gt/chapters/06/

https://vknight.org/gt/chapters/06/#Definition-of-best-response-polytopes


So the inequalities are:

5𝑦1 + 8𝑦2 ≤ 1 label: 0𝑦1 + 6𝑦2 ≤ 1 label: 1

𝑦1 ≥ 0 label: 2𝑦2 ≥ 0 label: 3

The first two inequalities come from 𝐴𝑦 ≤ 1, however, 𝐴𝑦 corresponds to the row players utility against vertex 𝑦.

Note that everything is scaled so that the best response of the row player is 1 (the scaling takes place in the strate-gies/vertices).

Thus if a vertex has label 0 or 1 that implies that the first/second row player strategy is a best response to 𝑦.

We will return to this idea but first let us plot our 𝑄. We start by rearranging our inequalities:

𝑦2 ≤ 1/8− 5/8𝑦1 label: 0𝑦2 ≤ 1/6− 𝑦1/6 label: 1𝑦1 ≥ 0 label: 2𝑦2 ≥ 0 label: 3

Now let us sketch these:

import matplotlib.pyplot as plt%matplotlib inliney1s = [0, 1]first_line = [1 / 8 - 5 / 8 * y for y in y1s]second_line = [1 / 6 - y / 6 for y in y1s]plt.figure()plt.plot(first_line, label="$1/8-5/8y_1$ (label: 0)")plt.plot(second_line, label="$1/6-y_1/6$ (label: 1)")plt.legend();

This gives:

42 Chapter 3. Optional meetings


We can see that our polytope is a straightforward triangle and that no vertex will ever have label 1.

Discuss if this is OK?

Returning to the game:

𝐴 =

(︂5 81 6

)︂We see that the first strategy dominates the second: so the second strategy is never a best response.

Our vertices with labels are:

(0, 0) labels {2, 3}(0, 1/8) labels {0, 2}(1/5, 0) labels {0, 3}

Exercise:

Obtain the best response polytope for the row player:

By definition (https://vknight.org/gt/chapters/06/#Definition-of-best-response-polytopes), the row player best re-sponse polytope 𝒫 is given by:

𝒬 = {𝑥 ∈ R𝑚 | 𝑥 ≥ 0 𝑥𝐵 ≤ 1}


𝑥1 ≥ 0 label: 0𝑥2 ≥ 0 label: 1

9𝑥1 + 3𝑥2 ≤ 1 label: 22𝑥1 + 9𝑥2 ≤ 1 label: 3

Rearranging:

𝑥1 ≥ 0 label: 0𝑥2 ≥ 0 label: 1𝑥2 ≤ 1/3− 3𝑥1 label: 2𝑥2 ≤ 1/9− 2/9𝑥1 label: 3


x1s = [0, 1]first_line = [1 / 3 - 3 * x for x in x1s]second_line = [1 / 9 - 2 / 9 * x for x in x1s]plt.figure()plt.plot(first_line, label="$1/3-3x_1$ (label: 2)")plt.plot(second_line, label="$1/9-2/9x_1/6$ (label: 3)")plt.legend();

This gives:

3.1. AA- Revisiting Polytopes 43



We see that there our polytope has 4 vertices:


(2/25, 7/75) labels {2, 3}

Here is some sympy code to verify the intersection of both boundaries:

>>> import sympy as sym>>> x = sym.Symbol("x")>>> sym.solveset(sym.S(1) / 3-3 * x - sym.S(1) / 9 + sym.S(2) / 9 * x, x){2/25}

We can now identify a fully labelled vertex pair (we already know that label 𝑐𝑜𝑑𝑒 must come from a vertex in 𝒫):

((1/9, 0), (1/5, 0))

Which once normalised gives: ((1, 0), (0, 1)) which is in fact readily readable using pure best responses:

𝐴 =

(︂1 4−3 2

)︂𝐵 =

(︂5 2−1 9

)︂Checking using nashpy:

>>> import nashpy as nash>>> A = [[1, 4], [-3, 2]]>>> B = [[5, 2], [-1, 9]]>>> game = nash.Game(A, B)>>> list(game.vertex_enumeration())[(array([1., 0.]), array([1., 0.]))]



3.2 AB - Revisiting Repeated games


• Repeated games


3.2.2 Objectives

• Re visit repeated games and ensure able to compute Nash equilibria for games that are not stage nash.

3.2.3 Notes

Consider the (𝐴,𝐵) ∈ R2×22 game with 𝑇 = 2:

𝐴 =

(︂200 4 31 0 2

)︂𝐵 =

(︂0 −3 0

−10 0 −10

)︂Ask students in pairs to:

• Identify pure Nash equilibria of stage game:

𝐴 =

(︂200 4 31 0 2

)︂𝐵 =

(︂0 −3 0

−10 0 −10

)︂• Identify repeated game Nash equilibria:





• Identify an equilibria that is not stage Nash:

1. For the row player:

(∅, ∅) → 𝑟2

(𝑟2, 𝑐1) → 𝑟1

(𝑟2, 𝑐2) → 𝑟1

(𝑟2, 𝑐3) → 𝑟1

2. For the column player:

(∅, ∅) → 𝑐2

(𝑟2, 𝑐2) → 𝑐1

(𝑟1, 𝑐2) → 𝑐3

3.2. AB - Revisiting Repeated games 45



This corresponds to the following scenario:

> Play (𝑟2, 𝑐2) in first stage and (𝑟1, 𝑐1) in second stage unless the row player does not cooperate in which caseplay (𝑟1, 𝑐3).

This gives a utility of (200, 0). Is this an equilibrium?

1. If the row player deviares, they would only be rational to do so in the first state, if they did theywould gain 4 but lose 197.

2. If the column player deviates they would do so in the first round and gain no utility.

Potentially discuss how this repeated game framework can/does correspond to a normal normal form game.

• Writing all strategies down;

• Obtaining large matrix (very large)

3.3 AC - Coursework peer review

3.3.1 Corresponding assessment

• Individual coursework


3.3.2 Objectives

• Use peer review to discuss and give feedback on individual coursework.

3.3.3 Notes

5 mins: Discuss marking criteria:

• Understanding of subject (scope): the specific modelling scenario chosen is not terribly important but the com-munication of understanding is. Does the work use something specific to highlight high understanding?

• Presentation: how does work align with guidelines for writing mathematics:

– Inclusion of graphics?

– Spelling/grammar?

– Clarity

• Code: how has the code been used, is it the correct code, does it help with the scope? A simple use of somethingfrom the notes is what is expected but is it done appropriately (for example using the lemke-howson algorithmwhen it’s not appropriate, plotting the incorrect thing, etc. . . ).

5 mins: Exchange coursework with someone else.

15 mins: Read and write feedback (Encourage students to write positively).

10 mins: Pair and discuss feedback.

10 mins: Group discussion.


https://vknight.org/gt/other/assessment/#Individual-coursework


3.4 AD - Moran process


3.4.1 Objectives

• Compute the probability of fixation in the Hawk Dove game using the formula.

3.4.2 Notes

For:

𝐴 =

(︂0 31 2

)︂Ask students to work in pairs, using the formula to compute 𝑥1.

This is given by:

𝑓1𝑖 =3(𝑁 − 𝑖)

𝑁 − 1= 3

𝑁 − 𝑖

𝑁 − 1(3.1)

𝑓2𝑖 =𝑖+ 2(𝑁 − 𝑖− 1)

𝑁 − 1=

2𝑁 − 2− 𝑖

𝑁 − 1(3.2)

(3.3)

This gives (for 𝑁 = 4):

𝑖 = 1 𝑖 = 2 𝑖 = 3𝑓1𝑖 3 2 1𝑓2𝑖 5/3 4/3 1𝛾𝑖 5/9 2/3 1

Thus:

𝑥1 =1

1 + 5/9 + 5/9× 2/3 + 5/9× 2/3× 1=

1

62/27=

27

62≈ .44

3.5 AE - Support enumeration


• Support enumeration


3.5.2 Objectives

• Obtain the Nash equilibria of a game using support enumeration.

3.4. AD - Moran process 47



3.5.3 Notes

Play a knockout RPSLS tournament

Tell students to works in pairs.

Invite students to obtain the Nash equilibria, using support enumeration for the following game:

𝐴 =

(︂1 −1−2 2

)︂Ask students to apply the support enumeration algorithm.

For supports of size one: there are no pairs of best response so now we only consider 𝑘 = 2. This implies the followingpair of linear equations:

−𝜎𝑟1 + 2𝜎𝑟2 = 𝜎𝑟1 − 2𝜎𝑟2

𝜎𝑟1 =1

2𝜎𝑟2

𝜎𝑐1 − 𝜎𝑐2 = −2𝜎𝑐1 + 2𝜎𝑐2

𝜎𝑐1 = 𝜎𝑐2

Setting this to be probability distributions gives the final result:

𝜎𝑟 = (1/3, 2/3)

𝜎𝑐 = (1/2, 1/2)

We don’t need to really check the best response condition as the players have no where to deviate to but we can:

𝐴𝜎𝑇𝑐 =

(︂00

)︂Thus 𝜎𝑟 is a best response to 𝜎𝑐.

𝜎𝑟𝐵 = 𝜎𝑟(−𝐴) = (0, 0)



>>> import nashpy as nash>>> import numpy as np>>> A = np.array([[1, -1],... [-2, 2]])>>> game = nash.Game(A)>>> list(game.support_enumeration())[(array([0.666..., 0.333...]), array([0.5, 0.5]))]

As a second example consider:

𝐴 =

⎛⎝ 0 −2 21 0 −1−1 1 0

⎞⎠1. We see there are no pairs of best response, so consider 𝑘 ∈ {2, 3}.



2. We have the following set of utilities to consider:

1. 𝐼 = {1, 2}

1. 𝐽 = {1, 2}

2. 𝐽 = {1, 3}

3. 𝐽 = {2, 3}

2. 𝐼 = {1, 3}

1. 𝐽 = {1, 2}

2. 𝐽 = {1, 3}

3. 𝐽 = {2, 3}

3. 𝐼 = {2, 3}

1. 𝐽 = {1, 2}

2. 𝐽 = {1, 3}

3. 𝐽 = {2, 3}

4. 𝐼 = 𝐽 = {1, 2, 3}

The case for k=2 leads to contradictions (let students identify some of these).

4. 𝐼 = 𝐽 = {1, 2, 3}

In this case we have:

−𝜎𝑟2 + 𝜎𝑟3 = 2𝜎𝑟1 − 𝜎𝑟3

2𝜎𝑟1 − 𝜎𝑟3 = −2𝜎𝑟1 + 𝜎𝑟2

which has solution:

2𝜎𝑟1 = 𝜎𝑟2 = 𝜎𝑟3

Similarly:

−2𝜎𝑐2 + 2𝜎𝑐3 = 𝜎𝑐1 − 𝜎𝑐3

𝜎𝑐1 − 𝜎𝑐3 = −𝜎𝑐1 + 𝜎𝑐2

which has solution:

𝜎𝑐1 = 𝜎𝑐2 = 𝜎𝑐3

4. Now we consider which of those supports give valid mixed strategies:

4. 𝐼 = 𝐽 = {1, 2, 3}

𝜎𝑟 = (1/5, 2/5, 2/5)

𝜎𝑐 = (1/3, 1/3, 1/3)

5. The final step is to check the best response condition:

4. 𝐼 = 𝐽 = {1, 2, 3}

3.5. AE - Support enumeration 49


𝐴𝜎𝑇𝑐 =

⎛⎝000


𝜎𝑟𝐵 = (0, 0, 0)



>>> import nashpy as nash>>> A = np.array([[0, -2, 2],... [1, 0, -1],... [-1, 1, 0]])>>> rps = nash.Game(A)>>> list(rps.support_enumeration())[(array([0.2..., 0.4..., 0.4...]), array([0.333..., 0.333..., 0.333...]))]

Discuss with students about what happens when we have a 3 by 2 game?

3.6 AF- Revisiting Lemke-Howson


• Lemke Howson algorithm


3.6.2 Objectives

• Re visit Lemke Howson Algorithm

3.6.3 Notes

Consider the (𝐴,𝐵) ∈ R2×22 game:

𝐴 =

(︂1 4−3 2

)︂𝐵 =

(︂5 2−1 9

)︂Ask students in pairs to construct the column player best response polytope

First we add 4 to all elements to ensure they are positive:

𝐴 =

(︂5 81 6

)︂𝐵 =

(︂9 63 13

)︂By definition (https://vknight.org/gt/chapters/06/#Definition-of-best-response-polytopes), the column player best re-sponse polytope 𝒬 is given by:

𝒬 = {𝑦 ∈ R𝑛 | 𝐴𝑦 ≤ 1 𝑦 ≥ 0}






5𝑦1 + 8𝑦2 ≤ 1 label: 0𝑦1 + 6𝑦2 ≤ 1 label: 1

𝑦1 ≥ 0 label: 2𝑦2 ≥ 0 label: 3

The first two inequalities come from 𝐴𝑦 ≤ 1, however, 𝐴𝑦 corresponds to the row players utility against vertex 𝑦.

Note that everything is scaled so that the best response of the row player is 1 (the scaling takes place in the strate-gies/vertices).

Thus if a vertex has label 0 or 1 that implies that the first/second row player strategy is a best response to 𝑦.

We will return to this idea but first let us plot our 𝑄. We start by rearranging our inequalities:

𝑦2 ≤ 1/8− 5/8𝑦1 label: 0𝑦2 ≤ 1/6− 𝑦1/6 label: 1𝑦1 ≥ 0 label: 2𝑦2 ≥ 0 label: 3


import matplotlib.pyplot as plt%matplotlib inliney1s = [0, 1]first_line = [1 / 8 - 5 / 8 * y for y in y1s]second_line = [1 / 6 - y / 6 for y in y1s]plt.figure()plt.plot(first_line, label="$1/8-5/8y_1$ (label: 0)")plt.plot(second_line, label="$1/6-y_1/6$ (label: 1)")plt.legend();

This gives:

3.6. AF- Revisiting Lemke-Howson 51


We can see that our polytope is a straightforward triangle and that no vertex will ever have label 1.

Discuss if this is OK?

Returning to the game:

𝐴 =

(︂5 81 6

)︂We see that the first strategy dominates the second: so the second strategy is never a best response.

Our vertices with labels are:


Exercise:

Obtain the best response polytope for the row player:

By definition (https://vknight.org/gt/chapters/06/#Definition-of-best-response-polytopes), the row player best re-sponse polytope 𝒫 is given by:

𝒬 = {𝑥 ∈ R𝑚 | 𝑥 ≥ 0 𝑥𝐵 ≤ 1}


𝑥1 ≥ 0 label: 0𝑥2 ≥ 0 label: 1

9𝑥1 + 3𝑥2 ≤ 1 label: 22𝑥1 + 9𝑥2 ≤ 1 label: 3

Rearranging:

𝑥1 ≥ 0 label: 0𝑥2 ≥ 0 label: 1𝑥2 ≤ 1/3− 3𝑥1 label: 2𝑥2 ≤ 1/9− 2/9𝑥1 label: 3


x1s = [0, 1]first_line = [1 / 3 - 3 * x for x in x1s]second_line = [1 / 9 - 2 / 9 * x for x in x1s]plt.figure()plt.plot(first_line, label="$1/3-3x_1$ (label: 2)")plt.plot(second_line, label="$1/9-2/9x_1/6$ (label: 3)")plt.legend();

This gives:




We see that there our polytope has 4 vertices:


(2/25, 7/75) labels {2, 3}

Here is some sympy code to verify the intersection of both boundaries:

>>> import sympy as sym>>> x = sym.Symbol("x")>>> sym.solveset(sym.S(1) / 3-3 * x - sym.S(1) / 9 + sym.S(2) / 9 * x, x){2/25}

We now carry out the Lemke-Howson algorithm.

• Start at ((0, 0), (0, 0)), have labels {0, 1, 2, 3}. Drop 3 in 𝒬: move to ((0, 0), (0, 1/8)) and pick up 0.

• At ((0, 0), (0, 1/8)), have labels {0, 1, 2}. Drop 0 in 𝒫: move to ((1/9, 0), (0, 1/8)) and pick up 2.

• At ((1/9, 0), (0, 1/8)), have labels {0, 1, 2}. Drop 2 in 𝒬: move to ((1/9, 0), (1/5, 0)) and pick up 3.

• At ((1/9, 0), (1/5, 0)), have labels {0, 1, 2, 3}: stop.

Normalise to get:

((1, 0), (1, 0))

Now, ask students to carry this out using tableaux:

Apply definitions to get:

𝑇𝑟 =

(︂9 3 1 0 16 13 0 1 1

)︂

3.6. AF- Revisiting Lemke-Howson 53


and:

𝑇𝑐 =

(︂1 0 5 8 10 1 1 6 1

)︂• 𝑇𝑟 has labels {0, 1}.

• 𝑇𝑐 has labels {2, 3}.

Now let us drop 3 from 𝑇𝑐. The minimum ratio test: 1/8 < 1/6 so pivot on 1st row.

𝑇𝑐 =

(︂1 0 5 8 1−6 8 −22 0 2

)︂which has labels {0, 2} thus we need to drop 0 in 𝒫 . The minimum ratio test: 1/9 < 1/6 so pivot on 1st row:

𝑇𝑟 =

(︂9 3 1 0 10 99 −6 9 3

)︂which has labels {1, 2} thus we need to drop 2 in 𝒬. The minimum ratio test: there is only 1 positive ratio so we pivoton 1st row.

𝑇𝑐 =

(︂1 0 5 8 1−8 40 0 176 32

)︂which has labels {0, 3} thus we have a Nash equilibria.

Recall that the variable for 𝑇𝑟 in order are:

𝑥1, 𝑥2, 𝑠1, 𝑠2

Recall that the variable for 𝑇𝑟 in order are:

𝑠1, 𝑠2, 𝑦1, 𝑦2

Thus, 5𝑦1 = 1 and 9𝑥1 = 1 which gives:

𝑥 = (1/9, 0) 𝑦 = (1/5, 0)

after normalising we get the required result:

𝜎𝑟 = (1, 0) 𝜎𝑐 = (1, 0)


CHAPTER 4

About the materials

4.1 Overview

The course content is all written using Jupyter notebooks which you can find here: https://github.com/drvinceknight/gt/tree/master/nbs.

This is done using a combination of Python and Markdown (as well as LaTeX for the mathematics).

The Jupyter notebooks are then converted in to html which is served using Github pages and can be seen here: https://vknight.org/gt/.

4.2 How this is done

Using git you can clone the repository locally:

git clone https://github.com/drvinceknight/gt.git

This will create a copy of all the source files on to a directory on your computer called gt.

To view the notebooks locally you will need to have Python and Jupyter installed on your computer. I recommendusing the Anaconda distribution of Python which not only includes Jupyter but also a number of tools to ensureportability and reproducibility of scientific computing.

In the gt directory you will see an environment.yml file. This is used to define a specific set of Python libraries.To create this environment on your computer you will need to use a command line application:

• On Mac OSX or linux: search for terminal.

• On Windows, assuming you have installed the Anaconda distribution: search for Anaconda Prompt.

Open this and navigate to the gt directory, this is done using the cd command. For more information about using thecommand line, take a look at https://vknight.org/rsd/chapters/01/.

Once there, type:

55

http://jupyter.org/

https://github.com/drvinceknight/gt/tree/master/nbs

https://github.com/drvinceknight/gt/tree/master/nbs

https://www.python.org/

https://en.wikipedia.org/wiki/Markdown

https://www.latex-project.org/

https://pages.github.com/

https://vknight.org/gt/

https://vknight.org/gt/

https://git-scm.com/

https://anaconda.org/

https://vknight.org/rsd/chapters/01/


conda env create -f environment.yml

This will create a Python environment on your machine with all the required software called gt. To activate thisenvironment, type:

source activate gt

You can now start a Jupyter server by typing (still in the command line application):

jupyter notebook

This will open a tab in your browser (note that you do not need to be online for this) that you can browse and usethe notebooks with.

Finally if you want to convert the notebooks in to the html. In your command line application, making sure you haveactivated the gt environment type:

inv main

4.3 Testing all the code

One important advantage of writing the course notes in this way is that all code snippets (which are often used toconfirm calculations carried out) can be tested. This helps ensure that there are no errors in the notes.

To test the code, in your command line application type:

inv test

56 Chapter 4. About the materials

vince's game theory course documentationvince’s game theory course documentation strategy...

Documents