mathematics - game theory
TRANSCRIPT
IE675 Game Theory
Lecture Note Set 1Wayne F. Bialas1
Friday, March 30, 2001
1 INTRODUCTION
1.1 What is game theory?
Game theory is the study of problems of conflict and cooperation among independentdecisionmakers.
Game theory deals withgames of strategyrather thangames of chance.
The ingredients of a game theory problem include
• players (decisionmakers)
• choices (feasible actions)
• payoffs (benefits, prizes, or awards)
• preferences to payoffs (objectives)
We need to know when one choice is better than another for a particular player.
1.2 Classification of game theory problems
Problems in game theory can be classified in a number of ways.
1.2.1 Static vs. dynamic games
In dynamic games, the order of decisions are important.
Question 1.1. Is it ever really possible to implement static decisionmaking in practice?
1Department of Industrial Engineering, University at Buffalo, 342 Bell Hall, Box 602050, Buffalo,NY 142602050 USA;Email: [email protected];Web: http://www.acsu.buffalo.edu/˜bialas.Copyright c© MMI Wayne F. Bialas. All Rights Reserved. Duplication of this work is prohibitedwithout written permission. This document produced March 30, 2001 at 2:31 pm.
11
1.2.2 Cooperative vs. noncooperative
In a noncooperative game, each player pursues his/her own interests. In a cooperativegames, players are allowed to form coalitions and combine their decisionmaking problems.
Noncooperative Cooperative
Math programming CooperativeStatic Noncooperative Game
Game Theory TheoryCooperative
Dynamic Control Theory DynamicGames
Note 1.1. This area of study is distinct from multicriteria decision making.
Flow of information is an important element in game theory problems, but it is sometimesexplicitly missing.
• noisy information
• deception
1.2.3 Related areas
• differential games
• optimal control theory
• mathematical economics
1.2.4 Application areas
• corporate decision making
• defense strategy
• market modelling
• public policy analysis
• environmental systems
• distributed computing
• telecommunications networks
12
1.2.5 Theory vs. simulation
The mathematical theory of games provides the fundamental laws and problem structure.Games can also be simulated to assess complex economic systems.
1.3 Solution concepts
The notion of a “solution” is more tenuous in game theory than in other fields.
Definition 1.1. A solution is a systematic description of the outcomes that may emergefrom the decision problem.
• optimality (for whom??)
• feasibility
• equilibria
1.4 Games in extensive form
1.4.1 Example: Matching Pennies
• Player 1: Choose H or T
• Player 2: Choose H or T (not knowing Player 1’s choice)
• If the coins are alike, Player 2 wins 1 cent from Player 1
• If the coins are different, Player 1 wins 1 cent from Player 2
Written in extensive form, the game appears as follows
13
Player 1
Player 2
InformationSet
(-1,1) (1,-1) (1,-1) (-1,1)
21S
In order to deal with the issue of Player 2’s knowledge about the game, we introduce theconcept of aninformation set. When the game’s progress reaches Player 2’s time to move,Player 2 is supposed to know Player 1’s choice. The set of nodes,S2
1, is an information setfor Player 2.
A player only knows the possible options emanating from an information. A player doesnot know which node within the information set is the actual node at which the progress ofplay resides.
There are some obvious rules about information sets that we will formally describe later.
14
For example, the following cannot occur. . .
Player 1
Player 221S
7KL�FDQQR � KDSSHQ���
Definition 1.2. A player is said to haveperfect information if all of his/her informationsets are singletons.
Player 1
Player 2
�JDP � ZLW � SHUIHF � LQIRUPDWLRQ���
15
Definition 1.3. A strategy for Playeri is a function which assigns to each of Playeri’sinformation sets, one of the branches which follows a representative node from that set.
Example 1.1. A strategyΨ which hasΨ(S21) = H would tell Player 2 to selectHeads
when he/she is in information setS21.
1.5 Games in strategic form
1.5.1 Example: Matching Pennies
Consider the same game as above and the following matrix of payoffs:
Player 2H T
Player 1 H (1,1) (1,1)T ( 1,1) (1,1)
The rows represent the strategies of Player 1. The columns represent the strategies of Player2.
Distinguishactionsfrom strategies.
The matching pennies game is an example of anoncooperative game.
1.6 Cooperative games
Cooperative games allow players to form coalitions to share decisions, information andpayoffs.
For example, if we have player set
N = {1, 2, 3, 4, 5, 6}
A possible coalition structure would be
{{1, 2, 3}, {4, 6}, {5}}
Often these games are described incharacteristic function form.A characteristic functionis a mapping
v : 2N → R
andv(S) (whereS ⊆ N ) is the “worth” of the coalitionS.
16
1.7 Dynamic games
Dynamic games can be cooperative or noncooperative in form.
One class of cooperative dynamic games are hierarchical (organizational) games.
Consider a hierarchy of players, for example,
PresidentVicePresident
...Workers
Each level has control over a subset of all decision variables. Each may have an objectivefunction that depends on the decision variables of other levels. Suppose that the top levelmakes his/her decision first, and then passes that information to the next level. The nextlevel then makes his/her decision, and the play progresses down through the hierarchy.
The key ingredient in these problems is preemption, i.e., the “friction of space and time.”
Even when everything is linear, such systems can producestable, inadmissiblesolutions(Chew).
stable: No one wants to unilaterally move away from the given solution point.
inadmissible: There are feasible solution points that produce better payoffs for all playersthan the given solution point.
17
IE675 Game Theory
Lecture Note Set 2Wayne F. Bialas1
Friday, March 30, 2001
2 TWOPERSON GAMES
2.1 TwoPerson ZeroSum Games
2.1.1 Basic ideas
Definition 2.1. A game (in extensive form) is said to bezerosumif and only if,at each terminal vertex, the payoff vector(p1, . . ., pn) satisfies
∑ni=1 pi = 0.
Twoperson zero sum games innormal form. Here’s an example. . .
A =
−1 −3 −3 −20 1 −2 −12 −2 0 1
The rows represent the strategies of Player 1. The columns represent the strategiesof Player 2. The entriesaij represent the payoff vector(aij ,−aij). That is, ifPlayer 1 chooses rowi and Player 2 chooses columnj, then Player 1 winsaij andPlayer 2 losesaij . If aij < 0, then Player 1 pays Player 2|aij |.
Note 2.1. We are using the termstrategyrather thanactionto describe the player’soptions. The reasons for this will become evident in the next chapter when we usethis formulation to analyze games in extensive form.
Note 2.2. Some authors (in particular, those in the field of control theory) preferto represent the outcome of a game in terms oflossesrather thanprofits. Duringthe semester, we will use both conventions.
1Department of Industrial Engineering, University at Buffalo, 342 Bell Hall, Box 602050, Buffalo,NY 142602050 USA;Email: [email protected];Web: http://www.acsu.buffalo.edu/˜bialas.Copyright c© MMI Wayne F. Bialas. All Rights Reserved. Duplication of this work is prohibitedwithout written permission. This document produced March 30, 2001 at 2:31 pm.
21
How should each player behave? Player 1, for example, might want to place abound on his profits. Player 1 could ask “For each of my possible strategies, whatis the least desirable thing that Player 2 could do to minimize my profits?” Foreach of Player 1’s strategiesi, compute
αi = minj
aij
and then choose thati which produces maxi αi. Suppose this maximum is achievedfor i = i∗. In other words, Player 1 is guaranteed to get at least
V (A) = minj
ai∗j ≥ minj
aij i = 1, . . .,m
The valueV (A) is called thegainfloorfor the gameA.
In this caseV (A) = −2 with i∗ ∈ {2, 3}.
Player 2 could perform a similar analysis and find thatj∗ which yields
V (A) = maxi
aij∗ ≤ maxi
aij j = 1, . . ., n
The valueV (A) is called thelossceilingfor the gameA.
In this caseV (A) = 0 with j∗ = 3.
Now, consider the joint strategies(i∗, j∗). We immediately get the following:
Theorem 2.1. For every (finite) matrix gameA =[
aij]
1. The valueV (A) andV (A) are unique.
2. There exists at least one security strategy for each player given by(i∗, j∗).
3. minj ai∗j = V (A) ≤ V (A) = maxi aij∗
Proof: (1) and (2) are easy. To prove (3) note that for anyk and`,
minj
akj ≤ ak` ≤ maxi
ai`
and the result follows.
22
2.1.2 Discussion
Let’s examine the decisionmaking philosophy that underlies the choice of(i∗, j∗).For instance, Player 1 appears to be acting as if Player 2 is trying to do as muchharm to him as possible. This seems reasonable since this is a zerosum game.Whatever, Player 1 wins, Player 2 loses.
As we proceed through this presentation, note that this same reasoning is also usedin the field of statistical decision theory where Player 1 is the statistician, and Player2 is “nature.” Is it reasonable to assume that “nature” is a malevolent opponent?
2.1.3 Stability
Consider another example
A =
−4 0 10 1 −3
−1 −2 −1
Player 1 should consideri∗ = 3 (V = −2) and Player 2 should considerj∗ = 1(V = 0).
However, Player 2 can continue his analysis as follows
• Player 2 will choose strategy 1
• So Player 1 should choose strategy 2 rather than strategy 3
• But Player 2 would predict that and then prefer strategy 3
and so on.
Question 2.1. When do we have a stable choice of strategies?
The answer to the above question gives rise to some of the really important earlyresults in game theory and mathematical programming.
We can see that ifV (A) = V (A), then both Players will settle on(i∗, j∗) with
minj
ai∗j = V (A) = V (A) = maxi
aij∗
Theorem 2.2. If V (A) = V (A) then
1. A has a saddle point
23
2. The saddle point corresponds to the security strategies for each player
3. Thevaluefor the game isV = V (A) = V (A)
Question 2.2. SupposeV (A) < V (A). What can we do? Can we establish a“spyproof” mechanism to implement a strategy?
Question 2.3. Is it ever sensible to use expected loss (or profit) as a performance criterion in determining strategies for “oneshot” (nonrepeated) decisionproblems?
2.1.4 Developing Mixed Strategies
Consider the following matrix game. . .
A =
[
3 −10 1
]
For Player 1, we haveV (A) = 0 andi∗ = 2. For Player 2, we haveV (A) = 1 andj∗ = 2. This game does not have a saddle point.
Let’s try to create a “spyproof” strategy. Let Player 1 randomize over his twopurestrategies.That is Player 1 will pick the vector of probabilitiesx = (x1, x2) where∑
i xi = 1 andxi ≥ 0 for all i. He will then select strategyi with probabilityxi.
Note 2.3. When we formalize this, we will call the probability vectorx, amixedstrategy.
To determine the “best” choice ofx, Player 1 analyzes the problem, as follows. . .
24
-1
0
1
2
3
x1 = 0x2 = 1
x1 = 1x2 = 0
x1 = 1/5
3/5
� � � � � ���
Player 2 might do the same thing using probability vectory = (y1, y2) where∑
i yi = 1 andyi ≥ 0 for all i.
-1
0
1
2
3
y1 = 0y2 = 1
y1 = 1y2 = 0
y1 = 2/5
3/5
� � � � � ���
25
If Player 1 adopts mixed strategy(x1, x2) and Player 2 adopts mixed strategy(y1, y2), we obtain an expected payoff of
V = 3x1y1 + 0(1− x1)y1 − x1(1− y1)
+(1− x1)(1− y1)
= 5x1y1 − y1 − 2x1 + 1
Suppose Player 1 usesx∗1 = 15, then
V = 5(
15
)
y1 − y1 − 2(
15
)
+ 1 =35
which doesn’t depend ony! Similarly, suppose Player 2 usesy∗1 = 25, then
V = 5x1
(
25
)
−(
25
)
− 2x1 + 1 =35
which doesn’t depend onx!
Each player is solving a constrained optimization problem. For Player 1 the problemis
max{v}st: +3x1 + 0x2 ≥ v
−1x1 + 1x2 ≥ vx1 + x2 = 1xi ≥ 0 ∀ i
which can be illustrated as follows:
26
-1
0
1
2
3
x1 = 0x2 = 1
x1 = 1x2 = 0
� � � � � ���
v
This problem is equivalent to
maxx
min{(3x1 + 0x2), (−x1 + x2)}
For Player 2 the problem is
min{v}st: +3y1 − 1y2 ≤ v
+0y1 + 1y2 ≤ vy1 + y2 = 1yj ≥ 0 ∀ j
which is equivalent to
miny
max{(3y1 − y2), (0y1 + y2)}
We recognize these as dual linear programming problems.
Question 2.4. We now have a way to compute a “spyproof” mixed strategy foreach player. Modify these two mathematical programming problems to producethepuresecurity strategy for each player.
27
In general, the players are solving the following pair of dual linear programmingproblems:
max{v}st:
∑
i aijxi ≥ v ∀ j∑
i xi = 1xi ≥ 0 ∀ i
andmin{v}
st:∑
j aijyj ≤ v ∀ i∑
i yi = 1yi ≥ 0 ∀ j
Note 2.4. Consider, once again, the example game
A =
[
3 −10 1
]
If Player 1 (the maximizer) uses mixed strategy(x1, (1− x1)), and if Player 2 (theminimizer) uses mixed strategy(y1, (1− y1)) we get
E(x, y) = 5x1y1 − y1 − 2x1 + 1
and lettingx∗ = 15 andy∗ = 2
5 we getE(x∗, y) = E(x, y∗) = 35 for anyx andy.
These choices forx∗ andy∗ make the expected value independent of the opposingstrategy. So, if Player 1 becomes a minimizer (or if Player 2 becomes a maximizer)the resulting mixed strategies would be the same!
Note 2.5. Consider the game
A =
[
1 34 2
]
By “factoring” the expression forE(x, y), we can write
E(x, y) = x1y1 + 3x1(1− y1) + 4(1− x1)y + 2(1− x1)(1− y1)
= −4x1y1 + x1 + 2y1 + 2
= −4(x1y1 −x1
4− y1
2+
18) + 2 +
12
= −4(x1 −12)(y1 −
14) +
52
It’s now easy to see thatx∗1 = 12, y∗1 = 1
4 andv = 52.
28
2.1.5 A more formal statement of the problem
Suppose we are given a matrix gameA(m×n) ≡[
aij]
. Each row ofA is a purestrategy for Player 1. Each column ofA is a pure strategy for Player 2. The valueof aij is the payoff from Player 1 to Player 2 (it may be negative).
For Player 1 letV (A) = max
imin
jaij
For Player 2 letV (A) = min
jmax
iaij
{Case 1} (Saddle Point Case whereV (A) = V (A) = V )Player 1 can assure himself of getting at leastV from Player 2 by playing hismaximin strategy.
{Case 2} (Mixed Strategy Case whereV (A) < V (A))Player 1 uses probability vector
x = (x1, . . . , xm)∑
i
xi = 1 xi ≥ 0
Player 2 uses probability vector
y = (y1, . . ., yn)∑
j
yj = 1 yj ≥ 0
If Player 1 usesx and Player 2 uses strategyj, the expected payoff is
E(x, j) =∑
i
xiaij = xAj
whereAj is columnj from matrixA.
If Player 2 usesy and Player 1 uses strategyi, the expected payoff is
E(i, y) =∑
j
aijyj = AiyT
whereAi is row i from matrixA.
29
Combined, if Player 1 usesx and Player 2 usesy, the expected payoff is
E(x, y) =∑
i
∑
j
xiaijyj = xAyT
The players are solving the following pair of dual linear programming problems:
max{v}st:
∑
i aijxi ≥ v ∀ j∑
i xi = 1xi ≥ 0 ∀ i
andmin{v}
st:∑
j aijyj ≤ v ∀ i∑
i yi = 1yi ≥ 0 ∀ j
TheMinimax Theorem(von Neumann, 1928) states that there exists mixed strategiesx∗ andy∗ for Players 1 and 2 which solve each of the above problems withequal objective function values.
2.1.6 Proof of the minimax theorem
Note 2.6. (From Basar and Olsder [1]) The theory of finite zerosum games datesback to Borel in the early 1920’s whose work on the subject was later translatedinto English (Borel, 1953). Borel introduced the notion of a conflicting decisionsituation that involves more than one decision maker, and the concepts of pureand mixed strategies, but he did not really develop a complete theory of zerosumgames. Borel even conjectured that the minimax theorem was false.
It was von Neumann who first came up with a proof of the minimax theorem, andlaid down the foundations of game theory as we know it today (von Neumann 1928,1937).
The following proof of the minimax theorem does not use the powerful toolsof duality in linear programming problems. It is provided here for historicalpurposes.
Theorem 2.3. Minimax Theorem Let A =[
aij]
be anm × n matrix of realnumbers. LetΞr denote the set of allrdimensional probability vectors, that is,
Ξr = {x ∈ Rr |∑r
i=1 xi = 1 andxi ≥ 0}
210
We sometimes callΞr theprobability simplex.
Letx ∈ Ξm andy ∈ Ξn. Define
V (A) = maxx
miny
xAyT
V (A) = miny
maxx
xAyT
ThenV (A) = V (A).
Proof: (By finite induction onm + n.)
The result is clearly true whenA is a 1× 1 matrix (i.e.,m + n = 2).
Assume that the theorem is true for allp × q matrices such thatp + q < m + n(i.e., for any submatrix of anm × n matrix). We will now show the result is truefor anym× n matrix.
First note thatxAyT, maxx xAyT and miny xAyT are all continuous functions of(x, y), x andy, respectively. Any continuous, realvalued function on a compactset has an extermum. Therefore, there existsx0 andy0 such that
V (A) = miny
x0AyT
V (A) = maxx
xAy0T
It is clear thatV (A) ≤ x0Ay0T ≤ V (A)(1)
So we only need to show thatV (A) ≥ V (A).
Let’s assume thatV (A) < V (A). We will show that this produces a contradiction.
From Equation 1, eitherV (A) < x0Ay0T or V (A) > x0Ay0T. We’ll assumeV (A) < x0Ay0T (the other half of the proof is similar).
Let A be an(m − 1) × n matrix obtained fromA by deleting a row. LetA be anm× (n− 1) matrix obtained fromA by deleting a column. We then have, for allx ∈ Ξm−1 andy ∈ Ξn−1,
V (A) = maxx′
miny
x′AyT ≤ V (A)
V (A) = miny
maxx′
x′AyT ≤ V (A)
V (A) = maxx
miny′
xAy′T ≥ V (A)
V (A) = miny′
maxx
xAy′T ≥ V (A)
211
We know thatV (A) = V (A) andV (A) = V (A). Thus
V (A) = V (A) ≤ V (A)
V (A) ≤ V (A) ≤ V (A)
and
V (A) = V (A) ≥ V (A) ≥ V (A)
V (A) ≥ V (A)
So, if it is really true thatV (A) < V (A) then, for allA andA, we have
V (A) > V (A)
V (A) > V (A)
Now, if is true thatV (A) > V (A) we will show that we can construct a vector∆xsuch that
miny
(x0 + ε∆x)AyT > V (A) = maxx
miny
xAyT
for someε > 0. This would be clearly false and would yield our contradiction.
To construct∆x, assumeV (A) > V (A)(2)
For somek there is a columnAk of A such that
x0Ak > V (A)
This must be true because ifx0Ak ≤ V (A) for all columnsAj , thenx0Ay0T ≤V (A) violating Equation 2.
For the above choice ofk, letAk denote them×(n−1) matrix obtained by deletingthekth column fromA. From calculus, there must existx′ ∈ Ξm such that
V (Ak) = maxx
miny′
xAy′T
= miny′
x′Ay′T
wherey′ ∈ Ξn but withy′k = 0 (because we deleted thekth column fromA).
Now, let∆x = x′ − x0 6= 0 to get
V (Ak) = miny′
(x0 + ∆x)Ay′T > V (A)
212
To summarize, by assuming Equation 2, we now have the following:
V (A) = miny
x0AyT
V (Ak) = miny′
(x0 + ∆x)Ay′T > V (A)
Therefore,miny′
(x0 + ε∆x)Ay′T > V (A) = miny
x0Ay′T
for all 0 < ε ≤ 1.
In addition,xAk is continuous, and linear inx, so
x0Ak > V (A)
implies(x0 + ε∆x)Ak > V (A)
for some smallε > 0. Letx? = x0 + ε∆x for that smallε. This produces,
miny
x?AyT = min0≤yk≤1
{
(1− yk)(
miny′
x?Ay′T)
+ yk(
x?Ak)
}
= min{
miny′
x?Ay′T, x?Ak
}
> V (A)
which is a contradiction.
2.1.7 The minimax theorem and duality
The following theorem provides a modern proof of the minimax theorem, usingduality:1
Theorem 2.4. Consider the matrix gameA with mixed strategiesx and y forPlayer 1 and Player 2, respectively. Then
1. minimax statement
maxx
miny
E(x, y) = miny
maxx
E(x, y)
1This theorem and proof is from my own notebook from a Game Theory course taught at Cornellin the summer of 1972. The course was taught by Professors William Lucas and Louis Billera. Ibelieve, but I cannot be sure, that this particular proof is from Professor Billera.
213
2. saddle point statement (mixed strategies)There existsx∗ andy∗ such that
E(x, y∗) ≤ E(x∗, y∗) ≤ E(x∗, y)
for all x andy.
2a. saddle point statement (pure strategies)Let E(i, y) denote the expectedvalue for the game if Player 1 uses pure strategyi and Player 2 uses mixedstrategyy. Let E(x, j) denote the expected value for the game if Player 1uses mixed strategyx and Player 2 uses pure strategyj. There existsx∗ andy∗ such that
E(i, y∗) ≤ E(x∗, y∗) ≤ E(x∗, j)
for all i andj.
3. LP feasibility statementThere existsx∗, y∗, andv′ = v′′ such that
∑
i aijx∗i ≥ v′ ∀ j∑
i x∗i = 1x∗i ≥ 0 ∀ i
∑
j aijy∗j ≤ v′′ ∀ i∑
j y∗j = 1y∗j ≥ 0 ∀ j
4. LP duality statement The objective function values are the same for thefollowing two linear programming problems:
max{v}st:
∑
i aijx∗i ≥ v ∀ j∑
i x∗i = 1x∗i ≥ 0 ∀ i
min{v}st:
∑
j aijy∗j ≤ v ∀ i∑
i y∗j = 1y∗j ≥ 0 ∀ j
Proof: We will sketch the proof for the above results by showing that
(4) ⇒ (3) ⇒ (2) ⇒ (1) ⇒ (3) ⇒ (4)
and(2) ⇔ (2a)
.
{(4) ⇒ (3)} (3) is just a special case of (4).
214
{(3) ⇒ (2)} Let 1n denote a column vector ofn ones. Then (3) implies that thereexistsx∗, y∗, andv′ = v′′ such that
x∗A ≥ v′1n
x∗AyT ≥ v′(1nyT) = v′ ∀ y
and
Ay∗T ≤ v′′1m
xAy∗T ≤ xv′′1m = v′′(x1m) = v′′ ∀ x
Hence,E(x∗, y) ≥ v′ = v′′ ≥ E(x, y∗) ∀x, y
andE(x∗, y∗) = v′ = v′′ = E(x∗, y∗)
{(2) ⇒ (2a)} (2a) is just a special case of (2) using mixed strategiesx with xi = 1andxk = 0 for k 6= i.
{(2a) ⇒ (2)} For eachi, consider all convex combinations of vectorsx with xi =1 andxk = 0 for k 6= i. SinceE(i, y∗) ≤ v, we must haveE(x∗, y∗) ≤ v.
{(2) ⇒ (1)}
• {Case≥}
E(x, y∗) ≤ E(x∗, y) ∀ x, y
maxx
E(x, y∗) ≤ E(x∗, y) ∀ y
maxx
E(x, y∗) ≤ miny
E(x∗, y)
miny
maxx
E(x, y) ≤ maxx
E(x, y∗) ≤ miny
E(x∗, y) ≤ maxx
miny
E(x, y)
• {Case≤}
miny
E(x, y) ≤ E(x, y) ∀ x, y
maxx
[
miny
E(x, y)]
≤ maxx
E(x, y) ∀ y
maxx
[
miny
E(x, y)]
≤ miny
[
maxx
E(x, y)]
215
{(1) ⇒ (3)}
maxx
[
miny
E(x, y)]
= miny
[
maxx
E(x, y)]
Let f(x) = miny E(x, y). From calculus, there existsx∗ such thatf(x) attains its maximum value atx∗. Hence
miny
E(x∗, y) = maxx
[
miny
E(x, y)]
{(3) ⇒ (4)} This is direct from the duality theorem of LP. (See Chapter 13 ofDantzig’s text.)
Question 2.5. Can the LP problem in section (4) of Theorem 2.4 have alternateoptimal solutions. If so, how does that affect the choice of(x∗, y∗)?2
2.2 TwoPerson GeneralSum Games
2.2.1 Basic ideas
Twoperson generalsum games(sometimes called “bimatrix games”) can be represented by two(m × n) matricesA =
[
aij]
and B =[
bij]
whereaij is the“payoff” to Player 1 andbij is the “payoff” to Player 2. IfA = −B then we get atwoperson zerosum game,A.
Note 2.7. These are noncooperative games with no side payments.
Definition 2.2. The (pure) strategy(i∗, j∗) is aNash equilibrium solution to thegame(A,B) if
ai∗,j∗ ≥ ai,j∗ ∀ i
bi∗,j∗ ≥ bi∗,j ∀ j
Note 2.8. If both players are placed on their respective Nash equilibrium strategies(i∗, j∗), then each player cannot unilaterally move away from that strategy andimprove his payoff.
2Thanks to Esra E. Aleisa for this question.
216
Question 2.6. Show that ifA = −B (zerosum case), the above definition of aNash solution corresponds to our previous definition of a saddle point.
Note 2.9. Not every game has a Nash solution using pure strategies.
Note 2.10. A Nash solution need not be the best solution, or even a reasonablesolution for a game. It’s merely a stable solution against unilateral moves by asingle player. For example, consider the game
(A,B) =
[
(4, 0) (4, 1)(5, 3) (3, 2)
]
This game has two Nash equilibrium strategies,(4, 1) and(5, 3). Note that bothplayers prefer(5, 3) when compared with(4, 1).
Question 2.7. What is the solution to the following simple modification of theabove game:3
(A,B) =
[
(4, 0) (4, 1)(4, 2) (3, 2)
]
Example 2.1. (Prisoner’s Dilemma) Two suspects in a crime have been picked upby police and placed in separate rooms. If both confess (C), each will be sentencedto 3 years in prison. If only one confesses, he will be set free and the other (whodidn’t confess (NC)) will be sent to prison for 4 years. If neither confesses, theywill both go to prison for 1 year.
This game can be represented in strategic form, as follows:
C NCC (3,3) (0,4)
NC (4,0) (1,1)
This game has one Nash equilibrium strategy,(−3,−3). When compared with theother solutions, note that it represents one of the worst outcomes for both players.
2.2.2 Properties of Nash strategies
3Thanks to Esra E. Aleisa for this question.
217
Definition 2.3. The pure strategy pair(i1, j1) weakly dominates(i2, j2) if andonly if
ai1,j1 ≥ ai2,j2
bi1,j1 ≥ bi2,j2
and one of the above inequalities is strict.
Definition 2.4. The pure strategy pair(i1, j1) strongly dominates(i2, j2) if andonly if
ai1,j1 > ai2,j2
bi1,j1 > bi2,j2
Definition 2.5. (Weiss [3])The pure strategy pair(i, j) is inadmissible if thereexists some strategy pair(i′, j′) that weakly dominates(i, j).
Definition 2.6. (Weiss [3])The pure strategy pair(i, j) is admissibleif it is notinadmissible.
Example 2.2. Consider again the game
(A,B) =
[
(4, 0) (4, 1)(5, 3) (3, 2)
]
With Nash equilibrium strategies,(4, 1) and(5, 3). Only (5, 3) is admissible.
Note 2.11. If there exists multiple admissible Nash equilibria, then sidepayments(with collusion) may yield a “better” solution for all players.
Definition 2.7. Two bimatrix games(A.B) and(C, D) arestrategically equivalent if there existsα1 > 0, α2 > 0 and scalarsβ1, β2 such that
aij = α1cij + β1 ∀ i, j
bij = α2dij + β2 ∀ i, j
Theorem 2.5. If bimatrix games(A.B) and(C,D) are strategically equivalentand(i∗, j∗) is a Nash strategy for(A,B), then(i∗, j∗) is also a Nash strategy for(C, D).
Note 2.12. This was used to modify the original matrices for the Prisoners’Dilemma problem in Example 2.1.
218
2.2.3 Nash equilibria using mixed strategies
Sometimes the bimatrix game(A,B) does not have a Nash strategy using purestrategies. As before, we can use mixed strategies for such games.
Definition 2.8. The (mixed) strategy(x∗, y∗) is a Nash equilibrium solution tothe game(A,B) if
x∗Ay∗T ≥ xAy∗T ∀ x ∈ Ξm
x∗By∗T ≥ x∗ByT ∀ y ∈ Ξn
whereΞr is therdimensional probability simplex.
Question 2.8. Consider the game
(A,B) =
[
(−2,−4) (0,−3)(−3, 0) (1,−1)
]
Can we find mixed strategies(x∗, y∗) that provide a Nash solution as definedabove?
Theorem 2.6. Every bimatrix game has at least one Nash equilibrium solutionin mixed strategies.
Proof: (This is the sketch provided by the text for Proposition 33.1; see Chapter 3for a complete proofs forN ≥ 2 players.)
Consider the setsΞn andΞm consisting of the mixed strategies for Player 1 andPlayer 2, respectively. Note thatΞn × Ξm is nonempty, convex and compact.Since the expected payoff functionsxAyT andxByT are linear in(x, y), the resultfollows using Brouwer’s fixed point theorem,
2.2.4 Finding Nash mixed strategies
Consider again the game
(A,B) =
[
(−2,−4) (0,−3)(−3, 0) (1,−1)
]
For Player 1
xAyT = −2x1y1 − 3(1− x1)y1 + (1− x1)(1− y1)
= 2x1y1 − x1 − 4y1 + 1
219
For Player 2
xByT = −2x1y1 − 2x1 + y1 − 1
In order for(x∗, y∗) to be a Nash equilibrium, we must have for all 0≤ x1 ≤ 1
x∗Ay∗T ≥ xAy∗T ∀ x ∈ Ξm(3)
x∗By∗T ≥ x∗ByT ∀ y ∈ Ξn(4)
For Player 1 this means that we want(x∗, y∗) so that for allx1
2x∗1y∗1 − x∗1 − 4y∗1 + 1 ≥ 2x1y∗1 − x1 − 4y∗1 + 1
2x∗1y∗1 − x∗1 ≥ 2x1y∗1 − x1
Let’s try y∗1 = 12. We get
2x∗1
(
12
)
− x∗1 ≥ 2x1
(
12
)
− x1
0 ≥ 0
Therefore, ify∗ = (12, 1
2) then anyx∗ can be chosen and condition (3) will besatisfied.
Note that only condition (3) and Player 1’s matrixA was used to get Player 2’sstrategyy∗.
For Player 2 the same thing happens if we usex∗1 = 12 and condition (4). That is,
for all 0≤ y1 ≤ 1
−2x∗1y∗1 − 2x∗1 + y∗1 − 1 ≥ −2x1y∗1 − 2x1 + y∗1 − 1
−2x∗1y∗1 + y∗1 ≥ −2x1y∗1 + y1
−2(
12
)
y∗1 + y∗1 ≥ −2(
12
)
y∗1 + y1
0 ≥ 0
How can we get the values of(x∗, y∗) that will work? One suggested approachfrom (Basar and Olsder [1]) uses the following:
220
Theorem 2.7. Any mixed Nash equilibrium solution(x∗, y∗) in the interior ofΞm × Ξn must satisfy
n∑
j=1
y∗j (aij − a1j) = 0 ∀ i 6= 1(5)
m∑
i=1
x∗i (bij − bi1) = 0 ∀ j 6= 1(6)
Proof: Recall that
E(x, y) = xAyT =m
∑
i=1
n∑
j=1
xiyjaij
=n
∑
j=1
m∑
i=1
xiyjaij
Sincex1 = 1−∑m
i=2 xi, we have
xAyT =n
∑
j=1
[ m∑
i=2
xiyjaij +
(
1−m
∑
i=2
xi
)
yja1j
]
=n
∑
j=1
[
yja1j + yj
m∑
i=2
xi(aij − a1j)
]
=n
∑
j=1
yja1j +m
∑
i=2
xi
n∑
j=1
yj(aij − a1j)
If (x∗, y∗) is an interior maximum (or minimum) then
∂∂xi
xAyT =n
∑
j=1
yj(aij − a1j) = 0 for i = 2, . . . ,m
Which provide the Equations 5.
The derivation of Equations 6 is similar.
Note 2.13. In the proof we have the equation
xAyT =n
∑
j=1
yja1j +m
∑
i=2
xi
n∑
j=1
yj(aij − a1j)
221
Any Nash solution(x∗, y∗) in the interior ofΞm × Ξn has
n∑
j=1
y∗j (aij − a1j) = 0 ∀ i 6= 1
So this choice ofy∗ produces
xAyT =n
∑
j=1
yja1j +m
∑
i=2
xi [0]
making this expression independent ofx.
Note 2.14. Equations 5 and 6 only provide necessary (not sufficient) conditions,and only characterize solutions on the interior of the probability simplex (i.e., whereevery component ofx andy are strictly positive).
For our example, these equations produce
y∗1(a21− a11) + y∗2(a22− a12) = 0
x∗1(b12− b11) + x∗2(b22− b21) = 0
Sincex∗2 = 1− x∗1 andy∗2 = 1− y∗1, we get
y∗1(−3− (−2)) + (1− y∗1)(1− 0) = 0
−y∗1 + (1− y∗1) = 0
y∗1 =12
x∗1(−3− (−4)) + (1− x∗1)(−1− 0) = 0
x∗1 − (1− x∗1) = 0
x∗1 =12
But, in addition, one must check thatx∗1 = 12 and y∗1 = 1
2 are actually Nashsolutions.
2.2.5 The LemkeHowson algorithm
Lemke and Howson [2] developed a quadratic programming technique for findingmixed Nash strategies for twoperson general sum games(A,B) in strategic form.Their method is based on the following fact, provided in their paper:
222
Letek denote a column vector ofk ones, and letx andy be row vectors of dimensionm andn, respectively. Letp andq denote scalars. We will also assume thatA andB are matrices, each withm rows andn columns.
A mixed strategy is defined by a pair(x, y) such that
xem = yen = 1, and x ≥ 0, y ≥ 0(7)
with expected payoffsxAyT and xByT.(8)
A Nash equilibrium solution is a pair(x, y) satisfying (7) such that for all(x, y)satisfying (7),
xAyT ≤ xAyT and xByT ≤ xByT.(9)
But this implies that
AyT ≤ xAyTem and xB ≤ xByTeTn.(10)
Conversely, suppose (10) holds for(x, y) satisfying (7). Now choose an arbitrary(x, y) satisfying (7). Multiply the first expression in (10) on the left byx andsecond expression in (10) on the right byyT to get (9). Hence, (7) and (10) are,together, equivalent to (7) and (9).
This serves as the foundation for the proof of the following theorem:
Theorem 2.8. Any mixed strategy(x∗, y∗) for bimatrix game(A, B) is a Nashequilibrium solution if and only ifx∗, y∗, p∗ andq∗ solve problem (LH):
(LH): maxx,y,p,q{xAyT + xByT − p− q}st: AyT ≤ pem
BTxT ≤ qenxi ≥ 0 ∀ iyj ≥ 0 ∀ j∑m
i=1 xi = 1∑n
j=1 yj = 1
Proof: (⇒)
Every feasible solution(x, y, p, q) to problem (LH) must satisfy the constraints
AyT ≤ pem
xB ≤ qeTn.
223
Multiply both sides of the first constraint on the left byx and multiply the secondconstraint on the right byyT. As a result, we see that a feasible(x, y, p, q) mustsatisfy
xAyT ≤ p
xByT ≤ q.
Hence, for any feasible(x, y, p, q). the objective function must satisfy
xAyT + xByT − p− q ≤ 0.
Suppose(x∗, y∗) is any Nash solution for(A,B). Let
p∗ = x∗Ay∗T
q∗ = x∗By∗T.
Because of (9) and (10), this implies
Ay∗T ≤ x∗Ay∗Tem = p∗emx∗B ≤ x∗By∗TeT
n = q∗eTn.
So this choice of(x∗, y∗, p∗, q∗) is feasible, and results in the objective functionequal to zero. Hence it’s an optimal solution to problem (LH)
(⇐)
Suppose(x, y, p, q) solves problem (LH). From Theorem 2.6, there is at least oneNash solution(x∗, y∗). Using the above argument,(x∗, y∗) must be an optimalsolution to (LH) with an objective function value of zero. Since(x, y, p, q) is anoptimal solution to (LH), we must then have
xAyT + xByT − p− q = 0(11)
with (x, y, p, q) satisfying the constraints
AyT ≤ pem(12)
xB ≤ qeTn.(13)
Now multiply (12) on the left by ¯x and multiply (13) on the right by ¯yT to get
xAyT ≤ p(14)
xByT ≤ q.(15)
224
Then (11), (14), and (15) together imply
xAyT = p
xByT = q.
So (12), and (13) can now be rewritten as
AyT ≤ xAyTem(16)
xB ≤ xByTen.(17)
Choose an arbitrary(x, y) ∈ Ξm × Ξn and, this time, multiply (16) on the left byx and multiply (17) on the right byyT to get
xAyT ≤ xAyT(18)
xByT ≤ xByT(19)
for all (x, y) ∈ Ξm × Ξn. Hence(x, y) is a Nash equilibrium solution.
2.3 BIBLIOGRAPHY
[1] T. Basar and G. Olsder,Dynamic noncooperative game theory,Academic Press(1982).
[2] C. E. Lemke and J. T. Howson, Jr., Equilibrium points of bimatrix games,SIAMJournal, Volume 12, Issue 2 (Jun., 1964), pp 413–423.
[3] L. Weiss,Statistical decision theory,McGrawHill (1961).
225
IE675 Game Theory
Lecture Note Set 3Wayne F. Bialas1
Monday, April 16, 2001
3 NPERSON GAMES
3.1 N Person Games in Strategic Form
3.1.1 Basic ideas
We can extend many of the results of the previous chapter for games withN > 2players.
Let Mi = {1, . . . ,mi} denote the set ofmi pure strategies available to Playeri.
Let ni ∈ Mi be the strategy actually selected by Playeri, and letain1,n2,...,nN
be thepayoff to Playeri if
Player 1 chooses strategyn1
Player 2 chooses strategyn2...
PlayerN chooses strategynN
Definition 3.1. The strategies(n∗1, . . ., n∗N ) with n∗i ∈ Mi for all i ∈ N form aNash equilibrium solution if
a1n∗1 ,n∗2 ,...,n∗N
≥ a1n1,n∗2 ,...,n∗N
∀ ni ∈ M1
1Department of Industrial Engineering, University at Buffalo, 342 Bell Hall, Box 602050, Buffalo,NY 142602050 USA;Email: [email protected];Web: http://www.acsu.buffalo.edu/˜bialas.Copyright c© MMI Wayne F. Bialas. All Rights Reserved. Duplication of this work is prohibitedwithout written permission. This document produced April 16, 2001 at 1:47 pm.
31
a2n∗1 ,n∗2 ,...,n∗N
≥ a2n∗1 ,n2,...,n∗N
∀ n2 ∈ M2
...
aNn∗1 ,n∗2 ,...,n∗N
≥ aNn∗1 ,n∗2 ,...,nN
∀ nN ∈ MN
Definition 3.2. Two N person games with payoff functionsain1,n2,...,nN
andbin1,n2,...,nN
are strategically equivalent if there existsαi > 0 and scalarsβi fori = 1, . . ., n such that
ain1,n2,...,nN
= αibin1,n2,...,nN
+ βi ∀ i ∈ N
3.1.2 Nash solutions with mixed strategies
Definition 3.3. The mixed strategies(y∗1, . . ., y∗N ) with y∗i ∈ ΞMi for all i ∈ Nform a Nash equilibrium solution if∑
n1
· · ·∑
nN
y∗1n1
y∗2n2· · · y∗NnN
a1n1,...,nN
≥∑
n1
· · ·∑
nN
y1n1
y∗2n2· · · y∗NnN
a1n1,...,nN
∀ y1 ∈ ΞM1
∑
n1
· · ·∑
nN
y∗1n1
y∗2n2· · · y∗NnN
a2n1,...,nN
≥∑
n1
· · ·∑
nN
y∗1n1
y2n2· · · y∗NnN
a2n1,...,nN
∀ y2 ∈ ΞM2
...∑
n1
· · ·∑
nN
y∗1n1
y∗2n2· · · y∗NnN
aNn1,...,nN
≥∑
n1
· · ·∑
nN
y∗1n1
y∗2n2· · · yN
nNaN
n1,...,nN∀ yN ∈ ΞMN
Note 3.1. Consider the function
ψini
(y1, . . . , yn) =∑
n1
· · ·∑
nN
y1n1
y2n2· · · yN
nNai
n1,...,nN
−∑
n1
· · ·∑
ni−1
∑
ni+1
· · ·∑
nN
y1n1· · · yi−1
ni−1yi+1
ni+1· · · yN
nNai
n1,...,nN
This represents the difference between the following two quantities:
1. the expected payoff to Playeri if all players adopt mixed strategies(y1, . . . , yN ):∑
n1
· · ·∑
nN
y1n1
y2n2· · · yN
nNai
n1,...,nN
32
2. the expected payoff to Playeri if all players except Playeri adopt mixedstrategies(y1, . . ., yN ) and Playeri uses pure strategyni:
∑
n1
· · ·∑
ni−1
∑
ni+1
· · ·∑
nN
y1n1· · · yi−1
ni−1yi+1
ni+1· · · yN
nNai
n1,...,nN
Remember that the mixed strategies include the pure strategies. For example,(0, 1, 0, . . . , 0) is a mixed strategy that implements pure strategy 2.
For example, in a twoplayer game, for eachn1 ∈ M1 we have
ψ1n1
(y1, y2) =[
y11y
21a
111 + y1
1y22a
112 + y1
2y21a
121 + y1
2y22a
122
]
−[
y21a
1n11 + y2
2a1n12
]
The first termy1
1y21a
111 + y1
1y22a
112 + y1
2y21a
121 + y1
2y22a
122
is the expected value if Player 1 uses mixed strategyy1. The second term
y21a
1n11 + y2
2a1n12
is the expected value if Player 1 uses pure strategyn1. Player 2 uses mixed strategyy2 in both cases.
The next theorem (Theorem 3.1) will guarantee that every game has at least oneNash equilibrium in mixed strategies. Its proof depends on things that can gowrong whenψi
ni(y1, . . ., yn) < 0. So we will define
cini
(y1, . . ., yn) = min{ψini
(y1, . . ., yn), 0}
The proof of Theorem 3.1 then uses the expression
yini
=yi
ni+ ci
ni
1 +∑
j∈Micij
Note that the denominator is the sum (taken overni) of the terms in the numerator.If all of the ci
j vanish, we getyi
ni= yi
ni.
Theorem 3.1. EveryN person finite game in normal (strategic) form has a Nashequilibrium solution using mixed strategies.
33
Proof: Defineψini
andcini
, as above. Consider the expression
yini
=yi
ni+ ci
ni
1 +∑
j∈Micij
(1)
We will try to find solutionsyini
to Equation 1 such that
yini
= yini
∀ ni ∈ Mi and∀ i = 1, . . ., N
The Brouwer fixed point theorem1 guarantees that at least one such solution exists.
We will show that every solution to Equation 1 is a Nash equilibrium solution andthat every Nash equilibrium solution is a solution to Equation 1.
To show that every Nash equilibrium solution is a solution to Equation 1, note thatif (y∗1, . . . , y∗N ) is a Nash solution then, from the definition of a Nash solution,
ψini
(y1∗, . . ., yn∗) ≥ 0
which impliescini
(y1∗, . . ., yn∗) = 0
and this holds for allni ∈ Mi and alli = 1, . . ., N . Hence,(y1∗, . . . , yn∗) solvesEquation 1.
Remark: Will show that every solution to Equation 1 is a Nash equilibrium solution by contradiction. That is, we will assume that a mixedstrategy(y1, . . ., yN ) is a solution to Equation 1 but is not a Nash solution. This will lead us to conclude that(y1, . . . , yN ) is not a solutionto Equation 1, a contradiction.
Assume(y1, . . ., yN ) is a solution to Equation 1 but is not a Nash solution. Thenthere exists ai ∈ {1, . . ., N} (sayi = 1) with y1 ∈ ΞM1 such that
∑
n1
· · ·∑
nN
y1n1
y2n2· · · yN
nNai
n1,...,nN
<∑
n1
· · ·∑
nN
y1n1
y2n2· · · yN
nNai
n1,...,nN
1The Brouwer fixed point theorem states that ifS is a compact and convex subset ofRn and iff : S → S is a continuous function ontoS, then there exists at least onex ∈ S such thatf(x) = x.
34
Rewriting the right hand side,∑
n1
· · ·∑
nN
y1n1
y2n2· · · yN
nNai
n1,...,nN
<∑
n1
y1n1
[
∑
n2
· · ·∑
nN
y2n2· · · yN
nNai
n1,...,nN
]
Now the expression[
∑
n2
· · ·∑
nN
y2n2· · · yN
nNai
n1,...,nN
]
is a function ofn1. Suppose this quantity is maximized whenn1 = n1. We thenget,
∑
n1
· · ·∑
nN
y1n1
y2n2· · · yN
nNai
n1,...,nN
<∑
n1
y1n1
[
∑
n2
· · ·∑
nN
y2n2· · · yN
nNai
n1,...,nN
]
which yields∑
n1
· · ·∑
nN
y1n1
y2n2· · · yN
nNai
n1,...,nN(2)
<∑
n2
· · ·∑
nN
y2n2· · · yN
nNai
n1,...,nN(3)
Remark: After this point we don’t really use ˜y again. It was just adevice to obtain ˜n1 which will produce our contradiction. Remember,throughout the rest of the proof, the values of(y1, . . ., yN ) claim be afixed point for Equation 1. If(y1, . . ., yN ) is, in fact, not Nash (as wasassumed), then we have just found a player (who we are calling Player1) who has apurestrategy ˜n1 that can beat strategyy1 when Players2, . . ., N use mixed strategies(y2, . . ., yN ).
Usingn1, Player 1 obtains
ψ1n1
(y1, . . ., yn) < 0
which means thatc1n1
(y1, . . ., yn) < 0
35
which implies that∑
j∈M1
cij < 0
since one of the indices inM1 is n1 and the rest of thecij cannot be positive.
Remark: Now the values(y1, . . ., yN ) are in trouble. We have determined that their claim of being “nonNash” produces a denominatorin Equation 1 that is less than 1. All we need to do is find some purestrategy (say ˆn1) for Player 1 withci
ni(y1, . . ., yn) = 0. If we can,
(y1, . . . , yN ) will fail to be a fixedpoint for Equation 1, and it will bey1 that causes the failure. Let’s see what happens. . .
Recall expression 2:∑
n1
· · ·∑
nN
y1n1
y2n2· · · yN
nNai
n1,...,nN
rewritten as∑
n1
y1n1
[
∑
n2
· · ·∑
nN
y2n2· · · yN
nNai
n1,...,nN
]
and consider the term[
∑
n2
· · ·∑
nN
y2n2· · · yN
nNai
n1,...,nN
]
(4)
as a function ofn1 There must be somen1 = n1 that minimizes expression 4, with
∑
n1
· · ·∑
nN
y1n1
y2n2· · · yN
nNai
n1,...,nN≥
[
∑
n2
· · ·∑
nN
y2n2· · · yN
nNai
n1,n2...,nN
]
For that particular strategy we have
ψ1n1
(y1, . . ., yn) ≥ 0
which means thatc1n1
(y1, . . ., yn) = 0
Therefore, for Player 1, we get
y1n1
=y1
n1+ 0
1 + [something< 0]> y1
n1
36
Hence,y1 (which claimed to be a component of the nonNash solution(y1, . . ., yN ))fails to solve Equation 1. A contradiction.
The following theorem is an extension of a result forN = 2 given in Chapter 2. Itprovides necessary conditions for any interior Nash solution forN person games.
Theorem 3.2. Any mixed Nash equilibrium solution(y∗1, . . ., y∗N ) in the interiorof ΞM1 × · · · × ΞMN must satisfy∑
n2
∑
n3
· · ·∑
nN
y∗2n2
y∗3n3· · · y∗NnN
(a1n1,n2,n3...,nN
− a11,n2,n3...,nN
) = 0 ∀ n1 ∈ M1 − {1}
∑
n1
∑
n3
· · ·∑
nN
y∗1n1
y∗3n3· · · y∗NnN
(a2n1,n2,n3...,nN
− a2n1,1,n3...,nN
) = 0 ∀ n2 ∈ M2 − {1}
...∑
n1
∑
n2
· · ·∑
nN−1
y∗1n1
y∗2n2· · · y∗NnN
(aNn1,n2,n3...,nN
− aNn1,n2,n3...,1) = 0 ∀ nN ∈ MN − {1}
Proof: Left to the reader.
Question 3.1. Consider the 3player game with the following values for
(a1n1,n2,n3
, a2n1,n2,n3
, a3n1,n2,n3
) :
Forn3 = 1n2 = 1 n2 = 2
n1 = 1 (1,−1, 0) (0, 1, 0)n1 = 2 (2, 0, 0) (0, 0, 1)
Forn3 = 2n2 = 1 n2 = 2
n1 = 1 (1, 0, 1) (0, 0, 0)n1 = 2 (0, 3, 0) (−1, 2, 0)
For examplea2212 = 3. Use the above method to find an interior Nash solution.
3.2 N Person Games in Extensive Form
3.2.1 An introductory example
We will use an example to illustrate some of the issues associates with games inextensive form.
37
Consider a game with two players described by the following tree diagram:
Player 1
Player 2
L M R
L R L R L R
(0,1) (2,-1) (-3,-2) (0,-3) (-2,-1) (1,0)
11η
21η 2
2η
information set
action
payoffs
Player 1 goes first and chooses an action among{Left, Middle, Right}. Player 2then follows by choosing an action among{Left, Right}.
The payoff vectors for each possible combination of actions are shown at eachterminating node of the tree. For example, if Player 1 chooses actionu1 = L andPlayer 2 chooses actionu2 = R then the payoff is(2,−1). So, Player 1 gains 2while Player 1 loses 1.
Player 2 does not have complete information about the progress of the game. Hisnodes are partitioned among two information sets{η1
2, η22}. When Player 2 chooses
his action, he only knows which information set he is in, not which node.
Player 1 could analyze the game as follows:
• If Player 1 choosesu1 = L then Player 2 would respond withu2 = Lresulting in a payoff of(0, 1).
• If Player 1 choosesu1 ∈ {M, R} then the players are really playing the
38
following subgame:
Player 1
Player 2
L M R
L R L R L R
(0,1) (2,-1) (-3,-2) (0,-3) (-2,-1) (1,0)
11η
21η 2
2η
which can be expressed in normal form as
L RM (3,2) (0,3)R (2,1) (1,0)
in which (R, R) is a Nash equilibrium strategy in pure strategies.
So it seems reasonable for the players to use the following strategies:
• For Player 1
– If Player 1 is in information setη11 chooseR.
• For Player 2
– If Player 2 is in information setη21 chooseL.
– If Player 2 is in information setη22 chooseR.
39
These strategies can be displayed in our tree diagram as follows:
Player 1
Player 2
L M R
L R L R L R
(0,1) (2,-1) (-3,-2) (0,-3) (-2,-1) (1,0)
11η
21η 2
2η
A pair of pure strategies for Players 1 and 2
For games in strategic form, we denote the set of pure strategies for Playeri byMi = {1, . . . ,mi} and letni ∈ Mi denote the strategy actually selected by Playeri. We will now consider a strategyγi as a function whose domain is the set ofinformation sets of Playeri and whose range is the collection of possible actionsfor Playeri. For the strategy shown above
γ1(η11) = R
γ2(η2) =
{
L if η2 = η21
R if η2 = η22
The players’ task is to choose the best strategy from those available. Using thenotation from Section 3.1.1, the setMi = {1, . . .,mi} now represents the indicesof the possible strategies,{γi
1, . . ., γimi}, for Playeri.
Notice that if either player attempts to change his strategy unilaterally, he will notimprove his payoff. The above strategy is, in fact, a Nash equilibrium strategy aswe will formally define in the next section.
310
There is another Nash equilibrium strategy for this game, namely
Player 1
Player 2
L M R
L R L R L R
(0,1) (2,-1) (-3,-2) (0,-3) (-2,-1) (1,0)
11η
21η 2
2η
An alternate strategy pairfor Players 1 and 2
γ1(η11) = L
γ2(η2) =
{
L if η2 = η21
L if η2 = η22
There is another Nash equilibrium strategy for this game, namely the strategy pair
311
(γ11, γ
21).
Player 1
Player 2
L M R
L R L R L R
(0,1) (2,-1) (-3,-2) (0,-3) (-2,-1) (1,0)
11η
21η 2
2η
An alternate strategy pairfor Players 1 and 2
But this strategy did not arise from the recursive procedure described in Section 3.2.1. But(γ1
1, γ21) is, indeed, a Nash equilibrium. Neither player can improve
his payoff by a unilateral change in strategy. Oddly, there is no reason for Player1 to implement this strategy. If Player 1 chooses to go Left, he can only receive0. But if Player 1 goes Right, Player 2 will go Right, not Left, and Player 1 willreceive a payoff of 1. This example shows that games in extensive form can haveNash equilibria that will never be considered for implementation,
3.2.2 Basic ideas
Definition 3.4. AnN playergame in extensive formis a directed graph with
1. a specific vertex indicating the starting point of the game.
2. N cost functions each assigning a real number to each terminating node ofthe graph. Theith cost function represents the gain to Playeri if that nodeis reached.
3. a partition of the nodes among theN players.
4. a subpartition of the nodes assigned to Playeri into information sets{ηik}.
The number of branches emanating from each node of a given information
312
set is the same, and no node follows another node in the same informationset.
We will use the following notation:
ηi information sets for Playeri.
ui actual actions for Playeri emanating from information sets.
γi(·) a function whose domain is the set of all information sets{ηi}and whose range is the set of all possible actions{ui}.
The set ofγi(·) is the collection of possible (pure) strategies that Playeri coulduse. In the parlance of economic decision theory, theγi aredecision rules.In gametheory, we call them (pure)strategies.
For the game illustrated in Section 3.2.1, we can write down all possible strategypairs(γ1, γ2). The text calls theseprofiles.
Player 1 has 3 possible pure strategies:
γ1(η11) = L
γ1(η11) = M
γ1(η11) = R
Player 2 has 8 possible pure strategies which can be listed in tabular form, asfollows:
γ21 γ2
2 γ23 γ2
4η2
1 : L R L Rη2
2 : L L R R
Each strategy pair(γ1, γ2), when implemented, results in payoffs to both playerswhich we will denote by(J1(γ1, γ2), J2(γ1, γ2)). These payoffs produce a gamein strategic (normal) form where the rows and columns correspond to the possiblepure strategies of Player 1 and Player 2, respectively.
γ21 γ2
2 γ23 γ2
4
γ11 (0,1) (2,1) (0,1) (2,1)
γ12 (3,2) (3,2) (0,3) (0,3)
γ13 (2,1) (2,1) (1,0) (1,0)
313
Using Definition 3.1, we have two Nash equilibria, namely
(γ11, γ2
1) with J(γ11, γ
21) = (0, 1)
(γ13, γ
23) with J(γ1
3, γ23) = (1, 0)
This formulation allows us to
• focus on identifying “good” decision rules even for complicated strategies
• analyze games with different information structures
• analyze multistage games with players taking more than one “turn”
3.2.3 The structure of extensive games
The general definition of games in extensive form can produce a variety of differenttypes of games. This section will discuss some of the approaches to classifyingsuch games. These classification schemes are based on
1. the topology of the directed graph
2. the information structure of the games
3. the sequencing of the players
This section borrows heavily from Basar and Olsder [1]. We will categorize multistage games, that is, games where the players take multiple turns. This classificationscheme extends to differential games that are played in continuous time. In thissection, however, we will use it to classify multistage games in extensive form.
Define the following terms:
ηik information for Playeri at stagek.
xk state of the game at stagek. This completely describes the currentstatus of the game at any point in time.
yik = hi
k(xk) is the state measurement equation, where
hik(·) is the state measurement function
yik is the observation of Playeri at statek.
uik decision of Playeri at stagek.
314
The purpose of the functionhik is to recognize that the players may not perfect
information regarding the current state of the game. The information available toPlayeri at stagek is then
ηik = {y1
1, . . ., y1k; y2
1, . . ., y2k; · · · ; yN
1 , . . ., yNk }
Based on these ideas, games can be classified as
open loop
ηik = {x1} ∀ k ∈ K
closed loop, perfect state
ηik = {x1, . . ., xk} ∀ k ∈ K
closed loop, imperfect state
ηik = {yi
1, . . ., yik} ∀ k ∈ K
memoryless, perfect state
ηik = {x1, xk} ∀ k ∈ K
feedback, perfect state
ηik = {xk} ∀ k ∈ K
feedback, imperfect state
ηik = {yi
k} ∀ k ∈ K
Example 3.1. Princess and the Monster. This game is played in completedarkness. A princess and a monster know their starting positions in a cave. Thegame ends when they bump into each other. Princess is trying to maximize the timeto the final encounter. The monster is trying to minimize the time. (Open Loop)
Example 3.2. Lady in the Lake. This game is played using a circular lake. Thelady is swimming with maximum speedv`. A man (who can’t swim) runs alongthe shore of the lake at a maximum speed ofvm. The lady wins if she reaches shoreand the man is not there. (Feedback)
315
3.3 Structure in extensive form games
I am grateful to Pengfei Yi who provided significant portions of Section 3.3.
The solution of an arbitrary extensive form game may require enumeration. Butunder some conditions, the structure of some games will permit a recursive solutionprocedure. Many of these results can be found in Basar and Olsder [1].
Definition 3.5. Player i is said to be apredecessorof Playerj if Player i iscloser to the initial vertex of the game’s tree than Playerj.
Definition 3.6. An extensive form game isnestedif each player has access to theinformation of his predecessors.
Definition 3.7. (Basar and Olsder [1])A nested extensive form game isladdernestedif the only difference between the information available to any player (sayPlayeri) and his immediate predecessor (Player(i− 1)) involves only the actionsof Player(i−1), and only at those nodes corresponding to the branches emanatingfrom singleton information sets of Player(i− 1).
Note 3.2. Every 2player nested game is laddernested
The following three figures illustrate the distinguishing characteristics among nonnested, nested, and laddernested games.
Player 1
Player 2
Player 3
Not nested
316
Player 1
Player 2
Player 3
Nested
Player 1
Player 2
Player 3
Ladder-nested
The important feature of laddernested games is that the tree can be decomposedin to subtrees using the singleton information sets as the starting vertices of thesubtrees. Each subtree can then be analyzed as game in strategic form amongthose players involved in the subtree.
317
As an example, consider the following laddernested game:
Player 1
Player 2
Player 3
An example
(0,-1,-3) (-1,0,-2) (1,-2,-0) (0,1,-1) (-1,-1,-1) (0,0,-3) (1,-3,0) (0,-2,-2)
L R
11η
21η 2
2η
31η 3
2η
This game can be decomposed into two bimatrix games involving Player 2 andPlayer 3. Which of these two games are actually played by Player 2 and Player 3depends on the action (L or R) of Player 1.
If Player 1 choosesu1 = L then Player 2 and Player 3 play the game
Player 3L R
Player 2 L (−1,−3) (0,−2)R (−2, 0) (1,−1)
Suppose Player 2 uses a mixed strategy of choosingL with probability 0.5 andRwith probability 0.5. Suppose Player 3 also uses a mixed strategy of choosingLwith probability 0.5 andR with probability 0.5. Then these mixed strategies are aNash equilibrium solution for this subgame with an expected payoff to all threeplayers of(0,−0.5,−1.5).
If Player 1 choosesu1 = R then Player 2 and Player 3 play the game
Player 3L R
Player 2 L (−1,−1) (0,−3)R (−3, 0) (0,−2)
318
This subgame has a Nash equilibrium in pure strategies with Player 2 and Player 3both choosingL. The payoff to all three players in this case is of(−1,−1,−1).
To summarize the solution for all three players we will introduce the concept of abehavioral strategy:
Definition 3.8. A behavioral strategy(or locally randomized strategy) assignsfor each information set a probability vector to the alternatives emanating from theinformation set.
When using a behavioral strategy, a player simply randomizes over the alternativesfrom each information set. When using a mixed strategy, a player randomizes hisselection from the possible pure strategies for the entire game.
The following behavioral strategy produces a Nash equilibrium for all three players:
γ1(η11) = L
γ2(η21) =
{
L with probability 0.5R with probability 0.5
γ2(η22) =
{
L with probability 1R with probability 0
γ3(η31) =
{
L with probability 0.5R with probability 0.5
γ3(η32) =
{
L with probability 1R with probability 0
with an expected payoff of(0,−0.5,−1.5).
Note 3.3. When using a behavioral strategy, a player, at each information set,must specify a probability distribution over the alternatives for that information set.It is assumed that the choices of alternatives at different information sets are madeindependently. Thus it might be reasonable to call such strategies “uncorrelated”strategies.
Note 3.4. For an arbitrary game, not all mixed strategies can be represented byusing behavioral strategies. Behavioral strategies are easy to find and represent. Wewould like to know when we can use behavioral strategies instead of enumeratingall pure strategies and randomizing among those pure strategies.
Theorem 3.3. Every singlestage, laddernestedN person game has at least oneNash equilibrium using behavioral strategies.
319
3.3.1 An example by Kuhn
One can show that every behavioral strategy can be represented as a mixed strategy. But an important question arises when considering mixed strategies visavisbehavioral strategies: Can a mixed strategy always be represented by a behavioralstrategy?
The following example from Kuhn [2] shows a remarkable result involving behavioral strategies. It shows what can happen if the players do not have a propertycalledperfect recall.Moreover, the property ofperfect recallalone is a necessaryand sufficient condition to obtain a onetoone mapping between behavioral andmixed strategies for any game.
In a game withperfect recall, each player remembers everything he knew atprevious moves and all of his choices at these moves.
A zerosum game involves two players and a deck of cards. A card is dealt to eachplayer. If the cards are not different, two more cards are dealt until one player hasa higher card than the other.
The holder of the high card receives $1 from his opponent. The player with thehigh card can choose to either stop the game or continue.
If the game continues, Player 1 (who forgets whether he has the high or low card)can choose to leave the cards as they are or trade with his opponent. Another $1 isthen won by the (possibly different) holder of the high card.
320
The game can be represented with the following diagram:
Player 1 Player 2
Player 1
Chance
21
21
S C C S
T K K T
(1,-1) (-1,1)
(0,0) (2,-2) (-2,2) (0,0)
11η 2
1η
12η
whereS Stop the gameC Continue the gameT Trade cardsK Keep cards
At information setη11, Player 1 makes the critical decision that causes him to
eventually lose perfect recall atη12. Moreover, it is Player 1’s own action that
causes this loss of information (as opposed to Player 2 causing the loss). This isthe reason why behavioral strategies fail for Player 1 in this problem.
Define the following pure strategies for Player 1:
γ11(η
1) =
{
S if η1 = η11
T if η1 = η12
γ12(η
1) =
{
S if η1 = η11
K if η1 = η12
γ13(η
1) =
{
C if η1 = η11
T if η1 = η12
γ14(η
1) =
{
C if η1 = η11
K if η1 = η12
and for Player 2:γ2
1(η21) = S γ2
2(η21) = C
321
This results in the following strategic (normal) form game:
γ21 γ2
2γ1
1 (1/2,−1/2) (0, 0)γ1
2 (−1/2, 1/2) (0, 0)γ1
3 (0, 0) (−1/2, 1/2)γ1
4 (0, 0) (1/2,−1/2)
Question 3.2. Show that the mixed strategy for Player 1:
(12, 0, 0, 1
2)
and the mixed strategy for Player 2:
(12, 1
2)
result in a Nash equilibrium with expected payoff(14,−1
4).
Question 3.3. Suppose that Player 1 uses a behavioral strategy(x, y) definedas follows: Letx ∈ [0, 1] be the probability Player 1 choosesS when he is ininformation setη1
1, and lety ∈ [0, 1] be the probability Player 1 choosesT when heis in information setη1
2.
Also suppose that Player 2 uses a behavioral strategy(z) wherez ∈ [0, 1] is theprobability Player 2 choosesS when he is in information setη2
1.
Let Ei((x, y), z) denote the expected payoff to Playeri = 1, 2 when using behavioral strategies(x, y) and(z). Show that,
E1((x, y), z) = (x− z)(y − 12)
andE1((x, y), z) = −E2((x, y), z) for anyx, y andz.
Furthermore, considermaxx,y
minz
(x− z)(y − 12)
and show that the every equilibrium solution in behavioral strategies must havey = 1
2 whereE1((x, 1
2), z) = −E2((x, 12), z) = 0.
Therefore, using only behavioral strategies, the expected payoff will be(0, 0). IfPlayer 1 is restricted to using only behavioral strategies, he can guarantee, at most,
322
an expected gain of 0. But if he randomizes over all of his pure strategies and stayswith that strategy throughout the game, Player 1 can get an expected payoff of1
4.
Any behavioral strategy can be expressed as a mixed strategy. But, without perfectrecall, not all mixed strategies can be implemented using behavioral strategies.
Theorem 3.4. (Kuhn [2]) Perfect recall is a necessary and sufficient conditionfor all mixed strategies to be induced by behavioral strategies.
A formal proof of this theorem is in [2]. Here is a brief sketch: We would like toknow under what circumstances there is a 11 correspondence between behavioraland mixed strategies. Suppose a mixed strategy consists of the following mixtureof three pure strategies:
choose γa with probability 12
choose γb with probability 13
choose γc with probability 16
Suppose that strategiesγb andγc lead the game to information setη. Suppose thatstrategyγa does not go toη. If a player is told he is in informationη, he can useperfect recall to backtrack completely through the game to learn whether strategyγb or γc was used. Supposeγb(η) = ub andγc(η) = uc. Then if the player is inη,he can implement the mixed strategy with the following behavioral strategy:
choose ub with probability 23
choose uc with probability 13
A game may not have perfect recall, but some strategies could take the gamealong paths that, as subtrees, have the property of perfect recall. Kuhn [2] andThompson [4] employ the concept ofsignaling information sets. In essence, asignaling information set is that point in the game where a decision by a playercould cause him to lose the property of perfect recall.
323
In the following three games, the signaling information sets are marked with (*):
Chance
Player 2Â
1/2
Player 1
Player 1
 Signaling Set
1/2
Player 2
 Player 1
Player 1
 Signaling Set
324
Player 2ÂPlayer 2
Player 2
 Signaling SetPlayer 1
Â
3.4 Stackelberg solutions
3.4.1 Basic ideas
This early idea in game theory is due to Stackelberg [3]. Its features include:
• hierarchical ordering of the players
• strategy decisions are made and announced sequentially
• one player has the ability to enforce his strategy on others
This approach introduce is notion of arational reactionof one player to another’schoice of strategy.
Example 3.3. Consider the bimatrix game
γ21 γ2
2 γ23
γ11 (0, 1) (−2,−1) (−3
2,−23)
γ12 (−1,−2) (−1, 0) (−3,−1)
γ12 (1, 0) (−2,−1) (−2, 1
2)
Note that(γ12, γ
22) is a Nash solution with value(−1, 0).
325
Suppose that Player 1 must “lead” by announcing his strategy, first. Is this anadvantage or disadvantage? Note that,
If Player 1 chooses γ11 Player 2 will respond with γ2
1If Player 1 chooses γ1
2 Player 2 will respond with γ22
If Player 1 chooses γ13 Player 2 will respond with γ2
3
The best choice for Player 1 isγ11 which will yield a value of(0, 1). For this
game, theStackelberg solutionis an improvement over the Nash solution forbothplayers.
If we letγ1
1 = Lγ1
2 = Mγ1
3 = R
γ21 = L
γ22 = M
γ23 = R
we can implement the Stackelberg strategy by playing the following game inextensive form:
Player 1
Player 2
LM
R
(-1-2) (-1,0) (-3,-1)
(1,0) (-2,-1) (-2, 1/2)(0,1) (-2,-1) (-3/2,- 2/3)
M R
L M R
L M R
Stackelberg
L
326
The Nash solution can be obtained by playing the following game:
Player 1
Player 2
LM
R
(-1-2) (-1,0) (-3,-1)
(1,0) (-2,-1) (-2, 1/2)(0,1) (-2,-1) (-3/2,- 2/3)
M R
L M R
L M R
Nash
L
There may not be a unique response to the leader’s strategy. Consider the followingexample:
γ21 γ2
2 γ23
γ11 (0, 0) (−1, 0) (−3,−1)
γ12 (−2, 1) (−2, 0) (1, 1)
In this case
If Player 1 chooses γ11 Player 2 will respond with γ2
1 or γ22
If Player 1 chooses γ12 Player 2 will respond with γ2
1 or γ23
One solution approach uses a minimax philosophy. That is, Player 1 should securehis profits against the alternative rational reactions of Player 2. If Player 1 choosesγ1
1 the least he will obtain is−1, and he choosesγ12 the least he will obtain is−2.
So his (minimax) Stackelberg strategy isγ11.
Question 3.4. In this situation, one might consider mixed Stackelberg strategies.How could such strategies be defined, when would they be useful, and how wouldthey be implemented?
327
Note 3.5. When the follower’s response is not unique, a natural solution approachwould be tosidepayments. In other words, Player 1 could provide an incentiveto Player 2 to choose an action in Player 1’s best interest. Letε > 0 be a smallsidepayment. Then the players would be playing the Stackelberg game
γ21 γ2
2 γ23
γ11 (−ε, ε) (−1, 0) (−3,−1)
γ12 (−2, 1) (−2, 0) (1− ε, 1 + ε)
3.4.2 The formalities
Let Γ1 andΓ2 denote the sets of pure strategies for Player 1 and Player 2, respectively. Let J i(γ1, γ2) denote the payoff to Playeri if Player 1 chooses strategyγ1 ∈ Γ1 and Player 2 chooses strategyγ2 ∈ Γ2. Let
R2(γ1) ≡ {ξ ∈ Γ2 | J2(γ1, ξ) ≥ J2(γ1, γ2) ∀ γ2 ∈ Γ2}
Note thatR2(γ1) ⊆ Γ2 and we callR2(γ1) the rational reaction of Player 2 toPlayer 1’s choice ofγ1. A Stackelberg strategycan be formally defined as the ˆγ1
that solves
minγ2∈R2(γ1)
J1(γ1, γ2) = maxγ1∈Γ1
minγ2∈R2(γ1)
J1(γ1, γ2) = J1∗
Note 3.6. If R2(γ1) is a singleton for allγ1 ∈ Γ1 then there exists a mapping
ψ2 : Γ1 → Γ2
such thatR2(γ1) = {γ2} implies γ2 = ψ2(γ1). In this case, the definition of aStackelberg solution can be simplified to the ˆγ1 that solves
J1(γ1, ψ2(γ1)) = maxγ1∈Γ1
J1(γ1, ψ2(γ1))
It is easy to prove the following:
Theorem 3.5. Every twoperson finite game has a Stackelberg solution for theleader.
Note 3.7. From the follower’s point of view, his choice of strategy in a Stackelberggame is always optimal (i.e., the best he can do).
328
Question 3.5. Let J1∗ (defined as above) denote the Stackelberg value for theleader Player 1, and letJ1
N denote any Nash equilibrium solution value for thesame player. What is the relationship (bigger, smaller, etc.) betweenJ1∗ andJ1
N?What additional conditions (if any) do you need to place on the game to guaranteethat relationship?
3.5 BIBLIOGRAPHY
[1] T. Basar and G. Olsder, Dynamic noncooperative game theory, Academic Press(1982).
[2] H.W. Kuhn,Extensive games and the problem of information,Annals of Mathematics Studies, 28, 1953.
[3] H. von Stackelberg, Marktform und gleichgewicht, Springer, Vienna, 1934.
[4] G. L. ThompsonSignaling strategies omnperson games,Annals of Mathematics Studies, 28, 1953.
329
IE675 Game Theory
Lecture Note Set 4Wayne F. Bialas1
Friday, March 30, 2001
4 UTILITY THEORY
4.1 Introduction
This section is really independent of the field of game theory, and it introducesconcepts that pervade a variety of academic fields. It addresses the issue of quantifying the seemingly nonquantifiable. These include attributes such as quality oflife and aesthetics. Much of this discussion has been borrowed from Keeney andRaiffa [1]. Other important references include Luce and Raiffa [2], Savage [4], andvon Neumann and Morgenstern [5].
The basic problem of assessing value can be posed as follows: A decision makermust choose among several alternatives, sayW1,W2, . . . ,Wn, where each willresult in a consequence discernible in terms of asingle attribute, sayX. Thedecision maker does not know with certainty which consequence will result fromeach of the variety of alternatives. We would like to be able to quantify (in someway) our preferences for each alternative.
The literature on utility theory is extensive, both theoretical and experimental. Ithas been the subject of significant criticism and refinement. We will only presentthe fundamental ideas here.
4.2 The basic theory
Definition 4.1. Given any two outcomesA and B we write A � B if A ispreferable toB. We will writeA ' B if A 6� B andB 6� A.
1Department of Industrial Engineering, University at Buffalo, 342 Bell Hall, Box 602050, Buffalo,NY 142602050 USA;Email: [email protected];Web: http://www.acsu.buffalo.edu/˜bialas.Copyright c© MMI Wayne F. Bialas. All Rights Reserved. Duplication of this work is prohibitedwithout written permission. This document produced March 30, 2001 at 2:31 pm.
41
4.2.1 Axioms
The relations� and' must satisfy the following axioms:
1. Given any two outcomesA andB, exactly one of the following must hold:
(a) A � B
(b) B � A
(c) A ' B
2. A ' A for all A
3. A ' B impliesB ' A
4. A ' B andB ' C impliesA ' C
5. A � B andB � C impliesA � C
6. A � B andB ' C impliesA � C
7. A ' B andB � C impliesA � C
4.2.2 What results from the axioms
The axioms provide that' is anequivalence relationand � produces aweakpartial orderingof the outcomes.
Now assume thatA1 ≺ A2 ≺ · · · ≺ An
Suppose that the decision maker is indifferent to the following two possibilities:
Certainty option: ReceiveAi with probability 1
Risky option:
{
ReceiveAn with probabilityπiReceiveA1 with probability(1− πi)
If the decision maker is consistent, thenπn = 1 andπ1 = 0, and furthermore
π1 < π2 < · · · < πn
Hence, theπ’s provide a numerical ranking for theA’s.
42
Suppose that the decision maker is asked express his preference for probabilitydistributions over theAi. That is, consider mixtures,p′ andp′′, of theAi where
p′i ≥ 0∑n
i=1 p′i = 1
p′′i ≥ 0∑n
i=1 p′′i = 1
Using theπ’s, we can consider the question of which is better,p′ orp′′, by computingthe following “scores”:
π′ =n
∑
i=1
p′iπi
π′′ =n
∑
i=1
p′′i πi
We claim that the choice ofp′ versusp′′ should be based on the relative magnitudesof π′ andπ′′.
Note 4.1. Suppose we have two outcomesA andB with the probability of gettingeach equal top and(1− p), respectively. Denote thelotterybetweenA andB by
Ap⊕B(1− p)
Note that this is not expected value, sinceA andB are not real numbers.
Suppose we choosep′. This implies that we obtainAi with probabilityp′i and thisis indifferent to obtaining
Anπi ⊕A1(1− πi)
with probabilityp′i. Now, sum over alli and consider the quantities
An
n∑
i=1
πip′i ⊕A1
n∑
i=1
(1− πi)p′i
' An
n∑
i=1
πip′i ⊕A1
(
1−n
∑
i=1
πip′i
)
' Anπ′ ⊕A1(1− π′)
So if π′ > π′′ then
Anπ′ ⊕A1(1− π′) � Anπ′′ ⊕A1(1− π′′)
43
This leads directly to the following. . .
Theorem 4.1. If A � C � B and
pA⊕ (1− p)B ' C
then0 < p < 1 andp is unique.
Proof: See Owen [3].
Theorem 4.2. There exists a realvalued functionu(·) such that
1. (Monotonicity)u(A) > u(B) if and only ifA � B.
2. (Consistency)u(pA⊕ (1− p)B) = pu(A) + (1− p)u(B)
3. the functionu(·) is unique up to a linear transformation. In otherwords, ifu andv are utility functions for the same outcomes thenv(A) = αu(A) + β for someα andβ.
Proof: See Owen [3].
Consider a lottery (L) which yields outcomes{Ai}ni=1 with probabilities{pi}n
i=1.Then let
A = A1p1 ⊕A2p2 ⊕ · · · ⊕Anpn
Because of the properties of utility functions, we have
E[(u(A)] =n
∑
i=1
piu(Ai)
Consideru−1
(
E[(u(A)])
This is anoutcomethat represents lottery (L)
Suppose we have two utility functionsu1 andu2 with the property that
u−11
(
E[(u1(A)])
' u−12
(
E[(u2(A)])
∀ A
Thenu1 andu2 will imply the same preference rankings for any outcomes. If thisis true, we writeu1 ∼ u2. Note that some texts (such as [1]) say thatu1 andu2 arestrategically equivalent.We won’t use that definition, here, because this term hasbeen used for another property of strategic games.
44
4.3 Certainty equivalents
Definition 4.2. A certainty equivalent of lottery (L) is an outcomeA such thatthe decision maker is indifferent between (L) and the certain outcomeA.
In other words, ifA is a certainty equivalent of (L)
u(A) = E[u(A)]
A ' u−1(
E[u(A)])
You will also see the termscash equivalentandlottery selling pricein the literature.
Example 4.1. Suppose outcomes are measured in terms of real numbers, sayA = x. For anya andb > 0
u(x) = a + bx ∼ x
Suppose the decision maker has a lottery described by the probability densityf(x)then
E[x] =∫
xf(x) dx
Note thatu(x) = E[u(x)] = E[a + bx] = a + bE[x]
Takingu−1 of both sides shows that ˆx = E[x].
Hence, if the utility function is linear, the certainty equivalent for any lottery is theexpected consequence of that lottery.
Question 4.1. Supposeu(x) = a − be−cx ∼ −e−cx whereb > 0. Suppose thedecision maker is considering a 5050 lottery yielding eitherx1 or x2. So
E[x] =x1 + x2
2
Find the solution tou(x) = E[u(x)] to obtaining the certainty equivalent for thislottery. In other words, solve
−e−cx =−(e−cx1 + e−cx2)
2
Question 4.2. This is a continuation of Question 4.1. Ifu(x) = −e−cx andx isthe certainty equivalent for the lottery ˜x, show that ˆx+x0 is the certainty equivalentfor the lottery ˜x + x0.
45
4.4 BIBLIOGRAPHY
[1] R.L. Keeney and H. Raiffa,Decisions with multiple objective,Wiley (1976).
[2] R.D. Luce and H. Raiffa,Games and decisions,Wiley (1957).
[3] G. Owen,Games theory,Academic Press (1982).
[4] L.J. Savage,The foundations of statistics,Dover (1972).
[5] J. von Neumann and O. Morgenstern, Theory of games and economic behavior,Princeton Univ. Press (1947).
46
IE675 Game Theory
Lecture Note Set 5Wayne F. Bialas12
Friday, March 30, 2001
5 STATIC COOPERATIVE GAMES
5.1 Some introductory examples
Consider a game with three players 1, 2 and 3. LetN = {1, 2, 3} Suppose that theplayers can freely form coalitions. In this case, the possiblecoalition structureswould be
{{1}, {2}, {3}} {{1, 2, 3}}
{{1, 2}, {3}} {{1, 3}, {2}} {{2, 3}, {1}}
Once the players form their coalition(s), they inform a referee who pays eachcoalition an amount depending on its membership. To do this, the referee uses thefunctionv : 2N → R. CoalitionS receivesv(S). This is a game incharacteristicfunction form andv is called thecharacteristic function.
For simple games, we often specify the characteristic function without using brackets and commas. For example,
v(12) ≡ v({1, 2}) = 100
The functionv may actually be based on another game or an underlying decisionmaking problem.
1Department of Industrial Engineering, University at Buffalo, 342 Bell Hall, Box 602050, Buffalo,NY 142602050 USA;Email: [email protected];Web: http://www.acsu.buffalo.edu/˜bialas.Copyright c© MMI Wayne F. Bialas. All Rights Reserved. Duplication of this work is prohibitedwithout written permission. This document produced March 30, 2001 at 2:26 pm.
2Much of the material for this section has been cultivated from the lecture notes of Louis J. Billeraand William F. Lucas. The errors and omissions are mine.
51
An important issue is the division of the game’s proceeds among the players. We callthe vector(x1, x2, . . . , xN ) of these payoffs animputation . In many situations, theoutcome of the game can be expressed solely in terms of the resulting imputation.
Example 5.1. Here is a threeperson, constant sum game:
v(123) = 100
v(12) = v(13) = v(23) = 100
v(1) = v(2) = v(3) = 0
How much will be given to each player? Consider solutions such as
(x1, x2, x3) = (1003 , 100
3 , 1003 )
(x1, x2, x3) = (50, 50, 0)
Example 5.2. This game is similar to Example 5.1.
v(123) = 100
v(12) = v(13) = 100
v(23) = v(1) = v(2) = v(3) = 0
Player 1 has veto power but if Player 2 and Player 3 form a coalition, they can forcePlayer 1 to get nothing from the game. Consider this imputation as a solution:
(x1, x2, x3) = (2003 , 50
3 , 503 )
5.2 Cooperative games with transferable utility
Cooperative TU (transferable utility) games have the following ingredients:
1. a characteristic functionv(S) that gives a value to each subsetS ⊂ N ofplayers
2. payoff vectors calledimputationof the form(x1, x2, . . ., xn) which represents a realizable distribution of wealth
52
3. a preference relation over the set of imputations
4. solution concepts
Global: stable sets
solutions outside of the stable set can be blocked by somecoalition, and nothing in the stable set can be blocked byanother member of the stable set.
Local: bargaining sets
any objection to an element of a bargaining set has a counterobjection.
Single point: Shapley value
Definition 5.1. A TU game in characteristic function form is a pair (N, v)whereN = {1, . . ., n} is the set of players andv : 2N → R is the characteristicfunction.
Note 5.1. We often assume either that the game is
superadditive: v(S ∪ T ) ≥ v(S) + v(T ) for all S, T ⊆ N , such thatS ∩ T = Ø
or that the game is
cohesive: v(N) ≥ v(S) for all S ⊆ N
We define the set ofimputationsas
A(v) = {x |∑n
i=1 xi = v(N) andxi ≥ v({i}) ∀ i ∈ N} ⊂ RN
If S ⊆ N , S 6= Ø andx, y ∈ A(v) then we say thatx dominates y via S,(x domS y) if and only if
1. xi > yi for all i ∈ S
2.∑
i∈S xi ≤ v(S)
If x dominatesy via S, we writex domS y.
If x domS y for someS ⊆ N then we say thatx dominatesy and writex dom y.
Forx ∈ A(v), we define thedominion of x via S as
DomS x ≡ {y ∈ A(v) |x domS y}
53
For anyB ⊆ A(v) we define
DomSB ≡⋃
y∈B
DomS y
andDomB ≡
⋃
T⊆N
DomT B
We say thatK ⊂ A(v) is astable setif
1. K ∩ DomK = Ø
2. K ∪ DomK = A(v)
In other words,K = A(v)− DomK
Thecore is defined as
C ≡ {x ∈ A(v) |∑
i∈S xi ≥ v(S) ∀S ⊂ N}
Note 5.2. If the game is cohesive, the core is the set of undominated imputations.
Theorem 5.1. The core of a cooperative TU game(N, v) has the followingproperties:
1. The coreC is an intersection of half spaces.
2. If stable setsKα exist, thenC ⊂ ∩αKα
3. (∩αKα) ∩ DomC = Ø
Note 5.3. For some games (e.g., constant sum games) the core is empty.
As an example consider the following constant sum game withN = 3:
v(123) = 1
v(12) = v(13) = v(23) = 1
v(1) = v(2) = v(3) = 0
The set of imputations is
A(v) = {x = (x1, x2, x3) |x1 + x2 + x3 = 1 andxi ≥ 0 for i = 1, 2, 3}
54
This set can be illustrated as a subset inR3 as follows:
x3
x1
x2
(0,0,1)
(0,1,0)
(1,0,0)
The set ofimputations
or alternatively, using barycentric coordinates
The set of imputationsusing barycentric coordinates
x2 = 1
x1 = 1x3 = 1
x 1=
0 x3 = 0
x2 = 0
x = (x1 ,x2 ,x3 )
55
For an interior pointx we get
Dom{1,2}x = A(v) ∩ {y | y1 < x1 andy2 < x2}
x3
x1
x2
(0,0,1)
(0,1,0)
(1,0,0)
The set ofimputationsDom{1.2}x
x
x2
x3 x1
Dom{1.2}x
x
56
And for all twoplayer coalitions we obtain
x2
x3 x1
Dom{1.2}x
x
Dom{1.3}x
Dom{2.3}x
Question 5.1. Prove that
DomN A(v) = Ø(1)
Dom{i}A(v) = Ø ∀ i(2)
C = Ø(3)
Note that (1) and (2) are general statements, while (3) is true for this particulargame.
Now consider the set
K = {(12, 1
2, 0), (12, 0, 1
2), (0, 12, 1
2)}
57
and note that the setsDom{1,2} (12, 1
2, 0), Dom{1,3} (12, 0, 1
2), and Dom{2,3} (0, 12, 1
2)can be illustrated as follows:
x2
x3 x1
Dom{1.2} (1/2,
1/2,0)
(1/2, 1/2,0)
x2
x3 x1
Dom{1.3} (1/2,0, 1/2)
(1/2,0,1/2)
58
x2
x3 x1
Dom{2.3} (0,1/2,1/2)
(0,1/2,1/2)
We will let you verify that
1. K ∩ DomK = Ø
2. K ∪ DomK = A(v)
so thatK is a stable set.
Question 5.2. There are more stable sets (an uncountable collection). Find them,and show that, for this example,
∩αKα = Ø
∪αKα = A(v)
Now, let’s look at the veto game:
v(123) = 1
v(12) = v(13) = 1
v(23) = v(1) = v(2) = v(3) = 0
59
This game has a core at(1, 0, 0) as shown in the following diagram:
x2
x3 x1
Dom{1.2}x
x
Dom{1,3}x
Core 3-(1,0,0)
Question 5.3. Verify that any continuous curve fromC to the surfacex2+x3 = 1with a Lipshitz condition of 30◦ or less is a stable set.
x2
x3 x1
x
Core 3-(1,0,0)
A stable set
510
Note that
∩αKα = C∪αKα = A(v)
5.3 Nomenclature
Much of this section is from Willick [13].
5.3.1 Coalition structures
A coalition structureis any partition of the player set into coalitions LetN ={1, 2, . . ., n} denote the set ofn players.
Definition 5.2. A coalition structure, P, is a partition ofN into nonemptysets such thatP = {R1, R2, . . . , RM} where Ri ∩ Rj = Ø for all i6=j and∪M
i=1Ri = N .
5.3.2 Partition function form
LetP0 ≡ {{1}, {2}, . . ., {n}} denote the singleton coalition structure. The coalition containing all playersN is called thegrand coalition. The coalition structurePN ≡ {N} is called thegrand coalition structure.
In partition function form games, the value of a coalition,S, can depend on thecoalition arrangement of players inN − S (See Lucas and Thrall [11]).
Definition 5.3. The game(N, v) is anperson game in partition function formif v(S,P) is a real valued function which assigns a number to each coalitionS ∈ Pfor every coalition structureP.
5.3.3 Superadditivity
A game is superadditive ifv(S ∪ T ) ≥ v(S) + v(T ) for all S, T ⊆ N such thatS ∩ T = Ø.
Most nonsuperadditive games can be mapped into superadditive games. Thefollowing reason is often given: Suppose there exist disjoint coalitionsS andTsuch that
v(S ∪ T ) < v(S) + v(T )
511
Then S and T could secretly form the coalitionS ∪ T and collect the valuev(S) + v(T ). The coalitionS ∪ T would then divide the amount among its totalmembership.
Definition 5.4. The gamev is said to be thesuperadditive coverof the gameuif for all P ⊆ N ,
v(P ) = maxP∗P
∑
R∈P∗P
u(R)
whereP∗P be a partition ofP .
Note 5.4. P∗P is a coalition structure restricted to members ofP
Note 5.5. A problem with using a superadditive cover is that it requires theingredient of secrecy. Yet all of the players are assumed to have perfect information.
It also requires a dynamic implementation process. The players need to first decideon their secret alliance, then collect the payoffs asS andT individually, and finallydivide the proceeds asS ∪ T . But characteristic function form games are assumedto be static.
Example 5.3. Consider this threeperson game:
u(123) = 1
u(12) = u(13) = u(23) = 1
u(2) = u(3) = 0
u(1) = 5
Note that(N, u) is not superadditive. The superadditive cover of(N,u) is
v(123) = 6
v(12) = 5
v(13) = 5
v(23) = 1
v(2) = v(3) = 0
v(1) = 5
We can often relax the requirement of superadditivty and assume only that thegrand coalition obtains a value at least as great as the sum of the values of anypartition of the grand coalition. Such games are calledcohesive.
512
Definition 5.5. A characteristic function game is said to becohesiveif
v(N) = maxP
∑
P∈Pv(P ).
There are important examples of cohesive games. For instance, we will see laterthat some models of hierarchical organizations produce cohesive games that arenot superadditive.
5.3.4 Essential games
Definition 5.6. A game isessentialif∑
i∈N
v({i}) < v(N)
A game isinessentialif∑
i∈N
v({i}) ≥ v(N)
Note 5.6. If∑
i∈N v(i) > v(N) then A(v) = Ø. If∑
i∈N v(i) = v(N) thenA(v) = {(v(1), v(2), . . ., v(n))}
5.3.5 Constant sum games
Definition 5.7. A game is aconstant sum gameif
v(S) + v(N − S) = v(N) ∀S ⊂ N
5.3.6 Strategic equivalence
Definition 5.8. Two games(N, v1) and (N, v2) are strategically equivalent ifand only if there existc > 0 and scalarsa1, . . ., an such that
v1(S) = cv2(S) +∑
i∈S
ai ∀M ⊆ N
Properties of strategic equivalence:
513
1. It’s a linear transformation
2. It’s an equivalence relation
• reflexive
• symmetric
• transitive
Hence it partitions the set of games into equivalence classes.
3. It’s an isomorphism with respect todom on A(v2) → A(v1). So, strategicequivalence preserves important solution concepts.
5.3.7 Normalization
Definition 5.9. A game(N, v) is in (0, 1) normal form if
v(N) = 1
v({i}) = 0 ∀ i ∈ N
The setA(v) for a game in(0, 1) normal form is a “probability simplex.”
Suppose a game is in(0, 1) normal form and superadditive, then 0≤ v(S) ≤ 1 forall S ⊆ N .
An essential game(N, u) can be converted to(0, 1) normal form by using
v(S) =u(S)−
∑
i∈S u({i})u(N)−
∑
i∈N u({i})
Note that the denominator must be positive for any essential game(N, u).
Note 5.7. ForN = 3 a game in(0, 1) normal form can be completely defined byspecifying(v(12), v(13), v(23)).
Question 5.4. Show thatC 6= Ø for any threeperson(0, 1) normal form gamewith
v(12) + v(13) + v(23) < 2
514
Here’s an example:2
x2
x3 x1
v(12) + v(13) + v(23) < 2
Dom{1,3}C
Dom{2,3}C
Dom{1,2}CCore C
x2+x3=v(23)
x1+x2=v(12)
x1+x3=v(13)
Show that stable sets are of the following form:
x2
x3 x1
v(12) + v(13) + v(23) < 2
Stable set
2My thanks to Ling Wang for her suggestions on this section.
515
Produce similar diagrams for the casev(12) + v(13) + v(23) > 2.
Is C = Ø for v(12) + v(13) + v(23) = 2?
5.4 Garbage game
There areN players. Each player produces one bag of garbage and dumps it inanother’s yard. The payoff for any player is
−1× (the number of bags in his yard)
We get
v(N) = −n
v(M) = |M | − n for |M | < n
We haveC = Ø whenn > 2. To show this, note thatx ∈ C implies∑
i∈N−{j}xi ≥ v(N − {j}) = −1 ∀ j ∈ N
Summing over allj ∈ N ,
(n− 1)∑
i∈N
xi ≥ −n
(n− 1)v(N) ≥ −n
(n− 1)(−n) ≥ −n
n ≤ 2
5.5 Pollution game
There aren factories around a lake.
Input water is free, but if the lake is dirty, a factory may need to pay to clean thewater. If k factories pollute the lake, the cost to a factory to clean the incomingwater iskc.
Output water is dirty, but a factory might pay to treat the effluent at a cost ofb.
Assume 0< c < b < nc.
516
If a coalitionM forms, all of it’s members could agree to pollute with a payoff of|M |(−nc). Or, all of it’s members could agree to clean the water with a payoff of|M |(−(n− |M |)c)− |M |b. Hence,
v(M) = max{{|M |(−nc)}, {|M |(−(n− |M |)c)− |M |b}} for M ⊂ N
v(N) = max{
{−n2c}, {−nb}}
Question 5.5. Show thatC 6= Ø andx = (−b, . . .,−b) ∈ C.
5.6 Balanced sets and the core
The presentation in this section is based on Owen [9]
The coreC can be defined as the set of all(x1, . . ., xn) ∈ A(V ) ⊂ Rn such that∑
i∈N
xi ≡ x(N) = v(N) and
∑
i∈S
xi ≡ x(S) ≥ v(S) ∀S ∈ 2N
If we further define anadditive set functionx(·) as any function such that
x : 2N → Rx(S) =
∑
i∈S
x({i})
we get the following, equivalent, definition of a core:
Definition 5.10. The coreC of a game(N, v) is the set of additivea : 2N → Rsuch that
a(N) = v(N)
a ≥ v
The second condition means thata(S) ≥ v(S) for all S ⊂ N .
We would like to characterize those characteristic functionsv for which the core isnonempty.
517
Note thatC 6= Ø if and only if the linear programming problem
min z =∑n
i=1 xist:
∑
i∈S xi ≥ v(S) ∀ S ⊂ N(4)
has a minimumz∗ ≤ v(N).
Consider the dual to the above linear programming problem (4)
max∑
S⊂N
ySv(S) = q
st:∑
S3i
yS = 1 ∀ i ∈ N
yS ≥ 0 ∀ S ⊂ N
(5)
Both the linear program (4) and its dual (5) are always feasible. So
minz = maxq
by the duality theorem. Hence, the core is nonempty if and only if
maxq ≤ v(N)
This leads to the following:
Theorem 5.2. A necessary and sufficient condition for the game(N, v) to haveC 6= Ø is that for every nonnegative vector(yS)S⊂N satisfying
∑
S3i
yS = 1 ∀ i
we have∑
S⊂N
ySv(S) ≤ v(N)
To make this more useful, we introduce the concept of abalanced collectionofcoalitions.
Definition 5.11. B ⊂ 2N is balancedif there existsyS ∈ R with yS > 0 for allS ∈ B such that
∑
S3i
yS = 1 ∀ i ∈ N
518
y is called thebalancing vector(or weight vector) forB. The individualyS ’s arecalledbalancing coefficients.
Example 5.4. SupposeN = {1, 2, 3}
B = {{1}, {2}, {3}} is a balanced collection withy{1} = 1, y{2} = 1, andy{3} = 1.
B = {{1, 2}, {3}} is a balanced collection withy{1,2} = 1 andy{3} = 1.
B = {{1, 2}, {1, 3}, {2, 3}} is a balanced collection withy{1,2} = 12, y{1,3} = 1
2,andy{2,3} = 1
2.
Theorem 5.3. The union of balanced collections is balanced.
Lemma 5.1. Let B1 and B2 be balanced collections such thatB1 ⊂ B2 butB1 6= B2. Then there exists a balanced collectionB3 6= B2 such thatB3∪B1 = B2.
The above lemma leads us to define the following:
Definition 5.12. A minimal balanced collection is a balanced collection forwhich no proper subcollection is balanced.
Theorem 5.4. Any balanced collection can be written as the union of minimalbalanced collections.
Theorem 5.5. Any balanced collection has a unique balancing vector if and onlyif it is a minimal balanced collection.
Theorem 5.6. Each extreme point of the polyhedron for the dual linear programming problem (5) is the balancing vector of a minimal balanced collection.
Corollary 5.1. A minimal balanced collection has at mostn sets.
The result is the following theorem:
Theorem 5.7. (ShapleyBondareva)The core is nonempty if and only if for everyminimal balanced collectionB with balancing coefficients(yS)S∈B we have
v(N) ≥∑
s∈BySv(S)
Example 5.5. Let N = {1, 2, 3}. Besides the partitions, such as{{1, 2}, {3}},there is only one other minimal balanced collection, namely,
B = {{1, 2}, {1, 3}, {2, 3}}
519
withy =
(
12, 1
2, 12
)
Therefore a threeperson game(N, v) has a nonempty core if and only if
12v({1, 2}) + 1
2v({1, 3}) + 12v({2, 3}) ≤ v(N)
v({1, 2}) + v({1, 3}) + v({2, 3}) ≤ 2v(N)
Question 5.6. Use the above result and reconsider Question 5.4 on page 514.
Question 5.7. Suppose we are givenv(S) for all S 6= N . What is the smallestvalue ofv(N) such thatC 6= Ø?
5.7 The Shapley value
Much of this section is from Yang [14].
Definition 5.13. A carrier for a game(N, v) is a coalitionT ⊆ N such thatv(S) ≤ v(S ∩ T ) for anyS ⊆ N .
This definition is slightly different from the one given by Shapley [10]. Shapleyusesv(S) = v(S ∩ T ) instead ofv(S) ≤ v(S ∩ T ). However, when the game(N, v) is superadditive, the definitions are equivalent.
A carrier is a group of players with the ability to benefit the coalitions they join. Acoalition can remove any of its members who do not belong to the carrier and getthe same, or greater value.
Let Π(N) denote the set of all permutations onN , that is, the set of all onetoonemappings fromN onto itself.
Definition 5.14. (Owen [9])Let(N, v) be annperson game, and letπ ∈ Π(N).Then, the game(N, πv) is defined as the game(N, u), such that
u({π(i1), π(i2), . . . , π(i|S|)}) = v(S)
for any coalitionS = {i1, i2, . . ., i|S|}.
Definition 5.15. (Friedman [3])Let (N, v) be annperson game. Themarginalvalue,cS(v), for coalitionS ⊆ N is given by
c{i}(v) ≡ v({i})
520
for all i ∈ N , andcS(v) ≡ v(S)−
∑
L⊂S
cL(v)
for all S ⊆ N with |S| ≥ 2.
The marginal value ofS can also be computed by using the formula
cS(v) =∑
L⊆S
(−1)|S|−1v(L).
5.7.1 The Shapley axioms
Let φ(v) = (φ1(v), φ2(v), . . ., φn(v)) be anndimensional vector satisfying thefollowing three axioms:
Axiom S 1. (Symmetry)For eachπ ∈ Π(N), φπ(i)(πv) = φi(v).
Axiom S 2. (Rationing)For each carrierC of (N, v)∑
i∈C
φi(v) = v(C).
Axiom S 3. (Law of Aggregation)For any two games(N, v) and(N, w)
φ(v + w) = φ(v) + φ(w).
Theorem 5.8. (Shapley [10])For any superadditive game(N, v) there is aunique vector of valuesφ(v) = (φ1(v), . . ., φn(v)) satisfying the above threeaxioms. Moreover, for each playeri this value is given by
φi(v) =∑
S⊆NS3i
1|S|
cS(v)(6)
Note 5.8. The Shapley value can be equivalently written [9] as
φi(v) =∑
T⊆NT3i
(
(|T | − 1)!(n− |T |)!n!
)
[v(T )− v(T − {i})](7)
521
This formula can be interpreted as follows: Supposen players arrive one after theother into a room that will eventually contain the grand coalition. Consider allpossible sequencing arrangements of then players. Suppose that any sequence canoccur with probability 1
n! . If Player i arrives and finds coalitionT − {i} alreadyin the room, his contribution to the coalition isv(T ) − v(T − {i}). The Shapleyvalue is the expected value of the contribution of Playeri.
5.8 A generalization of the Shapley value
Suppose we introduce the concept of taxation (or resource redistribution) and relaxjust one of the axioms. Yang [14], has shown that the Shapley value and theegalitarian value
φ0i (v) =
v(N)n
∀ i ∈ N
are then the extremes of an entire family of values for all cohesive (not necessarilysuperadditive) games.
Axiom Y 1. (Symmetry)For eachπ ∈ Π(N), ψπ(i)(πv) = ψi(v).
Axiom Y 2. (Rationing)For each carrierC of (N, v)∑
i∈C
ψi(v) = g(C)v(C) with |C|n ≤ g(C) ≤ 1.
Axiom Y 3. (Law of Aggregation)For any two games(N, v) and(N,w)
ψ(v + w) = ψ(v) + ψ(w).
Note that Yang only modifies the second axiom. The functiong(C) is called therationing function. It can be any realvalued function defined on attributes of thecarrierC with range
[
|C|n , 1
]
. If the game(N, v) is superadditive, theng(C) = 1yields Shapley’s original axioms.
A particular choice of the rationing functiong(C) produces a convex combinationbetween the egalitarian value and the Shapley value. LetN = {1, . . . , n} and letc ≡ |C| for C ⊆ N . Given the value of the parameterr ∈
[ 1n , 1
]
consider therealvalued function
g(C) ≡ g(c, r) =(n− c)r + (c− 1)
n− 1.
522
The functiong(C) specifies the distribution of revenue among the players of agame.
Note that this function can be rewritten as
g(c, r) = 1− (1− r)(
n− cn− 1
)
.
For games with a large number of players,
limn→∞
g(c, r) = r ∈ (0, 1]
so that(1− r) can be regarded as a “tax rate” on carriers.
Using this form of the rationing function results in the following:3
Theorem 5.9. Let (N, v) be a cohesivenperson cooperative transferable utilitygame. For eachr ∈
[ 1n , 1
]
, there exists a unique value,ψi,r(v), for each Playerisatisfying the three axioms. Moreover, this unique value is given by
ψi,r(v) = (1− p)φi(v) + pv(N)
n∀ i ∈ N(8)
wherep =n− nrn− 1
∈ (0, 1).
Note that the rationing function can be written4 in terms ofp ∈ (0, 1) as
g(c, p) = p + (1− p)cn
Example 5.6. Consider a twoperson game with
v({1}) = 1, v({2}) = 0, v({1, 2}) = 2
Player 2 can contribute 1 to a coalition with Player 1. But, Player 1 can get 1 onhis own, leaving Player 2 with nothing.
The family of values is
ψr(v) =(
12
+ r,32− r
)
for 12 ≤ r ≤ 1. The Shapley value (withr = 1) is
(
32, 1
2
)
.
3We are indebted to an anonymous reviewer for the simplified version of this theorem.4Once again, our thanks to the same anonymous reviewer for this observation.
523
Example 5.7. Consider a modification of the above game in Example (5.6) with
v({1}) = 1, v({2}) = 0, v({1, 2}) = 1
In this case, Player 2 is a dummy player.
The family of values isψr(v) = (r, 1− r)
for 12 ≤ r ≤ 1. The Shapley value (withr = 1) is (1, 0).
Example 5.8. This solution approach can be applied to a problem suggested byNowak and Radzik [8]. Consider a threeperson game where
v({1}) = v({2}) = 0, v({3}) = 1,
v({1, 2}) = 3.5, v({1, 3}) = v({2, 3}) = 0,
v({1, 2, 3}) = 5.
The Shapley value for this game is
φ(v) =(
2512,
2512,
1012
)
.
Note that the Shapley value will not necessarily satisfy the condition ofindividualrationality
φi(v) ≥ v({i})
when the characteristic functionv is not superadditive. That is the case here sinceφ3(v) < v({3}).
Thesolidarity value(Nowak and Radzik [8])ξ(v) of this game is
ξ(v) =(
169 , 16
9 , 139
)
and is in the core of(N, v).
For everyr ∈[ 1
n , 1]
, the general form of the family of values is
ψr(v) =(
35+ 15r24
,35+ 15r
24,50− 30r
24
)
.
524
The diagram in the following figure shows the relationship between the family ofvalues and the core.
(0,5,0)(5,0,0)
(0,0,5)
B
A
CORE
Note that, in the diagram,
A =(
2512,
2512,
1012
)
(the Shapley value)
B =(
53, 5
3, 53
)
.
Neither of these extreme values of the family of values is in the core for this game.However, those solutions for715 ≤ r ≤ 13
15 are elements of the core.
Example 5.9. Nowak and Radzik [8] offer the following example related tosocial welfare and income redistribution: Players 1, 2, and 3 are brothers livingtogether. Players 1 and 2 can make a profit of one unit, that is,v({1, 2}) = 1.Player 3 is a disabled person and can contribute nothing to any coalition. Therefore,v({1, 2, 3}) = 1. Also,v({1, 3}) = v({2, 3}) = 0 andv({i}) = 0 for every Playeri.
Shapley value of this game is
φ(v) =(
12, 1
2, 0)
and for the family of values, we get
ψr(v) =(
1 + r4
,1 + r
4,1− r
2
)
for r ∈[1
3, 1]
. Everyr yields a solution satisfying individual rationality, but, inthis case,ψr(v) belongs to the core only when it equals the Shapley value (r = 1).
525
For this particular game, the solidarity value is a member of the family whenr = 5
9. Nowak and Radzik propose this single value as a “better” solution for thegame(N, v) than its Shapley value. They suggest that it could be used to includesubjective social or psychological aspects in a cooperative game.
Question 5.8. Suppose game(N, v) has coreC 6= Ø. Let
F ≡ {ψr(v) | 1n≤ r ≤ 1}
denote the set of Yang’s values when using rationing functiong(c, r). Under whatconditions willC ∩ F 6= Ø?
5.9 BIBLIOGRAPHY
[1] Robert J. Aumann, The core of a cooperative game without side payments,Transactions of the American Mathematical Society,Vol. 98, No. 3. (Mar.,1961), pp. 539552. (JSTOR)
[2] Louis J. Billera, Some theorems on the core of annperson game withoutsidepayments,SIAM Journal on Applied Mathematics,Vol. 18, No. 3. (May.,1970), pp. 567579. (JSTOR)
[3] Friedman J.W.,Oligopoly and the theory of games.NorthHolland PublishingCo., New York (1977).
[4] Friedman J.W.,Game theory with applications to economics.Oxford University Press , New York (1986).
[5] I narra E and Usategui J.M. The Shapley value and average convex games.International Journal of Game Theory,Vol. 22. (1993) pp. 13–29.
[6] Maschler M., The power of a coalition.Management Science,Vol. 10, (1963),pp. 8–29.
[7] Monderer D., Samet D. and Shapley L.S., Weighted values and the core.International Journal of Game Theory,Vol. 21 (1992) 27–39
[8] Nowak A.S. and Radzik T., A solidarity value fornperson transferable utilitygames.International Journal of Game Theory,Vol. 23, (1994). pp. 43–48.
[9] Owen G.,Game theory.Academic Press Inc. San Diego (1982)
[10] Shapley L.S., A value fornperson games.Annals of Mathematics Studies,Vol. 28, (1951). 307–317
526
[11] W. F. Lucas and R. M. Thrall,nperson Games in Partition Form,NavalResearch Logistics Quarterly, Vol. 10 (1963), pp. 281–298.
[12] J. von Neumann and O. Morgenstern,Theory of games and economic behavior,Princeton Univ. Press (1947).
[13] W. Willick, A power index for cooperative games with applications to hierarchical organizations,Ph.D. Thesis, SUNY at Buffalo (1995).
[14] C. H. Yang,A family of values for nperson cooperative transferable utilitygames,M.S. Thesis, University at Buffalo (1997).
527
IE675 Game Theory
Lecture Note Set 6Wayne F. Bialas1
Tuesday, April 3, 2001
6 DYNAMIC COOPERATIVE GAMES
6.1 Some introductory examples
Consider the following hierarchical game:
�G\QDPL � FRRSHUDWLY � JDPH���
Federal Government
State Government
Local Government
F
S
L
In this particular example,
1Department of Industrial Engineering, University at Buffalo, 342 Bell Hall, Box 602050, Buffalo,NY 142602050 USA;Email: [email protected];Web: http://www.acsu.buffalo.edu/˜bialas.Copyright c© MMI Wayne F. Bialas. All Rights Reserved. Duplication of this work is prohibitedwithout written permission. This document produced April 3, 2001 at 1:34 pm.
61
1. The system has interacting players within a hierarchical structure
2. Each player executes his polices after, and with full knowledge of, the decisions of predecessors.
3. Players might form coalitions in order to improve their payoff.
What do we mean by (3)?
For examples (without coalitions) see Cassidy,et al. [12] and Charnes,et al. [13].
Without coalitions:
Payoff to Federal government= gF (F, S, L)
Payoff to State government= gS(F, S, L)
Payoff to Local government= gL(F, S, L)
A coalition structure of{{F, S}, {L}} would result in the players maximizing thefollowing objective functions:
Payoff to Federal government= gF (F, S, L) + gS(F, S, L)
Payoff to State government= gF (F, S, L) + gS(F, S, L)
Payoff to Local government= gL(F, S, L)
The order of the play remains the same. Only the objectives change.
62
Here is a twoplayer game of the same type, but written in extensive form:
Player F
Player S
$ � H[DPSO � ZLW�WZ � SOD\HUV���
f p
a b a b
(3,2) (2,1) (7,0) (2,1)
wheref Full fundingp Partial funding
a Projectab Projectb
The Stackelberg solution to this game is(f, a) with a payoff of(3, 1). However, ifthe players cooperated, and utility was transferable, they could get 7 with strategy(p, a).
The key element causing this effect is preemption. A dynamic, cooperative modelis needed.
Chew [14] showed that even linear models can exhibit this behaviorandhe developed a dynamic cooperative game model.
63
x1
x2
c1
c2
S1
���OHYH � / � SUREOHP���
2x
)ˆ( 21 xx ψ=
x1
x2
c1
c2
S1
5DWLRQD�UHDFWLRQV
)( 12 1SS xcΨ= *x
64
x1
x2
c1
c2
S1
Solution properties
*x
c1 + c2
6.1.1 Issues
See Bialas and Karwan [4] for details.
1. alternate optimal solutions
2. nonconvex feasible region
Note 6.1. The cause of inadmissible solutions is not the fault of the optimizers,but, rather, the sequential and preemptive nature of the decision process (i.e., the“friction of space and time”).
6.2 Multilevel mathematical programming
The noncooperative model in this section will serve as the foundation for ourcooperative dynamic model. See also Bialas and Karwan [6].
Note 6.2. Some history:Sequential optimization problems arise frequently inmany fields, including economics, operations research, statistics and control theory.The origin of this class of problems is difficult to trace since it is woven into thefabric of many scientific disciplines.
For the field of operations research, this topic arose as an extension to linearprogramming (see, for example, Bracken and McGill [8] Cassidy,et al. [12],
65
Charnes,et al. [13])
In particular, Bracken,et al. [8, 7, 9] define a twolevel problem where the constraints contain an optimization problem. However, the feasible region of the lowerlevel planner does not depend on the decision variables to the upperlevel planner.Removing this restriction, Candler and Norton [10] named this class of problems“multilevel programming.” A number of researchers mathematically characterizedthe geometry of this problem and developed solution algorithms (see, for example,[1, 4, 5, 15]).
For a more complete bibliography, see Vicente and Calamai [18].
Let the decision variable space (Euclideannspace),Rn 3 x = (x1, x2, . . ., xn),be partitioned amongr levels,
Rnk 3 xk = (xk1, xk
2, . . ., xknk
) for k = 1, . . ., r,
where∑r
k=1 nk = n. Denote the maximization of a functionf(x) overRn byvarying onlyxk ∈ Rnk given fixedxk+1, xk+2, . . . , xr inRnk+1×Rnk+2×· · ·×Rnr
bymax{f(x) : (xk |xk+1, xk+2, . . ., xr)}.(1)
Note 6.3. The value of expression (1) is a function ofx1, x2, . . ., xk−1.
Let the full set of system constraints for all levels be denoted byS. Then theproblem at the lowest level of the hierarchy, level one, is given by
(P 1)
{
max {f1(x) : (x1 |x2, . . ., xr)}st: x ∈ S1 = S
Note 6.4. The problem for the levelone decision makerP 1 is simply a (traditional)mathematical programming problem dependent on the given values ofx2, . . ., xr.That is,P 1 is a parametric programming problem.
The feasible region,S = S1, is defined as thelevelone feasible region. Thesolutions toP 1 in Rn
1 for each fixedx2, x3, . . ., xr form a set,
S2 = {x ∈ S1 : f1(x) = max{f1(x) : (x1 | x2, x3, . . ., xr)}},
called theleveltwo feasible regionover whichf2(x) is then maximized by varyingx2 for fixedx3, x4, . . ., xr.
66
Thus the problem at level two is given by
(P 2)
{
max {f2(x) : (x2 |x3, x4, . . ., xr)}st: x ∈ S2
In general, thelevelk feasible regionis defined as
Sk = {x ∈ Sk−1 | fk−1(x) = max{fk−1(x) : (xk−1 | xk, . . ., xr)}},
Note thatxk−1 is a function of ˆxk, . . ., xr. Furthermore, the problem at each levelcan be written as
(P k)
{
max {fk(x) : (xk |xk+1, . . ., xr)}st: x ∈ Sk
which is a function ofxk+1, . . ., xr, and
(P r) : maxx∈Sr
fr(x)
defines the entire problem. This establishes a collection of nested mathematicalprogramming problems{P 1, . . ., P r}.
Question 6.1. P k depends on givenxk+1, . . ., xr, and onlyxk is varied. Butfk(x) is defined over allx1, . . ., xr. Where are the variablesx1, . . ., xk−1 inproblemP k?
Note that the objective at levelk, fk(x), is defined over the decision space of alllevels. Thus, the levelk planner may have his objective function determined, inpart, by variables controlled at other levels. However, by controllingxk, afterdecisions from levelsk +1 tor have been made, levelk may influence the policiesat levelk − 1 and hence all lower levels to improve his own objective function.
6.2.1 A more general definition
See also Bialas and Karwan [5].
Let the vectorx ∈ RN be partitioned as(xa, xb). Then we can define the followingset function over the collection of closed and bounded regionsS ⊂ RN :
Ψf (S) = {x ∈ S : f(x) = max{f(x) | (xa | xb)}}
67
as theset of rational reactionsof f over S. This set is also sometimes calledthe inducible region. If for a fixed xb there exists a unique ˆxa which maximizesf(xa, xb) over all(xa, xb) ∈ S, then there induced a mapping
xa = ψf (xb)
which provides the rational reaction for each ˆxb, and we can then write
Ψf (S) = S ∩ {(xa, xb) : xa = ψf (xb)}
So if S = S1 is the levelone feasible region, the leveltwo feasible region is
S2 = Ψf1(S1)
and the levelk feasible region is
Sk = Ψfk−1(Sk−1)
Note 6.5. Even if S1 is convex,Sk = Ψfk−1(Sk−1) for k ≥ 2 are typically
nonconvex sets.
6.2.2 The twolevel linear resource control problem
The twolevel linear resource control problem is the multilevel programming problem of the form
max c2xst: x ∈ S2
whereS2 = {x ∈ S1 : c1x = max{c1x : (x1 | x2)}}
andS1 = S = {x : A1x1 + A2x2 ≤ b, x ≥ 0}
Here, level 2 controlsx2 which, in turn, varies the resource space of level one byrestrictingA1x1 ≤ b−A2x2.
The nested optimization problem can be written as:
(P 2)
max {c2x = c21x1 + c22x2 : (x2)}wherex1 solves
(P 1)
max {c1x = c11x1 + c12x2 : (x1 |x2)}st: A1x1 + A2x2 ≤ b
x ≥ 0
68
Question 6.2. Suppose someone gives you a proposed solutionx∗ to problemP 2. Develop an “easy” way to test thatx∗ is, in fact, the solution toP 2.
Question 6.3. What is the solution toP 2 if c1 = c2. What happens ifc1 issubstituted withαc1 + (1− α)c2 for some 0≤ α ≤ 1?
6.2.3 The twolevel linear price control problem
The twolevel linear price control problem is another special case of the generalmultilevel programming problem. In this problem, level two controls the costcoefficients of level one:
(P 2)
max {c2x = c21x1 + c22x2 : (x2)}st: A2x2 ≤ b2
wherex1 solves
(P 1)
max {(x2)tx1 : (x1 |x2)}st: A1x1 ≤ b1
x1 ≥ 0
In this problem, level two controls the cost coefficients of level one.
6.3 Properties of S2
Theorem 6.1. SupposeS1 = {x : Ax = b, x ≥ 0} is bounded. Let
S2 = {x = (x1, x2) ∈ S1 : c1x1 = max{c1x1 : (x1 | x2)}}
then the following hold:
(i) S2 ⊆ S1
(ii) Let {yt}`t=1 be any points ofS1, such thatx =
∑
t λ`t=1yt ∈ S2
with λt ≥ 0 and∑
t λt = 1. Thenλt > 0 impliesyt ∈ S2.
Proof: See Bialas and Karwan [4].
Note 6.6. The following results are due to Wen [19] (Chapter 2).
• a setS2 with the above property is called ashavingof S1
• shavings of shavings are shavings.
• shavings can be decomposed into convex sets that are shavings
69
• a convex set is always a shaving of itself.
• a relationship between shavings and the KuhnTucker conditions for linearprogramming problems.
Definition 6.1. Let S ⊆ Rn. A setσ(S) ⊆ S is a shavingof S if and only if foranyy1, y2, . . . , y` ∈ S, andλ1 ≥ 0, λ2 ≥ 0, . . ., λ` ≥ 0 such that
∑`t=1 λt = 1
and∑`
t=1 λtyt = x ∈ σ(S), the statement{λi > 0} impliesyi ∈ σ(S).
The following figures illustrate the notion of a shaving.
y2
y1
xS
σ(S)
Figure A σ(S) is a shaving
y2
y1
x T
τ(T)
Figure B τ(T) is not a shaving
The red region,σ(S), in Figure A is a shaving of the setS. However in Figure B,the pointλ1y1 + λ2y2 = x ∈ τ(T ) with λ1 + λ2 = 1, λ1 > 0, λ2 > 0. Buty1 andy2 do not belong toτ(T ). Henceτ(T ) is not a shaving.
Theorem 6.2. SupposeT = σ(S) is a shaving ofS and τ(T ) is a shaving ofT . Let τ ◦ σ denote the composition of the functionsτ andσ. Thenτ ◦ σ(S) is ashaving ofS.
Proof:Lety1, y2, . . ., y` ∈ S, andλ1 ≥ 0, λ2 ≥ 0, . . ., λ` ≥ 0 such that∑`
t=1 λt =1 and
∑`t=1 λtyt = x ∈ σ(S) = T .
Supposeλi > 0. Sinceσ(S) is a shaving ofS thenyi ∈ σ(S) = T . Sinceτ(T )is a shaving ofT , yi ∈ T , andλi > 0 thenyi ∈ τ(T ). Thereforeyi ∈ τ(σ(S)) soτ ◦ σ(S) is a shaving ofS.
It is easy to prove the following theorem:
Theorem 6.3. If S is a convex set, theσ(S) = S is a shaving ofS.
Theorem 6.4. LetS ⊆ RN . Letσ(S) be a shaving ofS. If x is an extreme pointof σ(S), thenx is an extreme point ofS.
610
Proof: See Bialas and Karwan [4].
Corollary 6.1. An optimal solution to the twolevel linear resource controlproblem (if one exists) occurs at an extreme point of the constraint set of allvariables (S1).
Proof: See Bialas and Karwan [4].
These results were generalized tonlevels by Wen [19]. Using Theorems 6.2 and6.4, if fk is linear andS1 is a bounded convex polyhedron then the extreme pointsof
Sk = Ψk−1Ψk−2 · · ·Ψ2Ψ1(S1)
are extreme points ofS1. This justifies the use of extreme point search proceduresto finding the solution to thenlevel linear resource control problem.
6.4 Cooperative Stackelberg games
This section is based on Chew [14], Bialas and Chew [3], and Bialas [2].
6.4.1 An Illustration
Consider a game with three players, named 1, 2 and 3, each of whom controls anunlimited quantity of a commodity, with a different commodity for each player.Their task is to fill a container of unit capacity with amounts of their respectivecommodities, never exceeding the capacity of the container. The task of filling willbe performed in a sequential fashion, with player 3 (the player at the “top” of thehierarchy) taking his turn first. A player cannot remove a commodity placed in thecontainer by a previous player.
At the end of the sequence, a referee pays each player one dollar (or fraction,thereof) for each unit of his respective commodity which has been placed in thecontainer. It is easy to see that, since player 3 has preemptive control over thecontainer, he will fill it completely with his commodity, and collect one dollar.
Suppose, however, that the rules are slightly changed so that, in addition, player 3could collect five dollars for each unit ofplayer one’scommodity which is placedin the container. Since player 2 does not receive any benefit from player one’scommodity, player 2 would fill the container with his own commodity on his turn,if given the opportunity. This is therational reactionof player 2. For this reason,player 3 has no choice but to fill the container with his commodity and collect onlyone dollar.
611
6.4.2 Coalition Formation
In the previous example, there are six dollars available to the three players. Dividedequally, each of the three players could improve their payoffs. However, becauseof the sequential and independent nature of the decisions, such a solution cannotbe attained.
The solution to the above problem is, thus, not Pareto optimal (see Chew [14]).However, as suggested by the example, the formation of a coalition among subsetsof the players could provide a means to achieve Pareto optimality. The membersof each coalition act for the benefit of the coalition as a whole. The questionimmediately raised are:
• which coalitions will tend to form,
• are the coalitions enforceable, and
• what will be the resulting distribution of wealth to each of the players?
The game in partition function form (see Lucas and Thrall [16] and Shenoy [17])provides a framework for answering these questions in this Stackelberg setting.
Definition 6.2. An abstract gameis a pair (X,dom) whereX is a set whosemembers are calledoutcomesanddom is a binary relation onX calleddomination.
Let G = {1, 2, . . ., n} denote the set ofn players. LetP = {R1, R2, . . . , RM}denote a coalition structure or partition ofG into nonempty coalitions, whereRi ∩Rj = Ø for all i 6= j and∪M
i=1Ri = G.
Let P0 ≡ {{1}, {2}, . . ., {n}} denote the coalition structure where no coalitionshave formed and letPG ≡ {G} denote thegrand coalition.
ConsiderP = {R1, R2, . . ., RM}, an arbitrary coalition structure. Assume thatutility is additive and transferable. As a result of the coalition formation, theobjective function of each player in coalitionRj becomes,
f ′Rj(x) =
∑
i∈Rj
fi(x).
Although the sequence of the players’ decisions has not changed, their objectivefunctions have. LetR(i) denote the unique coalitionRj ∈ P such that playeri ∈ Rj . Instead of maximizingfi(x), playeri will now be maximizingf ′R(i)(x).Let x(P) denote the solution to the resultingnlevel optimization problem.
612
Definition 6.3. Suppose thatS1 is compact andx(P) is unique. The value of (orpayoff to) coalitionRj ∈ P, denoted byv(Rj ,P), is given by
v(Rj ,P) ≡∑
i∈Rj
fi(x(P)).
Note 6.7. The functionv need not be superadditive. Hence, one must be carefulwhen applying some of the traditional game theory results which require superadditivity to this class of problems.
Definition 6.4. A solution configuration is a pair (r,P), wherer is an ndimensional vector (called animputation ) whose elementsri (i = 1, . . ., n) represent the payoff to each playeri under coalition structureP.
Definition 6.5. A solution configuration(r,P) is a feasible solution configuration if and only if
∑
i∈R ri ≤ v(R,P) for all R ∈ P.
Let Θ denote the set of all solution configurations which are feasible for thehierarchical decisionmaking problem under consideration. We can then define thebinary relationdom, as follows:
Definition 6.6. Let(r,Pr), (s,Ps) ∈ Θ. Then(r,Pr) dominates(s,Ps) denotedby (r,Pr)dom(s,Ps), if and only if there exists an nonemptyR ∈ P, such that
ri > si for all i ∈ R and(2)∑
i∈R
ri ≤ v(R,Pr).(3)
Condition (2) implies that each decision maker inR prefers coalition structurePrto coalition structurePs. Condition (3) ensures thatR is a feasible coalition inPr.That is,R must not demand more for the imputationr than its valuev(R,Pr).
Definition 6.7. The core, C, of an abstract game is the set of undominated,feasible solution configurations.
When the core is nonempty, each of its elements represents an enforceable solutionconfiguration within the hierarchy.
6.4.3 Results
613
We have now defined a model of the formation of coalitions among players in aStackelberg game. Perfect information is assumed among the players, and coalitions are allowed to form freely. No matter which coalitions form, the order of theplayers’ actions remains the same. Each coalition earns the combined proceeds thateach individual coalition member would have received in the original Stackelberggame. Therefore, a player’s rational decision may now be altered because he isacting for the joint benefit of the members of his coalition.
Using the above model, several results can be obtained regarding the formationof coalitions among the players. First, the distribution of wealth to any feasiblecoalition cannot exceed the value of the grand coalition. This is provided by thefollowing lemma:
Lemma 6.1. If solution configuration(z,P) ∈ Θ then
n∑
i=1
zi ≤n
∑
i=1
fi(x(PG)) = v(G,PG) ≡ V ∗.
Theorem 6.5. If (z,P) ∈ C 6= Ø then∑n
i=1 zi = V ∗.
It is also possible to construct a simple sufficient condition for the core to be empty.This is provided in Theorem 6.6.
Theorem 6.6. The abstract game(Θ,dom) hasC = Ø if there exists coalitionstructuresP1,P2, . . .,Pm and coalitionsRj ∈ Pj (j = 1, . . .,m) with Rj ∩Rk =Ø for all j 6= k such that
m∑
j=1
v(Rj ,Pj) > V ∗.(4)
Finally, we can easily show that, in any 2person game of this type, the core isalways nonempty.
Theorem 6.7. If n = 2 thenC 6= Ø.
6.4.4 Examples and Computations
We will expand on the illustration given in Section 6.4.1. Letcij represent the reward to playeri if the commodity controlled by playerj is placed in the container.Let C represent the matrix[cij ] and letx be anndimensional vector withxj representing the amount of commodityj placed in the container. Note that
∑nj=1 xj ≤ 1
614
andxj ≥ 0 for j = 1, . . . , n. For the illustration provided in Section 6.4.1,
C =
1 0 00 1 05 0 1
.
Note thatCxT is a vector whose components represent the earnings to each player.
Chew [14] provides a simple procedure to solve this game. The algorithm requiresc11 > 0.
Step 0: Initialize i=1 and j=1. Go toStep 1.
Step 1: If i = n, stop. The solution is ˆxj = 1 andxk = 0 for k 6= j. If i 6= n,then go toStep 2.
Step 2: Seti = i + 1. If cii > cij , then setj = i. Go toStep 1.
If no ties occur inStep 2(i.e., cii 6= cij) then it can be shown that the abovealgorithm solves the problem (see Chew [14]).
Example 6.1. Consider the three player game of this form with
C = CP0 =
10 4 00 1 11 4 3
.
With coalition structureP0 = {{1}, {2}, {3}}, the solution is(x1, x2, x3) =(0, 1, 0)and the coalition values arev({1},P0) = 4,v({2},P0) = 1 andv({3},P0) =4.
Consider coalition structureP = {{1, 2}, {3}}, The payoff matrix becomes
CP =
10 5 110 5 11 4 3
and a solution of(0, 0, 1). The values of the coalitions in this case arev({1, 2},P) =1 andv({3},P) = 3.
Note that coalition structureP is not superadditive since
v({1},P0) + v({2},P0) > v({1, 2},P).
615
When Players 1 and 2 do not cooperate, Player 2 fills the container with a benefitof 4 to Player 3. Suppose the bottom two players form coalition{1, 2}. Then ifPlayer 2 is given anemptycontainer, the coalition will have Player 1 fill it with hiscommodity, earning 10 for the coalition. So, if Player 3 does not fill the container,the formation of coalition{1, 2} reduces Player 3’s benefit from 4 to 1. As a result,Player 3 fills the container himself, and earns 3. The coalition{1, 2} only earns 1(not 10).
Remember that Chew’s model assumes that all players have full knowledge ofthe coalition structure that has formed. Obvious natural extensions of this simplemodel would incorporate secret coalitions and delayed coalition formation (i.e.,changes in the coalition structure while the container is being passed).
Example 6.2. Consider the three player game of this form with
C = CP0 =
4 1 41 0 32 5 1
.
With coalition structureP0 = {{1}, {2}, {3}}, the solution is(x1, x2, x3) =(1, 0, 0)and the coalition values arev({1},P0) = 4,v({2},P0) = 1 andv({3},P0) =2.
Under the formation of coalition structureP = {{1}, {2, 3}}, the resources ofplayers 2 and 3 are combined. This yields a payoff matrix of
CP =
4 1 43 5 43 5 4
and a solution of(0, 1, 0). The values of the coalitions in this case arev({1},P) = 1andv({2, 3},P) = 5.
Finally, if all of the players join to form the grand coalition,PG, the payoff matrixbecomes
CPG =
7 6 87 6 87 6 8
with a solution of(0, 0, 1) andv({1, 2, 3},PG) = 8. Note that
v({1},P0) + v({2, 3},P) > v({1, 2, 3},PG).
From Theorem 6.6, we know that the core for this game is empty.
616
6.5 BIBLIOGRAPHY
[1] J. F. Bard and J. E. Falk, “An explicit solution to the multilevel programmingproblem,”Computers and Operations Research,, Vol. 9, No. 1 (1982), pp. 77–100.
[2] W. F. Bialas, Cooperativenperson Stackelberg games. working paper, SUNYat Buffalo (1998).
[3] W. F. Bialas and M. N. Chew, A linear model of coalition formation innperson Stackelberg games.Proceedings of the 21st IEEE Conference onDecision and Control(1982), pp. 669–672.
[4] W. F. Bialas and M. H. Karwan, Mathematical methods for multilevel planning. Research Report 792, SUNY at Buffalo (February 1979).
[5] W.F. Bialas and M.H. Karwan, On twolevel optimization.IEEE Transactionson Automatic Control;,Vol. AC27, No. 1 (February 1982), pp. 211–214.
[6] W.F. Bialas and M.H. Karwan, Twolevel linear programming.ManagementScience,Vol. 30, No. 8 (1984), pp. 1004–1020.
[7] J. Bracken and J. Falk and J. McGill. Equivalence of two mathematical programs with optimization problems in the constraints.Operations Research,Vol. 22 (1974), pp. 1102–1104.
[8] J. Bracken and J. McGill. Mathematical programs with optimization problemsin the constraints.Operations Research, Vol. 21 (1973), pp. 37–44.
[9] J. Bracken and J. McGill. Defense applications of mathematical programswith optimization problems in the constraints.Operations Research, Vol. 22(1974), pp. 1086–1096.
[10] Candler, W. and R. Norton,Multilevel Programming,unpublished researchmemorandum, DRC, World Bank, Washington, D.C., August 1976.
[11] Candler, W. and R. Townsley, A Linear TwoLevel Programming Problem.Computers and Operations Research,Vol. 9, No. 1 (1982), pp. 59–76.
[12] R. Cassidy, M. Kirby and W. Raike. Efficient distribution of resources throughthree levels of government.Management Science,Vol. 17 (1971) pp. 462–473.
[13] A. Charnes, R. W. Clower and K. O. Kortanek. Effective control throughcoherent decentralization with preemptive goals.Econometrica,Vol. 35, No.2 (1967), pp. 294–319.
617
[14] M. N. Chew.A game theoretic approach to coalition formation in multileveldecision making organizations.M.S. Thesis, SUNY at Buffalo (1981).
[15] J. Fortuny and B. McCarl, “A representation and economic interpretationof a twolevel programming problem,”Journal of the Operations ResearchSociety,Vol. 32, No. 9 (1981), pp. 738–792.
[16] W. F. Lucas and R. M. Thrall,nperson Games in partition form.NavalResearch Logistics Quarterly, Vol. 10, (1963) pp. 281–298.
[17] P. Shenoy, On coalition formation: a game theoretic approach.Intl. Jour. ofGame Theory, (May 1978).
[18] L. N. Vicente and P. H. Calamai,Bilevel and multilevel programming: a bibliography review.Technical Report, University of Waterloo (1997) Availableat: ftp://dial.uwaterloo.ca/pub/phcalamai/bilevelreview/bilevelreview.ps.
[19] U. P. Wen.Mathematical methods for multilevel programming,Ph.D. Thesis,SUNY at Buffalo (September 1981).
618
IE675 Game Theory Spring 2001
Homework 1 DUE February 1, 2001
1. Obtain the optimal mixed strategies for the following matrix games: 0 44 21 3
[3 01 5
]
2. Show that, if we treat an m × n matrix game as a point in mn-dimensional euclidean space, the value of the game is a continuousfunction of the game. (See below)
3. Revise the dual linear programming problems for determining optimalmixed-strategies so that they can find the optimal pure strategy. Showthat your formulation works with an example.
Notes:
For question (2), recall the following from real analysis. . .
Definition 1.1 A metric space is a set E, together with a rule which asso-ciates with each pair x, y ∈ E a real number d(x, y) such that
a. d(x, y) ≥ 0 for all x, y ∈ E
b. d(x, y) = 0 if and only if x = y
c. d(x, y) = d(y, x) for all x, y ∈ E
d. d(x, z) ≤ d(x, y) + d(y, z) for all x, y, z ∈ E
Definition 1.2 Let E and E ′ be metric spaces, with distances denoted dand d′, respectively, let f : E → E ′, and let x0 ∈ E. Then f is said to becontinuous at x0 if, given any real number ε > 0, there exists a real numberδ > 0 such that if x ∈ E and d(x, x0) < δ, then d′(f(x), f(x0)) < ε.
Definition 1.3 If E and E ′ are metric spaces and f : E → E ′ is a function,then f is said to be continuous on E, or, more briefly, continuous, if f iscontinuous at all points of E.
IE675 Game Theory Spring 2001Homework 2 DUE February 8, 2001
1. Obtain the optimal mixed strategies for the following matrix games:
1 3 −1 2−3 −2 2 1
0 2 −2 1
−1 −3 1 −23 2 −2 −10 −2 2 −1
2. Let A be a matrix game and let V = x0AyT0 denote the expected value
of the game when using the mixed saddle-point strategies x0 and y0.Consider a revised game where Player 1 must announce his choice ofrow first, and then (knowing Player 1’s choice) Player 2 announces hischoice of column. Let VS denote the (expected) value of the game un-der these rules. What can one say (if anything) about the relationshipbetween V and VS?
Hint: We say that VS is the value from a Stackelberg strategy.
IE675 Game Theory Spring 2001Homework 3 DUE February 15, 2001
1. Theorem 2.8 in the lecture notes states that the Lemke-Howsonquadratic programming problem can be used to find Nash equilib-rium solutions for a general-sum strategic form game. Although it’sthe correct approach, the proof in the lecture notes is, to say the least,rather sloppy.
Using the proof of Theorem 2.8 provided in the lecture notes as astarting point, develop an improved version of the proof. Try to makeyour proof clear and concise.