review: game theory - computer sciencelazebnik/fall11/lec10_game_theory2.pdf · review: game theory...

23
Review: Game theory Alice: Testify Alice: Refuse Bob: Testify -5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse 0, 10 1, 1 Dominant strategy Dominant strategy Nash equilibrium Pareto optimal outcome

Upload: duongquynh

Post on 31-Jan-2018

230 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Review: Game theoryAlice:Testify

Alice:Refuse

Bob:Testify -5,-5 -10,0

Bob: 0 -10 -1 -1Refuse 0, 10 1, 1

• Dominant strategy• Dominant strategy• Nash equilibrium• Pareto optimal outcome

Page 2: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Game of ChickenS CPl 1 Pl 2 S C

S -10, -10 -1, 1C 1, -1 0, 0

Straight

Chicken Straight

ChickenPlayer 1 Player 2

• Is there a dominant strategy for either player?• Is there a Nash equilibrium?

Chicken

Is there a Nash equilibrium?(Straight, chicken) or (chicken, straight)

• Anti-coordination game: it is mutually beneficial for the two players to choose different strategies– Model of escalated conflict in humans and animals

(hawk-dove game)( g )

http://en.wikipedia.org/wiki/Game_of_chicken

Page 3: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Mixed strategy equilibriaS CPl 1 Pl 2 S C

S -10, -10 -1, 1C 1, -1 0, 0

Straight

Chicken Straight

ChickenPlayer 1 Player 2

• Mixed strategy: a player chooses between the moves

Chicken

according to a probability distribution• Suppose each player chooses S with probability 1/10.

Is that a Nash equilibrium?Is that a Nash equilibrium?• Consider payoffs to P1 while keeping P2’s strategy fixed

– The payoff of P1 choosing S is (1/10)(–10) + (9/10)1 = –1/10y g ( )( ) ( )– The payoff of P1 choosing C is (1/10)(–1) + (9/10)0 = –1/10– Is there a different strategy that can improve P1’s payoff?– Similar reasoning applies to P2– Similar reasoning applies to P2

Page 4: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Finding mixed strategy equilibriaP1: Choose Swith prob. p

P1: Choose Cwith prob. 1-p

P2: Choose SP2: Choose S with prob. q -10, -10 -1, 1

P2: Choose C with prob. 1-q 1, -1 0, 0

• Expected payoffs for P1 given P2’s strategy:

with prob. 1 q

P1 chooses S: q(–10) +(1–q)1 = –11q + 1P1 chooses C: q(–1) + (1–q)0 = –q

• In order for P2’s strategy to be part of a Nash• In order for P2 s strategy to be part of a Nash equilibrium, P1 has to be indifferent between its two actions:–11q + 1 = –q or q = 1/10Similarly, p = 1/10

Page 5: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Mixed strategy equilibria: Another example

Wife: Wife:Wife:Ballet

Wife:Football

Husband:Ballet 3, 2 0, 0

Husband:Football 0, 0 2, 3

• Pure strategy equilibria:– (Ballet, ballet) or (football, football)( , ) ( , )

Page 6: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Mixed strategy equilibria: Another example

Wife:B ll t / b

Wife:F tb ll / b 1Ballet w/ prob. p Football w/ prob. 1-p

Husband:Ballet w/ prob. q 3, 2 0, 0

H sband Football /

• Payoff to wife assuming she chooses ballet:

Husband: Football w/ prob 1-q 0, 0 2, 3

3q• Payoff to wife assuming she chooses football:• 3q = 2(1–q) or q = 2/5• Payoff to husband assuming he chooses ballet:

2(1–q)

2p• Payoff to husband assuming he chooses football:• 2p = 3(1–p) or p = 3/5• Mixed strategy equilibrium: wife picks ballet w/ probability 3/5 and

3(1-p)

husband picks football with probability 3/5

Page 7: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Mixed strategy equilibria: Another example

Wife:B ll t / b

Wife:F tb ll / b 1Ballet w/ prob. p Football w/ prob. 1-p

Husband:Ballet w/ prob. q 3, 2 0, 0

H sband Football /

• Mixed strategy equilibrium: wife picks ballet w/ probability 3/5 and

Husband: Football w/ prob 1-q 0, 0 2, 3

gy q p p yhusband picks football with probability 3/5

• How often do they end up in different places?P(wife = ballet, husband = football or wife = football, husband = ballet)P(wife ballet, husband football or wife football, husband ballet)

= 3/5 * 3/5 + 2/5 * 2/5 = 13/25• What is the expected payoff for each?

3/5 * 2/5 * 3 + 3/5 * 2/5 * 2 = 6/53/5 2/5 3 + 3/5 2/5 2 = 6/5• What is the payoff for always going to the other’s preferred event?

Page 8: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Back to rock-paper-scissorsZ t t fi d

P1• Zero-sum game: want to find

minimax solution for P1

P20,0 1,-1 -1,1

P2 -1,1 0,0 1,-1

1,-1 -1,1 0,0

Page 9: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Back to rock-paper-scissorsZ t t fi d

P1• Zero-sum game: want to find

minimax solution for P1• Let r, p, s = probability of P1 P2

0 1 -1Let r, p, s probability of P1 playing rock, paper, scissors

• Let’s find the expected payoffs for P1 i diff i f P2

P2 -1 0 1

1 -1 0

given different actions of P2:P2 plays rock: u = r(0) + p(1) + s(–1) = p – sP2 plays paper: u = r(–1) + p(0) + s(1) = s – rp y p p ( ) p( ) ( )P2 plays scissors: u = r(1) + p(–1) + s(0) = r – p

• P2 is trying to minimize P1’s utility, so we haveu ≤ p – s; u ≤ s – r; u ≤ r – p

• To find r, p, s, maximize u subject to the above constraints and r + p + s = 1p– Linear programming problem– Solution is (1/3, 1/3, 1/3)

Page 10: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Computing Nash equilibriaComputing Nash equilibria

• Any game with a finite set of actions has at leastAny game with a finite set of actions has at least one Nash equilibrium

• If a player has a dominant strategy, there exists a p y gy,Nash equilibrium in which the player plays that strategy and the other player plays the best response to that strategy

• If both players have strictly dominant strategies, h i N h ilib i i hi h h lthere exists a Nash equilibrium in which they play

those strategies

Page 11: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Computing Nash equilibriaComputing Nash equilibria

• For a two-player zero-sum game, simple linearFor a two player zero sum game, simple linear programming problem

• For non-zero-sum games, the algorithm has worst-g , gcase running time that is exponential in the number of actions

• For more than two players, and for sequential games, things get pretty hairy

Page 12: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Nash equilibria and rational decisionsdecisions

• If a game has a unique Nash equilibrium, it willIf a game has a unique Nash equilibrium, it will be adopted if each player– is rational and the payoff matrix is accurate– doesn’t make mistakes in execution– is capable of computing the Nash equilibrium – believes that a deviation in strategy on their part will

not cause the other players to deviatethere is common knowledge that all players meet– there is common knowledge that all players meet these conditions

http://en.wikipedia.org/wiki/Nash_equilibrium

Page 13: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Nash equilibria and rational decisionsdecisions

Do you have a dominant strategy?y gy

Play dominant strategy

yes no

Do you know what the opponent will do?Play dominant strategy Do you know what the opponent will do?

yes no

Maximize utility Is opponent rational?Maximize utility Is opponent rational?

yes no

Can agree on a NashCan agree on a Nashequilibrium?

yes no

Maximize worst-case outcome

Play the equilibrium strategy

Maximize worst-case outcome

Page 14: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

More fun stuff: Ultimatum game• Alice and Bob are given a sum of money S to divide

– Alice picks A, the amount she wants to keep for herselfp p– Bob picks B, the smallest amount of money he is willing to accept– If S – A ≥ B, Alice gets A and Bob gets S – A – If S – A < B both players get nothing– If S – A < B, both players get nothing

• What is the Nash equilibrium?– Alice offers Bob the smallest amount of money he will accept:

S – A = B – Alice offers Bob nothing and Bob is not willing to accept anything

less than the full amount: A = S, B = S (both players get nothing)

• How would humans behave in this game?– If Bob perceives Alice’s offer as unfair, Bob will be likely to refuse

Is this rational?– Is this rational?• Maybe Bob gets some positive utility for “punishing” Alice?

Page 15: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Repeated games• What if the Prisoner’s Dilemma

is played for many rounds and the players remember what

Cooperate Defect

Cooperate -1,-1 0,-10the players remember what happened in the previous rounds?

• What if the number of games is fixed and known in

Defect -10,0 -5,-5

advance to both players?– Then the equilibrium is still to defect

• If the number of games is random or unknown• If the number of games is random or unknown, cooperation may become a equilibrium strategy– Perpetual punishment: cooperate unless the other player has

ever defected– Tit for tat: start by cooperating, repeat the other player’s previous

move for all subsequent rounds

• In order for these strategies to work, the players must know that they have both adopted them

Page 16: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Some multi-player gamesSome multi player games• The diner’s dilemma

– A group of people go out to eat and agree to split the bill equally. Each has a choice of ordering a cheap dish or an expensive dish (the utility of the expensive dish is higher than that of the cheap ( y p g pdish, but not enough for you to want to pay the difference)

– Nash equilibrium is for everybody to get the expensive dish

• El Farol bar problem (W Brian Arthur)• El Farol bar problem (W. Brian Arthur)– If less than 60% of the town’s population go to the bar, they will

have a better time than if they stayed at home. If more than 60% of the people go to the bar the bar will be too crowded and theyof the people go to the bar, the bar will be too crowded and they will have a worse time than if they stayed at home

– Nash equilibrium must be a mixed strategy

Page 17: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Mechanism design (inverse game theory)

• Assuming that agents pick rational strategies how• Assuming that agents pick rational strategies, how should we design the game to achieve a socially desirable outcome?desirable outcome?

• We have multiple agents and a center that collects their choices and determines the outcome

Page 18: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

AuctionsAuctions

• GoalsGoals– Maximize revenue to the seller– Efficiency: make sure the buyer who values the goods y y g

the most gets them– Minimize transaction costs for buyer and sellers

Page 19: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Ascending-bid auctionAscending bid auction

• What’s the optimal strategy for a buyer?What s the optimal strategy for a buyer?– Bid until the current bid value exceeds your private value

• Usually revenue-maximizing and efficient, unlessUsually revenue maximizing and efficient, unless the reserve price is set too low or too high

• Disadvantagesg– Collusion– Lack of competition– Has high communication costs

Page 20: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Sealed-bid auctionSealed bid auction• Each buyer makes a single bid and communicates it to the y g

auctioneer, but not to the other bidders– Simpler communication

More complicated decision making: the strategy of a buyer depends on– More complicated decision-making: the strategy of a buyer depends on what they believe about the other buyers

– Not necessarily efficient

• Sealed-bid second-price auction: the winner pays the price of the second-highest bid– Let V be your private value and B be the highest bid by any other buyery p g y y y– If V > B, your optimal strategy is to bid above B – in particular, bid V– If V < B, your optimal strategy is to bid below B – in particular, bid V

Therefore your dominant strategy is to bid V– Therefore, your dominant strategy is to bid V– This is a truth revealing mechanism

Page 21: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Dollar auction• A dollar bill is being auctioned off. It goes to the highest

bidder, but the second-highest bidder also has to pay– Player 1 bids 1 cent– Player 2 bids 2 cents– …– Player 2 bids 98 cents– Player 1 bids 99 cents

• If Player 2 passes, he loses 98 cents, if he bids $1, he might still come out evenIf Player 2 passes, he loses 98 cents, if he bids $1, he might still come out even

– So Player 2 bids $1• Now, if Player 1 passes, he loses 99 cents, if he bids $1.01, he only loses 1 cent

– …

• What went wrong?– When figuring out the expected utility of a bid, a rational player

h ld t k i t t th f t f thshould take into account the future course of the game

• How about Player 1 starts by bidding 99 cents?

Page 22: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Tragedy of the commons• States want to set their policies for controlling emissions

– Each state can reduce their emissions at a cost of -10Each state can reduce their emissions at a cost of 10 or continue to pollute at a cost of -5

– If a state decides to pollute, -1 is added to the utility of every other stateother state

• What is the dominant strategy for each state?– Continue to pollute– Each state incurs cost of -5-49 = -54– If they all decided to deal with emissions, they would incur a cost

of only -10 each

• Mechanism for fixing the problem:– Tax each state by the total amount by which they reduce the

global utility (externality cost)global utility (externality cost) – This way, continuing to pollute would now cost -54

Page 23: Review: Game theory - Computer Sciencelazebnik/fall11/lec10_game_theory2.pdf · Review: Game theory Alice: Testify Alice: Refuse Bob: Testify-5,-5 -10,0 Bob: 0 -10 -1 -1 Refuse, ,

Game theory issuesGame theory issues• Is it applicable to real life?pp

– Humans are not always rational– Utilities may not always be known– Other assumptions made by the game-theoretic model

may not holdPolitical difficulties may prevent theoretically optimal– Political difficulties may prevent theoretically optimal mechanisms from being implemented

• Could it be more applicable to AI than to real life?pp– Computing equilibria in complicated games is difficult– Relationship between Nash equilibrium and rational

decision making is subtle