csrg presented by souvik das 11/02/05

30
1 On the Emergence of Social Conventions: modeling, analysis and simulations Yoav Shoham & Moshe Tennenholtz Journal of Artificial Intelligence 94(1-2), pp. 139-166, July 1997. CSRG Presented by Souvik Das 11/02/05

Upload: garren

Post on 25-Feb-2016

44 views

Category:

Documents


0 download

DESCRIPTION

On the Emergence of Social Conventions: modeling, analysis and simulations Yoav Shoham & Moshe Tennenholtz Journal of Artificial Intelligence 94(1-2), pp. 139-166, July 1997. CSRG Presented by Souvik Das 11/02/05. Authors. Yoav Shoham Professor of Computer Science, Stanford University - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CSRG Presented by Souvik Das 11/02/05

1

On the Emergence of Social Conventions: modeling, analysis

and simulationsYoav Shoham & Moshe Tennenholtz

Journal of Artificial Intelligence 94(1-2), pp. 139-166, July 1997.

CSRGPresented bySouvik Das

11/02/05

Page 2: CSRG Presented by Souvik Das 11/02/05

2

Authors

• Yoav Shoham– Professor of Computer Science, Stanford University– AI, MAS, Game Theory, e-commerce– http://ai.stanford.edu/~shoham/ , email: [email protected]

• Moshe Tennenholtz– Professor of Industrial Engineering and Management at

the Technion – Israel Institute of Technology– AI, MAS, Protocol evolution– http://iew3.technion.ac.il/Home/Users/Moshet.phtml email: [email protected]

Page 3: CSRG Presented by Souvik Das 11/02/05

3

Definition

• Social Convention– Limiting agents’ choices to induce subgames– Such restrictions are social constraints, in

cooperative games– When restrictions leave only one strategy for all

agents it is a social convention

Page 4: CSRG Presented by Souvik Das 11/02/05

4

Three basic concepts

• Maximin– Guarantees highest minimal payoff– Rationality of other players or common knowledge may not

be assumed• Nash Equilibrium

– No player deviates unilaterally from equlibrium solution without hurting his/her payoff

– Common knowledge and rationality assumed• Pareto Optimality

– Joint action is pareto optimal if on increasing one agents payoff, another suffers

Page 5: CSRG Presented by Souvik Das 11/02/05

5

Coordination and Cooperation games

• Coordination– M=

– Maximin gives –1 while the other two give 1 as payoff

• Cooperation– M=

– Maximin and Nash give –2 but this is pareto dominated

1,1 -3,33,-3 -2,-2

1,1 -1,-1-1,-1 1,1

Page 6: CSRG Presented by Souvik Das 11/02/05

6

Motivation

• Under what conditions do conventions eventually emerge?

• How efficiently are they achieved?• What are the different parameters affecting

speed of convergence?

Page 7: CSRG Presented by Souvik Das 11/02/05

7

Game Model• Symmetric• Population size N >= 4• Each game 2 player 2 choice• Typical coordination and cooperation

games• Payoff matrix M of each game gM = x,x u,v

v,u y,y

Page 8: CSRG Presented by Souvik Das 11/02/05

8

Game model cont.

• Social law sl induces sub game gsl where g is the unrestricted game

• Rationality test of sl – Let V be the game variable used for determining

rationality– Let V(g) denote the value of that variable in game g

• A rational social law with respect to g is– V(g) < V(gsl)

Note: Rationality here does not imply optimality

Page 9: CSRG Presented by Souvik Das 11/02/05

9

Example

• In coordination game, two possible rational social conventions with respect to maximin– Restriction on either one of the strategies

• In cooperation game, only one possible rational social convention with respect to maximin– Cooperate

Page 10: CSRG Presented by Souvik Das 11/02/05

10

The Game Dynamics

• N-k-g stochastic social game– Unbounded sequence of ordered tuples of k

agents selected at random from given N agents– Random k agents meet repeatedly and play

game g– In each iteration, action selection by agents are

synchronous

Page 11: CSRG Presented by Souvik Das 11/02/05

11

Action Selection

• An agent switches to a new action iff total payoff obtained from that action in the last m >= N >= 4 iterations is more than the present action in same time period

• This action update rule called HCR or Highest Cumulative Reward

• Complicated weighted HCR rules based on simple HCR possible

• m puts finite bound on history

Page 12: CSRG Presented by Souvik Das 11/02/05

12

Theorem 1• Given a N-2-g stochastic social agreement game

– For every ε > 0, there exists a bounded number Λ such that if the system runs for Λ iterations, probability that a social convention is reached is 1-ε

– Once the convention is reached, it is never left– Reaching the convention guarantees to agent a payoff no less

than the maximum value initially guaranteed– If social convention exists for g that is rational w.r.t maximin

value then, then social convention will be rational w.r.t. maximin• Corollary

– HCR rule guarantees eventual convergence for coordination and cooperation social games, that is, rational convention

Page 13: CSRG Presented by Souvik Das 11/02/05

13

Theorem 2

• Efficiency measured in terms of number of iterations T(N) required to get desired behavior

• T(N) = Ω ( N log N ) for any update rule R which guarantees convergence

Page 14: CSRG Presented by Souvik Das 11/02/05

14

Proof: Theorem 1

• Case I: – Coordination games ( y > 0, u < 0, v < 0 )

• Rational social convention will restrict all agents to similar strategy

• Pair of agents (i,j) with similar strategy meet together till all other agents forget their past

• i meets x (not equal to j) and then meets j. This step continues in loop till i meets all agents.

• If Λ = k g(N) f(N), then probability that convention not reached is e-k

• f(N) and g(N) bounded by an exponent of the form Ns where s is a polynomial in m and N

Page 15: CSRG Presented by Souvik Das 11/02/05

15

Proof: Theorem 1

• Case II:– Cooperation ( y < 0, u < 0, v > 0 )

• Similar structure of proof as Case I• The major change is in the creation of a pair of cooperative

agents • Achieved by meeting a pair of agents till a pair of non-

cooperative agents forget their past• These historyless non cooperative agents meet till all other non

cooperative agents forget their history• Then they meet sequentially and convention is reached in

similar way as coordination game

Page 16: CSRG Presented by Souvik Das 11/02/05

16

Proof: Theorem 2

• Total number of permutations possible for choosing two players from N is NP2 or N(N-1)

• Ways in which a particular player is chosen is N• Probability of it not being chosen as player 1 or

player 2 in 2 person game in one iteration is (1-1/(N-1))2

• Probability of player not being chosen for a stretch of T(N) = (N-1)f(N) games is (1-1/(N-1))2(N-1)f(N) which converges to e-2f(N)

Page 17: CSRG Presented by Souvik Das 11/02/05

17

Proof: Theorem 2 cont.

• Consider the random variable YN(i) which contains the number of agents that did not participate in any of the i iterations

• E[YN(T(N))] goes to 0 implies that convention established

• If e-2f(N) > 1/N, then E[YN(T(N))] > 1, implying no convergence

• Therefore, for convergence, e-2f(N) < 1/N• Taking natural log, f(N) > 0.5logN• Thus, T(N) = Ω ( N log N )

Page 18: CSRG Presented by Souvik Das 11/02/05

18

Evolution of coordination: Experimental Results

• Coordination games achieve conventions rapidly with the HCR rule while cooperation games do not

• Parameters considered are – Update frequency

• How frequently an agent uses its action update rule HCR– Memory restarts

• Previous history forgotten, but current action retained– Memory window

• Previous m iterations in which agent participated versus previous m iterations regardless of whether the agent participated in those

Page 19: CSRG Presented by Souvik Das 11/02/05

19

Update frequency

The efficiency of convention decreasesas the delay in updateincreases

Page 20: CSRG Presented by Souvik Das 11/02/05

20

Memory Restarts

With decreasing memory restart distance, convention evolution efficiency decreases

Page 21: CSRG Presented by Souvik Das 11/02/05

21

Memory Window

Increasing memory size indefinitely is not helpfulOld information not as relevant as new ones

Page 22: CSRG Presented by Souvik Das 11/02/05

22

Co-varying memory size and update frequency

• When update frequency drops below 100, it becomes better to use statistics of only last window than entire history

• When agents have update delays, they rely on old information

• Systems with large update delays should have frequent memory restarts

Page 23: CSRG Presented by Souvik Das 11/02/05

23

Convention Evolution Dynamics

• As the number of players remaining to conform to convention decreases, the rate of convergence slows down

Page 24: CSRG Presented by Souvik Das 11/02/05

24

Extended Coordination Game

• Symmetric 2-person-s-choice game where payoff x for both agents is greater than 0, iff they perform similar actions, and it is –x otherwise

• New update rule used in this case is External Majority or EM rule

• EM rule– Strategy i is adopted if it was observed in other agents

more often than any other strategy– Reduces to HCR rule for s=2

Page 25: CSRG Presented by Souvik Das 11/02/05

25

Experimental results

• Addition of more potential conventions decreases the efficiency of convention formation by less than logarithmic fashion

Page 26: CSRG Presented by Souvik Das 11/02/05

26

General Comments

• These conventions are not necessarily Nash Equlibria

• Constraints are viewed as regulations laid down by central authority such as government

• If central authority present and is able to enforce certain rules, then they may as well enforce the efficient convention

• In proofs of theorems, statements are made without validation

Page 27: CSRG Presented by Souvik Das 11/02/05

27

Comments on Selection Rule• HCR rule replaces the Best Response or BR rule used in

evolutionary stable strategies and stochastically stable strategies • Two important criteria for selection function are obliviousness

and locality– Selection function is independent of identity of players– Selection function is purely a function of player’s personal history

• Obliviousness is similar to Young’s approach• Young* uses BR which is global• Rationale for using local update is that individual decision

making usually happens in absence of global information• Is HCR really local?

*The Evolution of Conventions, H P Young, Econometrica, Vol 61, No. 1, (Jan 1993), 57-84

Page 28: CSRG Presented by Souvik Das 11/02/05

28

Comments on the Experiment

• It is not clear – How many agents play games in each iteration

and how they are chosen– How does one ensure that a particular pair of

agents play and the rest forget their play history in instances where the memory window is based upon the last m iterations in which the agents participated

Page 29: CSRG Presented by Souvik Das 11/02/05

29

Comparison with Young’s Work• Model differences

– BR vs HCR– Anonymity of history– Incompleteness of information measured by k/m ratio– A convention defined as state h consisting of m repetitions of a

pure strategy which is an absorbing state– No central authority to dictate restrictions– Mistakes (deviation from rational behavior assumed)– Adaptive play’s incomplete sampling helps it to break out from

sub optimal cycles– As long as m/k and k are large, for 2x2 games, stochastically

stable equilibria is independent of m and k

Page 30: CSRG Presented by Souvik Das 11/02/05

30

Questions?