summer school july 2010

164
1 Automated negotiations: Agents Automated negotiations: Agents interacting with other automated interacting with other automated agents and with humans agents and with humans Sarit Kraus Department of Computer Science Bar-Ilan University University of Maryland [email protected] http://www.cs.biu.ac.il/~sarit/

Upload: ilhamsyah80

Post on 29-May-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 1/164

1

Automated negotiations: AgentsAutomated negotiations: Agents

interacting with other automatedinteracting with other automatedagents and with humansagents and with humans

Sarit KrausDepartment of Computer Science

Bar-Ilan UniversityUniversity of Maryland

[email protected]

http://www.cs.biu.ac.il/~sarit/

Page 2: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 2/164

2

NegotiationsNegotiations

“A discussion in which interested parties exchange information andcome to an agreement.” — Davis and

Smith, 1977

Page 3: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 3/164

3

NEGOTIATIONNEGOTIATION is an

interpersonal decision-making process necessarywhenever we cannot

achieve our objectivessingle-handedly.

NegotiationsNegotiations

Page 4: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 4/164

4

Agent environmentsAgent environments

Teams of agents that need to coordinate jointactivities; problems: distributed information,distributed decision solving, local conflicts.

Open agent environments acting in the sameenvironment; problems: need motivation tocooperate, conflict resolution, trust, distributed

and hidden information.

Page 5: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 5/164

5

Open Agent EnvironmentsOpen Agent Environments

Consist of:◦ Automated agents developed by or serving different

people or organizations.

◦ People with a variety of interests and institutional

affiliations. The computer agents are “self-interested”;

they may cooperate to further their interests. The set of agents is not fixed.

Page 6: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 6/164

6

Open Agent Environments (examples)Open Agent Environments (examples)

Agents support people◦ Collaborative interfaces◦ CSCW: Computer Supported Cooperative Work systems◦ Cooperative learning systems◦ Military-support systems

nAgents act as proxies for peoplenCoordinating schedulesnPatient care-delivery systems

nOnline auctionsnGroups of agents act autonomously alongside

peoplenSimulation systems for education and trainingnComputer games and other forms of entertainment

nRobots in rescue operations

nSoftware personal assistants

Page 7: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 7/1647

Open Agent EnvironmentsOpen Agent Environments(examples)(examples) Agents support people

◦ Collaborative interfaces◦ CSCW: Computer Supported Cooperative Work systems◦ Cooperative learning systems◦ Military-support systems

Agents act as proxies for people◦ Coordinating schedules◦ Patient care-delivery systems◦ Online auctions

Groups of agents act autonomously alongside people◦ Simulation systems for education and training◦

Computer games and other forms of entertainment◦ Robots in rescue operations◦ Software personal assistants

Page 8: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 8/1648

ExamplesExamples Monitoring electricity networks (Jennings) Distributed design and engineering (Petrie et al.) Distributed meeting scheduling (Sen & Durfee) Teams of robotic systems acting in hostile environments (Balch &

Arkin, Tambe) Collaborative Internet-agents (Etzioni & Weld, Weiss) Collaborative interfaces (Grosz & Ortiz, Andre) Information agent on the Internet (Klusch) Cooperative transportation scheduling (Fischer)

Supporting hospital patient scheduling (Decker & Jin) Intelligent Agents for Command and Control (Sycara)

Page 9: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 9/1649

Types of agentsTypes of agents

Fully rational agents Bounded rational agents

Page 10: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 10/16410

Using other disciplines’ resultsUsing other disciplines’ results

No need to start from scratch! Required modification and adjustment; AI gives

insights and complimentary methods. Is it worth it to use formal methods for multi-agent

systems?

Page 11: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 11/16411

Negotiating with rational agentsNegotiating with rational agents

Quantitative decision making◦ Maximizing expected utility◦ Nash equilibrium, Bayesian Nash equilibrium

Automated Negotiator 

◦ Model the scenario as a game◦ The agent computes (if complexity allows)

the equilibrium strategy, and actsaccordingly.

(Kraus, Strategic Negotiation inMultiagent Environments,MIT Press 2001). 

Page 12: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 12/16412

Game Theory studies situations of strategic interaction in whichGame Theory studies situations of strategic interaction in whicheach decision maker's plan of action depends on the plans of each decision maker's plan of action depends on the plans of the other decision makers.the other decision makers.

Short introduction

to game theory

Page 13: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 13/164

13

Decision Theory (reminder)Decision Theory (reminder)(How to make decisions)(How to make decisions)

Decision Theory = Probability theory + Utility Theory

(deals with chance) (deals with outcomes)

Fundamental idea◦ The MEU (Maximum expected utility) principle◦ Weigh the utility of each outcome by the probability that it

occurs

Page 14: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 14/164

14

Basic PrincipleBasic Principle

Given probability P(out1| Ai), utility U(out1),

P(out2| Ai), utility U(out2)…

Expected utility of an action Aii:

EU(Ai) = Σ U(out j)*P(out j|Ai)

Choose Ai such that maximizes EU 

MEU = argmax Σ U(out j)*P(out j|Ai)  Ai Ac Out j OUT

Out j OUT

Page 15: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 15/164

15

Risk Averse, Risk NeutralRisk Averse, Risk NeutralRisk SeekingRisk Seeking

0

5

1 0

1 5

2 0

2 5

0 1 M 2 M 3 M 4 M

M o n

         U         t         i         l         i         t       y

RISK AVERSE

0

5

1 0

1 5

2 0

2 53 0

3 5

4 0

4 5

0 1 M 2 M 3 M 4 M

M o n

          t

          t       y

RISK NEUTRAL

0

2 0

4 0

6 0

8 0

1 0 0

1 2 0

0 1 M 2 M 3 M

M o n

          t

          t       y

RISK SEEKER

Page 16: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 16/164

16

Game DescriptionGame Description

Players◦ Who participates in the game?

Actions / Strategies◦ What can each player do?

In what order do the players act? Outcomes / Payoffs

◦ What is the outcome of the game?

◦ What are the players' preferences over the possibleoutcomes?

Page 17: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 17/164

17

Game Description (cont)Game Description (cont)

Information◦ What do the players know about the parameters of 

the environment or about one another?

◦ Can they observe the actions of the other players?

Beliefs◦ What do the players believe about the unknown

parameters of the environment or about oneanother?

What can they infer from observing the actions of the other players?

Page 18: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 18/164

18

Strategies and EquilibriumStrategies and Equilibrium

Strategy◦ Complete plan, describing an action for every

contingency Nash Equilibrium

◦ Each player's strategy is a best response to thestrategies of the other players

◦ Equivalently: No player can improve his payoffs bychanging his strategy alone

Self-enforcing agreement. No need for formalcontracting Other equilibrium concepts also exist

Page 19: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 19/164

19

Classification of GamesClassification of Games

Depending on the timing of move◦ Games with simultaneous moves

◦ Games with sequential moves

Depending on the information available to theplayers◦ Games with perfect information

◦ Games with imperfect (or incomplete) information We concentrate on non-cooperative games

◦ Groups of players cannot deviate jointly

◦ Players cannot make binding agreements

Page 20: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 20/164

20

Games with Simultaneous MovesGames with Simultaneous Movesand Perfect Informationand Perfect Information

All players choose their actions simultaneously or justindependently of one another 

There is no private information

All aspects of the game are known to the players Representation by game matrices Often called normal form games or strategic form

games

Page 21: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 21/164

21

Matching PenniesMatching Pennies

Example of a zero-sum game.Strategic issue of competition.

Page 22: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 22/164

22

Prisoner’s DilemmaPrisoner’s Dilemma

Each player can cooperate or defect 

cooperate defect

defect 0,-10

-10,0

-8,-8

-1,-1

Row

Column

cooperate

Main issue: Tension betweensocial optimality and individual incentives.

Page 23: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 23/164

23

Coordination GamesCoordination Games

A supplier and a buyer need to decide whether to adopt a new purchasing system.

new old

old 0,0

0,0

5,5

20,20

Supplier 

Buyer 

new

Page 24: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 24/164

24

Battle of sexesBattle of sexes

football shopping

shopping 0,0

0,0

1,2

2,1

Husband

Wife

football

The game involves both the issues of coordination andcompetition

Page 25: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 25/164

25

Definition of Nash EquilibriumDefinition of Nash Equilibrium

A game has n players. Each player i has a strategy set S i 

◦ This is his possible actions Each player has a payoff function

◦ pI: S R

A strategy t i  in S i  is a best response if there is no

other strategy in S i that produces a higher 

payoff, given the opponent’s strategies

Page 26: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 26/164

26

Definition of Nash EquilibriumDefinition of Nash Equilibrium

A strategy profile is a list (s1, s2 , …, sn) of thestrategies each player is using

If each strategy is a best response given theother strategies in the profile, the profile is a

Nash equilibrium Why is this important?

◦ If we assume players are rational, they will playNash strategies

◦ Even less-than-rational play will often converge toNash in repeated settings

Page 27: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 27/164

27

An Example of a Nash EquilibriumAn Example of a Nash Equilibrium

a b

b 2,1

0,1

1,0

1,2

Row

Column

a

(b,a) is a Nash equilibrium:Given that column is playing a, row’s best response is b Given that row isplaying b, column’s best response is a

Page 28: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 28/164

28

Mixed strategiesMixed strategies

Unfortunately, not every game has a purestrategy equilibrium.◦ Rock-paper-scissors

However, every game has a mixed strategy

Nash equilibrium Each action is assigned a probability of play Player is indifferent between actions, given

these probabilities

Page 29: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 29/164

29

Mixed StrategiesMixed Strategies

football shopping

shopping 0,0

0,0

1,2

2,1

Husband

Wife

football

Page 30: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 30/164

30

Mixed strategyMixed strategy

Instead, each player selects a probability associated

with each action◦ Goal: utility of each action is equal◦ Players are indifferent to choices at this probability

a=probability husband chooses football b=probability wife chooses shopping Since payoffs must be equal, for husband:

◦ b*1=(1-b)*2 b=2/3 For wife:

◦ a*1=(1-a)*2 = 2/3 In each case, expected payoff is 2/3

◦ 2/9 of time go to football, 2/9 shopping, 5/9 miscoordinate If they could synchronize ahead of time they could

do better.

Page 31: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 31/164

31

Rock paper scissorsRock paper scissors

rock paper  

paper 1,-1

-1,1

0,0

0,0

Row

Column

rock

scissors

scissors

1,-1

-1,1

-1,1 1,-1 0,0

Page 32: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 32/164

32

SetupSetup

Player 1 plays rock with probability pr ,scissors with probability ps, paper withprobability 1-pr –ps

Utility2(rock) = 0*pr + 1*ps – 1(1-pr –ps) =

2 ps + pr -1 Utility2(scissors) = 0*ps + 1*(1 – pr – ps) – 1pr 

= 1 – 2pr –ps

Utility2

(paper) = 0*(1-pr 

–ps

)+ 1*pr 

– 1ps= pr –ps

Player 2 wants to choose a probability for each actionso that the expected payoff for each action is thesame.

Page 33: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 33/164

33

SetupSetup

qr (2 ps + pr –1) = qs(1 – 2pr –ps) = (1-qr -qs) (pr –ps)

• It turns out (after some algebra) that the optimal

mixed strategy is to play each action 1/3 of the time

• Intuition: What if you played rock half the time?Your opponent would then play paper half thetime, and you’d lose more often than you won

•So you’d decrease the fraction of times youplayed rock, until your opponent had no ‘edge’in guessing what you’ll do

Page 34: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 34/164

34

Extensive Form GamesExtensive Form Games

H

H H

T T 

(1,2) (4,0)(2,1) (2,1)

Any finite game of perfectinformation has a purestrategy Nash equilibrium.It can be found bybackward induction.

Chess is a finite game of perfect information.Therefore it is a “trivial” game from a gametheoretic point of view.

Page 35: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 35/164

35

Extensive Form Games - IntroExtensive Form Games - Intro

A game can have complex temporal structure Information

◦ set of players

◦ who moves when and under what circumstances

◦ what actions are available when called upon tomove

◦ what is known when called upon to move

◦ what payoffs each player receives

Foundation is a game tree

Page 36: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 36/164

36

Example: Cuban Missile CrisisExample: Cuban Missile Crisis

Khrushchev

Kennedy

Arm

Retract

Fold

Nuke

-1, 1

- 100, - 100

10, -10

Pure strategy Nash equilibria: (Arm, Fold)

and (Retract, Nuke)

Page 37: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 37/164

37

Subgame perfect equilibrium &Subgame perfect equilibrium &credible threatscredible threats

Proper subgame = subtree (of the game tree)whose root is alone in its information set

Subgame perfect equilibrium◦ Strategy profile that is in Nash equilibrium in everyproper subgame (including the root), whether or notthat subgame is reached along the equilibrium pathof play

Page 38: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 38/164

38

Example: Cuban Missile CrisisExample: Cuban Missile Crisis

Khrushchev

Kennedy

Arm

Retract

Fold

Nuke

-1, 1

- 100, - 100

10, -10

Pure strategy Nash equilibria: (Arm, Fold) and (Retract,Nuke)

Pure strategy subgame perfect equilibria: (Arm, Fold)

Conclusion: Kennedy’s Nuke threat was not credible.

f

Page 39: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 39/164

39

Type of gamesype of games

Diplomacy

Page 40: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 40/164

40

Take it or leave it dealsTake it or leave it deals

• The rules of the game:1.You will be randomly paired up with someone in the other 

section; this pairing will remain completely anonymous.

2.One of you will be chosen (by coin flip) to be either theProposer or the Responder in this experiment.

3.The Proposer gets to make an offer to split $100 in some

proportion with the Responder. So the proposer canoffer $x to the responder, proposing to keep $100-xfor themselves.

4.The Responder must decide what is the lowest amountoffered by the proposer that he / she will accept; i.e. “Iwill accept any offer which is greater than or equal to

$y.”5.If the responder accepts the offer made by the proposer,

they split the sum according  to the proposal . If theresponder rejects, both parties lose their shares.

Page 41: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 41/164

41

AN EXAMPLE OF Buyer/Seller negotiationAN EXAMPLE OF Buyer/Seller negotiation

Page 42: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 42/164

42

BARGAININGBARGAINING

ZOPA

xfinal prices b

Sellers’ RPSellers wants s or more

Buyers’ RPBuyer wants b or less

Sellers’ surplus Buyers’ surplus

Page 43: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 43/164

43

BARGAININGBARGAINING

If b < s negative bargaining zonenegative bargaining zone,no possible agreements

If b > s   positive bargaining zone,positive bargaining zone, agreement  possible

(x-s) sellers’ surplus; (b-x) buyers’ surplus;

The surplus to divide independent on ‘x’ –constant-sum game!

Page 44: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 44/164

44

POSITIVE BARGAINING ZONEPOSITIVE BARGAINING ZONE

Buyers’ target point

Buyers’ reservation point

Sellers’ reservation point Sellers’ target point

Sellers’ bargaining range

Buyers’ bargaining range

POSITIVE bargaining zone

Page 45: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 45/164

45

NEGATIVE BARGAININGZONE

Buyers’ target point

Buyers’ reservation point

Sellers’ reservation point Sellers’ target point

Sellers’bargaining range

Buyers’ bargainingrange

NEGATIVE bargaining zone

Page 46: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 46/164

46

Single issue negotiationSingle issue negotiation

Agents a and b negotiate over a pie of size 1 Offer: (x,y), x+y=1 Deadline: n and Discount factor: δ

Utility: Ua((x,y), t) = x δt-1

if t ≤ n Ub((x,y),t)= y δt-1   0 otherwise

The agents negotiate using Rubinstein’s alternating 

offer’s protocol 

Page 47: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 47/164

47

Alternating offers protocolAlternating offers protocol

  Time Offer Respond 1 a(x1,y1)  b(accept/reject)

2  b (x2,y2) a (accept/reject) -

-

n

Page 48: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 48/164

48

How much should an agent offer if there isonly one time period?

Let n=1 and a be the first mover 

Equilibrium strategies

Agent a’ s offer:

Propose to keep the whole pie (1,0);agent b will accept this

Page 49: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 49/164

49

Equilibrium strategies for n = 2Equilibrium strategies for n = 2δ = 1/4 first mover: a

Offer: ( x , y )  x : a’sshare; y : b’s shareOptimal offers obtained using backward induction 

Time Offering agent Offer Utility

1 a → b (3/4, 1/4) 3/4;1/4

2  b → a (0, 1) 0;1/4

The offer (3/4, 1/4) forms a P.E. Nash

equilibrium

Agreement

Page 50: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 50/164

50

What happens to first mover’s share as δincreases?

What happens to second mover’s share as δ

increases? As deadline increases, what happens to first

mover’s share? Likewise for second mover?

Effect of discount factor and deadlineEffect of discount factor and deadlineon the equilibrium outcomeon the equilibrium outcome

Effect of δ and deadline on the agents’ shares

Page 51: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 51/164

51

Effect of δ and deadline on the agents’ shares

Page 52: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 52/164

52

Multiple issuesMultiple issues

Set of issues: S = {1, 2, …, m}. Each issue is apie of size 1

The issues are divisible Deadline: n (for all the issues)

Discount factor: δ c  for issue c 

Utility: U(x, t) = ∑c U(xc, t)

Page 53: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 53/164

53

Multi-issue proceduresMulti-issue procedures

Package deal procedure: The issues are bundledand discussed together as a package

Simultaneous procedure: The issues are

negotiated in parallel but independently of eachother 

Sequential procedure: The issues are negotiatedsequentially one after another 

P k d l d

Page 54: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 54/164

54

Package deal procedure

  Issues negotiated using alternating offer’sprotocol  An offer specifies a division for each of the

m issues

The agents are allowed to accept/reject acomplete offer  The agents may have different preferences

over the issues

The agents can make tradeoffs across theissues to maximize their utility – thisleads to Pareto optimal outcome

Utility for two issues

Page 55: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 55/164

55

Utility for two issuesUa = 2X + Y U b = X + 2Y

M ki t d ff

Page 56: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 56/164

56

Making tradeoffs

U b = 2

What is a’s utility for Ub = 2

E l f t i

Page 57: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 57/164

57

Example for two issuesDEADLINE: n = 2

DISCOUNT FACTORS: δ1= δ2 = 1/2

UTILITIES: Ua = 1/2t-1 (x1 + 2x2); Ub =1/2t-1 (2y1 +y2)

Time Offering agentPackage Offer 

1 a → b [(1/4, 3/4); (1, 0)]OR [(3/4, 1/4); (0, 1)]

2  b → a [(0, 1); (0, 1)]U b = 1.5

Agreement

The outcome is not symmetric

P E N h ilib i t t i

Page 58: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 58/164

58

P.E. Nash equilibrium strategiesFor t = nThe offering agent takes 100 percent of all the issues

The receiving agent accepts

For t < n (for agent a):

OFFER [ x, y]s.t. U b( y, t ) = EQUB (t +1)

If more than one such [ x, y] perform trade-offs across issuesto find best offer 

RECEIVE [ x, y]

If Ua( x, t ) ≥ EQUA (t +1)

ACCEPTelse REJECT

EQUA (t +1) is a’s equilibrium utility for t+1 

EQUB (t +1) is b’s equilibrium utility for t+1

M ki t d ff di i ibl i

Page 59: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 59/164

59

Making trade-offs – divisible issues

Agent a’s trade-off problem at time t :

TR: Find a package [x, y] to

mMaximize ∑ k

ac xc

 c=1

 m

Subject to ∑ kbc yc ≥ EQUB (t+1)0 ≤ xc ≤ 1; 0 ≤ yc ≤ 1

 c=1

This is the fractional knapsack problem

Page 60: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 60/164

60

Making trade-offs – divisible issues

Agent a’s perspective (time t )•

•Agent a considers the m issues in theincreasing order of ka/kb and assigns to b the maximum possible share for each of them until b’s cumulative utility equalsEQUB (t +1)

E ilib i t t i

Page 61: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 61/164

61

Equilibrium strategies

For t = nThe offering agent takes 100 percent of all the issuesThe receiving agent acceptsFor t < n (for agent a)

OFFER [ x , y ]

s.t. Ub(y , t ) = EQUB (t +1)

If more then one such [ x , y ]

perform trade-offs acrossissues to find best offer 

RECEIVE [ x , y ]

If Ua( x , t ) ≥ EQUA (t +1)

ACCEPT

else REJECT

Page 62: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 62/164

M ki t d ffM ki t d ff

Page 63: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 63/164

63

Making trade-offs –Making trade-offs –indivisible issuesindivisible issues

Agent a’s trade-off problem at time t is to find apackage [x, y] that

For indivisible issues, this is the integer knapsack problem

( ) 10:;10:1..1

1

or  yor  xt  EQ yk t S 

 xk Maximize

ccUBc

m

c

b

c

m

c

c

a

c

+≥∑

=

=

Key pointsKey points

Page 64: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 64/164

64

Key pointsKey points

Single issue:

Time to compute equilibrium is O(n) The equilibrium is not unique, it is not symmetric

Multiple divisible issues: (exact solution) Time to compute equilibrium for t=1 is O(mn)

The equilibrium is Pareto optimal, it is not unique, it isnot symmetric

Multiple indivisible issues: (approx. solution) There is an FPTAS to compute approximate

equilibrium The equilibrium is Pareto optimal, it is not unique, it is

not symmetric

Page 65: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 65/164

65

Negotiation on dataNegotiation on dataallocation in multi-server allocation in multi-server environmentenvironment R. Azulay-Schwartz and S. Kraus. Negotiation On DataR. Azulay-Schwartz and S. Kraus. Negotiation On DataAllocation in Multi-Agent Environments. AutonomousAllocation in Multi-Agent Environments. AutonomousAgents and Multi-Agent Systems journal 5(2):123-172,Agents and Multi-Agent Systems journal 5(2):123-172,2002.2002.

Page 66: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 66/164

66

Cooperative Web ServersCooperative Web Servers

•The Data and Information System component of the Earth Observing System (EOSDIS) of NASAis a distributed knowledge system whichsupports archival and distribution of data atmultiple and independent servers.

Page 67: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 67/164

67

Cooperative Web Servers- cont.Cooperative Web Servers- cont.

•Each data collection, or file, is called a dataset.The datasets are huge, so each dataset hasonly one copy.

•The current policy for data allocation in NASA isstatic: old datasets are not reallocated; eachnew dataset is located by the server with thenearest topics (defined according to the topicsof the datasets stored by this server).

Related WorkRelated Work

Page 68: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 68/164

68

Related Work -Related Work -File Allocation ProblemFile Allocation Problem

The original problem:How to distribute files among computers, in order to optimize the system performance.

Our problem:

How can self-motivated servers decide aboutdistribution of files, when each server has its ownobjectives.

Page 69: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 69/164

69

EnvironmentEnvironment DescriptionDescription

•There are several information servers. Eachserver is located at a different geographicalarea.

•Each server receives queries from the clients in

its area, and sends documents as responses toqueries. These documents can be storedlocally, or in another server.

Page 70: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 70/164

70

Environment DescriptionEnvironment Description

server i server  j

a query

document/s

area iarea j

distance

a client

the document/s

the query

Page 71: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 71/164

71

Basic DefinitionsBasic Definitions

•SERVERS:the set of the servers.

•DATASETS:the set of datasets (files) to be allocated.

Allocation:a mapping of each dataset to one of theservers. The set of all possible allocation isdenoted by Allocs.

•U: the utility function of each server.

Page 72: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 72/164

Page 73: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 73/164

73

Utility FunctionUtility Function

•U server (alloc,t) specifies the utility of server fromalloc ∈ Allocs at time t .

•It consists of •The utility from the assignment of each dataset.•The cost of negotiation delay.

U server (alloc,0)= V server (x,alloc(x)).    x ־DATASETS 

Page 74: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 74/164

74

Parameters of utilityParameters of utility

•query price: payment for retrieved docoments.•usage(ds,s): the expected number of documents

of dataset ds from clients in the area of server s.

storage costs, retrieve costs, answer costs.

C t ti

Page 75: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 75/164

75

Cost over timeCost over time

•Cost of communication and computationtime of the negotiation.

•Loss of unused information: new documentscan not be used until the negotiation ends.

Datasets usage and storage cost areassumed to decrease over time, with thesame discount ratio (p-1).

•Thus, there is a constant discount ratio of theutility from an allocation:

 U server (alloc,t)=δ  t *U server (alloc,0) - t*C .

Page 76: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 76/164

76

AssumptionsAssumptions

•Each server prefers any agreement over continuation of the negotiation indefinitely.

•The utility of each server from the conflict

allocation is always greater or equal to 0.

•OFFERS  - the set of allocations that arepreferred by all the agents over opting out.

Negotiation Analysis -Negotiation Analysis -

Page 77: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 77/164

77

Negotiation Analysis -Negotiation Analysis -Simultaneous ResponsesSimultaneous Responses

•Simultaneous responses: A server, when responding, is not informed of the other responses.

•Theorem: 

For each offer  x  ∈OFFERS , there is asubgame-perfect equilibrium of the bargaininggame, with the outcome x offered andunanimously accepted in period 0.

C

Page 78: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 78/164

78

Choosing the AllocationChoosing the Allocation

•The designers of the servers can agree inadvance on a joint technique for choosing x 

•giving each server its conflict utility

•maximizing a social welfare criterion

• the sum of the servers’ utilities.•or the generalized Nash product of the servers’

utilities:  Π  (Us(x)-Us(conflict))

E i t l E l ti

Page 79: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 79/164

79

Experimental EvaluationExperimental Evaluation

•How do the parameters influence the results of the negotiation?•vcost(alloc): the variable costs due to an

allocation (excludes storage_cost and the gainsdue to queries).

•vcost_ratio: the ratio of vcosts when usingnegotiation, and vcosts of the static allocation.

ff f

Page 80: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 80/164

80

Effect of Parameters on The ResultsEffect of Parameters on The Results

•As the number of servers grows, vcost_ratio increases (more complex computations)L.

•As the number of datasets grows, vcost_ratio decreases (negotiation is more beneficial) J.

Changing the mean usage did not influencevcost_ratio significantlyK, but vcost_ratio decreases as the standard deviation of theusage increasesJ.

I fl f PI fl f P t t

Page 81: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 81/164

81

Influence of Parameters - cont.Influence of Parameters - cont.

•When the standard deviation of the distancesbetween servers increases, vcost_ratio decreasesJ.

•When the distance between servers increases,

vcost_ratio decreasesJ

.• In the domains tested,•answer_cost  vcost_ratio   L.•storage_cost  vcost_ratio  L.• retrieve_cost  vcost_ratio J.•query_price vcost_ratio J.

I l t I f tiI l t I f ti

Page 82: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 82/164

82

Incomplete InformationIncomplete Information

•Each server knows:•The usage frequency of all

datasets, by clients from its area

•The usage frequency of datasetsstored in it, by all clients

BARGAININGBARGAINING

Page 83: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 83/164

83

BARGAININGBARGAINING

ZOPA

xfinal pricesL bL

Sellers’ RPSellers wants s or more

Buyers’ RPBuyer wants b or less

Sellers’ surplus Buyers’ surplus

sH bH

Definition of a Bayesian gameDefinition of a Bayesian game

Page 84: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 84/164

84

Definition of a Bayesian gameDefinition of a Bayesian game N is the set of players. Ω is the set of the states of nature.  Ai  is the set of actions for player i.  A = A1 × A2  × … 

× An

T i  is the type set of player i. For each state of 

nature, the game will have different types of players (one type per player). u : Ω ×  A→ R is the payoff function for player i.  pi  is the probability distribution over Ω for each

player i, that is to say, each player has differentviews of the probability distribution over the statesof the nature. In the game, they never know theexact state of the nature.

Sol tion concepts for Ba esian gamesSolution concepts for Bayesian games

Page 85: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 85/164

85

Solution concepts for Bayesian gamesSolution concepts for Bayesian games

A (Bayesian) Nash equilibrium is a strategy profileand beliefs specified for each player about thetypes of the other players that maximizes theexpected utility for each player given their beliefsabout the other players' types and given thestrategies played by the other players.

I l t I f ti tI l t I f ti t

Page 86: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 86/164

86

Incomplete Information - cont.Incomplete Information - cont.

•A revelation mechanism:

•First, all the servers report simultaneously all their private information:• for each dataset, the past usage of the dataset by this

server.

• for each server, the past usage of each local dataset bythis server.

•Then, the negotiation proceeds as in the completeinformation case.

I l t I f ti tI l t I f ti t

Page 87: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 87/164

87

Incomplete Information - cont.Incomplete Information - cont.

•Lemma: 

There is a Nash equilibrium where each server tells the truth about its past usage of remotedatasets, and the other servers usage of its

local datasets.

•Lies concerning details about local usage of localdatasets are intractable.

Summary: negotiation on dataSummary: negotiation on data

Page 88: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 88/164

88

Summary: negotiation on dataSummary: negotiation on dataallocationallocation•

We have considered the data allocationproblem in a distributed environment.

•We have presented the utility function of theservers, which expresses their preferences.

•We have proposed using a negotiation protocolfor solving the problem.

•For incomplete information situations, arevelation process was added to the protocol.

Page 89: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 89/164

89

Agent-Human NegotiationAgent-Human Negotiation

C t i t ti ith lC t i t ti ith l

Page 90: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 90/164

90

Computers interacting with peopleComputers interacting with people

Computer persuahuman

Computer has the

control

Human hasthe control

Page 91: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 91/164

9191

Page 92: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 92/164

92

Culture sensitive agentsCulture sensitive agentsThe development of standardizedagent to be used in the collectionof data for studies on culture and

negotiation

r agents negotiate well across cultures

Semi autonomous carsSemi autonomous cars

Page 93: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 93/164

93

Semi-autonomous carsSemi-autonomous cars

Medical applicationsMedical applications

Page 94: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 94/164

94

Medical applicationsMedical applications

Gertner Institute for Epidemiology and HealthPolicy Research 

Automated care takerAutomated care taker

Page 95: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 95/164

95

Automated care-taker Automated care-taker 

I will be too tired in the afternoon!!!I scheduled an appointment for you at the physiotherapis

Try to reschedule and fail

The physiotherapist has no other available appoiHow about resting before the appointment?

Security applicationsSecurity applications

Page 96: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 96/164

96

Collect

UpdateAnalyzePrioritize

PeoplePeople often follow suboptimaloften follow suboptimal

Page 97: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 97/164

97

Irrationalities attributed to◦ sensitivity to context

◦ lack of knowledge of own preferences

◦ the effects of complexity

◦ the interplay between emotion and cognition

◦ the problem of self control

◦ bounded rationality in the bullet  

pp ppdecision strategiesdecision strategies

Page 98: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 98/164

98

Agents that play repeatedlywith the same person

AutONAAutONA [BY03][BY03]

Page 99: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 99/164

AutONAAutONA [BY03][BY03]

Buyers and sellers Using data from previous experiments Belief function to model opponent Implemented several tactics and heuristics

◦ including, concession mechanism

A. Byde, M. Yearworth, K.-Y. Chen, and C. Bartolini. AutONA: A system for automated multiple 1-1 negotiation. In CEC , pages 59–67, 2003

Cliff-EdgeCliff-Edge

Page 100: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 100/164

Cliff-EdgeCliff Edge

Virtual learning and reinforcement learningUsing data from previous interactionsImplemented several tactics and heuristics

qualitative in natureNon-deterministic behavior, via means of 

randomization

R. Katz and S. Kraus. Efficient agents for cliff edgeenvironments with a large set of decision options.In AAMAS , pages 697–704, 2006

Page 101: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 101/164

101

Agents that play with thesame person only once

General opponent*General opponent*modelingmodeling

Challenges of human opponent*Challenges of human opponent*

Page 102: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 102/164

102

Small number of examples◦ difficult to collect data on people

Noisy data◦ people are inconsistent (the same person may act

differently)

◦ people are diverse

Challenges of human opponentChallenges of human opponentmodelingmodeling

Guessing HeuristicGuessing Heuristic

Page 103: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 103/164

Guessing HeuristicGuessing Heuristic

Multi-issue, multi-attribute, withincompleteinformation

Domain independent

Implemented several tacticsand heuristics◦ including, concession mechanism

C. M. Jonker, V. Robu, and J. Treur. An agent architecture for multi-attribute negotiation using incomplete preferenceinformation. JAAMAS , 15(2):221–252, 2007

PURB AgentPURB Agent

Page 104: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 104/164

PURB AgentPURB Agent

Building blocks: Personality model, Utilityfunction, Rules for guiding choice. Key idea: Models Personality traits of its

negotiation partners over time.

Uses decision theory to decide how to negotiate,with utility function that depends on models andother environmental features.

Pre-defined rules facilitate computation.Plays as well as people; adapts to c

QOAgentQOAgent [LIN08][LIN08]Played a

t least as well as p

Page 105: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 105/164

QOAgent QOAgent [LIN08][LIN08]

Multi-issue, multi-attribute, with incompleteinformation Domain independent Implemented several tactics and heuristics

◦ qualitative in nature Non-deterministic behavior, also via means of 

randomization

R. Lin, S. Kraus, J. Wilkenfeld, and J. Barry. Negotiating with boundrational agents in environments with incomplete information using anautomated agent. Artificial Intelligence, 172(6-7):823 – 851, 2008

y p

Is it possible to improve the QO

 Yes, if you have data

Page 106: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 106/164

106

KBAgent KBAgent 

Y. Oshrat, R. Lin, and S. Kraus. Facing the challenge of human-agentnegotiations via effective general opponent modeling. In AAMAS , 2009

Multi-issue, multi-attribute, with incompleteinformation

Domain independent Implemented several tactics and heuristics

◦ qualitative in nature Non-deterministic behavior, also via means of 

randomization Using data from previous interactions

Page 107: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 107/164

Page 108: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 108/164

108

General opponent modelingGeneral opponent modeling

Challenge: sparse data of past negotiationsessions of people negotiation

Technique: Kernel Density Estimation

§

n

G l t d liG l t d li

Page 109: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 109/164

109

Estimate likelihood of other party: accept an offer  make an offer  its expected average utility

The estimation is done separately for each possibleagent type:

The type of a negotiator is determined using a simpleBayes' classifier 

Use estimation for decision making

General opponent modelingGeneral opponent modeling

Page 110: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 110/164

110

KBAgent KBAgent as the job candidateas the job candidate

Best result: 20,000, Project manager, With leased car; 20%pension funds, fast promotion, 8 hours

20,000Team Manager 

With leased car Pension: 20%Slow promotion9 hours

12,000Programmer 

Without leased car Pension: 10%Fast promotion10 hours

20,000Project manager Without leased car Pension: 20%Slow promotion9 hours

 KBAgent Human

KBA h j b did

Page 111: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 111/164

111

KBAgent KBAgent as the job candidateas the job candidate Best agreement: 20,000, Project manager, With leased car; 20%

pension funds, fast promotion, 8 hours

 

 KBAgent Human

20,000Programmer With leased car Pension: 10%Slow promotion9 hours

Round 712,000Programmer Without leased car Pension: 10%Fast promotion

10 hours

20,000Team Manager With leased car Pension: 20%Slow promotion9 hours

E i t

Page 112: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 112/164

112112

ExperimentsExperiments

172 grad and undergrad students in Computer Science

People were told they may be playing a computer agent or a person.

Scenarios: Employer-Employee Tobacco Convention: England vs. Zimbabwe

Learned from 20 games of human-human

Results:Results:

Page 113: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 113/164

113113

ComparingComparing KBAgent KBAgent to othersto othersPlayer  Type Average Utility Value (std)

KBAgent vs people Employer  468.9 (37.0)QOAgent vs peoples 417.4 (135.9)People vs. People 408.9 (106.7)People vs. QOAgent  431.8 (80.8)

People vs. KBAgent  380. 4 (48.5)KBAgent  482.7 (57.5)QOAgent  Job

Candidate397.8 (86.0)

People vs. People 310.3 (143.6)People vs. QOAgent  320.5 (112.7)

People vs. KBAgent  370.5 (58.9)

M i lM i lt

Page 114: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 114/164

114

Main resultsMain results

In comparison to the QOAgent  The KBAgent achieved higher utility values than

QOAgent  More agreements were accepted by people

The sum of utility values (social welfare) were higher when the KBAgent was involvedThe KBAgent achieved significantly higher utility

values than people

Results demonstrate the proficiency negotiationdone by the KBAgent 

ponent* modeling improves agent ba

Automated care-takerAutomated care-taker

Page 115: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 115/164

115

Automated care taker Automated care taker 

I will be too tired in the afternoon!!!I arrange for you to go to the physiotherapist in the

How can I convince him? What argument should I give?

Security applicationsSecurity applicationsHow should I convince

him to provide me with informatio

Page 116: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 116/164

116

 

d I tell him th

at we are running out of antibiotics?

Page 117: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 117/164

117

Which information to reveal?

ArgumentationArgumentation

Should I tell him that I will lose a project if I don’t hire t

Should I tell him I was fired from my last job?

Should I tell her that my leg hurts?

Build a game thatcombines informationrevelation and bargaining

Automated care-takerAutomated care-taker

Page 118: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 118/164

118

Automated care taker Automated care taker 

I will be too tired in the afternoon!!!I arrange for you to go to the physiotherapist in the

How can I convince him? What argument should I give?

Security applicationsSecurity applications

Page 119: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 119/164

119

hould I convince him to provide me with information?

C l T il (CT)Color Trails (CT)

Page 120: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 120/164

120

Color Trails (CT)Color Trails (CT)

An infrastructure for agentdesign, implementationand evaluation for open

environmentsDesigned with Barbara Grosz

(AAMAS 2004)

Implemented by Harvard teamand BIU team

An e perimental test tedAn experimental test ted

Page 121: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 121/164

121

An experimental test-tedAn experimental test-ted

Interesting for people to play◦ analogous to task settings;

◦ vivid representation of strategy space(not just a list of outcomes).

Possible for computers to play Can vary in complexity

◦ repeated vs. one-shot setting;

◦ availability of information;

◦ communication protocol.◦

S i l P f A tSocial Preference Agent

Page 122: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 122/164

Social Preference AgentSocial Preference Agent

Learns the extent to which people are affected bysocial preferences such as social welfare andcompetitiveness.

Designed for one-shot take-it-or-leave-itscenarios. Does not reason about the future ramifications of 

its actions.

Y. Gal and A. Pfeffer: Predicting people's bidding behavior innegotiation. AAMAS 2006: 370-376

Page 123: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 123/164

123

Agents for Revelation Games

Peled Noam, Gal Kobi,Kraus Sarit

Introduction - Revelation gamesIntroduction - Revelation games

Page 124: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 124/164

124

Combine two types of interaction Signaling games (Spence 1974)

Players choose whether to convey privateinformation to each other 

Bargaining games (Osborne and Rubinstein 1999)

Players engage in multiple negotiation rounds Example: Job interview

Colored Trails (CT)Colored Trails (CT)

Page 125: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 125/164

125

Asymmetric Symmetric

Why not equilibrium agents?Why not equilibrium agents?

Page 126: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 126/164

126

y q gy q g

Results from the social sciences suggest peopledo not follow equilibrium strategies:

◦ Equilibrium based agents played againstpeople failed.

People rarely design agents to follow equilibriumstrategies(Sarne et al AAMAS 2008).

Equilibrium strategies are

usually not cooperative – all lose.

Page 127: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 127/164

PE agent – Phase onePE agent – Phase one

Page 128: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 128/164

128

First proposal round (generous):

First proposer: propose the opponent’scounter-proposal.

First responder: Accepts anyproposals which gives it the same or higher benefit from its counter-proposal.

Revelation phase - revelation vs non

revelation: In both boards, the PE with goal revelation yields

lower or equal expected utility than non-revelation PE

Benefits DiversityBenefits Diversity

Page 129: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 129/164

129

Average proposed benefit to players fromfirst and second rounds

Performance of PEQ agenterformance of PEQ agent

Page 130: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 130/164

130

Revelation EffectRevelation Effect

Page 131: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 131/164

131

Only 35% of the games played by

humans included revelation Revelation had a significant effect on

human performance but not on agent

performance Revelation didn't help the agent People were deterred by the strategic

machine-generated proposals

SIGAL agentSIGAL agent

Page 132: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 132/164

132

gg

Agent based on general opponentmodeling:

Genetic algorithm Logistic Regressio

SIGAL AgentSIGAL Agent

Page 133: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 133/164

133

Learns from previous games. Predict the acceptance probability for each

proposal using Logistic Regression. Models human as using a weighted utility

function of: Humans benefit Benefits difference Revelation decision

Benefits in previous round

Logistic Regression using aLogistic Regression using aGenetic AlgorithmGenetic Algorithm

Page 134: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 134/164

134

Genetic AlgorithmGenetic Algorithm

Expected benefit maximizationExpected benefit maximization

Page 135: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 135/164

135

Maximization – round 2Maximization – round 2

Page 136: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 136/164

136

Strategy ComparisonStrategy Comparison

S f f

Page 137: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 137/164

137

Strategies for the asymmetric board, non of the

players has revealed, the human lacks 2 chipsfor reaching the goal, the agent lacks 1:

* In first round the agent was proposed a benefit of 90

HeuristicsHeuristics

Page 138: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 138/164

138

Tit for Tat Never give more than you asks in the

counter-proposal

Risk averseness Isoelastic utility:

Learned CoefficientsLearned Coefficients

Page 139: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 139/164

139

Responder benefit: (0.96) Benefits difference: (-0.79) Responder revelation: (0.26)

Proposer revelation: (0.03) Responder benefit in first round: (0.45) Proposer benefit in first round: (0.33)

MethodologyMethodology

Page 140: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 140/164

140

Cross validation. 10-fold Over-fitting removal. Stop learning in the minimum of the

generalization error  Error calculation on held out test set.Using new human-human games

Performance prediction criteria.

PerformancePerformance

Page 141: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 141/164

141

Page 142: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 142/164

142

General opponent* modeling inGeneral opponent* modeling inMaximization problemsMaximization problems

AAT agentAAT agent

Page 143: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 143/164

143

Agent based on general* opponentmodeling

cision Tree/ Naïve Byes AAT

Aspiration Adaptation TheoryAspiration Adaptation Theory( )(AAT)

Page 144: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 144/164

144

(AAT)(AAT)

Economic theory of people’s behavior (Selten) No utility function exists for decisions (!)

Relative decisions used insteadRetreat and urgency used for goal variables

Avi Rosenfeld and Sarit Kraus. Modeling Agents through BoundedRationality Theories. Proc. of IJCAI 2009., JAAMAS, 2010.

Commodity searchCommodity search

Page 145: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 145/164

145

1000

Commodity searchCommodity search

Page 146: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 146/164

146

1000

900

Commodity searchCommodity search

Page 147: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 147/164

147

1000

900

950

If price < 800 buy; otherwise visit 5 stores andbuy in the cheapest.

ResultsResults

Page 148: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 148/164

148

Page 149: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 149/164

149

Generalopponent*modeling in

cooperativeenvironments

Coordination with limitedCoordination with limitedi ticommunication

Page 150: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 150/164

150

communicationcommunication

Communication is not always possible:◦ High communication costs

◦ Need to act undetected

◦ Damaged communication devices◦ Language incompatibilities

◦ Goal: Limited interruption of humanactivities

Zuckerman, S. Kraus and J. S. Rosenschein.Using Focal Points Learning to ImproveHuman-Machine Tactic Coordination, JAAMAS, 2010.

Focal Points (Examples)Focal Points (Examples)

Page 151: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 151/164

151

Divide £100 into two piles, if your piles areidentical to your coordination partner, you getthe £100. Otherwise, you get nothing.

101 equilibria

Focal points (Examples)Focal points (Examples)

Page 152: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 152/164

152

9 equilibria16 equilibria

Focal PointsFocal Points

Page 153: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 153/164

153

Thomas Schelling (63)

  Focal Points = Prominentsolutions to tactic

coordination games

Prior work: Focal PointsPrior work: Focal Points BasedBasedCoordination for closed environmentsCoordination for closed environments

Page 154: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 154/164

154

Domain-independent rules that could be used byautomated agents to identify focal points:

Properties: Centrality,Firstness, Extremeness, Singularity.

◦ Logic based model

◦ Decision theory based model Algorithms for agents coordination

Kraus and Rosenchein MAAMA 1992Fenster et al ICMAS 1995Annals of Mathematics and Artificial Intelligence 2000

FPL agentFPL agent

Page 155: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 155/164

155

Agent based on general* opponentmodeling

ision Tree/ neural network Focal Point

FPL agentFPL agent

Page 156: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 156/164

156

Agent based on general opponentmodeling:

ision Tree/ neural networkraw data vector 

FP vector 

Focal Point LearningFocal Point Learning

Page 157: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 157/164

157

3 experimental domains:

Results – cont’Results – cont’General opponent*modeling improvesagent coordination

Page 158: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 158/164

158

“very similar domain” (VSD) vs “similar domain” (SD) of the “pick the pile” game.

Page 159: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 159/164

159

eriments with people is a costly proc

Evaluation of agents (EDA)Evaluation of agents (EDA)

Page 160: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 160/164

160

Peer Designed Agents (PDA): computer agentsdeveloped by humans

Experiment: 300 human subjects, 50 PDAs, 3 EDA Results:

◦ EDA outperformed PDAs in the same situations in

which they outperformed people,◦ on average, EDA exhibited the same measure of 

generosity

R. Lin, S. Kraus, Y. Oshrat and Y. Gal. Facilitating the Evaluationof Automated Negotiators using Peer Designed Agents, in AAAI2010.

ConclusionsConclusions

Page 161: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 161/164

161

Negotiation and argumentation with people isrequired for many applications

General* opponent modeling is beneficial◦ Machine learning

◦ Behavioral model

◦ Challenge: how to integrate machine learning andbehavioral model

ReferencesReferences

1. S.S. Fatima, M. Wooldridge, and N.R. Jennings, Multi-issue negotiationwith deadlines, Jnl of AI Research, 21: 381-471, 2006.

Page 162: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 162/164

162

with deadlines, Jnl of AI Research, 21: 381 471, 2006.

2. R. Keeney and H. Raiffa, Decisions with multiple objectives: Preferences

and value trade-offs, John Wiley, 1976.3. S. Kraus, Strategic negotiation in multiagent environments, The MIT press,

2001.

4. S. Kraus and D. Lehmann. Designing and Building a NegotiatingAutomated Agent, Computational Intelligence, 11(1):132-171, 1995

5. S. Kraus, K. Sycara and A. Evenchik. Reaching agreements through

argumentation: a logical model and implementation. Artificial Intelligence journal, 104(1-2):1-69, 1998.

6. R. Lin and Sarit Kraus. Can Automated Agents Proficiently Negotiate WithHumans? Communications of the ACM Vol. 53 No. 1, Pages 78-88,January, 2010.

7. R. Lin, S. Kraus, Y. Oshrat and Y. Gal. Facilitating the Evaluation of 

Automated Negotiators using Peer Designed Agents, in AAAI 2010.

References contd.References contd.8. R. Lin, S. Kraus, J. Wilkenfeld, and J. Barry. Negotiating with bounded

rational agents in environments with incomplete information using an

Page 163: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 163/164

163

rational agents in environments with incomplete information using anautomated agent. Artificial Intelligence, 172(6-7):823 – 851, 2008

9. A. Lomuscio, M. Wooldridge, and N.R. Jennings, A classification scheme for negotiation in electronic commerce , Int. Jnl. of Group Deciion andNegotiation, 12(1), 31-56, 2003.

10.M.J. Osborne and A. Rubinstein, A course in game theory, The MIT press,1994.

11.M.J. Osborne and A. Rubinstein, Bargaining and Markets, Academic Press,1990.

12.Y. Oshrat, R. Lin, and S. Kraus. Facing the challenge of human-agentnegotiations via effective general opponent modeling. In AAMAS , 2009

13.H. Raiffa, The Art and Science of Negotiation, Harvard University Press,1982.

14.J.S. Rosenschein and G. Zlotkin, Rules of encounter, The MIT press, 1994.15.I. Stahl, Bargaining Theory, Economics Research Institute, Stockholm Schoolof Economics, 1972.

16.I. Zuckerman, S. Kraus and J. S. Rosenschein. Using Focal Points Learningto Improve Human-Machine Tactic Coordination, JAAMAS, 2010.

17.

18

Tournament

Page 164: Summer School July 2010

8/8/2019 Summer School July 2010

http://slidepdf.com/reader/full/summer-school-july-2010 164/164

2nd

annual competition of state-of-the-artnegotiating agents to be held in AAMAS’11

Do you want to participate?

At least $2,000 for the winner!