general opponent* modeling for improving agent-human interaction
DESCRIPTION
General Opponent* Modeling for Improving Agent-Human Interaction. Sarit Kraus Dept . of Computer Science Bar Ilan University AMEC May 2010. Motivation. Negotiation is an extremely important form of people interaction. Computers interacting with people. Computer has the control. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/1.jpg)
1
General Opponent* Modeling for Improving Agent-Human
Interaction
Sarit Kraus Dept. of Computer Science Bar Ilan University
AMECMay 2010
![Page 2: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/2.jpg)
Negotiation is an extremely important form of people interaction
2
Motivation
2
![Page 3: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/3.jpg)
Computers interacting with people
Computer persuadeshuman
Computer has the control
Human has the control
3
![Page 4: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/4.jpg)
44
![Page 5: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/5.jpg)
Culture sensitive agents
The development of standardized agent to be used in the collection of data for studies on culture and negotiation
Buyer/Seller agents negotiate well across cultures PURB
agent
5
![Page 7: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/7.jpg)
7
Medical applications
Gertner Institute for Epidemiology and Health Policy Research
7
![Page 8: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/8.jpg)
Automated care-taker
8
I will be too tired in the afternoon!!!
I scheduled an appointment for you at the physiotherapist this afternoon
Try to reschedule and fail
The physiotherapist has no other available
appointments this week .How about resting before
the appointment?
![Page 9: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/9.jpg)
Security applications
9
•Collect•Update•Analyze•Prioritize
![Page 10: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/10.jpg)
Irrationalities attributed to– sensitivity to context– lack of knowledge of own preferences– the effects of complexity– the interplay between emotion and cognition– the problem of self control – bounded rationality in the bullet
10
People often follow suboptimal decision strategies
10
General opponent* modeling
![Page 11: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/11.jpg)
Small number of examples – difficult to collect data on people
Noisy data – people are inconsistent (the same person may act
differently)– people are diverse
Challenges of human opponent* modeling
11
![Page 12: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/12.jpg)
Multi-attribute multi-round bargaining– KBAgent
Revelation + bargaining – SIGAL
Optimization problems– AAT based learning
Coordination with people: – Focal point based learning
Agenda
12
![Page 13: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/13.jpg)
QOAgent [LIN08]
Multi-issue, multi-attribute, with incomplete information
Domain independent Implemented several tactics and heuristics
– qualitative in nature Non-deterministic behavior, also via means of
randomization
R. Lin, S. Kraus, J. Wilkenfeld, and J. Barry. Negotiating with bounded rational agents in environments with incomplete information using an automated agent. Artificial Intelligence, 172(6-7):823–851, 2008
Played at least as well as people
Is it possible to improve the QOAgent?
Yes, if you have data
13
![Page 14: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/14.jpg)
KBAgent [OS09]
Y. Oshrat, R. Lin, and S. Kraus. Facing the challenge of human-agent negotiations via effective general opponent modeling. In AAMAS, 2009
Multi-issue, multi-attribute, with incomplete information
Domain independent Implemented several tactics and heuristics
– qualitative in nature Non-deterministic behavior, also via means
of randomization Using data from previous interactions
14
![Page 15: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/15.jpg)
Example scenario
Employer and job candidate– Objective: reach an
agreement over hiring terms after successful interview
15
![Page 16: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/16.jpg)
General opponent modeling
• Challenge: sparse data of past negotiation sessions of people negotiation
• Technique: Kernel Density Estimation
16
![Page 17: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/17.jpg)
Estimate likelihood of other party:– accept an offer– make an offer– its expected average utility
The estimation is done separately for each possible agent type:– The type of a negotiator is determined using a simple
Bayes' classifier
Use estimation for decision making
General opponent modeling
17
![Page 18: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/18.jpg)
KBAgent as the job candidate
Best result: 20,000, Project manager, With leased car; 20% pension funds, fast promotion, 8 hours
20,000Team ManagerWith leased carPension: 20%Slow promotion9 hours
12,000ProgrammerWithout leased carPension: 10%Fast promotion10 hours
20,000Project managerWithout leased carPension: 20%Slow promotion9 hours
KBAgentHuman
18
![Page 19: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/19.jpg)
KBAgent as the job candidate Best agreement: 20,000, Project manager, With leased car; 20%
pension funds, fast promotion, 8 hours
KBAgentHuman
20,000ProgrammerWith leased carPension: 10%Slow promotion9 hours
Round 7
12,000ProgrammerWithout leased carPension: 10%Fast promotion10 hours
20,000Team ManagerWith leased carPension: 20%Slow promotion9 hours
19
![Page 20: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/20.jpg)
20
Experiments
172 grad and undergrad students in Computer Science
People were told they may be playing a computer agent or a person.
Scenarios: – Employer-Employee– Tobacco Convention: England vs. Zimbabwe
Learned from 20 games of human-human
![Page 21: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/21.jpg)
21
Results: Comparing KBAgent to others
Player Type Average Utility Value (std)
KBAgent vs people
Employer
468.9 (37.0)
QOAgent vs peoples 417.4 (135.9)
People vs. People 408.9 (106.7)
People vs. QOAgent 431.8 (80.8)
People vs. KBAgent 380. 4 (48.5)
KBAgent 482.7 (57.5)
QOAgent
Job Candidate
397.8 (86.0)
People vs. People 310.3 (143.6)
People vs. QOAgent 320.5 (112.7)
People vs. KBAgent 370.5 (58.9)
![Page 22: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/22.jpg)
22
Main results
In comparison to the QOAgent– The KBAgent achieved higher utility values than
QOAgent– More agreements were accepted by people– The sum of utility values (social welfare) were higher
when the KBAgent was involved The KBAgent achieved significantly higher utility
values than people Results demonstrate the proficiency negotiation
done by the KBAgent
General opponent modeling improves agent negotiations
General opponent* modeling improves agent bargaining
![Page 23: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/23.jpg)
Automated care-taker
23
I will be too tired in the afternoon!!!
I arrange for you to go to the physiotherapist in the afternoon
How can I convince him? What argument should I give?
![Page 24: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/24.jpg)
Security applications
24
How should I convince him to provide me with information?
![Page 25: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/25.jpg)
Which information to reveal?
25
Argumentation
Should I tell him that I will lose a project if I don’t hire today?
Should I tell him I was fired from my last job?
Should I tell her that my leg hurts?
Should I tell him that we are running out of antibiotics?
Build a game that combines information revelation and bargaining
25
![Page 26: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/26.jpg)
Color Trails (CT)
An infrastructure for agent design, implementation and evaluation for open environments
Designed with Barbara Grosz (AAMAS 2004)
Implemented by Harvard team and BIU team
26
![Page 27: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/27.jpg)
An experimental test-ted
Interesting for people to play– analogous to task settings;– vivid representation of strategy
space (not just a list of outcomes).
Possible for computers to playCan vary in complexity
– repeated vs. one-shot setting;– availability of information; – communication protocol.
27
![Page 28: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/28.jpg)
Game description
The game is built from phases:– Revelation phase– First proposal phase– Counter-proposal phase
Joint work with Kobi Gal and Noam Peled28
![Page 29: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/29.jpg)
Two boards
29
![Page 30: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/30.jpg)
30
Why not equilibrium agents?
Results from the social sciences suggest people do not follow equilibrium strategies: Equilibrium based agents played against
people failed. People rarely design agents to follow equilibrium
strategies (Sarne et al AAMAS 2008).
Equilibrium strategies are usually not cooperative –
all lose.
30
![Page 31: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/31.jpg)
Perfect Equilibrium agent
Solved using Backward induction; no strategic signaling
Phase two:– Second proposer: Find the most beneficial proposal
while the responder benefit remains positive.– Second responder: Accepts any proposal which
gives it a positive benefit.
31
![Page 32: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/32.jpg)
Perfect Equilibrium agent
Phase one:– First proposer: propose the opponent’s counter-
proposal– First responder: Accepts any proposals which gives it
the same or higher benefit from its counter-proposal. In both boards, the PE with goal revelation
yields lower or equal expected utility than non-revelation PE
Revelation: Reveals in half of the games
32
![Page 33: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/33.jpg)
Asymmetric game
33
![Page 34: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/34.jpg)
Performance
34
140 students
![Page 35: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/35.jpg)
Benefits diversity
Average proposed benefit to players from first and second rounds
35
![Page 36: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/36.jpg)
Revelation affect
The effect of revelation on performance:
Only 35% of the games played by humans included revelation
Revelation had a significant effect on human performance but not on agent performance
People were deterred by the strategic machine-generated proposals, which heavily depended on the role of the proposer and the responder.
36
![Page 37: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/37.jpg)
37
SIGAL agent
Agent based on general opponent modeling:
Genetic algorithm
Logistic Regression
![Page 38: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/38.jpg)
SIGAL Agent: Acceptance
Learns from previous games Predict the acceptance probability for each
proposal using Logistic regression Features (for both players) relating to
proposals:– Benefit.– Goal revelations.– Players types– Benefit difference
between rounds 2 and 1.
38
![Page 39: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/39.jpg)
SIGAL Agent: counter proposals
Model the way humans make counter-proposals
39
![Page 40: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/40.jpg)
SIGAL Agent
Maximizes expected benefit given any state in the game– Round– Player revelation– Behavior in round 1
40
![Page 41: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/41.jpg)
Agent strategies comparison
Round 1 Round 2
Agent Send Receive Send Receive
EQ Green:10 Gray:11
Purple:2 Green:2 Purple:10 Gray:11
SIGAL Green:2 Purple:9 Green:2 Putple:5
41
![Page 42: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/42.jpg)
SIGAL agent: performance
42
![Page 43: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/43.jpg)
Agents performance comparison
Equilibrium Agent SIGAL Agent
43
General opponent* modeling improves agent negotiations
![Page 44: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/44.jpg)
GENERAL OPPONENT* MODELING IN MAXIMIZATION PROBLEMS
4444
![Page 45: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/45.jpg)
45
AAT agent
Agent based on general* opponent modeling
Decision Tree/ Naïve Byes
AAT
45
![Page 46: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/46.jpg)
Aspiration Adaptation Theory (AAT)
Economic theory of people’s behavior (Selten)– No utility function exists for decisions (!)
Relative decisions used instead Retreat and urgency used for goal variables
46
Avi Rosenfeld and Sarit Kraus. Modeling Agents through Bounded Rationality Theories. Proc. of IJCAI 2009., JAAMAS, 2010.
![Page 47: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/47.jpg)
47
Commodity search
1000
47
![Page 48: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/48.jpg)
48
Commodity search
1000
900
![Page 49: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/49.jpg)
49
Commodity search
1000
900
950
If price < 800 buy; otherwise visit 5 stores and buy in the cheapest.
49
![Page 50: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/50.jpg)
50
Results
Behavioral models used in General opponent* modeling is beneficial
50Sparse Naïve Learning Sparse AAT
65
67
69
71
73
75
77
79
81
83
Using AAT to Quickly Learn
Co
rre
ct
Cla
ss
ific
ati
on
%
![Page 51: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/51.jpg)
General opponent* modeling in cooperative environments
51
![Page 52: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/52.jpg)
Coordination with limited communication
Communication is not always possible:– High communication costs– Need to act undetected– Damaged communication devices– Language incompatibilities– Goal: Limited interruption of human
activities
I. Zuckerman, S. Kraus and J. S. Rosenschein. Using Focal Points Learning to Improve Human-Machine Tactic Coordination, JAAMAS, 2010.
52
![Page 53: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/53.jpg)
Focal Points (Examples)
Divide £100 into two piles, if your piles are identical to your coordination partner, you get the £100. Otherwise, you get nothing.
101 equilibria53
![Page 54: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/54.jpg)
Focal points (Examples)
9 equilibria16 equilibria
54
![Page 55: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/55.jpg)
Focal Points
Thomas Schelling (63):
Focal Points = Prominent solutions to tactic coordination games.
55
![Page 56: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/56.jpg)
Prior work: Focal Points Based Coordination for closed environments
Domain-independent rules that could be used by automated agents to identify focal points:
Properties: Centrality, Firstness, Extremeness, Singularity. – Logic based model– Decision theory based model
Algorithms for agents coordination.
Kraus and Rosenchein MAAMA 1992Fenster et al ICMAS 1995Annals of Mathematics and Artificial Intelligence 200056
![Page 57: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/57.jpg)
57
FPL agent
Agent based on general* opponent modeling
Decision Tree/ neural network
Focal Point
57
![Page 58: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/58.jpg)
58
FPL agent
Agent based on general opponent modeling:
Decision Tree/ neural network
raw data vector
FP vector58
![Page 59: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/59.jpg)
Focal Point Learning
3 experimental domains:
59
![Page 60: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/60.jpg)
Results – cont’
“very similar domain” (VSD) vs “similar domain” (SD) of the “pick the pile” game.
General opponent* modeling improves agent coordination
60
![Page 61: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/61.jpg)
Evaluation of agents (EDA)
Peer Designed Agents (PDA): computer agents developed by humans
Experiment: 300 human subjects, 50 PDAs, 3 EDA
Results: – EDA outperformed PDAs in the same situations in
which they outperformed people, – on average, EDA exhibited the same measure of
generosity
61
Experiments with people is a costly process
R. Lin, S. Kraus, Y. Oshrat and Y. Gal. Facilitating the Evaluation of Automated Negotiators using Peer Designed Agents, in AAAI 2010.
![Page 62: General Opponent* Modeling for Improving Agent-Human Interaction](https://reader030.vdocuments.net/reader030/viewer/2022013101/56814482550346895db11978/html5/thumbnails/62.jpg)
Negotiation and argumentation with people is required for many applications
General* opponent modeling is beneficial– Machine learning– Behavioral model – Challenge: how to integrate machine learning and
behavioral model
62
Conclusions
62