![Page 1: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/1.jpg)
Ad Hoc Autonomous Agent Teams:Collaboration without Pre-Coordination
Peter StoneDirector, Learning Agents Research Group
Department of Computer ScienceThe University of Texas at Austin
Joint work withGal A. Kaminka, Sarit Kraus, Bar Ilan University
Jeffrey S. Rosenschein, Hebrew University
![Page 2: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/2.jpg)
Teamwork
c© 2010 Peter Stone
![Page 3: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/3.jpg)
Teamwork
c© 2010 Peter Stone
![Page 4: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/4.jpg)
Teamwork
• Typical scenario: pre-coordination
− People practice together− Robots given coordination languages, protocols− “Locker room agreement” [Stone & Veloso, ’99]
c© 2010 Peter Stone
![Page 5: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/5.jpg)
Ad Hoc Teams
• Ad hoc team player is an individual− Unknown teammates (programmed by others)
c© 2010 Peter Stone
![Page 6: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/6.jpg)
Ad Hoc Teams
• Ad hoc team player is an individual− Unknown teammates (programmed by others)
• May or may not be able to communicate
c© 2010 Peter Stone
![Page 7: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/7.jpg)
Ad Hoc Teams
• Ad hoc team player is an individual− Unknown teammates (programmed by others)
• May or may not be able to communicate
• Teammates likely sub-optimal: no control
c© 2010 Peter Stone
![Page 8: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/8.jpg)
Ad Hoc Teams
• Ad hoc team player is an individual− Unknown teammates (programmed by others)
• May or may not be able to communicate
• Teammates likely sub-optimal: no control
c© 2010 Peter Stone
![Page 9: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/9.jpg)
Ad Hoc Teams
• Ad hoc team player is an individual− Unknown teammates (programmed by others)
• May or may not be able to communicate
• Teammates likely sub-optimal: no control
Challenge: Create a good team player
c© 2010 Peter Stone
![Page 10: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/10.jpg)
Illustration
c© 2010 Peter Stone
![Page 11: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/11.jpg)
An Individual
c© 2010 Peter Stone
![Page 12: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/12.jpg)
With Teammates
c© 2010 Peter Stone
![Page 13: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/13.jpg)
Made by Others
c© 2010 Peter Stone
![Page 14: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/14.jpg)
Heterogeneous
c© 2010 Peter Stone
![Page 15: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/15.jpg)
May not Communicate
c© 2010 Peter Stone
![Page 16: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/16.jpg)
May Have Different Capabilities
c© 2010 Peter Stone
![Page 17: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/17.jpg)
And/Or Maneuverability
c© 2010 Peter Stone
![Page 18: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/18.jpg)
May be a Previously Unknown Type
c© 2010 Peter Stone
![Page 19: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/19.jpg)
Human Ad Hoc Teams
• Military and industrial settings
c© 2010 Peter Stone
![Page 20: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/20.jpg)
Human Ad Hoc Teams
• Military and industrial settings
− Outsourcing
c© 2010 Peter Stone
![Page 21: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/21.jpg)
Human Ad Hoc Teams
• Military and industrial settings
− Outsourcing
• Agents support human ad hoc team formation[Just et al., 2004; Kildare, 2004]
c© 2010 Peter Stone
![Page 22: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/22.jpg)
Human Ad Hoc Teams
• Military and industrial settings
− Outsourcing
• Agents support human ad hoc team formation[Just et al., 2004; Kildare, 2004]
• Autonomous agents (robots) deployed for short times
− Teams developed as cohesive groups− Tuned to interact well together
c© 2010 Peter Stone
![Page 23: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/23.jpg)
Challenge Statement
Create an autonomous agent that is able to efficientlyand robustly collaborate with previously unknownteammates on tasks to which they are all individuallycapable of contributing as team members.
c© 2010 Peter Stone
![Page 24: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/24.jpg)
Challenge Statement
Create an autonomous agent that is able to efficientlyand robustly collaborate with previously unknownteammates on tasks to which they are all individuallycapable of contributing as team members.
• Aspects can be approached theoretically
c© 2010 Peter Stone
![Page 25: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/25.jpg)
Challenge Statement
Create an autonomous agent that is able to efficientlyand robustly collaborate with previously unknownteammates on tasks to which they are all individuallycapable of contributing as team members.
• Aspects can be approached theoretically
• Ultimately an empirical challenge
c© 2010 Peter Stone
![Page 26: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/26.jpg)
Empirical Evaluation
a0
c© 2010 Peter Stone
![Page 27: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/27.jpg)
Evaluation: A Metric
a0 a1
c© 2010 Peter Stone
![Page 28: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/28.jpg)
Evaluation: A Metric
a0 a1
• Most meaningful when a0 and a1 have similar individualcompetencies
c© 2010 Peter Stone
![Page 29: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/29.jpg)
Evaluation: Domain Consisting of Tasks
a0 a1
D
c© 2010 Peter Stone
![Page 30: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/30.jpg)
Evaluation: Set of Possible Teammates
a0 a1
A
D
c© 2010 Peter Stone
![Page 31: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/31.jpg)
Evaluation: Draw a Random Task
a0 a1
A
D
c© 2010 Peter Stone
![Page 32: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/32.jpg)
Evaluation: Random Team, Check Comp
a0 a1
A
D
c© 2010 Peter Stone
![Page 33: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/33.jpg)
Evalution: Replace Random with a0
a0
a1
A
D
c© 2010 Peter Stone
![Page 34: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/34.jpg)
Evaluation: Then a1 — Evaluate Diff
a1
a0
A
D
c© 2010 Peter Stone
![Page 35: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/35.jpg)
Evaluation: Repeat
a0 a1
A
D
c© 2010 Peter Stone
![Page 36: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/36.jpg)
Evaluate(a0, a1, A, D)
• Initialize performance (reward) counters r0 and r1 for agents a0 anda1 respectively to r0 = r1 = 0.
• Repeat:
– Sample a task d from D.– Randomly draw a subset of agents B, |B| ≥ 2, from A such thatE[s(B, d)] ≥ smin.
– Randomly select one agent b ∈ B to remove from the team tocreate the team B−.
– increment r0 by s({a0} ∪B−, d)– increment r1 by s({a1} ∪B−, d)
• If r0 > r1 then we conclude that a0 is a better ad-hoc team playerthan a1 in domain D over the set of possible teammates A.
c© 2010 Peter Stone
![Page 37: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/37.jpg)
Technical Requirements
• Assess capabilities of other agents (teammate modeling)
c© 2010 Peter Stone
![Page 38: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/38.jpg)
Technical Requirements
• Assess capabilities of other agents (teammate modeling)
• Assess the other agents’ knowledge states
c© 2010 Peter Stone
![Page 39: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/39.jpg)
Technical Requirements
• Assess capabilities of other agents (teammate modeling)
• Assess the other agents’ knowledge states
• Estimate effects of actions on teammates
c© 2010 Peter Stone
![Page 40: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/40.jpg)
Technical Requirements
• Assess capabilities of other agents (teammate modeling)
• Assess the other agents’ knowledge states
• Estimate effects of actions on teammates
• Be prepared to interact with many types of teammates:
− May or may not be able to communicate− May be more or less mobile− May be better or worse at sensing
c© 2010 Peter Stone
![Page 41: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/41.jpg)
Technical Requirements
• Assess capabilities of other agents (teammate modeling)
• Assess the other agents’ knowledge states
• Estimate effects of actions on teammates
• Be prepared to interact with many types of teammates:
− May or may not be able to communicate− May be more or less mobile− May be better or worse at sensing
A good team player’s best actions will differdepending on its teammates’ characteristics.
c© 2010 Peter Stone
![Page 42: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/42.jpg)
Preliminary Theoretical Progress
• Aspects can be approached theoretically
• Ultimately an empirical challenge
c© 2010 Peter Stone
![Page 43: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/43.jpg)
Preliminary Theoretical Progress
• Aspects can be approached theoretically
• Ultimately an empirical challenge
Be prepared to interact with many types of teammates
c© 2010 Peter Stone
![Page 44: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/44.jpg)
Preliminary Theoretical Progress
• Aspects can be approached theoretically
• Ultimately an empirical challenge
Be prepared to interact with many types of teammates
• Minimal representative scenarios
− One teammate, no communication− Fixed and known behavior
c© 2010 Peter Stone
![Page 45: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/45.jpg)
Scenarios
• Cooperative iterated normal form game[w/ Kaminka & Rosenschein—AMEC’09]
M1 b0 b1 b2a0 25 1 0a1 10 30 10a2 0 33 40
• Cooperative k-armed bandit [w/ Kraus—AAMAS’10]
c© 2010 Peter Stone
![Page 46: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/46.jpg)
Scenarios
• Cooperative normal form game
M1 b0 b1 b2a0 25 1 0a1 10 30 10a2 0 33 40
• Cooperative k-armed bandit
c© 2010 Peter Stone
![Page 47: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/47.jpg)
3-armed bandit
=⇒ • Random value from a distribution• Expected value µ•
c© 2010 Peter Stone
![Page 48: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/48.jpg)
3-armed banditArm∗ Arm1 Arm2
c© 2010 Peter Stone
![Page 49: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/49.jpg)
3-armed banditArm∗ Arm1 Arm2
µ∗ > µ1 > µ2
• Agent A: teacher− Knows payoff distributions− Objective: maximize expected sum of payoffs
c© 2010 Peter Stone
![Page 50: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/50.jpg)
3-armed banditArm∗ Arm1 Arm2
µ∗ > µ1 > µ2
• Agent A: teacher− Knows payoff distributions− Objective: maximize expected sum of payoffs− If alone, always Arm∗
c© 2010 Peter Stone
![Page 51: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/51.jpg)
3-armed banditArm∗ Arm1 Arm2
µ∗ > µ1 > µ2
• Agent A: teacher− Knows payoff distributions− Objective: maximize expected sum of payoffs− If alone, always Arm∗
• Agent B: learner− Can only pull Arm1 or Arm2
c© 2010 Peter Stone
![Page 52: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/52.jpg)
3-armed banditArm∗ Arm1 Arm2
µ∗ > µ1 > µ2
• Agent A: teacher− Knows payoff distributions− Objective: maximize expected sum of payoffs− If alone, always Arm∗
• Agent B: learner− Can only pull Arm1 or Arm2
− Selects arm with highest observed sample average
c© 2010 Peter Stone
![Page 53: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/53.jpg)
AssumptionsArm∗ Arm1 Arm2
c© 2010 Peter Stone
![Page 54: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/54.jpg)
AssumptionsArm∗ Arm1 Arm2
µ∗ > µ1 > µ2
• Alternate actions (teacher first)
• Results of all actions fully observable (to both)
c© 2010 Peter Stone
![Page 55: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/55.jpg)
AssumptionsArm∗ Arm1 Arm2
µ∗ > µ1 > µ2
• Alternate actions (teacher first)
• Results of all actions fully observable (to both)
• Number of rounds remaining finite, known to teacher
c© 2010 Peter Stone
![Page 56: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/56.jpg)
AssumptionsArm∗ Arm1 Arm2
µ∗ > µ1 > µ2
• Alternate actions (teacher first)
• Results of all actions fully observable (to both)
• Number of rounds remaining finite, known to teacher
Objective: maximize expected sum of payoffs
c© 2010 Peter Stone
![Page 57: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/57.jpg)
Summary of FindingsArm∗ Arm1 Arm2
c© 2010 Peter Stone
![Page 58: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/58.jpg)
Summary of FindingsArm∗ Arm1 Arm2
µ∗ > µ1 > µ2
• Arm1 is sometimes optimal
• Arm2 is never optimal
c© 2010 Peter Stone
![Page 59: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/59.jpg)
Summary of FindingsArm∗ Arm1 Arm2
µ∗ > µ1 > µ2
• Arm1 is sometimes optimal
• Arm2 is never optimal
• Optimal solution when arms have discrete distribution
• Interesting patterns in optimal action
• Extensions to more arms
c© 2010 Peter Stone
![Page 60: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/60.jpg)
Summary of FindingsArm∗ Arm1 Arm2
µ∗ > µ1 > µ2
• Arm1 is sometimes optimal
• Arm2 is never optimal
• Optimal solution when arms have discrete distribution
• Interesting patterns in optimal action
• Extensions to more arms
• Exploitation vs.
c© 2010 Peter Stone
![Page 61: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/61.jpg)
Summary of FindingsArm∗ Arm1 Arm2
µ∗ > µ1 > µ2
• Arm1 is sometimes optimal
• Arm2 is never optimal
• Optimal solution when arms have discrete distribution
• Interesting patterns in optimal action
• Extensions to more arms
• Exploitation vs. vs. teaching
c© 2010 Peter Stone
![Page 62: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/62.jpg)
Challenge Statement
Create an autonomous agent that is able to efficiently androbustly collaborate with previously unknown teammates ontasks to which they are all individually capable of contributingas team members.
c© 2010 Peter Stone
![Page 63: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/63.jpg)
Suggested Research Plan
1. Identify the full range of possible teamwork situations that a completead hoc team player needs to be capable of addressing (D and A).
c© 2010 Peter Stone
![Page 64: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/64.jpg)
Suggested Research Plan
1. Identify the full range of possible teamwork situations that a completead hoc team player needs to be capable of addressing (D and A).
2. For each such situation, find theoretically optimal and/or empiricallyeffective algorithms for behavior.
c© 2010 Peter Stone
![Page 65: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/65.jpg)
Suggested Research Plan
1. Identify the full range of possible teamwork situations that a completead hoc team player needs to be capable of addressing (D and A).
2. For each such situation, find theoretically optimal and/or empiricallyeffective algorithms for behavior.
3. Develop methods for identifying which type of teamwork situation theagent is currently in, in an online fashion.
c© 2010 Peter Stone
![Page 66: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/66.jpg)
Suggested Research Plan
1. Identify the full range of possible teamwork situations that a completead hoc team player needs to be capable of addressing (D and A).
2. For each such situation, find theoretically optimal and/or empiricallyeffective algorithms for behavior.
3. Develop methods for identifying which type of teamwork situation theagent is currently in, in an online fashion.
• 2 and 3: the core technical challenges
c© 2010 Peter Stone
![Page 67: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/67.jpg)
Suggested Research Plan
1. Identify the full range of possible teamwork situations that a completead hoc team player needs to be capable of addressing (D and A).
2. For each such situation, find theoretically optimal and/or empiricallyeffective algorithms for behavior.
3. Develop methods for identifying which type of teamwork situation theagent is currently in, in an online fashion.
• 2 and 3: the core technical challenges
• 1 and 3: a knob to incrementally increase difficulty
c© 2010 Peter Stone
![Page 68: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/68.jpg)
Related Work
Multiagent learning [Claus & Boutilier, ’98],[Littman, ’01],
[Conitzer & Sandholm, ’03],[Powers & Shoham, ’05],[Chakraborty & Stone, ’08]
Opponent Modeling
• Intended plan recognition [Sidner, ’85],[Lochbaum,’91],[Carberry, ’01]
• SharedPlans [Grosz & Kraus, ’96]
• Recursive Modeling [Vidal & Durfee, ’95]
Human-Robot-Agent Teams
• Overlapping but different challenges, including HRI [Klein, ’04]
• Out of scope
Much More pertaining to specific teammate characteristics
c© 2010 Peter Stone
![Page 69: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/69.jpg)
Acknowledgements
• Fulbright and Guggenheim Foundations
• Israel Science Foundation
c© 2010 Peter Stone
![Page 70: Ad Hoc Autonomous Agent Teams: Collaboration without Pre ...pstone/Papers/bib2html-links/... · – Randomly select one agent b 2B to remove from the team to create the team B . –](https://reader033.vdocuments.net/reader033/viewer/2022042321/5f0a973e7e708231d42c61fb/html5/thumbnails/70.jpg)
Ad Hoc Teams
• Ad hoc team player is an individual− Unknown teammates (programmed by others)
• May or may not be able to communicate
• Teammates likely sub-optimal: no control
Challenge: Create a good team player
c© 2010 Peter Stone