agents supporting cooperative and self interested human interactions in open, dynamic environments...

Agents Supporting Cooperative and Self Interested Human Interactions in Open,

Dynamic Environments

Katia P. Sycara

School of Computer Science

Carnegie Mellon University

Pittsburgh, PA. 15213 http://www.cs.cmu.edu/~softagents

2Copyright Katia Sycara 2002

Talk Outline

• Agents in Open Environments• Agents Supporting Human Teams

– Information processing (memory intensive) Tasks

– Planning Tasks

• Agents Supporting Organizations– E-commerce activities (negotiation, coalition

formation, auctions)

• Forward to the Past: Agent-Based Web Services


Vision: Agents on the Web

• A Wired/Wireless World populated with interoperating agents not just data


Overall Research Goal Develop multiagent technology that allows agents (cooperative

and self-interested) to coordinate autonomously and also assist individuals and human teams in environments that are:• time stressed• distributed• uncertain• open (information sources, communication links and agents

dynamically appear and disappear)Team members (humans and agents) are distributed in terms

of:• time and space• expertise


Reusable Environment for Task Structured Intelligent Networked Agents

• Adaptive, self-organizing collection of Intelligent Agents infrastructure that interact with the humans and each other.– integrate information management and decision support– anticipate and satisfy human information processing and

problem solving needs– perform real-time synchronization of actions– route and present the right information to the right person

at the right time– adapt to user, task and situation

• Develop schemes for autonomous agent coordination • Multi-agent discovery and interoperation• Multi-agent adaptivity and learning


Open Environments

• No predefined structure• Agents leave and join the society

dynamically• Communication is not ensured all

the time• Information sources may appear

and disappear


Generic Tasks in Open Environments

Agents must be able to:• discover each other. We distinguish the

notion of agent location from the notion of agent functionality.– Location is found through Agent Name Services

(ANS)– Functionality/capability is found through Middle

Agents

• interact/transact with each other • compose results of their reasoning• monitor progress of delegated tasks


The RETSINA Multi-Agent Organization

User 1 User 2 User u

InfoSource 1

InfoSource 1

Interface Agent 1Interface Agent 1 Interface Agent 2Interface Agent 2 Interface Agent iInterface Agent i

Task Agent 1Task Agent 1 Task Agent 2Task Agent 2 Task Agent tTask Agent t

MiddleAgent 2MiddleAgent 2

Info Agent nInfo Agent n

InfoSource 2

InfoSource 2

InfoSource m

InfoSource m

Goal and TaskSpecifications Results

SolutionsTasks

Info & ServiceRequests

Information IntegrationConflict Resolution Replies

Advertisements

Info Agent 1Info Agent 1

Queries

Answers

distributed adaptive collections of information agents that coordinate to retrieve, filter and fuseinformation relevant to the user, task and situation, as well as anticipate user's information needs.


RETSINA Single Agent Architecture


Some RETSINA Applications

• Aiding Human Teams in joint mission planning (using ModSAF as a simulated battlefield)

• Agent-aided aircraft maintenance

• E-commerce in wholesale markets (agent-based auctions and negotiation)

• Agent-based Supply Chain Management

• Robot teams for de-mining

• Team Rescue Scenario (NEO)

• Agent-based financial portfolio management

• Agent-based “on the move” collaboration on mobile devices


Visualization of Agent Interactions


Agent Discovery and Interoperation

• Discovery necessary in open environments• Interoperation necessary for heterogeneous agents• Agents advertise their expertise/capabilities to middle agents• Requester agents ask middle agents for agents with particular

capabilities• Middle agents match requests to advertisements and return results• Communication protocols include formal semantics and

ontologies for interoperation• The discovery scheme enables system robustness through

functional substitutability of agentsSycara, K., Klusch, M. Widoff, S. and Lu, J. "LARKS: Dynamic

Matchmaking among Heterogeneous Agents in Cyberspace", JAAMAS, vol 5, no. 2, July 2002.


Types of Interactions

• Providers and requesters interact with each other directly– a negotiation phase to find out service parameters and preferences

(if not taken into account in the locating phase)– delegation of service

• Providers and requesters interact through middle agents– middle agent finds provider and delegates– hybrid protocols

• Reasons for interacting through middle agents– privacy issues (anonymization of requesters and providers)– trust issues (enforcement of honesty; not necessarily keep

anonymity of principals); e.g. NetBill


Broadcaster

BroadcasterRequester

Provider 1 Provider n

Request for service

Broadcast service request

Delegation of serviceResults of

service request

Offer of service


Matchmaker

MatchmakerRequester


Request for service

Contact information of providers that match the

request Advertisementof capabilities

+para.Delegation of service

Results of service request


Broker

BrokerRequester


Delegation of service+ preferences

Advertisementof capabilities

+ para.Delegationof service

Results of service

Results of service


Contract Net

ManagerRequester


Request for service+ preferences

Offer of service Delegationof service

Results of service

Offer of service

Provider 1

BroadcastBroadcast

Offer of service

Results ofService


Performance of Match-made System


Performance of Brokered System


Hybrid Human-Agent Teams

Human and software agents working together as a team to perform complex tasks in a distributed environment

Agents providing information access as well as user-centered problem-solving and decision support

Agents monitoring team activity and the environment so that effective assistance can be provided


Human-Agent Teams

Agent Rolessupport for individual team members

simple reactive agents: manage and present information meaningfully, react to event stimuli

planning agents: present courses of action based on emerging events

support for team activitysituation assessment: provide information to the team on environmentfacilitate communication within the teamsupportive behaviours: correcting other team member, requesting backup

as an autonomous team membercannot use human team member roles directlyprobably feasible for information access, event monitoring, planning

of member roles


Agents in Teams: Expected Improvements

• Reduce time for human teams to arrive at a decision

• Allow teams to consider a broader range of alternatives

• Enable teams to flexibly manage contingencies (replan, repair)

• Reduce individual and team errors

• Increase overall team performance


NAWCTSD TeamWork Dimensions

Information Exchange•Seeking information from all available sources

•Passing information to the appropriate persons before being asked

•Providing “big picture” situation updates

Communication•Using proper phraseology

•Providing complete internal and external reports

•Avoiding excess chatter

•Ensuring communications are audible and ungarbled

Supporting Behavior•Correcting team errors

•Providing and requesting backup or assistance when needed

Team Initiative/Leadership•Providing guidance or suggestions to team members

•Stating clear team and individual priorities.


Aiding & Cognitive Resources

We might improve team performance by:

1. Making individual tasks easier freeing cognitive resources for team coordination tasks

2. Aiding aspects of individual task exercised in coordination activities

3. Supporting team coordination tasks directly


TANDEM Synthetic Radar Task

• Lab Simulation : moderate fidelity Aegis-based simulation

• Characteristics : Real-time, reactive & inflexible

• Task : Forced Pace, High Workload, Highly Dependent on Cooperation, Shared Information, Individual Action

• Cognitive Demands: High working memory load.. – Subjects must access from menus or obtain from teammates five

parameter values and their classifications in order to reach each of their individual targeting decisions

• Studies : contrasted agent aiding for reducing memory load with assistance in communication and cooperation


Tandem Experiments

• Three team members (Alpha, Bravo, & Charlie) each responsible for a different decision (type, classify, intent)

• Each team member has 3 menus each accessing 3 parameters

• Each team member has 3 pieces of data for his task, but the remaining two items must be obtained from teammates


Speed: 250 knots(It’s an aircraft)

Ini Altitude: 0 FeetSignal: Medium

(It’s surface

Climb rate: 300 ft/sec (air craft)

Comm Time: 1

0 sec (air c

raft)

User may need information from teammates


Agent Aiding Strategies

Supports Individual'sTask

Supports Team Work

Registry Shows who has whatdata

Facilitates coordination

PersistentMemory

InformationPush

Accumulates values forown task

Pushes accessed values toteammates Reduces verbal

communication Reduces communication

errors

Preserves accessed values for own decision

Preserves accessed valuesfor communication to team


Experimental Design Between subject design with 4 conditions:

• Individual Memory agent

• Team Registry agent

• Team Push agent

• Control (no agent)

Each task is defined by 5 parameter values, 3 of which a team member can access from menus, the other 2 are gotten from team mates

Three team mates Alpha, Bravo, Charlie, each responsible for a decision (type, intent, classification)


Experimental Design (cont)

• 10 teams of 3 subjects in each condition (120 subjects)

• Each session contained 3 trials, 15 minutes each

• Each trial included 75 targets with 3 levels of target difficulty

• Target difficulty : hard (25 targets), medium (25 targets) & easy (25 targets)


Individual Agent

*

*

*

*

***

*

**

*

**

Time : 00:14:25

Agent Window--TYPE-Speed: 27Climb/Dive : -366Signal

--CLASS-Bearing: Origin: Red_SeaRange: 1.4

--INTENT-Countermeasures: NoneElectronic Warfare: Missile Lock : Clean

Hooked Target : 35Radius : 50 nm

OPER A B C

000

270

180

090

*

*

*

*

*

*

*

*

*

*

*

Individual Memory

T SCORE: 1200I SCORE : 1950


Team Clipboard Agent

*

*

*

***

*

**

*

*

Time : 00:10:25

--TYPE-

Speed: 120Climb/Dive: 0Alt/Depth:Sig Strength: MediumComm Time:


OPER A B C

000

270

180

090

*

*

*

*

*

*

*

*

*

*

*

Team Push for Alpha



Team Checklist Agent

*

*

*

***

*

**

Time : 00:09:25


OPER A B C

000

270

180

090

*

*

*

*

*

*

*

*

*

*

*

Registry Agent


--TYPE-

A B C Speed

* B Alt/Depth

A B Climb/Dive

A B Signal Strength

B C Comm Time

--CLASS-

B Intel

*A B Bearing

A C Range

B C Maneuver

--INTENT-

A C Countermeasures

A Electronic War

A B Missile Lock

* C Response

B C Threat


Identification of Hard Targets

AgentsControl

Ha

rd T

arg

ets

Co

rre

ct

220

210

200

190

180

Copyright Katia Sycara 2002


Aiding Teams Helps more than Aiding Individuals for Hard Targets

Team Registry

Team Push

Individual Memory

Control

Har

d T

arge

ts C

orre

ct230

220

210

200

190

180



MokSAF Collaborative Planning Task

• Lab Simulation : MokSAF lightweight agent-based planning environment using ModSAF terrain database and Retsina-like planner

• Characteristics : Deliberative, iterative & multiattribute• Task : Self-Paced, Complex, Highly Dependent on Cooperation,

Shared Information, Team Action• Cognitive Demands: Complex problem-solving, requires multi-

attribute negotiation among subjects• Studies : Comparisons between autonomous, cooperative, and

critiquing route planning agentsPayne, T., Sycara, K. and Lewis, M. “Varying the User Interaction within Multi-Agent

Systems” , In Proceedings of the Fourth International Conference on Autonomous Agents, June 3-7, Barcelona, Spain, 2000. pp 412-418


Humans & AgentsAgents:

• have access to digital information in the infosphere

• cannot consider intangible objectives which are not part of that digital infosphere

Humans:

• Understand Idiosyncratic and situation-specific factors– local politics, non-quantified information, complex or vaguely specified mission

objectives

• Dynamically changing situations– Information, obstacles, enemy actions

Problem:

• To share and combine human and agent information and resources


Soil

RendezvousPoint

River

Forest

Road Building Teammate’s route Freeway

Commander’s routeStart Point Constraint

MokSAF Display


Experiments

• Map planning environment

• Teams of three subjects

• Three conditions

– Control (route critic) Agent

– Autonomous Planning Agent

– Cooperative Planning Agent

• Capability to express intangible constraints via physical artifacts on the map


Planning Routes


MokSAF: Autonomous Agent with user supplied constraints


Cooperative Agent/hilighter mode


Sharing Plans• Subjects create individual routes to rendezvous point by

– drawing them– asking agent to draw them

• When ready, subjects can share plans with other commanders– all routes will appear on screen

• Can communicate with each other via typing into a comm program– messages go to one commander or all commanders– categorized by subject


Mission Objectives (Performance Measures)

• All platoons arrive at the specified rendezvous point within a some agreed time frame

• Create an optimal route in terms of path length• The route should not violate any physical constraints• The route should not violate any social constraints (e.g., avoid

this area because the roads are under construction)• The route should pass through areas designated as “go-bys”• Minimize sharing paths with other teammates• The team should take the total number and types of units

specified by the mission briefing.– Too few units is worse than too many units.– An exact match is best.


Path Length Route Times

Path Length, Route Times, and Fuel Usage were uniformly better for Aided Teams


Results Vehicle Selection & Successful Rendezvous

On the more difficult Session 2 Rendezvous:• Teams using the Cooperative RPA most closely

approximated reference performance• Teams using the Autonomous RPA made slightly

less appropriate decisions• Teams using the Route Critic Control performed

poorly sometimes failing to rendezvousFor the less difficult Session 3 Rendezvous:• Performance retains ordering although differences

are not significant


Errors in Vehicle Choice session 2

CooperativeAutonomousControl

Err

ors

13

12

11

10

9

8

7

6

Shuttle LaunchSeveral distributed range operators must collaborate to achieve a successful launch within the launch window or abort the mission in minimal time

Responsible for monitoring a particular area in the launch zoneNegotiate with other range operators

Monitoring of several conditions, such as There should be no civilian or military vehicles in the path of the shuttle, in case of falling debris

The weather conditions need to be such that the exhaust plumage does not fall on inhabited areas

…


Shuttle Launch

Work environment isdistributedtime-criticalinformation-richcommunication-intensive

Increasingly, bottleneck on team performance is not availability of information, but limits on human capabilities: perception, cognition, attention


Supporting Human-Agent Teams in Shuttle Mission Launch


Approach

• Develop task models appropriate to the distributed workflow

• Develop cognitive models of key team members

• Develop software agents to support the team members and the team

• Evaluate the approach and resulting system


Evaluation

• Verification of task and cognitive models with human performance data

• Evaluate effectiveness of software agents using models and then through empirical testing in the laboratory and field settings

• Develop evaluation metrics to assess team performance


Range Operations & Space Launch SafetySpace Launch is an inherently risky business

• Many factors exist that could result in accidents

• However, US Ranges have an outstanding safety record due to:– Safety systems designed to minimized risk and to

validate protocol– Protecting civilians by restricting access to areas of

potential risk– Monitoring environmental factors to determine safe

launch parametersCopyright Katia Sycara 2002

Range Operations & Space Launch Safety

However,

• Existing systems are highly resource & expertise intensive– Want to improve operations, maintain quality

of service, but reduce cost.


Assisting Range OperationsAgents could assist Range Operations Teams

– Monitor team behavior/coordination to highlight emergent risk factors

– Provide assistance during range operations execution

To provide assistance, a model of the team task is required. Team-based Launch Scenario where team members:– Assume responsibility for different cognitive tasks– Are responsible for negotiating and managing shared

resources– Have to respond to unexpected events in a dynamic

environmentCopyright Katia Sycara 2002

MORSE Simulation Environment• MORSE is a simulation environment designed to

reproduce a time critical team based task that provides a variable cognitive load to a human team– Simulates the team-based task of launching a space vehicle

– Logs interaction between team members for the duration of the task

– Provides interfaces to setup and run experiments with various scenarios

– Provides interfaces for team members to focus attention to areas relevant to their responsibilities

• Network communication driven architecture can be extended to allow communication with external systems


Simulation Scenario for MORSESynopsis

• During the hours leading up to a space launch, three Range operators located at three different monitoring stations have to prepare for the launch. This involves:– Monitoring environmental conditions such as

the weather to determine it’s effect on the launch and the surrounding inhabited areas (monitor winds to determine plume dispersion)

– Monitoring the area within the anticipated flight path (Impact Lines)


Simulation Scenario for MORSE

– Allocating resources to prohibit incursions into the areas demarked by Impact Lines

– Determining if the launch should be aborted based on conditions at the time of launch

• The Range operators have access to shared, limited resources, and have to negotiate their allocation to maximize utility while minimizing cost


Team Objectives

Maximize safety, guarantee launch, yet minimize redundancy. Launch will be aborted if:•Weather conditions are severe•There is insufficient radar coverage of the launch path•Civilian vehicles (air or water based) are within the IILs or African Gates•Incursions are expected but interceptors are not in position


MORSE StationsThree stations (each with a different coverage area):

• Cape Canaveral (area around launch site & coastline)

• Antigua (area around Caribbean and South American Coastline)

• Ascension (area over Atlantic Ocean)

Decision Making• Each station is responsible for:

– Ensuring complete coverage of their area of responsibility – Monitoring weather within their domain– Negotiating with team members to acquire resources– Communicating with team members to share gathered data

in order to reduce mission costCopyright Katia Sycara 2002

MORSE Station (Ascension Islands)

The MORSE Station Interface supports communication between team members, resource allocation, planning, etc

This example illustrates the interface (showing the Instantaneous Impact Lines of the current launch) for the Range operator stationed at the Ascension Islands.


Factors affecting the Scenario• Wind (strength and direction)

– Wind Strength and Direction may change throughout scenario

– Wind Strength and Direction affects the dispersion of the plume.

– High temperatures can be a cause for aborting launch

• Space Launch Vehicle– Determines the position of the Impact Lines and hence

the area that must be covered


Factors affecting the Scenario• Radar Stations

– Positions of Radar Stations ensures that maximum coverage is obtained by the users.

– Fewer radar stations make the scenario more difficult because incursions are harder to detect

• Incursions– Initial incursions may be harmless – sea or air traffic

that may clear zones by launch time– Slow incursions on a course that will stay within the

IIL zone till the launch will require escorting by interceptor units.

– Probability of incursions between scenarios is variableCopyright Katia Sycara 2002

Factors affecting the Scenario• Interceptors

– Positions of available interceptor units – Speeds of different units is variable and affects

the ability of that unit to intercept incursions– Fewer interceptors will make the mission harder

• Plume Dispersion– Plume lines demark the anticipated dispersion of

the plume and are affected with the wind speed and direction

– Subjects will need to carefully determine expected dispersion of plume by launch time


Units and Deployment

• Units are available at different locations

• Each unit can be deployed from its current position to a new position by a user that controls that unit

• Deployment of a unit entails:– Acquiring that unit (by request if it is controlled

by another user)– Cost, dependent on unit– Calculation of time required to reach destination


Units and Deployment

• Units include:– Weather Balloons – unlimited– Air Vehicles – several sizes– Sea Vessels – several sizes– Radar Stations (stationary) – user determines

which of these are manned to obtain information


Tasks available to the Subjects (1)• Deploy Weather Balloons

– Weather balloons return the following information about the sector at which they are deployed

• Temperature• Pressure• Wind Speed• Wind Direction• Humidity

– Balloons take a finite amount to time to be deployed and hence there is a delay before data is returned

– Weather data returned by a balloon is available for 4 minutes (4 hours) after deployment


Tasks available to the Subjects (2)• Deploy Interceptors

– A number of different Ocean-going vessels and aircraft are available.

• Positions of these are established before the simulation starts

– Parameters of an interceptor include• Max Speed: Interceptors always travel at this speed• Position: This is position of an interceptor and changes

as it is deployed• Range: This is the maximum distance that the vehicle

can travel• Scope: This is the scope of coverage (i.e. sea or air)


Tasks available to the Subjects (3)

• Control Radar Stations– Select appropriate radar stations

• If a radar station is located at a non-critical area then there may be no need to activate it

• If a radar is inactivated then it may be activated immediately.

• If a radar is in use by another station then it may be requested


Morse Architecture

• Flow of the experiment is controlled by the MORSECommand• MORSECommand models entire mission and simulation world• If MORSEStation displays focused subset of simulation world to each user

Morse-Command

Morse-Station

Morse-Station

Morse-Station

Scenario FileWeather Queries

Timing Synchronizatio

n

Incursion Information

Shared Information

between Stations

Morse Command Station – Team Formation

This is the Initialization Page of the Morse Command Window.

Functions:•Agent Registration•Simulation Setup•Team Setup•Clock Initialization•Simulation Control

Morse Command – Experiment Logging

This page in the MORSE Command logs the experiment events as they occur

Functions:•Logs events such as activation of interceptors, radars, deployment of balloons etc•Logs can be saved to a file.


Morse Command – Incursion Generation

Functions:•Maintains model of incursions in the simulation world•Maintains current position/status of radars

This is the Incursion Information page.


Morse Command – Weather Modeling

This is the Weather Modeling Page of the Morse Command Window.Functions:•Models the Weather as a simple random variance around a pivot (Reference Value)•Variance is parameterized and pivots can be specified


MORSE Command – Scenario Editor

The Graphical Scenario Editor can be used to design scenarios. Allows easy placement of units before the simulation starts



Performance Evaluation

Introduce score-keeping mechanism to provide team performance feedback for individual team members and team itself during the simulation

Based on:• how efficiently resources are being used • how team members coordinate activities• how quickly the infeasibility of launch is

recognised and the mission aborted


Project Status

Currently:

developing the simulation environment based on task knowledge provided by NASA

Next:

evaluate simulation environment

develop cognitive models

develop agents

study their effectiveness


Conclusions: How Agents might support human teams

• Leverage implementation & testing by supporting domain independent aspects of teamwork in a variety of contexts

• Acting as bridge between stove-piped systems (currently done by humans e.g. Tandem)

• Acting to reduce the friction of HCI (cooperative RPA engaged participants in problem solving in the domain rather than in operating the system as the autonomous RPA did)


Multiagent Negotiation


A General Negotiation Model

• Communicate (offers & counter-offers)

• Compute (based on prior knowledge & negotiation history)

• Repeat / Quit


Literature Review

• Game theory– Profit dividing model (Rubinstein & Stahl)

• Complete information

• Unique equilibrium

– K-double auction (Chatterjee & Samuelson)• Incomplete information ( buyer and seller know each other’s

reservation price distribution)

• Bayesian belief update

• If the buyer’s offer b is greater than or equal to the seller’s offer s, then trade is possible

• But they may not make a deal even if they could


Literature Review

• Most multi-agent negotiation models belong to K-double auction framework– Personality model (Bazzan & Bordini)– mental emotion model(Sen et. al.)– Bayesian Learning (Zeng & Sycara)

• AI-based models– Argumentation-based negotiation– Experience-based negotiation


Desired characteristics of a Negotiation Model

• Support representation of negotiation context

• Be prescriptive

• Incur moderate computational cost

• Model the dynamics of negotiation

• Support learning from feedback in negotiation


The Bazaar Model

• Uses sequential decision making framework• Players have knowledge about the environment

and other players• History of negotiation is also taken into account• At each stage in the negotiation and for each non-

terminal history, each player has a subjective probability distribution that represents the player’s knowledge at this stage


The Bazaar Model (cnt)

In response to the most recent action taken by others, a player will:

1. Update his subjective evaluation of the environment and other players, using Bayesian rules (posterior probability calculation)

2. Select the action that maximizes his expected payoff, given the information available at the current stage


A Simple Example

Suppose that the buyer has two hypotheses about supplier’s reservation price:

H1= $100H2=$130Suppose the buyer has no other knowledge about the

supplier. Then, P(H1)=0.5 and P(H2)=0.5Suppose the buyer also has domain knowledge that “The

suppliers will typically ask a price above their reservation price by 17%” So, P(e/H1)=0.95

and P(e/H2)=0.75, where e denotes the event that the supplier asks $117


A Simple Example

Now, suppose that the supplier does ask 117.00

Then the buyer uses Bayes rule to calculate P(H1/e) = 55.9 and P(H2/e)= 44.1

Suppose the buyer adopts a simple strategy “Propose a price that is 10% less than the estimated reservation price of the supplier”.

Prior to receiving the supplier’s offer the buyer would have offered 115.00 (the mean of the RP of supplier’s distribution).

After receiving the offer and updating his beliefs, the buyer now offers 113.25.


Experimental Design

• A buyer, and a supplier• RP private information• The agents try to estimate the other player’s RP• Range of possible actions is integer within [0,

100]• Each player’s utility is linear in the final price• Each agent proposes strictly monotonically.• Each agent has different initial subjective belied

functions


Experimental Design

Three conditions:• Neither one learns• Both learn• Buyer learns, supplier does not (game is

symmetric)• For each condition, we ran 500 randomly

generated negotiation scenarios• Evaluation criterion the normalized joint Nash

solution (max is o.25)


Average Performance of Three Experimental Configurations in Bazaar

• A non-learning agent makes decisions based solely on his own reservation price

• A learning agents makes decisions based on both the agent's own and the opponent's reservation price

Zeng D. and Sycara, K. "Bayesian Learning in Negotiation", International Journal of Human Computer Systems, Vol 48, pp.125-141, 1998.

Configuration JointUtility

Buyer’sUtility

Supplier’s Utility

# of ProposalsExchanged

Both Learn 0.22 0.49 .051 24

Neither Learn 0.18 0.49 0.51 34

Only Buyer Learns 0.15 0.59 0.41 28


Evaluating Belief Updating Methods

• A variant of K-double auction model

• No Bayesian update

• Take finite bargaining time into consideration

• Provide a set of belief updating methods for agents’ human master to choose

• Easy implementation


Evaluating Belief Updating Methods

• Finite bargaining time

• DP like offering strategy

)}(){Pr(maxarg

)}(){Pr(max)(

1*

1*

xfax

xfaxf

ta

ta

t

:x

T

:)(xft

:)Pr(a

Agent’s offer

Expected profit at time t if offer x

Agent’s belief that his opponent will offer a


Belief Updating Methods

• Negotiation range:

• Buyer and seller’s offer at time t:

• Buyer’s updating method (Seller’s is similar)

]max,[min PP

tt Sb ,

],[ tt Sb

],[min tSP

Uniform Exp (1) Exp (2)

],[min tSP ],[min tSP

],[ tt Sb],[ tt Sb


Belief Updating Methods

• Two exponential updating (over )

})(

exp{1

)Pr(tT

Sa

Za t

],[ tt Sb

})(

exp{1

)Pr(tT

ba

Za t

(1)

(2)

:T :

:Z

tb tS

)Pr(atb tS

)Pr(a

Finite bargaining time

Normalization factor

Control parameter


Intuition (buyer’s viewpoint)

• Seller’s value is higher than my current offer update over interval

• I may have over-bided, update belief over interval

],[ tt Sb,tb

],[min tSP

tStb PmaxPmin

],[min tSP

],[ tt Sb


Intuition (buyer’s viewpoint)

• I belief the seller is not likely to move back from his current offer exp method (2)

• There is still enough negotiation space, exp method (1) (does not trust the seller)

,tS


Numerical Experiments

• Negotiation range [0, 100]

• Fix buyers’ reservation price to 100

• In different experiments, increase seller’s reservation price from 0 to 100


Some Results (1)


Some Results (2)


Some Results (3)


Some Results (4)


Some Results (5)


Some Results (6)


Some Results (7)


Some Results (8)


Conclusion

• It is hard to interpret your opponent’s behavior in bargaining– general knowledge about the environment– specific knowledge about your opponent

• We leave task to the agent’s human master• We provide a computational model for

human to control their agents’ negotiation behavior


Work on Coalitions Yamamoto, J, and Sycara, K. “A Stable and Efficient Buyer Coalition

Scheme for e-Marketplaces” Proceedings of the Fifth International Conference on Autonomous Agents, May 28-June 1, Montreal, CA. 2001.

Li, C. and Sycara, K. “Algorithms for Coalition Formation and Payoff Division in e-Marketplace”, Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, Bologna, Italy, July 15-19, 2002.


Outline of the GourpBuyAuction scheme

A Camera Group

BuyersBuyers

SellersSellers

The camera B coalition

The camera A coalition

Price schedule for camera A

Bid

Bid

Price schedule for camera B

I want B for $700 or lower

I want A for $400 or lower

I want A for $500 or lower, or B for $600 or lower.


Coalitions

A buyers’ coalition is a group of buyers that want to buy the same item.•Buyers in a coalition may pay different prices for the same item depending on their reservation prices

•Desired goals for the coalition are:•Increase the number of buyers who can purchase items•Increase group utility and individual buyers utility•Divide the total utility among buyers in a fair and stable way.

.


An Example

Buyers: b0, {(item0, 100), (item2, 70)} b1, {(item0, 80), (item1, 95), item2,95)}

. b2, {(item1, 95)} b3, {(item1,65)} b4, {(item1, 85), (item2, 95)}

Price schedule (assume all 3 items have same price schedule, for the example)One unit: 100, two units 95, three units 90, etc

Possible coalitions: item0 ({b0}) item1: ({b1,b2},{b1,b2,b4},{b1,b2,b3,b4})Item2: ({b1,b4})Our scheme derives: item0 ({b0)}: b0 pays 100Item1 ({b1,b2,b4}) : b1 pays 92.5; b2 pays 92.5, b4 pays 85


Approach to Coalition Formation

Principle 1: Maximize the utility of the most valuable coalition,then maximize the utility of the second valuable one,and continue recursively.

Principle 2:Distribute the surplus of each coalitionwithin the coalition in a stable way.

•Our coalition formation algorithm is a variant of the weighted set packing problem O (2**n) (n is the number of buyers)•If we assume that the number of items in a category is bounded above by an integer K, independent of n, then O(n*log*n)


bk: a buyer,gi: an item,rki: the reservation price of a buyer bk for gi,Ci : a buyer coalition to purchase gi.

vi(k) = rki - pk The utility of buyer bk

gained from buying gi at the price pk,

vi(Ci) = Sum of vi(k) where bk in CiThe utility of a buyer coalitiongained from buying gi.

Buyers’ Utility


Coalition Configuration Algorithm

A Camera GroupBuyersBuyers

SellersSellers

The camera B coalition

The camera A coalition

Price schedule for camera A

Bid

Bid

Price schedule for camera B

I want B for $700 or lower

I want Afor $400 or lower

I want A for $500 or lower, or B for $600 or lower.

(1) Maximize the utilityof the most valuable coalition.

(1) Maximize the utilityof the most valuable coalition.

(2) Then maximize the utility of the second valuable coalition,and continue recursively...

(2) Then maximize the utility of the second valuable coalition,and continue recursively...


Surplus Sharing Rule in a Coalition

Distribute the surplus of each coalitionwithin the coalition.

Distribute the surplus of each coalitionwithin the coalition.

Coalition Ci = {b0, b1, …, b5}

Reser-vationPrice

Share ofSurplus

Priceto Pay

b0 b1 b2 b3 b4 b5

Total Price for Ci to Pay

Surplus vi(Ci)Price


Stability of the Surplus Sharing Rule

Proposition For any coalition Ci , the surplus distribution is in the core of coalitional game with transferable payoff < Ci, vi>

Proposition For any coalition Ci , the surplus distribution is in the core of coalitional game with transferable payoff < Ci, vi>

No subset of buyers in a coalition can obtain utilitythat exceeds the sum of the current utility of the membersin the subset.


Effectiveness in Increasing Buyers’ Benefits

- Simulate buyers’ behaviors under several conditions at three group buying schemes:

(1) our scheme,(2) a traditional scheme,(3) an optimal scheme.

- Compare the three schemes using the evaluation criteria:(a) group’s total utility,(b) the number of buyers who can obtain items.

- Assume that a buyer randomly selects preferred items and reservation prices; they are not affected by others.


Simulation Results

Summary of Simulation Results

(1) Our scheme performed better than the traditional scheme under most conditions,

(2) Our scheme performed well close to the optimal scheme under most conditions which the optimal scheme could handle.


Simulation Results

How steeply the volumediscount price decreases

Group’s total utility

Examples of simulation results.

0 How steeply the volumediscount price decreases

The number of buyers who get items

0

50500

Our schemeA traditional schemeAn optimal scheme

Parameters: The number of items = 3 The number of buyers = 50 …….


Optimal coalition formation in Combinatorial Auctions

• Coalition formation allows buyers to enjoy volume discounts • In combinatorial auctions buyers place bids for bundles of

items.• How to form an “optimal” combinatorial coalition of

buyers?• What is a “fair” mechanism to distribute the profit among

members of the coalition?Li, C. and Sycara, K. “Algorithms for Coalition Formation and Payoff Division in e-

Marketplace”, Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, Bologna, Italy, July 15-19, 2002.


Sellers: I’m happy to sell more goods with a lower price.

Let’s see an example

Cell phone

Service Prog.

$450 Buyer 1

Buyer 2

1 2

$500

$405

1 2

$50

$40 $450

1

1

1

1

Buyers: Only a cell phone or service program means nothing to me !


Coalition formation

Cell phone Service program

Buyer 1 $405 $45

Buyer 2 $400 $50

Sum $805 $95< $405*2 > $40*2

Let’s take advantage of the price discounts…


Combinatorial bidding

Buyer 1 I bid $450 < $550

Buyer 2 I bid $450 < $550

I want all of m units of a and n units of b (and …) for no more than r.


Combinatorial coalition formation

Buyer 1 I bid $450

Buyer 2 I bid $450

Combinatorial bidding + Coalition formation

Sum: $900 > $($405+$40)*2


Combinatorial Coalition Formation(CCF)

Item 1

Item 2

Item K

11q

21q

22q

31q

32q

1r

2r

Nr

Buyer 1

Buyer 2

Buyer N

1p

2p

Kp


Questions• How to form an “optimal” coalition of

buyers?

• What is a “fair” mechanism to distribute the profit among members of the coalition?


Literature review• Economics Payoff division of coalitions

Osborne and Rubinstein[94]

Mech. design of comb. AuctionsBykowsky[95], Rassenti[82]

• Computer science Coalition formation: Yamamoto and Sycara (01), Lerman[00],

Sen[00], Shehory[99], Sandholm[97]

Winner determination: Sandholm[99], Fujishima[99], Andersson[00], Wurman[00], etc.


Problem formulation• Maximize the value of the coalition:

* max ( )C B

C v C

' ( '), 'C Cv x C C C

1

( ) ( ( ))n

Kk k

n n k Cb C k

v C r q p q

( ) ( )Cb C

x b v C

Divide the payoff in the core:

(no members can get better payoff by deviating from the coalition)

'

)()'(Cb

CC bxCx


Assumptions• Items are sold in fixed price schedules

• Buyers tell the truth about their reservation costs

• A partial bundle has value zero

• One-shot winner determination


Main Idea• Price dominates the decision for coalition formation

• Use divide and conquer to search for the optimal coalition

– For each item, find its optimal sub-coalition

– Apply transfer of reservation cost/price between optimal sub-coalitions to get the optimal coalition

• Approximate algorithm for optimal coalition by considering only greedy transfer of reservation costs


Some concepts• Reservation cost division

Buyer 1 450 405 45

Buyer 2 450 400 50

1nr

2nrnr

Subcoalition

*1C },{ 21

*2 bbC

Cb

kCk

kn

knk

n

qpqrCv ))(()(

K

k

knn rr

1


Some concepts(ctd.)• Compatible

},{ 21*1 bbC and },{ 21

*2 bbC

}{ 1*1 bC and }{ 1

*2 bC

*1C and *

2C

},{ 21*1 bbC and }{ 2

*2 bC

}{ 1*1 bC and }{ 2

*2 bC

*1C and },{ 21

*2 bbC

YES

YES

YES

NO

NO

NO


Approach

Reservation cost division

Subcoalition formation & Payoff division

Compatible?

Reservation cost transfer

No

Derive the CCF coalition & payoff

division

Yes


Why subcoalitions?Claim 2: If the optimal subcoalitions are

compatible, then the derived coalition is optimal.

If each subcoalition distribute the payoff in the core, then the payoff division obtained by summing up the payoff in the subcoalitions for each buyer is in the core of the derived coalition.


Go back to the example…

Buyer 1 450 403 47

Buyer 2 450 411 39

1nr

2nrnr

},{ 21*1 bbC

},{ 21*2 bbC

},{ 21* bbC comp.

}4,0{)( *11 Cx

}0,6{)( *22 Cx

)(}4,6{)( ** CCoreCx


Existence of compatible optimal subcoalitions

Linear price function:

Claim 3: Suppose the price functions are linear price functions, then there exists a reservation cost division such that the optimal subcoalitions are compatible.

( )k k kp m d m a

From now on, the focus will be put on the systems with linear price functions…


Need to solve …• Q1: How to efficiently form an optimal

subcoalition • Q2: How to distribute the payoff in the core of the

subcoalitions• Q3: How to transfer the virtual reservation cost

among items to make the optimal subcoalitions compatible

• Q4: How to construct an approximation algorithm in polynomial time


Q1: Optimal Subcoalition Formation…

• In Yamamoto and Sycara, we showed efficient and accurate algorithm for coalition formation for single unit items. This algorithm was extended to coalition formation for multiple units.


Q2: Subcoalition payoff division

Can be realized in ( log )O K N N


Q3: Reservation cost transfer scheme

Check the buyers one by one. If a sub-coalition is not compatible with respect to buyer b, then redistribute the reservation cost of b.

Converges to a set of compatible optimal subcoalitions


Q4: Approximation Algorithm

• Use the heuristic: Once a buyer has been excluded from all sub coalitions, there is a very small possibility that he will be included in the optimal coalition.

• Therefore, discard the buyer from the buyer set.• This results in a polynomial time algorithm


Experiment:instance generation

• System scale:

number of buyers and items• System characteristics:

DS(Discount Slope)RBMI(the Ratio of Buyers preferring Multiple

Items)RBBR(the Ratio of Buyers Bidding at the

Retail Prices)


Experiment: numerical result


Research Results

• Developed a polynomial time approximation algorithm for formation of ccf (coalition formation is NP-complete)

• Good ratio to the optimal value by experimental results

• Payoff division scheme in the core of the coalition, guaranteeing coalition stability


Multiunit Double Auctions: Design goals• Efficient

– Maximizes the collective profit of all the participating agents

• Strategy-proof– Induce agents to honestly report their private information

• Budget-balanced– The market does not need to be subsidized by outside

sources

• Individual rational– Agents will voluntarily attend the market because of

expected positive profit


Design goals (cont)• We can not achieve all four goals at the same time• For MDAs, we also need to consider the volume

issues • Trade-offs

– Asymptotically efficient– Strategy-proof in price– Weakly budget-balanced– Individual rational– Hard for sellers to influence market price by

misreporting volumes


The Mechanism


The Mechanism

• Two-side Vickrey-like auction

• Balance the supply volume and the demand volume

• Main result– If the buyers and sellers' volumes are public

information, the above mechanism is strategy-proof with respect to reservation price, weakly budget-balanced, and individually rational.


Sellers’ volume strategy

• Sellers may drive the market price up by tightening the supply volume

• Though possible, it is hard to for sellers to do so because the information disclosure rule of our market– Only sellers with index j<L can do so

– Sellers do not how much to under-report• Lack of information of the whole market

• Gaming between sellers


Efficiency

• Market loss – Market values loss between buyer K and seller

L (part A and part B in the figure) – Market values loss in order to balance the

supply and demand volume


Efficiency

• Main results– Given the number of agents who successfully

trade is large, the market is asymptotically efficient

– Under some weak assumptions, given the number of agents, trade or not, is large, the market is asymptotically efficient


Conclusions• Agents are becoming a reality• One of the killer applications is going to be the

deployment of agents as the future generation of Web Services

• Remaining open issues – Scalability of coordination– Predictability of overall results of a MAS– Agent trust– Semantic interoperation– Human delegation– Agent customization


Reference slides

Sycara, K., Klusch, M. Widoff, S. and Lu, J. "LARKS: Dynamic Matchmaking among Heterogeneous Agents in Cyberspace", Journal of Autonomous Agents and Multiagent Systems, vol 5, no. 2, July 2002.

Yamamoto, J, and Sycara, K. “A Stable and Efficient Buyer Coalition Scheme for e-Marketplaces” Proceedings of the Fifth International Conference on Autonomous Agents, May 28-June 1, Montreal, CA. 2001.

Li, C. and Sycara, K. “Algorithms for Coalition Formation and Payoff Division in e-Marketplace”, Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, Bologna, Italy, July 15-19, 2002.

Payne, T., Sycara, K. and Lewis, M. “Varying the User Interaction within Multi-Agent Systems” , In Proceedings of the Fourth International Conference on Autonomous Agents, June 3-7, Barcelona, Spain, 2000. pp 412-418


References

Lenox T., Hahn, S., Lewis M., Payne T. and Sycara, K. “Agent Based Aiding for Individual and Team Planning Tasks”, IEA 2000/HFES 2000 Congress.

Paolucci, M., Onn Shehory and Sycara, K., “Interleaving Planning and Execution in a Multiagent Team Planning Environment”. In the Journal of Electronic Transactions of Artificial Intelligence, May 2001.

Decker, K., Sycara, K. and Williamson, M. "Middle-Agents for the Internet", Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (IJCAI-97), Nagoya, Japan, August 1997 pp. 578-584.

Wong, C. and Sycara, K. “A Taxonomy of Middle Agents for the Internet” In Proceedings of the Fourth International Conference on Multiagent Systems, July 10-12, Boston MA., 2000 pp. 465-466.

Huang, P., Scheller-Wolf, A. and Sycara, K. “Design of a Multi-Unit Double Auction Market”, Computational Intelligence, Vol. 18, No. 4, 2002 (Special issue on Agent Technology for Electronic Commerce) .

Sycara, K. and Lewis, M. “Integrating Agents into Human Teams”, In Salas E. (ed.) Team Cognition, Erlbaum Publishers, 2002.

agents supporting cooperative and self interested human interactions in open, dynamic environments...

Documents

copyright katia sycara

interface agent i task

info agent n info source

agents cooperative

open environments agents

interoperating agents

middle agents

notion of agent location