control coordination of multiple agents ...control coordination of multiple agents through decision...

187
AFRL-IF-RS-TR-2003-21 Final Technical Report February 2003 CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS Stanford University APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED. AIR FORCE RESEARCH LABORATORY INFORMATION DIRECTORATE ROME RESEARCH SITE ROME, NEW YORK

Upload: others

Post on 25-Sep-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

AFRL-IF-RS-TR-2003-21

Final Technical Report February 2003 CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS Stanford University

APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED.

AIR FORCE RESEARCH LABORATORY INFORMATION DIRECTORATE

ROME RESEARCH SITE ROME, NEW YORK

Page 2: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

This report has been reviewed by the Air Force Research Laboratory, Information Directorate, Public Affairs Office (IFOIPA) and is releasable to the National Technical Information Service (NTIS). At NTIS it will be releasable to the general public, including foreign nations. AFRL-IF-RS-TR-2003-21 has been reviewed and is approved for publication.

APPROVED: JOHN F. LEMMER Project Engineer

FOR THE DIRECTOR: JAMES W. CUSACK, Chief Information Systems Division Information Directorate

Page 3: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

REPORT DOCUMENTATION PAGE Form Approved

OMB No. 074-0188 Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing this collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington, VA 22202-4302, and to the Office of Management and Budget, Paperwork Reduction Project (0704-0188), Washington, DC 20503 1. AGENCY USE ONLY (Leave blank)

2. REPORT DATE FEBRUARY 2003

3. REPORT TYPE AND DATES COVERED Final Jun 98 – Jun 00

4. TITLE AND SUBTITLE CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS

6. AUTHOR(S) Yoav Shoham

5. FUNDING NUMBERS C - F30602-98-C-0214 PE - 63760E PR - AGEN TA - T0 WU - 01

7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) Stanford University Office of Sponsored Research 320 Panama Street Stanford California 94305-4631

8. PERFORMING ORGANIZATION REPORT NUMBER

9. SPONSORING / MONITORING AGENCY NAME(S) AND ADDRESS(ES) Air Force Research Laboratory/IFSF 525 Brooks Road Rome New York 13441-4505

10. SPONSORING / MONITORING AGENCY REPORT NUMBER

AFRL-IF-RS-TR-2003-21

11. SUPPLEMENTARY NOTES AFRL Project Engineer: John F. Lemmer/IFSF/(315) 330-3657/ [email protected]

12a. DISTRIBUTION / AVAILABILITY STATEMENT APPROVED FOR PUBLIC RELEASE; DISTRIBUTION UNLIMITED.

12b. DISTRIBUTION CODE

13. ABSTRACT (Maximum 200 Words) The control and coordination of multi-agent systems is a major scientific and technological challenge. In settings where agents must act in flexible, hostile, distributed environments-such as those faced in military domains-the design of effective techniques for control coordination, competition, and adaptation becomes a task of great importance. There has been growing interest in the application of methods and approaches from economics to these problems; however, traditional economic methods lack many ingredients that are essential to make them useful for large-scale computational multi-agent systems. In our work we tackle some of these basic issues. In particular, we address: 1. the allocation of complementary and substitutable tasks to self-interested agents; 2. adaptation in hostile environments; 3. coordination for the assignment of a task among self interested bidders; 4. computationally-motivated representations of economic interactions; and 5. updating agents' beliefs after receiving new information. Our objective is therefore to introduce economic methods into the context of control and coordination of multi-agent systems, while generalizing and extending these methods to become efficient and effective. An important part of our approach is the identification and management of deep computational problems. We also present new theories which are essential for any flexible and dynamic practical multi-agent system.

15. NUMBER OF PAGES187

14. SUBJECT TERMS Combinatorial Auctions, Reinforcement Learning, Mechanism Design, Belief Revision

16. PRICE CODE

17. SECURITY CLASSIFICATION OF REPORT

UNCLASSIFIED

18. SECURITY CLASSIFICATION OF THIS PAGE

UNCLASSIFIED

19. SECURITY CLASSIFICATION OF ABSTRACT

UNCLASSIFIED

20. LIMITATION OF ABSTRACT

ULNSN 7540-01-280-5500 Standard Form 298 (Rev. 2-89)

Prescribed by ANSI Std. Z39-18 298-102

Page 4: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

TABLE OF CONTENTS Overview..................................................................................................................................................... 1

1. Combinatorial Auctions .......................................................................................................................... 1

2. Adaptation in Multi-Agent Settings........................................................................................................ 2

3. Mechanism Design.................................................................................................................................. 2

4. Representation......................................................................................................................................... 4

5. Belief Revision and Belief Fusion .......................................................................................................... 4

APPENDIX A Taming the Computational Complexity of Combinatorial Auctions: Optimal and

Approximate Approaches ........................................................................................................................... 6

APPENDIX B An Algorithm for Multi-Unit Combinatorial Auctions ................................................. 12

APPENDIX C Towards a Universal Test Suite for combinatorial Auction Algorithms....................... 18

APPENDIX D R-Max – A General Polynominal Time Algorithm for Near-Optimal Reinforcement

Learning .................................................................................................................................................... 29

APPENDIX E Bidding Clubs: Institutionalized Collusion in Auctions............................................... 46

APPENDIX F Bidding Clubs in First-Price Auctions........................................................................... 53

APPENDIX G Mechanism Design with Incomplete Languages .......................................................... 80

APPENDIX H Anonymous Bidding and Revenue Maximization ...................................................... 116

APPENDIX I From Belief Revision to Belief Fusion ......................................................................... 121

APPENDIX J Representing and Aggregating Conflicting Beliefs...................................................... 156

APPENDIX K A Normative Examination of Ensemble Learning Algorithms................................... 168

APPENDIX L Aggregating Learned Probabilistic Beliefs.................................................................. 176

goodelle
i
goodelle
goodelle
goodelle
Page 5: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Overview

The control and coordination of multi-agent systems is a major scientific and technological challenge. When facing large-scale multi-agent settings where the agents are to act in flexible, hostile and distributed environments-such as those faced in military domains-the design of effective techniques for dealing with control, coordination, competition, and adaptation becomes a task of great importance. In recent years there has been growing interest in the application of methods and approaches from economics, for example the application of classic solutions from the theory of economic mechanism design to task allocation in non-cooperative dynamic environments. However, traditional economic methods lack many ingredients that are essential to make them applicable to large-scale computational multi-agent systems. In our work we tackle some of these basic issues. In particular, we address the allocation of complementary and substitutable tasks to self-interested agents, adaptation in hostile environments, coordination for the assignment of a task among self-interested bidders, computationally- motivated representations of economic interactions, and the updating of agents' beliefs after receiving new information. Our objective is therefore to introduce economic methods into the context of control and coordination of multi-agent systems, while generalizing and extending these methods to become efficient and effective. An important part of our approach is the identification and management of the deep computational problems which frequently arise in the control and coordination of large- scale multi-agent systems. We also present new theories which are essential for any flexible and dynamic practical multi-agent system.

Our work in the COABS project may be seen as addressing five classes of problems:

1. Combinatorial Auctions

One primary economic mechanism upon which we chose to focus is the combinatorial auction. Combinatorial auctions involve the sale of multiple goods in a single auction, in cases where bidders' valuations may exhibit both complementarities (i.e., a bidder's willingness to pay for a bundle may exceed the sum of that bidder's valuation for each individual item in the bundle) and substitutability’s (e.g., a bidder may be willing to win only one of a set of bundles). To allow bidders to express complementarities in their valuations, combinatorial auctions allow bidders to request "all-or-nothing" bundles of goods; bidders may also bid on subsets of these bundles if they are interested. To allow bidders to express substitutability’s in their valuations, combinatorial auctions allow bidders to designate a set of bids as mutually exclusive-i.e., to indicate that only one of these bids is allowed to win, even if the seller would otherwise prefer to select more than one of these bids. Combinatorial auctions can lead to increased social welfare and/or seller revenue, but they come at a computational cost. Determining the set of winning bids in a combinatorial auction is an NP-hard computational problem. Nevertheless, we developed techniques to solve problems of interesting size by using a variety of different optimization techniques; we also investigated the design of test data for benchmarking such optimization algorithms. Our other research on combinatorial auctions included I investigating bidder strategies when goods are allocated through sequential, single-good auctions, and an alternative mechanism that maintains incentive compatibility even though goods are not always allocated to the bidders willing to pay the most for them.

goodelle
1
Page 6: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

All of these papers, and the papers in the following sections, may be found in the appendix:

Sequential Auctions for the Allocation of Resources with Complementarities (C. Boutilier, M. Goldszmidt and B. Sabata): presented at IJCAI-99.

Taming the Computational Complexity of Combinatorial Auctions: Optimal and Approximate Approaches (Fujishima, Leyton-Brown, Shoham): presented at IJCAI- 99.

.Incentive Compatibility in Rapid, Approximately Efficient Combinatorial Auctions (D Lehmann, L. O'Callahan, and Y. Shoham): presented at the First ACM Conference on Electronic Commerce (EC'99)

An Algorithm for Multi-Unit Combinatorial Auctions (K. Leyton-Brown, M. Tennenholtz, Y. Shoham): presented at Games-2000, AAAI-2000 and the International Symposium for Mathematical Programming (ISMP-2000).

Towards a Universal Test Suite for Combinatorial Auctions (Leyton-Brown, Pearson, Shoham): EC-OO.

2. Adaptation in Multi-Agent Settings

Studying adaptation in multi agent settings was an important component of our research agenda. Indeed, the simultaneous adaptation of multiple agents has profound impact on the design of robust command and control methods. The phenomenon of adaptation in multi-agent systems is considerably different from adaptation in the single-agent case. This is true because the fact that multiple agents simultaneously adapt to each other implies that even simple adaptation rules can lead to complex behaviors. In order to tackle this issue we addressed the problem of reinforcement learning in various classes of stochastic games. Stochastic games extend upon and incorporate features of repeated games and Markov Decision Processes (MDPs), and are a very general model of multi- agent interaction. Our work on this topic had two main threads. First, we studied algorithms that could learn bidding policies in complex auction settings, and investigated the behavior of these algorithms. Second, we developed a reinforcement learning algorithm for stochastic games that finds near-optimal policies in polynomial time, and which also introduces a new approach for dealing with the exploration vs. exploitation tradeoff.

Continuous Value Function Approximation for Sequential Bidding Policies (C. Boutilier, M. Goldszmidt and B. Sabata): presented at the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-99).

Conditional, Hierarchical Multi-Agent Preferences (?): presented at the Seventh . Conference on Theoretical Aspects of Rationality and Knowledge (TARK VII). l

Sequential Optimality and Coordination in Multi-agent Systems (C. Boutilier): . presented at ?

R -max: A near optimal polynomial time reinforcement learning algorithm, (Ronen Brafman and Moshe Tennenholtz): presented at UCAI'Ol.

3. Mechanism Design

One of the principal techniques for the control of multi-agent systems is the deployment

goodelle
2
Page 7: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

of an economic mechanism which will influence agents' behavior by giving them incentives for taking desirable actions. This mechanism design approach underlies a number of the research projects we undertook as part of our participation in the COABS project; because they are all so diverse we survey. them individually here.

Ascending bid auctions-such as the familiar English-style auction of Sotheby and eBay-suffer the problem of being unpredictably long. This is unacceptable in mission critical, urgent applications, of the sort encountered in the military. The alternative-- running a quick, one-shot sealed-bid auction-has the advantage of being fast, but unfortunately it does not posses the nice optimization properties of ascending-bid auctions in the presence of so-called common values. We were able to devise a novel auction mechanism, which combines the merits of both.

Finding ways of designing smart agents to assist bidders in auctions is fundamental to introducing agents' coordination to the context of economic mechanisms design. Our research emphasized protocols for coordinating groups of bidders through the paradigm of "bidding clubs"-groups of bidders who share information before participating in an auction, in such a way that all the members of a bidding club benefit. In our first paper on this topic we developed basic bidding club protocols for five fundamental auction settings; in our second paper we conducted a more rigorous and general theoretical analysis of bidding clubs in first-price auctions.

"Rational computation" presents a new model of computation based upon principles of rationality, which, we argue, are appropriate in a non-cooperative computing environment such as the Internet. In this work we developed a theory which looks at markets as computing devices and attempts to quantify their computing power.

Although VCG mechanisms have many appealing properties, their essential intractability prevents them from being used for complex problems like combinatorial auctions. We introduced a general way to overcome this intractability and proved its properties.

As we consider the use of auctions for resource allocation we must take into account the possibility-and in some cases virtual certainty-that agents will hide their true identities, so that it becomes impossible not only to know who is behind a given bid but even whether two different bids were submitted by the same bidder. This has profound effect on the outcome of the auction, as the bidders learn to manipulate the auction by I using this anonymity feature. We were able to characterize the equilibria of some auctions in such settings, which provides the first step towards designing auctions that can withstand anonymity.

Speeding Up Ascending-Bid Auctions (Y. Fujisima, D. McAdams, and Y. Shoham): presented at IJCAI-99.

Bidding Clubs: Institutionalized Collusion in Auctions (K. Leyton-Brown, M. Tennenholtz, and Y. Shoham): presented at Games-2000, EC-OO, Brown University. .Rational computation (M. Tennenholtz, and Y. Shoham): published in AIl.

Bidding Clubs for First-Price Auctions (Leyton-Brown, Shoham and Tennenholtz): submitted to GEB.

Mechanism Design With Incomplete Languages (Ronen): presented at EC-OI. Anonymous bidding in auctions (Yossi Feinberg and Moshe Tennenholtz): submitted to

GEB.

goodelle
3
Page 8: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

4. Representation

Bayesian networks-graphical representations of probability distributions that explicitly describe independences inherent in these distributions-revolutionized the field of probabilistic inference. By capturing the underlying structure of distributions, they allowed for algorithms that made inference tractable in practice. We have studied the possibility of finding structured representations for games which give similar tractability benefits. We began by studying possible ways of graphically representing utilities. The idea is that such representations capture structure inherent in the utility functions in the same way that Bayesian networks capture independences in probability distributions. Next, we introduced Game networks (G nets), a novel representation for multi-agent decision problems. Compared to other game-theoretic representations, such as strategic or extensive forms, G nets are more structured and more compact; more fundamentally, G nets constitute a computationally advantageous framework for strategic inference, as both probability and utility independencies are captured in the structure of the network and can be exploited in order to simplify the inference process. An important aspect of multi- agent reasoning is the identification of some or all of the strategic equilibria in a game; we presented original convergence methods for strategic equilibrium which can take advantage of strategic separabilities in the G net structure in order to simplify the computations. We introduced Multi-Agent Influence Diagrams (MAIDs), which generalize the familiar Bayesian Network generalization of (single-agent) influence diagrams to the multi-agent case. Finally, we developed a novel approach to computing all equilibria of a multi agent game, based on homotopy methods and closely related to simulated annealing used in AI.

Expected Utility Networks (P. La Mura and Y. Shoham): presented at UAI'99. Game Networks (P. La Mura): presented at the Sixteenth Conference on Uncertainty in

Artificial Intelligence (UAI'OO). Probabilistic Models for Agents' Beliefs and Decisions (B. Milch and D. Koller): I

presented at UAl'QQ. Simulated Annealing of Game Equilibria: A Simple Adaptive Procedure Leading to ~ Nash Equilibrium (P. La Mura and M. Pearson): presented?

5. Belief Revision and Belief Fusion

Often we want to combine the expertise of multiple experts in hopes of coming up with information that improves on all their individual beliefs. We studied the problem of automating this process. We considered different common representations, both qualitative and quantitative, of sources' beliefs and studied how information about the sources' expertise can be used to combine their beliefs in rigorous, justified ways. Our initial focus in solving this problem was on the situation where agents' belief states are represented as qualitative binary relations over possible worlds. Such representations are common in the belief revision community to represent not only agents' beliefs, but their counterfactual beliefs as well, i.e., not only what they believe at the moment, but what they would believe if the situation were somewhat different.

We introduced a novel belief fusion operator that aggregates the beliefs of two agents, each informed by a subset of sources (strictly) ranked by reliability. In the process we

goodelle
4
Page 9: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

defined pedigreed belief states, which enrich standard belief states with the source of each piece of information. We noted that the fusion operator satisfies the invariants of idempotence, associativity, and commutativity. As a result, it can be iterated without difficulty. We also defined belief diffusion; whereas fusion generally produces a belief state with more information than is possessed by either of its two arguments, diffusion produces a state with less information.

We considered the problem of representing collective beliefs and aggregating these beliefs when there may be conflicting sources of equal rank. We described a way to construct the belief state of an agent informed by a set of sources of varying degrees of reliability, giving a simple set-theory-based operator for combining the information of multiple agents. We also described a computationally effective way of computing the resulting belief state.

Ensemble learning algorithms combine the results of several classifiers to yield an aggregate classification. We presented a normative evaluation of combination methods, applying and extending existing axiomatizations from Social Choice theory and Statistics. For the case of multiple classes, we showed that several seemingly innocuous and desirable properties are mutually satisfied only by a dictatorship. A weaker set of properties admit only the weighted average combination rule. We exemplified these theoretical results with experiments on stock market data, demonstrating how ensembles of classifiers can exhibit canonical voting paradoxes.

Finally, we shifted our attention to the problem of aggregating beliefs when they are represented as probabilistic distributions. We proposed a framework, in which we assumed that nature generates samples from a 'true' distribution and different experts I form their beliefs based on the subsets of the data they have a chance to observe. We showed that the well-known aggregation operator LinOP is ideally suited for use in our ~ framework, and proposed a LinOP-based learning algorithm, inspired by the techniques developed for Bayesian learning, which aggregates the experts' distributions represented as Bayesian networks.

From Belief Revision to Belief Fusion (P. Maynard-Reid II and Y. Shoham): presented at the Third Conference on Logic and the Foundations of Game and Decision Theory (LOFT3).

Belief Fusion: Aggregating Pedigreed Belief States (P. Maynard-Reid II and Y. Shoham): published in the Journal of Logic, Language, and Information.

Representing and Aggregating Conflicting Beliefs (p. Maynard-Reid II and D. Lehmann): presented at the Seventh International Conference on Knowledge Representation and Reasoning (KR '00).

A Normative Examination of Ensemble Learning Algorithms (D. Pennock, P. Maynard-Reid ll, C. L. Giles, and E. Horvitz): presented at the Seventeenth International Conference on Machine Learning (ICML '00).

Aggregating Learned Probabilistic Beliefs (P. Maynard-Reid II and U. Chajewska): presented at UAI'OI.

goodelle
5
Page 10: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

AbstractIn combinatorial auctions, multiple goods are soldsimultaneously and bidders may bid for arbitrarycombinations of goods. Determining the outcomeof such an auction is an optimization problem thatis NP-complete in the general case. We proposetwo methods of overcoming this apparent intrac-tability. The first method, which is guaranteed tobe optimal, reduces running time by structuringthe search space so that a modified depth-firstsearch usually avoids even considering alloca-tions that contain conflicting bids. Caching andpruning are also used to speed searching. Oursecond method is a heuristic, market-based ap-proach. It sets up a virtual multi-round auction inwhich a virtual agent represents each original bidbundle and places bids, according to a fixedstrategy, for each good in that bundle. We showthrough experiments on synthetic data that (a) ourfirst method finds optimal allocations quickly andoffers good anytime performance, and (b) inmany cases our second method, despite lackingguarantees regarding optimality or running time,quickly reaches solutions that are nearly optimal.

1 Combinatorial AuctionsAuction theory has received increasing attention fromcomputer scientists in recent years.1 One reason is theexplosion of internet-based auctions. The use of auctionsin business-to-business trades is also increasing rapidly[Cortese and Stepanek, 1998]. Within AI there is growinginterest in using auction mechanisms to solve distributedresource allocation problems. For example, auctions andother market mechanisms are used in network bandwidthallocation, distributed configuration design, factoryscheduling, and operating system memory allocation

1 This material is based upon work supported by DARPA un-der the CoABS program, contract #F30602-98-C-0214, and bya Stanford Graduate Fellowship.

[Clearwater, 1996]. Market-oriented programming hasbeen particularly influential [Wellman, 1993; Mullen andWellman, 1996].

The value of a good to a potential buyer can depend onwhat other goods s/he wins. We say that there existscomplementarity between goods g and h to bidder b ifub({g,h})> ub({g})+ub({h}), where ub(G) is the utility to bof acquiring the set of goods G. If goods g and h wereauctioned separately, it is likely that neither of the typi-cally desired properties for auctions—efficiency andrevenue maximization—would hold. One way to ac-commodate complementarity in auctions is to allow bidsfor combinations of goods as well as individual goods.Generally, auctions in which multiple goods are auctionedsimultaneously and bidders place as many bids as theywant for different bundles of goods are called combina-torial auctions2.

It is also common for bidders to desire a second goodless if they have already won a first. We say that thereexists substitutability between goods g and h to bidder bwhen ub({g,h}) < ub({g})+ub({h}). A common example ofsubstitutability is for a bidder to be indifferent betweenseveral goods but not to want more than one. In order to beuseful, a combinatorial auction mechanism should providesome way for bidders to indicate that goods are substi-tutable.

Combinatorial auctions are applicable to manyreal-world situations. In an auction for the right to userailroad segments a bidder desires a bundle of segmentsthat connect two particular points; at the same time, theremay be alternate paths between these points and the bidderneeds only one [Brewer and Plott, 1996]. Similarly, in theFCC spectrum auction bidders may desire licenses formultiple geographical regions at the same frequency bandwhile being indifferent to which particular band they re-ceive [Milgrom, 1998]. The same situation also occurs inmilitary operations when multiple units each have severalalternate plans and each plan may require a differentbundle of resources.

2 Auctions in which combinatorial bidding is allowed are al-ternately called combinatorial and combinational.

Taming the Computational Complexity of Combinatorial Auctions:Optimal and Approximate Approaches

Yuzo Fujishima, Kevin Leyton-Brown and Yoav ShohamComputer Science Department, Stanford University, Stanford CA, 94305

[email protected] (visiting from NEC Corporation)[email protected]@cs.stanford.edu

APPENDIX A

6

Page 11: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

While economics and game theory provide many in-sights into the potential use of such auctions, they havelittle to say about computational considerations. In thispaper we address the computational complexity of com-binatorial auctions.

2 The Complexity ProblemThere has been much work in economics and game theoryon designing combinatorial auctions. TheClarke-Groves-Vickrey mechanism (also known as theGeneralized Vickrey Auction, or GVA) has been particu-larly influential [Mas-Colell et al., 1995; Varian, 1995]. Itis beyond the scope of this paper to review such mecha-nisms, but they share a central problem: given a collectionof bids on bundles, finding a set of non-conflicting bidsthat maximizes revenue. (A more precise definition isgiven in Section 3.) This problem is easily shown to beNP-complete3 [Rothkopf et al., 1995].

Several methods have been conceived to cope with thecomputational complexity of combinatorial auctions, mostaiming to ease the difficulty of finding optimal allocations.They can be classified into three categories based on thestrategies they use.

One strategy is to restrict the degree of freedom ofbidding to simplify the task of finding optimal allocations.Rothkopf et al. show that an optimal allocation can befound in polynomial time if (1) each bid contains no morethan two goods; (2) for any two bids, either they are dis-joint or one is a subset of the other; or (3) each bid containsonly consecutive goods given a one-dimensional orderingof goods [Rothkopf et al., 1995].

Another strategy is to shift the burden of finding anoptimal allocation to bidders. [Banks et al., 1989] and[Bykowsky et al., 1995] have reported a mechanism calledAUSM in which non-winning bids are pooled in a stand-byqueue. Bidders can combine their bids with other bidscurrently in the queue to form new allocations. A newallocation is adopted if it generates more revenue than thepreviously best allocation.

A third strategy is to attempt to find an optimal alloca-tion but to be satisfied with a sub-optimal allocation whenthe expenditure of further resources becomes unacceptable.In other words, the optimality of the allocation istraded-off with the resources required, especially time.

In this paper we present two algorithms. The first is ananytime algorithm that attempts to exploit a problem’sparticular bid structure to reduce the size of the search. Italso reduces search time by caching partial results and bypruning the search tree. The second algorithm uses amarket-based approach to determine an acceptable allo-cation, although it is not guaranteed to find an optimal one.We then show results of experiments with synthetic datasuggesting that these methods, though not provided withformal guarantees, appear to have surprisingly good per-

3 The GVA has the additional shortcoming of requiring biddersto submit an unreasonably large number of bids, but we do notaddress this issue here.

formance. Additionally, the market-based approach ap-pears to produce allocations that are always optimal ornearly optimal.4

3 Precise Problem StatementIn this paper we propose two methods for finding desirableallocations based on bids submitted. We start by formallydefining the optimization problem. Denote the set of goodsby G and the set of non-negative real numbers by R+. A bidb=(pb,Gb) is an element of S= R+×(2G-{∅}). Let B be asubset of S. A set F⊆B is said to be feasible if ∀b,c≠b∈FGb∩Gc=∅. Denote the set of all feasible allocations for Bby Φ(B). Further, let G(B)=∪b∈BGb be the set of goodscontained in the bids of B.

[Problem] Find an allocation W∈Φ(B) such that∀F∈Φ(B) ∑b∈Fpb≤∑b∈Wpb. Such an allocation is said to beoptimal or revenue maximizing.

What kind of value interrelation between goods can berepresented by the bids defined above? Clearly, comple-mentary values are easily accommodated. Suppose a bid-der bids $20 for each of {g} and {h}, and $50 for {g,h}. Inthis case any revenue-maximizing algorithm will correctlyselect the {g,h} bid instead of {g} and {h}.

This bid format is also sufficient for representing sub-stitutability through an encoding trick. Suppose a bidder iswilling to pay $20 for {g} and $30 for {h} but only $40 for{g,h}. In this case, bids cannot be submitted as beforesince the revenue-maximizing algorithm would select thepair {g} and {h} over {g,h}, charging the bidder $50 in-stead of $40 for g and h. However, this problem can besolved by the introduction of ‘dummy goods’—virtualgoods that enforce an exclusive-or relationship. (Eachdummy good must appear only in a single bidder’s bids.)In our example, the bidder could submit the followingbids: ($20, {g,d}), ($30, {h,d}), and ($40, {g,h}) where dis a new, unique dummy good. The first two bids are nowmutually exclusive and so will never be allocated together.This technique can lead to a combinatorial explosion in thenumber of bids if many goods are substitutable, but inmany interesting cases this does not arise.

4 CASS AlgorithmWhen the number of goods and bids is small enough, anexhaustive search can be used to determine the optimalallocation. We propose an algorithm, CombinatorialAuction Structured Search (CASS), presented as a naïvebrute-force approach followed by four improvements.CASS considers fewer partial allocations than thebrute-force method because it structures the search spaceto avoid considering allocations containing conflictingbids. It also caches the results of partial searches andprunes the search tree. Finally, it may be used as an any-

4 We do not analyze the impact of the approximation on theequilibrium strategies in auction mechanisms such as GVA; wewill address this issue in a future paper.

7

Page 12: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

time algorithm, as it tends to find good allocations quickly.

4.1 Brute-Force AlgorithmSuppose there are |G| goods 1, 2, ..., |G|, and |B| bids 1, 2,…, |B|. First, bids that will never be part of an optimalallocation are removed. That is, if for bid bk=(pk,Gk) thereexists a bid bl=(pl,Gl) such that pl >pk and Gl⊆Gk, then bk isremoved because it can always be replaced by bl, in-creasing revenue. Then for each good g, if there is no bidb=(x,{g}) a dummy bid b=(0,{g}) is added.

Our brute-force algorithm examines all feasible alloca-tions through a depth-first search. Let x be the first bid andy be the last bid. Our implementation follows:

1. If x does not conflict with the current allocation, addx to the current allocation

2. Increment x3. If more bids can be added to the allocation, go to 2.4. Update best revenue and allocation observed so far.5. If y is contained in the current allocation, remove it,

set x=y+1 and repeat from 2.6. Decrement y.7. If y is not the first bid, go to 5.

4.2 Improvement #1: BinsA great deal of unnecessary computation is avoided in thebrute-force algorithm by checking whether bids conflict withthe current allocation before they are added. However, workis still required to determine that a combination is infeasibleand to move on to the next bid. It would be desirable tostructure the search space to reduce the number of infeasibleallocations that are considered in the first place.

We can reduce the number of infeasible allocations con-sidered by sorting bids into bins, Di, containing all bids bwhere good i ∈ Gb and for all j such that j∈[1, i-1], j ∉ Gb.Rather than always trying to add each bid to our allocation,we add at most one bid from every bin since all bids in agiven bin are mutually exclusive.

In fact, we can often skip bins entirely. While consideringbin Di, if we observe that good j>i is already part of the al-location then we do not need to consider any of the bids in Dj.In general, instead of considering each bin in turn, skip to Dk

where k∉G(F) and ∀i<k, i∈G(F).

4.3 Improvement #2: CachingLet Fi be the partial allocation under consideration when Di isreached during a search. Define Ci ⊆ G(Fi) where ∀j ∈ G(Fi),j>i ↔ j ∈ Ci. Note that there are many different partial al-locations Fi1, Fi2, etc., that share the same Ci, and that ifCi1=Ci2 then the search trees for Fi1 and Fi2 are identicalbeyond Di. It is therefore possible to cache partial searchesbased on Ci. However, caching all possible values of Ci

would require a cache of size 2|G|-(i-1), which would quicklybecome infeasible. Therefore, we only cache when Ci in-cludes no more than k goods, where k is a threshold defined at

runtime for each bin. Di requires a cache of size∑=

−k

j j

iG

0

|| .

4.4 Improvement #3: Pruning #1Performance can be improved by backtracking whenever agiven search path is provably unable to lead to a new bestallocation. We can prune whenever C (Fi1) ⊂ C (Fi2) andp(Fi2) + p(cache (Fi1)) ≤ bestAllocation. In this case, the sumof the revenue from the cached path beyond Fi1 and therevenue leading up to Fi2 is less than the revenue from thebest allocation seen so far. Since Fi1 allocates a superset ofthe goods allocated in Fi2 (thus overestimating revenue), abetter allocation would not be found by expanding Fi2.

4.5 Improvement #4: Pruning #2We can also backtrack when it is provably impossible toadd any bids to the current allocation to generate morerevenue than the current best allocation. Beforestarting the search we calculate an overestimate of therevenue that can be achieved with each good, o(g) =

||/)(max|

bbgb

Gbp∈

. o(g) is the largest average price per bid

of bids containing good g. We backtrack at any pointduring the search with allocation F if p(F) + ∑

∉Fg

go )( ≤

p(best_allocation). This technique is most effective whengood allocations are found quickly. Finding good alloca-tions quickly is also useful if a solution is required beforethe algorithm has completed (i.e., if CASS is used as ananytime algorithm). We have found that good allocationsare found early in the search when the bids in each bin areordered in descending order of average price per good.Similarly, the pruning technique is most effective whenthe unallocated goods are those with the lowest o(g) val-ues. To achieve this, we reorder bins so that for any twobins i and j, o(gi) > o(gj) ↔ i < j.

5 VSA AlgorithmOur second algorithm is called Virtual SimultaneousAuction (VSA). This market-based method was inspiredby market-oriented programming [Wellman, 1993; Mullenand Wellman, 1996] and the simultaneous ascending auc-tion [Milgrom, 1998]. VSA generates a virtual simulta-neous auction from the bids submitted in a real combina-torial auction, then simulates the virtual auction to find agood allocation of goods in the real auction.

5.1 AlgorithmFirst, a virtual simultaneous auction is generated based onthe bids submitted in a real combinatorial auction. Foreach bid b=(pb,Gb) a virtual bidder vb is created. The vir-tual bidders compete in a virtual simultaneous auction thathas multiple rounds. Each virtual bidder vb tries to win allthe goods in Gb for the price pb on an all-or-nothing basis.The virtual auction starts with no goods allocated and the

8

Page 13: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

prices of all goods set to zero. The simultaneous auction isrepeated round by round until either an optimal allocationis found or a pre-set time deadline is reached. In the lattercase the current best allocation is adopted as the finalresult.

Each round of VSA has three phases: the virtual auctionphase, the refinement phase and the update phase. In thevirtual auction phase each virtual bidder bids for the goodsthey want. Each individual good is allocated to the highestbidder. If a bidder succeeds in winning all desired goods,that bidder becomes a temporary winner. Otherwise thebidder becomes a temporary loser and returns all allocatedgoods to the auctioneer. In the refinement phase each ofthe losers is examined in a random order to see whethermaking that agent a temporary winner (and consequentlymaking a different winner into a loser) would increaseglobal revenue. If so, the list of winners is updated. Fi-nally in the update phase the current highest price of eachgood is changed to reflect the price that its current winnerbid. The current highest price for unallocated goods isreset to zero.

Virtual bidders in VSA follow a simple strategy. If abidder was the temporary winner in the previous round, thebidder does not bid in the current round. Otherwise, agentscalculate the sum of the current highest prices of the goodsrequired. If the sum exceeds an agent’s budget, the agentdoes not bid because the agent will not be able to acquireall the goods simultaneously. If the sum is less than thebudget, the agent bids such that the surplus (budget - sum)is equally divided among the goods.

5.2 PropertiesIn certain circumstances, VSA will find an optimal allo-cation. Additionally, it is sometimes possible to detect ifan optimal allocation has been found, allowing the virtualauction to end before the deadline.

[Theorem] If no virtual bidder bids in a round in the vir-tual auction, the current set of winners is optimal.

[Proof] Assume that no agents bid in a given round. De-fine the function that calculates the revenue of an alloca-tion F by r(F)=∑b∈Fpb and let O denote the optimal set ofwinners. Split the current set of winners W into two partsO1 and W2 such that O1=O∩W and W2=W∩¬O1. Also splitO into O1 and O2 such that O1 is defined as before and O2 =O ∩ ¬O1. Further, split G into G1 and G2 such thatG1=∪b∈O1Gb and G2=G∩¬G1. By the assumption, for eachcurrently losing bidder, the sum of the current highestprices of the goods needed exceeds the bidder’s budget.This is especially true for bidders in O2, i.e., ∀b∈O2

pb<∑g∈Gbhg where hg is the current highest price of good g.It follows that r(O2) = ∑b∈O2pb ≤ ∑b∈O2∑g∈Gbhg ≤ ∑g∈G2hg =∑b∈W2∑g∈Gbhg = ∑b∈W2pb = r(W2). (Remember that theminimum price of a good that is not allocated to any agentis zero and agents always bid their entire budgets.) Theinequality means that W is optimal because r(O) =r(O1)+r(O2) ≤ r(O1)+r(W2) = r(W).

However, there is no guarantee that auctions will always

finish, even if an optimal allocation is found.

[Theorem] There exists a set of bids B such that at leastone virtual bidder always bids in every round of the virtualauction no matter what bidding strategy is used.

[Proof] Suppose B={a,b,c} where a={pa, {1,2}}, b={pb,{2,3}}, and c={pc, {3, 1}}. Suppose further that pa < pb +pc, pb < pc + pa, and pc < pa + pb. Because the real bids aremutually exclusive, at most one virtual bidder becomes thetemporary winner. If none is winning, h1=h2=h3=0 and allthe bidders bid in the current round. Assume here thatbidder a is currently winning. Then h1+h2=pa and h3=0.Assume that neither b nor c bids in the current round. Thenfor each of b and c, the sum of the prices of goods neededmust be larger than or equal to the budget, i.e.,h2+h3=h2≥pb and h3+h1=h1≥pc. This means that pa =h1+h2≥pb+pc and contradicts pa<pb+pc. This argumentdoesn't depend on the bidding strategy as long as an agentbids if and only if their budget exceeds the sum of theminimum prices of the goods needed.

It is this property that makes the refinement phase ofVSA important. Consider the case B=B1∪B2∪... where∀i,j G(Bi)∩G(Bj)=∅, |Bi|=3 and each Bi satisfies the con-dition from the proof above. If we omit the refinementphase then the winner in each subset changes every roundexcept the case where there is no winner. Therefore, anoptimal global allocation is examined only when in everysubset the optimal winner is temporarily winning. Suchsynchronization is unlikely to occur unless the number ofsubsets is very small. The refinement phase causes theoptimal winners to become the temporary winners in everyround, leading to an optimal allocation even though it isnot detected as optimal. (In some cases where ∃i,jG(Bi)∩G(Bj)≠∅ or |Bi| > 3 an optimal allocation may beimpossible to achieve regardless of the time limit.)

6 Experimental EvaluationAs we have not yet determined each algorithm’s formalcomplexity characteristics we conducted empirical tests.We evaluated (1) how running time varies with the numberof bids, and (2) how percentage optimality of the bestallocation varies with time, given a particular bid distri-bution and a fixed number of bids and goods.

6.1 Assumptions and ParametersThe space of this problem is large. Roughly speaking it hasthree degrees of freedom: the number of goods, the num-ber of bids and the distribution of bids. Most problematicamong these is the distribution. Precisely because of thecomputational complexity of combinatorial auctions thereis little or no real data available. In the absence of suchdata we tested our algorithms against bids drawn randomlyfrom specific distributions.

Throughout the experiments we used the followingtwo distribution functions to determine how often abid for n goods appears. The first is binomial,fb(n)=pn(1-p)N-nN!/(n!(N-n)!), p=0.2, in which the prob-

9

Page 14: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

ability of each good being included in a given bid is in-dependent of which other goods are included. The seconddistribution is of exponential form, fe(n)=Ce-x/p, p=5,representing the case where a bid for n+1 goods appearse-1/p times less often than a bid for n goods. The prices ofbids for n goods is uniformly distributed between[n(1-d), n(1+d)], d=0.5.

We do not present any experiments varying the numberof goods in this paper because of space constraints. Wefound that for both CASS and VSA running time increasedexponentially with the number of goods.

We ran our experiments on a 450MHz Pentium II with256MB of RAM, running Windows NT 4.0. 30 MB ofRAM was used for the CASS cache. All algorithms wereimplemented in C++.

6.2 ResultsTo answer question (1) we measured the running time ofCASS, VSA and the brute-force algorithm. Since VSA isnot guaranteed to reach the optimal revenue, it was passedthis value—calculated by CASS—and stopped when itfound an allocation with revenue of at least 95% of opti-mal. All the results reported here are averages over 10different runs. Figure 1 shows running time as a functionof the number of bids with a binomial distribution, withthe number of goods fixed at 30. Figure 2 shows the samething for an exponential distribution, without thebrute-force algorithm. To answer question (2), wemeasured the optimality of the output of both VSA andCASS as a function of time. Figure 3 shows both algo-rithms’ performance with 15000 bids for 150 goods with abinomial distribution and Figure 4 shows 4500 bids for 45goods with an exponential distribution.

6.3 DiscussionCASS demonstrates excellent performance both in findingoptimal allocations and as an anytime algorithm. In Fig-ures 1 and 2 CASS remains roughly an order of magnitudefaster than VSA as the number of bids increases. Bothcurves appear to grow sub-linearly on the logarithmicgraph, suggesting polynomial-time performance. As thesize of the problem is increased (Figures 3 and 4) CASSstill performs better than VSA for the binomial distribu-tion, but initially offers worse anytime performance for theexponential distribution. These results—and other ex-periments we have conducted—suggest that VSA is mostlikely to outperform CASS when the number of goods isrelatively large compared to average bid length. (Note thatVSA runs to a time limit, so the point at which VSA’scurve ends is not meaningful.)

CASS’s effectiveness is strongly influenced by thedistribution of bids, particularly as the number of goodsincreases. If bids contain a large number of goods onaverage, improvement #1 will have a substantial effectbecause more bins will be skipped between every pair ofbins that are considered, eliminating the need to indi-vidually examine all the bids in those bins. However, our

0.0 01

0 .01

0.1

1

10

1 00

0 50 0 1 00 0 15 00 20 00 2 50 0 30 00

B id s (a lw ay s 3 0 g o o d s)

V SA (9 5% ) C A S S B rute F o rce

Figure 1: Running Time Comparison (Binom. Dist.)

0.001

0 .01

0.1

1

10

100

0 500 1000 1500 2000 2500 3000

Bids (a lw ays 30 goods)

VS A (95% ) CA S S

Figure 2: Running Time Comparison (Exp. Dist.)

0.6 5

0.7 5

0.8 5

0.9 5

0 5 0 1 00 15 0 2 00

T im e (se c) - 15 0 g o o d s, 15 00 0 b id s, b in o m ia l d is trib .

V SA C A S S

Figure 3: Anytime Behavior (Binom. Dist.)

0 .9

0.9 5

1

0 1 00 2 00 30 0 4 00 50 0 6 00

T im e (s ec ) - 4 5 g o o d s, 45 00 b id s , e xp o n e n tia l d is trib .

VS A C A SS

Figure 4: Anytime Behavior (Exp. Dist.)

caching scheme favors distributions with small bids be-cause they increase the likelihood that partial allocationswill be cacheable. The pruning technique described in 4.4reduces the number of nodes that are cached, lowering

10

Page 15: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

memory consumption and making CASS feasible for lar-ger problems. Our second pruning technique often im-proves performance by two orders of magnitude, though itis most effective when the variance of average price perbid is relatively small. This technique also reduces theoptimal cache size, further reducing memory consumption.As a result of pruning, with pruning the amount of memoryavailable for caching does not seem to be a limiting factorin CASS’s performance.

VSA is interesting for two reasons. Firstly, it appears tooffer good anytime performance in cases with small bidsand many goods. Secondly, it provides a case study in thepower of market-based optimization. Further work isneeded to reach firm conclusions, but it appears that as acentralized optimization method VSA is overshadowed byother techniques. However, other attractions of mar-ket-based optimization—in particular its inherent distrib-uted nature and robustness to change in problem specifi-cation—may make VSA attractive for some domains.

7 Related and Ongoing WorkAs far as we are aware, the work most directly relevant tothe ideas presented here is a paper by Sandholm [1999]that appears in these proceedings. Sandholm’s Bidtreealgorithm appears to be closely related to CASS, but im-portant differences hold. In particular, Bidtree performs asecondary depth-first search to identify non-conflictingbids, whereas CASS’s structured approach allows it toavoid considering most conflicting bids. Bidtree alsoperforms no pruning analogous to our Improvement #3and no caching. On the other hand, Bidtree uses an IDA*search strategy rather than CASS’s branch-and-boundapproach, and does more preprocessing. We intend tocontinue studying the differences between these algo-rithms, including differences in experimental settings.

Our problem can of course be abstracted away from theauction motivation and viewed as a straightforward com-binatorial optimization. This suggests a wealth of litera-ture that could be applied. We are currently implementingsome of these techniques and comparing them to ourpresent results. We are especially interested in compari-sons with mixed-integer programming and greedy meth-ods. In particular, we have been investigating a new al-gorithm5 that orders bids in descending order according toaverage price per good, and does a depth-first search withextensive pruning. This algorithm appears to offer per-formance similar to CASS, and we intend to report on it ina follow-up paper.

8 ConclusionWe have proposed two novel algorithms to mitigate thecomputational complexity of combinatorial auctions.

CASS determines optimal allocations very quickly, andalso provides good anytime performance. In the future we

5 This ongoing work is joined by Liadan O’Callaghan andDaniel Lehmann.

intend to pursue a formal analysis of CASS’s computa-tional complexity, and to test both CASS and VSA withdata collected from real bidders.

VSA can determine near-optimal allocations even incases with hundreds of goods and tens of thousands of bids.Since it has been infeasible to run CASS on much largerproblems we do not yet know how close VSA comes tooptimality in these cases. An investigation of VSA’s limitsremains an area for future work.

References[Banks, et al., 1989] Jeffrey S. Banks, John O. Ledyard, andDavid P. Porter. Allocating uncertain and unresponsive re-sources: an experimental approach. RAND Journal of Economics,20(1): 1-25, 1989.

[Brewer and Plott, 1996] P.J. Brewer and C.R. Plott. A binaryconflict ascending price (BICAP) mechanism for the decentral-ized allocation of the right to use railroad tracks. InternationalJournal of Industrial Organization, 14:857-886, 1996]

[Bykowsky et al., 1995] Mark M. Bykowsky, Robert J. Cull, andJohn O. Ledyard. Mutually destructive bidding: the FCC auctiondesign problem. Social science working paper 916, CaliforniaInstitute of Technology, 1995.

[Clearwater, 1996] Scott H. Clearwater, editor. Market-BasedControl: A Paradigm for Distributed Resource Allocation.World Scientific, 1996.

[Cortese and Stepanek, 1998] Amy E. Cortese and Marcia Ste-panek. Good-bye to fixed pricing? Business Week, pages 71-84,May 4, 1998.

[Mas-Colell et al., 1995] Andreu Mas-Colell, Michael D.Whinston, and Jerry R. Green. Microeconomic Theory. OxfordUniversity Press, New York, 1995.

[Milgrom, 1998] Paul Milgrom. Putting auction theory to work:the simultaneous ascending auction. Working paper 98-002,Dept. Economics, Stanford University, 1998.

[Mullen and Wellman, 1996] Tracey Mullen and Michael P.Wellman. Some issues in the design of the market-orientedagents. In M.J. Wooldridge, J.P. Muller, and M. Tambe, editors,Intelligent Agents II: Agent Theories, Architectures, and Lan-guages. Springer-Verlag, 1996.

[Rothkopf et al., 1995] Michael H. Rothkopf, Aleksandar Pekeč,and Ronald M. Harstad. Computationally manageable combi-natorial auctions. DIMACS Technical Report 95-09, April 1995.

[Sandholm, 1999] Tuomas Sandholm. An algorithm for optimalwinner determination in combinatorial auctions. Proceedings ofthe International Joint Conference on Artificial Intelligence(IJCAI-99), Stockholm, 1999.

[Varian, 1995] Hal R. Varian. Economic mechanism design forcomputerized agents. In Proceedings of the First Usenix Con-ference on Electronic Commerce, New York, July 1995.

[Wellman, 1993] Michael P. Wellman. A market-oriented pro-gramming environment and its application to distributed mul-ticommodity flow problems. Journal of Artificial IntelligenceResearch, 1:1-23, 1993.

11

Page 16: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

An Algorithm for Multi-Unit Combinatorial Auctions

Kevin Leyton-Brown and Yoav Shoham and MosheTennenholtzComputerScienceDepartment

StanfordUniversity, Stanford,CA 94305

Abstract

Wepresentanovel algorithmfor computingtheoptimalwin-ning bidsin a combinatorialauction(CA), that is, anauctionin which biddersbid for bundlesof goods. All previouslypublishedalgorithmsarelimited to single-unitCAs, alreadya hardcomputationalproblem. In contrast,herewe addressthemoregeneralproblemin whicheachgoodmayhavemul-tiple units, andeachbid specifiesan unrestrictednumberofunits desiredfrom eachgood. We prove the correctnessofour branch-and-boundalgorithm,which incorporatesa spe-cializeddynamicprogrammingprocedure.We thenprovidevery encouraginginitial experimentalresultsfrom an imple-mentedversionof thealgorithm.1

Intr oductionAuctions are the most widely studiedmechanismin themechanismdesignliterature in economicsand game the-ory (Fudenberg & Tirole 1991). This is due to the factthat auctionsare basic protocols,serving as the buildingblocks of more elaboratedmechanisms. Given the widepopularity of auctionson the Internetand the emergenceof electroniccommerce,whereauctionsserve as the mostpopulargame-theoreticmechanism,efficient auctiondesignhas becomea subjectof considerableimportancefor re-searchersin multi-agentsystems(e.g.(Wellmanetal. 1998;Monderer& Tennenholtz2000)). Of particularinterestaremulti-objectauctionswherethebidsnamebundlesof goods,calledcombinatorialauctions(CA). For example,imagineanauctionof usedelectronicequipment.A biddermaywishto bid � for a particularTV and � for a particularVCR, but���� ��� � for thepair. In this exampleall thegoodsat auc-tion aredifferent,so we call the auctiona single-unitCA.In contrast,consideranelectronicsmanufacturerauctioning100 identicalTVs and100 identicalVCRs. A retailerwhowantsto buy 70 TVs and30 VCRswould beindifferentbe-tweenall bundleshaving 70TVs and30VCRs.Ratherthan

Copyright c�

2002, American Associationfor Artificial Intelli-gence(www.aaai.org). All rightsreserved.

1This work was partly supportedby DARPA grant numberF30602-98-C-0214-P00005.

having to bid on eachof the �� � ��� distinct bundles,shewould preferto placethesinglebid (price, � 70 TVs, 30VCRs� ). We call anauctionthatallows sucha bid a multi-unit CA.

In a combinatorialauction,a selleris facedwith a setofprice offers for variousbundlesof goods,andhis aim is toallocatethegoodsin awaythatmaximizeshisrevenue.Thisoptimizationproblemis intractablein thegeneralcase,evenwhen eachgood hasonly a single unit (Rothkopf, Pekec,& Harstad1998). Given this computationalobstacle,twoparallel lines of researchhave evolved. The first exposestractablesub-casesof the combinatorialauctionsproblem.Most of this work hasconcentratedon identifying biddingrestrictionsthatentailtractableoptimization;see(Rothkopf,Pekec, & Harstad1998; Nisan 1999; Tennenholtz2000;Vries & Vohra2000). Also, the caseof infinitely divisiblegoodsmaybetractablysolvedby linearprogrammingtech-niques. The other line of researchaddressesgeneralcom-binatorial auctions. Although this is a classof intractableproblems,in practiceit is possibleto addressinterestingly-large datasetswith heuristic methods. It is desirabletodo so becausemany economicsituationsarebestmodeledby a generalCA, andbidders’strategic behavior is highlysensitive both to changesin the auction mechanismandto approximationof the optimal allocation (Nisan & Ro-nen 2000). Previous researchon the optimizationof gen-eral CA problemshasfocusedexclusively on the simplersingle-unitCA (Fujishima,Leyton-Brown,& Shoham1999;Sandholm1999;Lehmann,O’Callaghan,& Shoham1999)).The generalmulti-unit problem has not previously beenstudied,nor have any heuristicsfor its solutionbeenintro-duced.

In this paperwe presenta novel algorithm, termedCA-MUS (CombinatorialAuction Multi-Unit Search),to com-putethewinnersin a general,multi-unit combinatorialauc-tion. A generalizationand extensionof our CASS algo-rithm for winner determinationin single-unit CA’s (Fu-jishima, Leyton-Brown, & Shoham1999),CAMUS intro-ducesanovel branch-and-boundtechniquethatmakesuseofseveral additionalprocedures.A crucial componentof any

APPENDIX B

12

Page 17: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

suchtechniqueis a function for computingupperboundson the� optimal outcome.We presentsuchan upperboundfunction,tailoredspecificallyto themulti-unit combinatorialauctionsproblem. We prove that this functiongivesanup-perboundontheoptimalrevenue,whichenablesusto showthatCAMUS is guaranteedto find optimalallocations.Wealso introducedynamic programmingtechniquesto moreefficiently handlemulti-unit single-goodbids. In addition,we presenttechniquesfor pre-processingandcaching,andheuristicsfor determiningsearchorderings,furthercapital-izing on the inherentstructureof multi-unit combinatorialauctions.

In thenext sectionwe formally definethegeneralmulti-unit combinatorialauctionproblem. In Section3 we de-scribeCAMUS. In Section4 we deal in somemoredetailwith someof CAMUS’s techniques.Due to lack of space,wecannotpresentall theCAMUSproceduresin detail;how-ever, this sectionwill clarify its most fundamentalcompo-nents. In Section5 we presentour experimentalsetupandsomeexperimentalresults.

ProblemDefinitionWe now definethe computationalproblemassociatedwithmulti-unit combinatorialauctions.

Let � � ��� �� ��� ��������� ����� be a set of goods. Let "!$#&%denotethe number of available units of good # . Con-sider a set of bids ' � �)( ���������� (+*,� . Bid (+- is a pair!/.0!1( - % ��2 !1( - %3% where .0!1( - % is the price offer of bid ( - , and

2 !1(+-+% � ! 2 !1(+-+% 4��2 !1(3-�%5� ����������2 !1(3-�%��6% where 2 !1(3-�%�7 is thenumberof requestedunitsof good # in (3- . If thereis no bidrequesting8 unitsof good 9 and0 unitsof all goods# �� 9(for some:<;=9>;@? andsome:<;A8B;C "!$9�% ) then,w.l.o.g,we augment' with a bid of price0 for thatbundle. An al-location DCEF' is a subsetof thebidswhere GIHKJ)L 2 !M(�%N7O; "!/#P% ( :Q;R#S;R? ). A partial allocation DUTKV�W+X - V�Y is an al-locationwhere,for some# , GIHKJZL�[�\1]M^$_`\+a 2 !1(�%N7<bc "!$#&% . A fullallocationisanallocationthatisnotpartial.Let d denotethesetof all allocations.The multi-unit combinatorialauctionproblemis thecomputationof anoptimalallocation,thatis,e�f �g? e�� LhJ�i>GIHKJ)L".0!1(�% . In short,wearesearchingfor asub-setof thebidsthatwill maximizetheseller’s revenuewhileallocatingeachavailableunit atmostonce.

Notethatthedefinitionof theoptimalallocationassumesthat bids areadditive–thatan auctionparticipantwho sub-mits multiple bids may be allocatedany numberof thesebids for a price that equalsthe sumof eachallocatedbid’spriceoffer. In somecases,however, a participantmaywishto submittwo or morebidsbut requirethatat mostonewillbeallocated.We permitsuchadditionalconstraintsthroughtheuseof dummygoods, introducedalreadyin (Fujishima,Leyton-Brown,& Shoham1999).Dummygoodsarenormalsingle-unitgoodswhichdonotcorrespondto actualgoodsintheauction,but serve to enforcemutualexclusionbetweenbids. For example, if bids ( and (1� referring to bundles

2 !1( % and 2 !1(1��% are intendedto be mutually exclusive, weadda dummygood j to eachbid: 2 !1( % becomes2 !1( %Uklj ,and 2 !1(1��% becomes2 !1(1��%Ikmj . Sincethe good j canbe al-locatedonly once,at mostoneof thesebids will be in anyallocation.(Moregenerally, it is possibleto introducen -unitdummygoodsto enforcetheconditionthatno morethan nof a setof bidsmaybeallocated.)While dummygoodsin-creasethe expressive power of the bidding language,theirusehasno impacton theoptimizationalgorithm.Hence,intheremainderof this paperwe do not discriminatebetweendummygoodsandreal goods,andwe assumethat all bidsareadditive.

In thesequel,we will alsomake useof thefollowing no-tation. Given an allocationD anda good 9 , we will denotethetotalnumberof unitsallocatedin D , andthetotalnumberof unitsof good9 allocatedin D , by o&n09qp�r�!/Ds% ando&n09qp�r�-+!/Ds%respectively. In addition o&n,9tp�r)!/p�u�p ewv % will denotethe totalnumberof unitsover all goods.

Algorithm DefinitionBranch-and-BoundSearch

Given a setof bids, CAMUS systematicallycomparestherevenuefrom all full allocationsin order to determinetheoptimal allocation. This comparisonis implementedas adepth-firstsearch:webuild upapartialallocationonebid ata time. Oncewehave constructeda full allocationweback-track,removing themostrecentlyaddedbid from thepartialallocationandaddinga new bid instead.Sometimeswe cansafelyprunethesearchtree,backtrackingbeforea full allo-cationhasbeenconstructed.Everytimeabid is addedto thecurrentallocation,CAMUS computesanestimateof therev-enuethatwill begeneratedby theunallocatedgoodswhichremain.Providedthatthisestimatefunction u"!+% alwayspro-videsan upperboundon the actualrevenue,we canprunewhenever .U!$Dx% � ug!$Dx%�;y.0!/DUHtz3{ X % , where D is the currentallocation,.0!$Dx% � GIHKJ)L".0!1(�% andDUHtz3{ X is thebestallocationobservedsofar.

Bins

Binsarepartitionedsetsof bids. Considersomeorderingofthegoods.Thereis onebin for eachgood,andeachbid be-longsto thebin correspondingto its lowest-ordergood.Dur-ing the searchwe start in the first bin andconsideraddingeachbid in turn. After addinga bid to our partial alloca-tion we move to the bin correspondingto the lowest-ordergoodwith any unallocatedunits. For example,if the firstbid we selectrequestsall unitsof goods1, 2 and4, we nextproceedto bin 3. Besidesmaking it easyto avoid consid-erationof conflicting bids, bins arepowerful becausetheyallow thepruningfunction to considercontext without sig-nificantcomputationalcost.If bidsin (t9tn|- arecurrentlybe-ing consideredthenthepruningfunctionmustonly take intoaccountbids from (t9qn}- ����� (t9tn|� . Becausethe partitioning

13

Page 18: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

of bids into binsdoesnot changeduringthesearchwe maycompute~ the pruninginformationfor eachbin in a prepro-cessingstep.

SubbinsIn the multi-unit setting,we will often needto selectmorethan one bid from a given bin. This leadsto the idea ofsubbins. A subbinis asubsetof thebidsin abin thatis con-structedduringthesearch.Sincesubbinsarecreateddynam-ically they cannotprovideprecomputedcontextual informa-tion; rather, they facilitatetheefficient selectionof multiplebidsfrom agivenbin. Every timeweaddabid to ourpartialallocationwe createa new subbincontainingthenext setofbids to consider. If thesearchmovesto a new bin, thenewsubbinis generatedfrom the new bin by removing all bidsthatconflict with thecurrentpartialallocation.If thesearchremainsin thesamebin, thenew subbinis createdfrom thecurrentsubbinby removing conflicting bids asabove, andadditionally: if (t94j �� (t94j�� ��������� (t94j)- is theorderedsetof ele-mentsin thecurrentsubbinand (t94j)7 is thebid thatwasjustchosen,thenweremoveall (t94j)� � 8�;@# . In thiswaywecon-siderall combinationsof non-conflictingbids in eachbin,ratherthanall permutations.

DominatedBidsSome bids may be removed from considerationin apolynomial-timepreprocessingstep. For eachpair of bids( ( , (M� ) wherebothnamethesamegoodsbut .U!M( %s�S.0!1(M�w%and 2 !1( %N7�; 2 !1(1�w%�7 for every good # , we may remove (M�from thelist of bidsto beconsideredduringthesearch,as (M�is neverpreferableto ( (hencewesaythat ( dominates(1� ).However, it is possiblethat an optimal allocationcontainsboth ( and (1� . For this reasonwe store (1� in a secondarydatastructureassociatedwith ( , andconsideraddingit toanallocationonly afteradding ( .Dynamic ProgrammingSingletonbids (that is, bids that nameunits from only onegood)deserve specialattention. Thesebids will generallybeamongthemostcomputationallyexpensive to consider–thenumberof nodesto searchafteraddinga very shortbidis nearly the sameas the numberof nodesto searchafterskipping the bid, becausea short bid allocatesfew unitsandhenceconflictswith few otherbids. Unfortunately, weexpect that singletonbids will be quite commonin a vari-ety of real-world multi-unit CA’s. CAMUS simplifies theproblemof singletonbids by applying a polynomial-timedynamicprogrammingtechniqueas a preprocessingstep.We constructa vector r+9qn0� v 2 p�uKn|� for eachgood � , whereeachelementof the vector is a setof singletonbids nam-ing only good � . r+9qn0� v 2 p�u�n}��!/#&% evaluatesto the revenue-maximizingsetof singletonbidstotaling # unitsof good � .This freesus from having to considersingletonbids indi-vidually; instead,we consideronly elementsof the single-

ton vector and treat theseelementsas atomic bids duringthe search. Also, thereis never a needto add more thanoneelementfrom eachsingletonvector. To seewhy, imag-ine that we addboth r+9qn0� v 2 p�u�n}��!/#&% and r+9qn0� v 2 p�u�n}��!18&% toour partial allocation. Thesetwo elementsmay have bidsin common,and additionally theremay be singletonbidswith morethan ? e�� !/# � 8&% elementsthatwould not conflictwith our partialallocationbut thatwe have not considered.Clearly, we would be betteroff addingthe singleelementr+9qn0� v 2 p�u�n}��!/# � 8&% .Caching

Considera partial allocation D that is reachedduring thesearchphase.If thesearchproceedsbeyond D then u"!/D %wasnot sufficiently small to allow usto backtrack.Later inthesearchwemayreachanallocationD,� which,by combin-ing differentbids,coversexactly thesamenumberof unitsof thesamegoodsasD . CAMUS incorporatesamechanismfor cachingthe resultsof the searchbeyond D to generatea betterestimatefor the revenuegiven D,� thanis given byu"!/D,�w% . (SinceD and D,� do not differ in theunitsof goodsthatremain,u"!/D % � u"!/D,�w% .) Considerall theallocationsex-tendingD uponconsiderationof which thealgorithmback-tracked, denotedr �� rK� ��������� r�� . When we backtracked ateachr - we did sobecause.0!1r - % � ug!Mr - %>;�.0!/D Htz3{ XK% , asex-plainedabove. It follows that ? e�� -+!/.0!1r4-�% � ug!Mr�-+%+% is anoverestimateof therevenueattainablebeyond D , andthatitis asmalleroverestimatethan u"!/D % (if it werenot,wewouldhave backtracked at D instead).Sincein general.0!/D % ��.0!/D,��% , wecachethevalue? e�� -+!/.0!Mr�-+% � u"!1r4-�%3%��B.0!/D % andbacktrackwhen .0!/D,��% ����e���� 2 !/D,��%>;S.U!$DUHtz3{ X % . Our cacheis implementedasa hashtable,sincecachingis only bene-ficial to theoverall searchif lookuptime is inconsequential.A consequenceof this choiceof datastructureis thatcachedatamaysometimesbeoverwritten;weoverwriteanold en-try in the cachewhen the searchassociatedwith the newentry examinedmore nodes. Even when we do overwriteuseful data the error is not catastrophic,however: in theworst casewe mustsimply searcha subtreethat we mightotherwisehavepruned.

Heuristics

Two orderingheuristicsareusedto improve CAMUS’s per-formance. First, we must determinean ordering of thegoods;thatis,whichgoodcorrespondsto thefirst bin,whichcorrespondsto thesecond,etc.For eachgood9 wecomputer � u f 2 - �

*"�Z�sHt-/�K{4_N� ����-��V�� �K�Z*�- X {K_ , wheren,o&?@(t94j�r4- is the numberofbidsthat requestgood 9 and e�� ��o&n,9tp�r4- is theaveragenum-ber of total units (i.e., not just units of good 9 ) requestedby thesebids. We designatethe lowest-ordergood as thegoodwith thelowestscore,thenwerecalculatethescoreforthe remaininggoodsandrepeat. The intuition behindthisheuristicis asfollows:

14

Page 19: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

� We want to minimize the numberof bids in low-orderbins,� to minimizeearlybranchingandthusto make eachindividualprunemoreeffective.

� Wewantto minimizethenumberof unitsof goodscorre-spondingto low-orderbins,sothatwe will morequicklymove beyond the first few bins. As a result,the pruningfunctionwill beableto take into accountmorecontextualinformation.

� We wantto maximizethetotal numberof unitsrequestedby bids in low-order bins. Taking thesebids moves usmorequickly towardsthe leavesof thesearchtree,againproviding the pruningfunction with morecontextual in-formation.

Our secondheuristicdeterminestheorderingof bidswithinbins. Given current partial allocation D , we sort bidsin a given bin in descendingorder of r � u f 2 !M(+7)% , where

r � u f 2 !1(+7)% �T ��H1�N��Z*�- X {���H1�q� � u"!/D�k�(37)% . Theintuition behindthis

heuristicis that theaveragepriceperunit of (t94j)7 is a mea-sureof how promisingthebid is, while thepruningoveresti-matefor u"!/D6k�(q9�jZ7)% is anestimateof how promisingtheun-allocatedunitsare,giventhepartialallocation.This heuris-tic helpsCAMUS to find goodallocationsquickly, improv-ing anytime performanceandalsoincreasingD0Htz3{ X , makingpruningmoreeffective. Becausethe pruningoverestimatedependson D , thisorderingis performeddynamicallyratherthanasapre-processingstep.

CAMUS OutlineBasedon theabove, it is now possibleto give anoutlineoftheCAMUS algorithm:

� Process dominated bids.� Determine an ordering on the goods,according to the good-ordering heuristic.

� Using the dynamic programming technique,determine the optimal combination ofsingleton bids totaling ���K�K�N���$�"  for eachgood � .

� Partition all non-singleton bids intobins, according to the good ordering.

� Precompute pruning information for eachbin.

� Set ¡}¢=� and £�¢¥¤�¦ .� Recursive entry point:

– For � = 1 ...number of bids in thecurrent subbin of §+¡�¨U© .ª £�¢�£�«¬§�¡�­�® .ª If (��$£| &°B±�²)±)³�´"�/£| |µ�¯w�/£U¶�·M¸N¹q  ) backtrack.ª If (��$£| &°Bºg�/£| |µ�¯w�/£U¶�·M¸N¹q  ) backtrack.ª If (»w¨P¡�¼4½4�$£| ¾¢¿»w¨P¡�¼4½K�/¼qº�¼N²�ÀM  ) record £ if itis the best; backtrack.

ª Set ¡ to the index of the lowest-ordergood in £ where »w¨P¡�¼K½K©N�/£| }ÁÂ���$¡+  . (¡ may ormay not change)ª Construct a new subbin based on theprevious subbin of §�¡�¨�© (which is §�¡�¨�©itself if ¡ changed above):à Include all §�¡�­PÄ from current subbin,where ÅÇÆ�� .à Include all dominated bids associatedwith §+¡�­�® .à Include ½t¡�¨hÈZÀ5´4¼qº�¨�©��/���$¡+ &É�»w¨P¡�¼K½K©��/£| M  .à Sort the subbin according to thesubbin-ordering heuristic.à Recurse to the recursive entry point,above, and search this new subbin.ª £�¢�£�ÉB§+¡�­g® .

– End For� Return the optimal allocation: £ ¶N·t¸N¹ .

CAMUS procedures: a closerlookIn this sectionwe examinetwo of CAMUS’s fundamentalproce-duresmore formally. Additional detailswill be presentedin ourfull paper.

PruningIn this subsectionwe explain the implementationof CAMUS’spruningfunction anddemonstratethat it is guaranteednot to un-derestimatetherevenueattainablegivena partialallocation.Con-sidera point in thesearchwherewe have constructedsomepartialallocation £ . The taskof our pruning function is to give an up-perboundon theoptimal revenueattainablefrom theunallocateditems,usingtheremainingbids(i.e., thebidsthatmaybeencoun-teredduring the remainderof the search). Hence,in the sequelwhenwereferto goods,thenumberof unitsof agoodandbids,wereferto whatremainsatourpoint in thesearch.

First, we provide an intuitive overview. For every (remaining)good � we will calculatea value Ê��/�"  . Simplifying slightly, thisvalueis thelargestaveragepriceperunit of all the(remaining)bidsrequestingunitsof good� thatdonotconflictwith £ , multipliedbythenumberof (remaining)unitsof � . Thesumof Ê��/�"  valuesfor allgoodsis anupperboundon optimalrevenuebecauseit relaxestheconstraintthatthebidsin theoptimalallocationmaynot conflict.

More formally, let Ë@¢�¤3È�Ì+ÍNÈ�Î4ÍK�K�K�KÍNÈ�ÏЦ beasetof goods.Let��Ñ��/�"  denotethe numberof availableunits of good � . Consideraset of bids ÒÓ¢Ô¤3§"Ì+ÍK�K�K�KÍN§�Õw¦ . Bid §�© is associatedwith a pair�/¯w�/§�©N 1ÍN´"�/§�©N t  where��$§�©N  is thepriceoffer of bid §�© , and ´"�/§�©N Ö¢�/´"�$§�©N 3Ì�ÍN´"�/§�©N 1ÎKÍK�K�K�KÍN´"�$§�©N qÏ>  where"�/§�©N N® is therequestednumberof unitsof good� in §�© . For eachbid §�© , let ²P�/§�©N U¢ ×�Ø ¶ _�ÙÚPÛ`Ü � Ü&Ý · Ø ¶ _NÙ��betheaveragepriceperunit of bid §�© . Noticethattheaveragepriceperunit maychangedramaticallyfrom bid to bid, andit is a non-trivial notion; our techniquewill work for any arbitrary averagepriceperunit. Let Þs�/�"  beasortedlist of thebidsthatreferto non-zerounitsof good� ; thelist is sortedin amonotonicallydecreasingmanneraccordingto the ²h© ’s. Let ß Þà�/�" 1ß denotethe numberofelementsin Þs�/�"  , andlet Þs�/�"  Ä denotethe Å -th elementof Þs�/�"  .Êh�$�"  is determinedby thefollowing algorithm:

15

Page 20: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Let Êh�/�"  :=0;Let ál�/�"  :=0;For ¡}â�¢¥� to ß Þs�$�" 1ß doif ál�/�" �Á���Ñ��/�"  then¤ let ­Óâ�¢ãá�¡�¨|�/´"�/Þà�/�" t©� N®�ÍN���$�" <Éäál�/�" t Måqál�/�" c¢ãál�/�" æ°­håqÊh�/�� �¢�Êh�/�" P°B²P�/Þà�/�" t©N  à ­P¦Theorem1 Let Ò6çÇ¢è¤3§�é Ì ÍN§�éÎ ÍK�K�K�KÍN§�é¸ ¦ be thebids in an optimalallocation.Then,ê�çë¢=ì ¶tí)î,ï ¯w�/§Z �µ=ì Ì1ðg®�ð�Ï Êh�/�"  .Sketch of proof: Consider the bid §)çòñ Ò6ç . Then,¯w�/§)ç� O¢¾ì Ì1ðg®KðwÏ ²P�/§ZçK  à ´��/§)ç� N® . Hence, ê�çl¢¾ìë¶tí)î,ïM¯w�/§� O¢ìó¶qí)î,ï�ì ÌMð�®�ð�Ï ²P�/§�  à ´"�/§� N® . By changingtheorderof summationwe get that ê�笢Aì ÌMð�®�ð�Ï ì ¶tí)î,ï ²P�/§�  à ´"�$§Z N® . Notice that,givena particular � , the contribution of bid § to ìë¶tí)î ï ²P�/§�  à ´"�/§� N® is²P�/§�  à ´"�/§� N® . Recallnow that Êh�/��  hasbeenconstructedfrom thesetof all bids that refer to good � by choosingthemaximalavail-ableunits of good � from the bids in Þs�$�"  , wherethesebids aresortedaccordingto theaveragepriceperunit of good. Hence,weget Êh�/�" �ô=ìë¶tí)î ï ²P�/§�  à ´"�$§Z N® . Giventhattheaboveholdsfor everygood� , this impliesthat ì ÌMð�®�ð�Ï Êh�/�" �ô@ìó¶tíZî0ïM¯w�/§�  , asrequested.

Theabove theoremis thecentraltool for proving thefollowingtheorem:

Theorem2 CAMUSis complete:it is guaranteedto find theopti-malallocationin a multi-unit combinatorialauctionproblem.

Pre-Processingof SingletonsIn this subsectionwe explain the constructionof the ½t¡�¨PÈZÀ/´�¼Nº�¨�õvectordescribedabove,anddemonstratethat ½t¡�¨PÈZÀ/´�¼Nº�¨�õ��/�"  is therevenue-maximizingsetof singletonbidsfor good È thatrequestatotal notexceeding� units.

Let §�Ì�ÍN§ZÎ4ÍK���K�KÍN§Zö be bids for a singlegood È , wherethe totalnumberof availableunits of good È is � . Let ¯w�/§�©�  and ´"�/§�©N  bethepriceoffer andthequantityrequestedby §�© , respectively. Ouraim is to computetheoptimalselectionof §�© ’s in orderto allocateÅ units of good È , for �lµ÷ÅmµA� . Considera two dimensionalgrid of size ø��|�K�K�NÀ$ùûú�ø��|�K�K�N��ù wherethe �$¡�ÍN�"  -th entry, denotedbyü �/¡�ÍN�"  , is the optimal allocationof � units consideringonly bids§"Ì+ÍN§ZÎ4ÍK�K�K�KÍN§�© . Thevalueof

ü �/¡+ÍN�"  , denotedby ý¬�$¡�ÍN�"  , is thesumof the price offers of the bids in

ü �/¡�Í��"  . ü �t�)Í��"  will be §"Ì if §"Ìrequestsnomorethan� units,andotherwisewill betheemptyset.Now wecandefine

ü �/¡�ÍN��  recursively:

1. ´��/§�©� |Æ�� : ü �/¡+ÍN�" �¢ ü �/¡0ÉÂ�ZÍN�"  ;2. ´��/§�©� �¢þ� : if ¯w�/§�©N 6ÆÿýÐ�/¡óÉ��ZÍN�"  then

ü �$¡�ÍN�" �¢c§�© . Elseü �/¡+ÍN�" �¢ ü �/¡0ÉÂ�ZÍN�"  .3. ´��/§�©� sÁ@� : if ýÐ�/¡|É �ZÍN�� sô@¯w�/§�©� ,°@ýÐ�/¡|É �ZÍN�¬É ´"�/§�©N M  thenü �/¡+ÍN�" �¢ ü �/¡PÉ �)ÍN��  . Else

ü �/¡+ÍN�" �¢�§�©"« ü �/¡PÉ �ZÍN�ëÉ ´"�/§�©N M  .This dynamicprogrammingprocedureis polynomial, and yieldsthe desiredresult; the optimal allocationof Å units is given byü �/À1ÍKÅ"  . Set ½t¡�¨PÈZÀ/´�¼Nº�¨Uõ��1Å�  =

ü �/À1ÍKÅ�  , �е=Åǵ�� .Experimental results

Unfortunately, no real-world dataexists to describehow bidderswill behave in generalmulti-unit combinatorialauctions,preciselybecausethe determinationof winnersin suchauctionswasprevi-ouslyunfeasible.WehavethereforetestedCAMUS onsetsof bidsdrawn from a randomdistribution. We createdbids as follows,

varyingtheparametersP»wá õ çMç � ¸ andP»wá ¶ © � ¸ , andfixing thepa-rameters»w¨h¡�¼K½KÏ����Т �

, ²)Ê)ÈZ¯���¡�±�´ ¶ � ¸t· ¢ �, ²)Ê)È)¯���¡�±�´ ����à¢�� � ,

¯���º�§"Ì}¢ ��� , ¯���º�§)ÎI¢ ��� � , ¯���¡�±�´ ����ࢠ� � :

1. Setthenumberof unitsthatexist for eachgood:

(a) For eachgood ¡ , randomlychoose»w¨h¡�¼K½K© from the rangeø����K�K�N»w¨P¡�¼4½�Ï����+ù .

(b) If ìI©�»w¨h¡�¼K½K©��¢ Õ���Ï�� ïtï�� � Ú"!�# _�^ � Ý \$�&% Û ®��Õ�© ¹�¸ Ý \$ (the expectationon

ìI©�»w¨P¡�¼K½K© ) thengoto (a). Thisensuresthateachtrial involvesthesametotal numberof units.

2. Set an average price for each good: ²�ÊZÈ)¯���¡�±�´�© is drawnuniformly randomly from the range ø ²)Ê)È)¯���¡�±�´g¶ � ¸t·þɲ)Ê)È)¯���¡�±�´ ���� �K�K�N²�Ê)ÈZ¯���¡�±�´ ¶ � ¸t· °B²�Ê)ÈZ¯���¡�±�´ ����� ù .

3. Selectthe numberof goodsin the bid. This numberis drawnfrom adecaydistribution:

(a) Randomlychoosea goodthathasnot alreadybeenaddedtothis bid

(b) With probability ¯���º�§"Ì , if moregoodsremainthengo to (a)

4. Selectthe numberof units of eachgood,accordingto anotherdecaydistribution:

(a) Add aunit

(b) With probability ¯���º�§)Î , if moreunitsremainthengo to (a)

5. Set a price for this bid: ¯���¡�±�´è¢'��²)¨P­h�M� É@¯���¡�±�´ ������Í�� °¯���¡�±�´ �����  Ã ì © í)¶ © � �/²�ÊZÈ)¯���¡�±�´�© à »w¨P¡�¼K½K©N This distribution hasthe following characteristicsthat we con-

siderto bereasonable.Bids will tendto requesta smallnumberofgoods,independentof thetotal numberof goods.Suchdatacasesarecomputationallyharderthandrawing a numberof goodsuni-formly from a range,or thanscalingtheaveragenumberof goodsperbid to themaximumnumberof goods.Likewise,bidswill tendto namea smallnumberof unitspergood. Pricestendto increaselinearly in the numberof units, for a fixed setof goods. This isa hardercasefor our pruningtechnique,muchharderthandraw-ing pricesuniformly from arange.In fact,it maybereasonableforpricesto besuperlinearin thenumberof units,asthemotivationforholdinga CA in thefirst placemaybethatbiddersareexpectedtovaluebundlesmorethanindividualgoods.However, thiswouldbeaneasiercasefor ourpruningalgorithm,sowe testedon thelinearcaseinstead.Theconstructionof realistic,harddatadistributionsremainsa topic for furtherresearch.

Our experimentaldatawascollectedon a PentiumIII-733 run-ning Windows 2000,with 25 MB allocatedfor CAMUS’s cache.Our figureNumberof BidsvsTimeshows CAMUS’s performanceon the distribution describedabove, with eachline representingrunswith adifferentnumberof goods.Notethat,for example,CA-MUS solvedproblemswith 35objects(14goods)and2500bidsinabouttwo minutes,andproblemswith 25 objects(10 goods)and1500bids in abouta second.Becausethe lines in this grapharesub-linearonthelogarithmicscale,CAMUS’sperformanceis sub-exponentialin the numberof bids, thoughit remainsexponentialin the numberof goods. Our figure Percentage Optimality showsCAMUS’sanytimeperformance.Eachline onthegraphshowsthetimetakento find solutionswith revenuethatis somepercentageoftheoptimal,calculatedafterthealgorithmterminated.Notethatthetimetakento find theoptimalsolutionis lessthanthetimetakenforthealgorithmto finish,proving thatthis solutionis optimal.These

16

Page 21: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

anytime resultsare very encouraging–notethat CAMUS finds a99%optimal solutionanorderof magnitudemorequickly thanittakes for the algorithm to run to completion. This suggeststhatCAMUS could be usefulon much larger problemsthanwe haveshown hereif anoptimalsolutionwerenot required.

ConclusionsIn this paperwe introducedCAMUS, a novel algorithmfor deter-miningtheoptimalsetof winningbidsin generalmulti-unit combi-natorialauctions.Thealgorithmhasbeentestedonavarietyof datadistributionsandhasbeenfoundto solveproblemsof considerablescalein anefficientmanner. CAMUS extendsourCASSalgorithmfor single-unitcombinatorialauctions,andenablesa wide exten-sion of the classof combinatorialauctionsthat canbe efficientlyimplemented.In our currentresearchwe arestudyingtheadditionof randomnoiseinto our goodandbin orderingheuristics,com-binedwith periodicrestartsandthedeletionof previously-searchedbids, to improve performanceon hard caseswhile still retainingcompleteness.

Number of Bids vs. Time (s)

0.01

0.1

1

10

100

1000

10000

0 500 1000 1500 2000 2500

Number of Bids

Tim

e (s

)av

erag

e ov

er 1

0 ru

ns

25 units (10 goods) 30 units (12 goods) 35 units (14 goods) 40 units (16 goods)

Percentage Optimality

0.01

0.1

1

10

100

0 500 1000 1500 2000 2500

Number of Bids

Tim

e (s

) av

erag

e ov

er 1

0 ru

ns

0.8 0.97 0.98 0.99 1 finished

ReferencesFudenberg, D., andTirole, J. 1991.GameTheory. MIT Press.

Fujishima,Y.; Leyton-Brown, K.; andShoham,Y. 1999.Tamingthecomputationalcomplexity of combinatorialauctions:Optimalandapproximateapproaches.In IJCAI-99.

Lehmann,D.; O’Callaghan,L.; and Shoham,Y. 1999. Truthrevalation in rapid, approximatelyefficient combinatorialauc-tions. In ACM ConferenceonElectronicCommerce.

Monderer, D., and Tennenholtz,M. 2000. Optimal AuctionsRevisited. Artificial Intelligence,forthcoming.

Nisan,N., andRonen,A. 2000. Computationallyfeasiblevcgmechanisms.To appear.

Nisan,N. 1999.Biddingandallocationin combinatorialauctions.Working paper.

Rothkopf, M.; Pekec, A.; andHarstad,R. 1998. Computation-ally manageablecombinatorialauctions. ManagementScience44(8):1131–1147.

Sandholm,T. 1999.An algorithmfor optimalwinnerdetermina-tion in combinatorialauctions.In IJCAI-99.

Tennenholtz,M. 2000. Sometractablecombinatorialauctions.To appearin theproceedingsof AAAI-2000.

Vries, S., andVohra,R. 2000. Combinatorialauctions:A briefsurvey. Unpublishedmanuscript.

Wellman, M.; Wurman,P.; Walsh, W.; and MacKie-Mason,J.1998. Auction protocolsfor distributedscheduling.Working pa-per(to appearin GamesandEconomicBehavior).

17

Page 22: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Towards a Universal Test Suite forCombinatorial Auction Algorithms

Kevin Leyton-BrownDept. of Computer Science

Stanford UniversityStanford, CA 94305

[email protected]

Mark PearsonDept. of Computer Science

Stanford UniversityStanford, CA 94305

[email protected]

Yoav ShohamDept. of Computer Science

Stanford UniversityStanford, CA 94305

[email protected]

ABSTRACTGeneral combinatorial auctions—auctions in which biddersplace unrestricted bids for bundles of goods—are the sub-ject of increasing study. Much of this work has focused onalgorithms for finding an optimal or approximately optimalset of winning bids. Comparatively little attention has beenpaid to methodical evaluation and comparison of these al-gorithms. In particular, there has not been a systematicdiscussion of appropriate data sets that can serve as uni-versally accepted and well motivated benchmarks. In thispaper we present a suite of distribution families for generat-ing realistic, economically motivated combinatorial bids infive broad real-world domains. We hope that this work willyield many comments, criticisms and extensions, bringingthe community closer to a universal combinatorial auctiontest suite.1

1. INTRODUCTION

1.1 Combinatorial AuctionsAuctions are a popular way to allocate goods when the

amount that bidders are willing to pay is either unknown orunpredictably changeable over time. The rise of electroniccommerce has facilitated the use of increasingly complexauction mechanisms, making it possible for auctions to beapplied to domains for which the more familiar mechanismsare inadequate. One such example is provided by combina-torial auctions (CA’s), multi-object auctions in which bidsname bundles of goods. These auctions are attractive be-cause they allow bidders to express complementarity andsubstitutability relationships in their valuations for sets ofgoods. Because CA’s allow bids for arbitrary bundles ofgoods, an agent may offer a different price for some bundleof goods than he offers for the sum of his bids for its disjoint

1This work was partly supported by DARPA grant numberF30602-98-C-0214-P00005.

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.EC’00, October 17-20, 2000, Minneapolis, Minnesota.Copyright 2000 ACM 1-58113-272-7/00/0010 ..$5.00

subsets; in the extreme case he may bid for a bundle withthe guarantee that he will not receive any of its subsets. Anexample of complementarity is an auction of used electronicequipment, in which a bidder values a particular TV at xand a particular VCR at y but values the pair at z > x + y.An agent with substitutable valuations for two copies of thesame book might value either single copy at x, but valuethe bundle at z < 2x. In the special case where z = x (theagent values a second book at 0, having already bought afirst) the agent may submit the set of bids {bid1 XOR bid2}.By default, we assume that any satisfiable sets of bids thatare not explicitly XOR’ed is a candidate for allocation. Wecall an auction in which all goods are distinguishable fromeach other a single-unit CA. In contrast, in a multi-unit CAsome of the goods are indistinguishable (e.g., many iden-tical TVs and VCRs) and bidders request some number ofgoods from each indistinguishable set. This paper is primar-ily concerned with single-unit CA’s, since most research todate has been focused on this problem. However, when ap-propriate we will discuss ways that our distributions couldbe generalized to apply to multi-unit CA’s.

1.2 The Computational Combinatorial Auc-tion Problem

In a combinatorial auction, a seller is faced with a setof price offers for various bundles of goods, and his aim isto allocate the goods in a way that maximizes his revenue.(For an overview of this problem, see [8].) This optimizationproblem is intractable in the general case, even when eachgood has only a single unit. Because of the intractability ofgeneral CA’s, much research has focused on subcases of theCA problem that are tractable; see [22] and more recently[25]. However, these subcases are very restrictive and there-fore are not applicable to many CA domains. Other researchattempts to define mechanisms within which general CA’swill be tractable (achieved by various trade-offs includingbid withdrawal penalties, activity rules and possible ineffi-ciency). Milgrom [15] defines the Simultaneous AscendingAuction mechanism which has been very influential, partic-ularly in the recent FCC spectrum auctions. However, thisapproach has drawbacks, discussed for example in [6]. Inthe general case there is no substitute for a completely un-restricted CA. Consequently, many researchers have recentlybegun to propose algorithms for determining the winners ofa general CA, with encouraging results. This wave of re-search has given rise to a new problem, however. In orderto test (and thus to improve) such algorithms, it has been

APPENDIX C

18

Page 23: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

necessary to use some sort of test suite. Since general CA’shave never been widely held, there is no data recording thebidding behavior of real bidders upon which such a test suitemay be built. In the absence of such natural data, we areleft only with the option of generating artificial data thatis representative of the sort of scenarios one is likely to en-counter. The goal of this paper is to facilitate the creationof such a test suite.

2. PAST WORK ON TESTING CAALGORITHMS

2.1 Experiments with Human SubjectsOne approach to experimental work on combinatorial auc-

tions uses human subjects. These experiments assign valu-ation functions to subjects, then have them participate inauctions using various mechanisms [3, 12, 7]. Such tests canbe useful for understanding how real people bid under differ-ent auction mechanisms; however, they are less suitable forevaluating the mechanisms’ computational characteristics.In particular, this sort of test is only as good as the sub-jects’ valuation functions, which in the above papers werehand-crafted. As a result, this technique does not easilypermit arbitrary scaling of the problem size, a feature thatis important for characterizing an algorithm’s performance.In addition, this method relies on relatively naive subjectsto behave rationally given their valuation functions, whichmay be unreasonable when subjects are faced with complexand unfamiliar mechanisms.

2.2 Particular ProblemsA parallel line of research has examined particular prob-

lems to which CA’s seem well suited. For example, re-searchers have considered auctions for the right to use rail-road tracks [5], real estate [19], pollution rights [13], airporttime slot allocation [21] and distributed scheduling of ma-chine time [26]. Most of these papers do not suggest holdingan unrestricted general CA, presumably because of the com-putational obstacles. Instead, they tend to discuss alterna-tive mechanisms that are tailored to the particular problem.None of them proposes a method of generating test data,nor does any of them describe how the problem’s difficultyscales with the number of bids and goods. However, theystill remain useful to researchers interested in general CA’sbecause they give specific descriptions of problem domainsto which CA’s may be applied.

2.3 Artificial DistributionsRecently, a number of researchers have proposed algo-

rithms for determining the winners of general CA’s. Inthe absence of test suites, some suggested novel bid gen-eration techniques, parameterized by number of bids andgoods [24, 10, 4, 8]. (Other researchers have used one ormore of these distributions, e.g., [17], while still others haverefrained from testing their algorithms altogether, e.g., [16,14].) Parameterization represents a step forward, making itpossible to describe performance with respect to the prob-lem size. However, there are several ways in which each ofthese bid generation techniques falls short of realism, con-cerning the selection of which goods and how many goods torequest in a bundle, what price to offer for the bundle, andwhich bids to combine in an XOR’ed set. More fundamen-tally, however, all of these approaches suffer from failing to

model bidders explicitly, and from attempting to representan economic situation with an non-economic model.

2.3.1 Which goodsFirst, each of the distributions for generating test data

discussed above has the property that all bundles of thesame size are equally likely to be requested. This assumptionis clearly violated in almost any real-world auction: most ofthe time, certain goods will be more likely to appear togetherthan others. (Continuing our electronics example, TVs andVCRs will be requested together more often than TVs andprinters.)

2.3.2 Number of goodsLikewise, each of the distributions for generating test data

determines the number of goods in a bundle completely in-dependently from determining which goods appear in thebundle. While this assumption appears more reasonable itwill still be violated in many domains, where the expectedlength of a bundle will be related to which goods it contains.(For example, people buying computers will tend to makelong combinatorial bids, requesting monitors, printers, etc.,while people buying refrigerators will tend to make shortbids.)

2.3.3 PriceNext, there are problems with the pricing2 schemes used

by all four techniques. Pricing is especially crucial: if pricesare not chosen carefully then an otherwise hard distributioncan become computationally easy.

In Sandholm [24] prices are drawn randomly from either[0, 1] or from [0, g], where g is the number of goods requested.The first method is clearly unreasonable (and computation-ally trivial) since price is unrelated to the number of goodsin a bid—note that a bid for many goods and for a smallsubset of the same bid will have exactly the same price onexpectation. The second is better, but has the disadvan-tage that average and range are parameterized by the samevariable.

In Boutilier et al.[4] prices of bids are distributed normallywith mean 16 and standard deviation 3, giving rise to thesame problem as the [0, 1] case above.

In Fujishima et al.[10] prices are drawn from [g(1−d), g(1+d)], d = 0.5. While this scheme avoids the problems de-scribed above, prices are simply additive in g and are unre-lated to which goods are requested in a bundle, both unre-alistic assumptions in some domains.

More fundamentally, Andersson et al.[1] note a criticalpricing problem that arises in several of the schemes dis-cussed above. As the number of bids to be generated be-comes large, a given short bid will be drawn much morefrequently than a given long bid. Since the highest-pricedbid for a bundle dominates all other bids for the same bun-dle, short bids end up being much more competitive. In-deed, it is pointed out that for extremely large numbersof bids a good approximation to the optimal solution issimply to take the best singleton bid for each good. Onesolution to this problem is to guarantee that a bid will

2Most of the existing literature on artificial distributionsin combinatorial auctions refers to the monetary amountassociated with a bundle as a “price”. In Section 3 we willadvocate the use of different terminology, but in this sectionwe use the existing term for clarity.

19

Page 24: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

be placed for each bundle at most once (for example, thisapproach is taken by Sandholm[24]). However, this solu-tion has the drawback that it is unrealistic: different realbidders are likely to place bids on some of the same bun-dles.

Another solution to this problem is to make bundle pricessuperadditive in the number of goods they request—an as-sumption that may also be reasonable in many CA domains.A similar approach is taken by deVries and Vohra [8], whomake the price for a bid a quadratic function of the pricesof bids for subsets. For some domains this pricing schememay result in too large an increase in price as a functionof bundle length. The distributions presented in this pa-per will include a pricing scheme that may be configuredto be superadditive or subadditive in bundle length, whereappropriate, parameterized to control how rapidly the priceoffered increases or decreases as a function of bundle length.

2.3.4 XOR bidsFinally, while most of the bid-generation techniques dis-

cussed above permit bidders to submit sets of bids XOR’edtogether, they have no way of generating meaningful sets ofsuch bids. As a consequence the computational impact ofXOR’ed bids has been very difficult to characterize.

3. GENERATING REALISTIC BIDSWhile the lack of standardized, realistic test cases does

not make it impossible to evaluate or compare algorithms,it does make it difficult to know what magnitude of real-world problems each algorithm is capable of solving, or whatfeatures of real-world problems each algorithm is capable ofexploiting. This second ambiguity is particularly troubling:it is likely that algorithms would be designed differently ifthey took the features of more realistic3 bidding into ac-count.

3.1 Prices, price offers and valuationsThe term “price” has traditionally been used by researchers

constructing artificial distributions to describe the amountoffered for a bundle. However, this term really refers to theamount a bidder is made to pay for a bundle, which is ofcourse mechanism-specific and is often not the same as theamount offered. Indeed, it is impossible to model bidders’price offers at all without committing to a particular auctionmechanism. In the distributions described in this paper, wewill assume a sealed-bid incentive-compatible mechanism,where the price offered for a bundle is equal to the bid-der’s valuation. Hence, in the rest of this paper, we will usethe terms price offer and valuation interchangeably. Re-searchers wanting to model bidding behavior in other mech-anisms could transform the valuation generated by our dis-tributions according to bidders’ equilibrium strategies in thenew mechanism.

3.2 The CATS suite

3Previous work characterizes hard cases for weighted setpacking—equivalent to the combinatorial auction problem.Real-world bidding is likely to exhibit various regularities,however, as discussed throughout this paper. A data set de-signed to include the same regularities may be more usefulfor predicting the performance of an algorithm in a real-world auction.

In this paper we present CATS (Combinatorial AuctionTest Suite), a suite of distributions for modeling realisticbidding behavior. This suite is grounded in previous re-search on specific applications of combinatorial auctions, asdescribed in section 2.1 above. At the same time, all ofour distributions are parameterized by number of goods andbids, facilitating the study of algorithm performance. Thissuite represents a move beyond current work on modelingbidding in combinatorial auctions because we provide aneconomic motivation for both the contents and the valuationof a bundle, deriving them from basic bidder preferences. Inparticular, in each of our distributions:

• Certain goods are more likely to appear together thanothers.

• The number of goods appearing in the bundle is oftenrelated to which goods appear in the bundle.

• Valuations are related to which goods appear in thebundle. Where appropriate, valuations can be config-ured to be subadditive, additive or superadditive inthe number of goods requested.

• Sets of XOR’ed bids are constructed in meaningfulways, on a per-bidder basis.

We do not intend for this paper to stand as an isolatedstatement on bidding in combinatorial auctions, but ratheras the beginning of a dialogue. We hope to receive manysuggestions and criticisms from members of the CA com-munity, enabling us both to update the distributions pro-posed here and to include distributions modeling new do-mains. In particular, our distributions include many param-eters, for which we suggest default values. Although thesevalues have evolved somewhat during our development ofthe test suite, it has not yet been possible to understandthe role each parameter plays in the difficulty or realismof the resulting distribution, and our choice may be seenas highly subjective. We hope and expect to receive criti-cisms about these parameter values; for this reason we in-clude a CATS version number with the defaults to differ-entiate them from future defaults. The suite also containsa legacy section including all bid generation techniques de-scribed above, so that new algorithms may easily be com-pared to previously-published results. More information onour test suite, including executable versions of our distri-butions for Solaris, Linux and Windows may be found athttp://robotics.stanford.edu/CATS .

In section 4, below, we present distributions based on fivereal-world situations. For most of our distributions, themechanism for generating bids requires first building a graphrepresenting adjacency relationships between goods. Later,the mechanism uses the graph, generated in an economically-motivated way, to derive complementarity properties be-tween goods and substitutability properties for bids. Of thefive real-world situations we model, the first three concerncomplementarity based on adjacency in (physical or con-ceptual) space, while the final two concern complementaritybased on correlation in time. Our first example (4.1) mod-els shipping, rail and bandwidth auctions. Goods are repre-sented as edges in a nearly planar graph, with agents submit-ting an XOR’ed set of bids for paths connecting two nodes.Our second example (4.2) models an auction of real estate,or more generally of any goods over which two-dimensional

20

Page 25: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

adjacency is the basis of complementarity. Again the rela-tionship between goods is represented by a graph, in thiscase strictly planar. In (4.3) we relax the planarity assump-tion from the previous example in order to model arbitrarycomplementarities between discrete goods such as electron-ics parts or collectables. Our fourth example (4.4) concernsthe matching of time-slots for a fixed number of differentgoods; this case applies to airline take-off and landing rightsauctions. In (4.5) we discuss the generation of bids for adistributed job-shop scheduling domain, and also its appli-cation to power generation auctions. Finally, in (4.6), weprovide a legacy suite of bid generation techniques, includ-ing all those discussed in (2.3) above.

In the description of the distributions that follow, letrand(a, b) represent a real number drawn uniformly from[a, b]. Let rand int(a, b) represent a random integer drawnuniformly from the same interval. With respect to a givengraph, let e(x, y) represent the proposition that an edge ex-ists between nodes x and y. Denote the number of goods ina bundle B as |B|. The statement a good g is in a bundleB means that g ∈ B. All of the distributions presented hereare parameterized by the number of goods (num goods) andnumber of bids (num bids).

4. CATS IN DETAIL

4.1 Paths in SpaceThere are many real-world problems involving bidding on

paths in space. Generally, this class may be characterized asthe problem of purchasing a connection between two points.Examples include truck routes [23], natural gas pipeline net-works [20], network bandwidth allocation, and the right touse railway tracks [5].4 In particular, spatial path problemsconsist of a set of points and accessibility relations betweenthem. Although the distribution we propose may be config-ured to model bidding in any of the above domains, we willuse the railway domain as our motivating example since itis both intuitive and well-understood.

More formally, we will represent this railroad auction bya graph in which each node represents a location on a plane,and an edge represents a connection between locations. Thegoods at auction are therefore the edges of the graph, andbids request a set of edges that form a path between twonodes. We assume that no bidder will desire more than onepath connecting the same two nodes, although the biddermay value each path differently.

4.1.1 Building the GraphThe first step in modeling bidding behavior for this prob-

lem is determining the graph of spatial and connective re-lationships between cities. One approach would be to usean actual railroad map, which has the advantage that theresulting graph would be unarguably realistic. However,

4Electric power distribution is a frequently discussed realworld problem which seems superficially similar to the prob-lems discussed here. However, many of the complementari-ties in this domain arise from physical laws governing powerflow in a network. Consideration of these laws becomes verycomplex in networks of interesting size. Also, because theselaws are taken into account during the construction of powernetworks, the networks themselves are difficult to model us-ing randomly generated graphs. For these reasons, we donot attempt to model this domain.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

"cities2""edges2"

Figure 1: Sample Railroad Graph

it would be difficult to find a set of real-world maps thatcould be said to exhibit a similar sort of connectivity andwould encompass substantial variation in the number ofcities. Since scalability of input data is of great importanceto the testing of new CA algorithms, we have chosen topropose generating such graphs randomly. Our techniquefor generating graphs has various parameters that may beadjusted as necessary; in our opinion it produces realisticgraphs with the recommended settings. Figure 1 shows arepresentative example of a graph generated using our tech-nique.

We begin with num cities nodes randomly placed on aplane. We add edges to this graph, G, starting by connectingeach node to a fixed number of its nearest neighbors. Next,we iteratively consider random pairs of nodes and examinethe shortest path connecting them, if any. To compare, wealso compute various alternative paths that would requireone or more edges to be added to the graph, given a penaltyproportional to distance for adding new edges. (We do thisby considering a complete graph C, an augmentation of Gwith new edges weighted to reflect the distance penalty.) Ifthe shortest path involves new edges—despite the penalty—then the new edges (without penalty) are added to G, andreplace the existing edges in C. This process models our sim-plifying assumption that there will exist uniform demand forshipping between any pair of cities, though of course it doesnot mimic the way new links would actually be added toa rail network. Our technique produces slightly non-planargraphs—graphs on a plane in which edges occasionally crossat points other than nodes. We consider this to be reason-able, as the same phenomenon may be observed in real-worldrail lines, highways, network wiring, etc. Determining the“reasonableness” of a graph is of course a subjective taskunless more quantitative metrics are used to assess quality;we see the identification and application of such metrics (forthis and other distributions) as an important topic for futurework.

4.1.2 Generating BidsGiven a map of cities and the connectivity between them,

there is the orthogonal problem of modeling bidding itself.We propose a method which generates a set of substitutablebids from a hypothetical agent’s point of view. We startwith the value to an agent for shipping from one city toanother and with a shipping cost which we make equal to theEuclidean distance between the cities. We then place XORbids on all paths on which the agent would make a profit

21

Page 26: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Let num cities = f(num goods)Randomly place nodes (cities) on a unit boxConnect each node to its initial connectionsnearest neighbors

For i = 1 to num building paths:C = GFor every pair of nodes n1, n2 ∈ G where¬e(n1, n2):

Add an edge to C of lengthbuilding penalty ·Euclidean distance(n1, n2)

Choose two nodes at random, and find theshortest path between them in C

If shortest path uses edges that do notexist in G:

For every such pair of nodesn1, n2 ∈ G add an edge to G withlength Euclidean distance(n1, n2)

End IfEnd ForIf total number of edges in G �= num goods,restart

Figure 2: Graph-Building Technique

While num generated bids < num bids:

Randomly choose two nodes, n1 and n2

d = rand(1, shipping cost factor)cost = Euclidean distance(city1, city2)value = d · Euclidean distance(city1, city2)Make XOR bids of value − cost on every pathfrom city1 to city2 with cost < value

If there are more than max bid set size suchpaths, bid on the max bid set size pathsthat maximize value − cost.

End While

Figure 3: Bid-Generation Technique

(i.e., those paths where utility−cost > 0). The path’s valueis random, in (parameterized) proportion to the Euclideandistance between the chosen cities. Since the shipping costis the Euclidean distance between two cities, we use this asthe lower bound for value as well, since only bidders withsuch valuations would actually place bids.

Note that this distribution, and indeed all others pre-sented in this paper, may generate slightly more than num bidsbids. In our experience CA optimization algorithms tend notto be highly sensitive in the number of bids, so we judged itmore important to build economically sensible sets of sub-stitutable bids. When generating a precise number of bids isimportant, an appropriate number of bids may be removedafter all bids have been generated so that the total will bemet exactly.

Note that 1 is used as a lower bound for d because any bid-der with d < 1 would find no profitable paths and thereforewould not bid.

This is CATS 1.0 problem 1. CATS default param-eters: initial connections = 2, building penalty =1.7, num building paths = num cities2/4,shipping cost factor = 1.5, max bid set size = 5,and f(num goods) = 0.529689 ∗ NUMGOODS + 3.4329.

4.1.3 Multi-Unit Extensions: Bandwidth Allocation,Commodity Flow

This model may also be used to generate realistic data

Place nodes at integer vertices (i, j) in a

plane, where 1 ≤ i, j ≤ �√(num goods)�For each node n:

If n is on the edge of the mapConnect n to as many hv-neighbors aspossible

ElseIf rand(0, 1) ≤ three prob

Connect n to a random set ofthree of its four hv-neighbors

ElseConnect n to all four of itshv-neighbors

While rand(0, 1) ≤ additional neighbor:

Connect g to one of itsd-neighbors, provided that thenew diagonal edge will notcross another diagonal edge

End WhileEnd For

Figure 4: Graph-Building Technique

for multi-unit CA problems such as network bandwidth al-location and general commodity flow. The graph may becreated as above, but with a number of units (capacity)assigned to each edge. Likewise, the bidding technique re-mains unchanged except for the assignment of a number ofunits to each bid.

4.2 Proximity in SpaceThere is a second broad class of real-world problems in

which complementarity arises from adjacency in two-dimen-sional space. An intuitive example is the sale of adjacentpieces of real estate [19]. Another example is drilling rights,where it is much cheaper for an (e.g.) oil company to drillin adjacent lots than in lots that are far from each other. Inthis section, we first propose a graph-generation mechanismthat builds a model of adjacency between goods, and thendescribe a technique for generating realistic bids on thesegoods. Note that in this section nodes of the graph representthe goods at auction, while edges represent the adjacencyrelationship.

4.2.1 Building the GraphThere are a number of ways we could build an adjacency

graph. The simplest would be to place all the goods (loca-tions, nodes) in a grid, and connect each to its four neigh-bors. We propose a slightly more complex method in orderto permit a variable number of neighbors per node (equiva-lent to non-rectangular pieces of real estate). As above weplace all goods on a grid, but with some probability we omita connection between goods that would otherwise representvertical or horizontal adjacency, and with some probabil-ity we introduce a connection representing diagonal adja-cency. (We call horizontally- or vertically-adjacent nodeshv-neighbors and diagonally-adjacent nodes d-neighbors.)

Figure 5 shows a sample real estate graph, generated bythe technique described in Figure 4. Nodes of the graph areshown with asterisks, while edges are represented by solidlines. The dashed lines show one set of property boundariesthat would be represented by this graph. Note that onenode falls inside each piece of property, and that two piecesof property border each other iff their nodes share an edge.

22

Page 27: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

0

0.5

1

1.5

2

2.5

3

3.5

4

0 0.5 1 1.5 2 2.5 3 3.5 4

"regions""edges"

"boundaries"

Figure 5: Sample Real Estate Graph

4.2.2 Generating BidsTo model realistic bidding behavior, we generate a set of

common values for each good, and private values for eachgood for each bidder. The common value represents theappraised or expected resale value of each individual good.The private value represents how much one particular biddervalues that good, as an offset to the common value (e.g., aprivate value of 0 for a good represents agreement with thecommon value). These private valuations describe a bidder’spreferences, and so they are used to determine both a valuefor a given bid and the likelihood that a bidder will requesta bundle that includes that good. There are two additionalcomponents to each bidder’s preferences: a minimum totalcommon value, and a budget. The former reflects the ideathat a bidder may only wish to acquire goods of a certainrecognized value. The latter reflects the fact that a biddermay not be able to afford every bundle that is of interest tohim.

To generate bids, we first add a random good, weightedby a bidder’s preferences, to the bidder’s bid. Next, wedetermine whether another good should be added by draw-ing a value uniformly from [0,1], and adding another goodif this value is smaller than a threshold. This is equiva-lent to drawing the number of goods in a bid from a de-cay distribution.56 We must now decide which good toadd. First we allow a small chance that a new good willbe added uniformly at random from the set of goods, with-out the requirement that it be adjacent to a good in thecurrent bundle B . (This permits bundles requesting un-connected regions of the graph: for example, a hotel com-pany may only wish to build in a city if it can acquireland for two hotels on opposite sides of the city.) Oth-erwise, we select a good from the set of nodes borderingthe goods in B. The probability that some adjacent good

5We use Sandholm’s [24] term “decay” here, though thedistribution goes by various names—for a description of thedistribution please see Section 4.6.1.6There are two reasons we use a decay distribution here.First, we expect that most bids will request small bundles;a uniform distribution, on the other hand, would be ex-pected to have the same number of bids for bundles of eachcardinality. Also, bids for large bundles will often be com-putationally easier for CA algorithms than bids for smallbundles, because choosing the former more highly restrictsthe future search. Second, we require a distribution wherethe expected bundle size is unaffected by changes in the totalnumber of goods. Some other distributions, such as uniformand binomial, do not have this property.

Routine Add Good to Bundle(bundle B)If rand(0, 1) ≤ jump prob:

Add a good g /∈ b to B, chosenuniformly at random

Else:Compute s =

∑x/∈B,y∈B,e(x,y) pn(x) [pn()

is defined below]Choose a random node x /∈ B from the

distribution∑

y∈B,e(x,y)pn(x)

sAdd x to B

End IfEnd Routine

Figure 6: Add Good to Bundle for Spatial Proxim-ity

n1 will be added depends on how many edges n1 shareswith the current bundle, and on the bidder’s relative pri-vate valuations for n1 and n2. For example, if nodes n1 andn2 are each connected to B by one edge, and the privatevaluation for n1 is twice that for n2 then the probabilityof adding n1 to B, p(n1), is 2p(n2). Further, if n1 has 3edges to nodes in B while n2 is connected to B by only1 edge, and the goods have equivalent private values, thenp(n1) = 3p(n2). Once we have determined all the goodsin a bundle we set the price offered for the bundle, whichdepends on the sum of common and private valuations forthe goods in the bundle, and also includes a function that issuperadditive (with our parameter settings) in the numberof goods.7 Finally, we generate additional bids that are sub-stitutable for the original bid, with the constraint that eachbid in the set requests at least one good from the originalbid.

This is CATS 1.0 problem 2. CATS default param-eters: three prob = 1.0, additional neighbor = 0.2,max good value = 100, max substitutable bids = 5,additional location = 0.9, jump prob = 0.05, additivity =0.2, deviation = 0.5, budget factor = 1.5, resale factor =0.5, and S(n) = n1+additivity. Note that additivity = 0 givesadditive bids, and additivity < 0 gives sub-additive bids.

4.2.3 Spectrum AuctionsA related problem is the auction of radio spectrum, in

which a government sells the right to use specific segmentsof spectrum in different geographical areas[18, 2].8 It is pos-sible to approximate bidding behavior in spectrum auctionsby making the assumption that all complementarity arisesfrom spatial proximity.9 In this case, our spatial proximitymodel can also be used to generate realistic bidding distri-butions for spectrum auctions. The main difference betweenthis problem and the real estate problem is that in a spec-trum auction each good may have multiple units (frequencybands) for sale. It is insufficient to model this as a multi-unit CA problem, however, if bidders have the constraint

7Recall the discussion in Section 2.3.3 motivating the useof superadditive valuations.8Spectrum auctions have not historically been formulatedas general CA’s, but the possibility of doing so is now beingexplored.9This assumption would be violated, for example, if somebidders wanted to secure some spectrum in all metropolitanareas. Clearly the problem of realistic test data for spectrumauctions remains an area for future work.

23

Page 28: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

For all g, c(g) = rand(1, max good value)While num generated bids < num bids:

For each good, resetp(g) = rand(−deviation ·max good value, deviation + max good value)

pn(g) =p(g)+deviation·max good value

2·deviation·max good value

Normalize pn(g) so that∑

g pn(g) = 1

B = {}Choose a node g at random, weighted by

pn(), and add it to BWhile rand(0, 1) ≤ additional location

Add Good to Bundle(B)value(B) =

∑x∈B(c(x) + p(x)) + S(|B|)

If value(B) ≤ 0 on B, restart bundlegeneration for this bidder

Bid value(B) on Bbudget = budget factor · value(B)min resale value = resale factor · ∑x∈B c(x)Construct substitutable bids. For eachgood gi ∈ B:

Initialize a new bundle, Bi = {gi}While |Bi| < |B|:

Add Good to Bundle(Bi)Compute ci =

∑x∈Bi

c(x)End ForMake XOR bids on all Bi where

0 ≤ value(B) ≤ budget andci ≥ min resale value.

If there are more thanmax substitutable bids such bundles, bidon the max substitutable bids bundleshaving the largest value

End While

Figure 7: Bid-Generation Technique

that they want the same frequency in each region.10 In-stead, the problem can be modeled with multiple distinctgoods per node in the graph, and bids constructed so thatall nodes added to a bundle belong to the same ‘frequency’.With this method, it is also easy to incorporate other pref-erences, such as preferences for different types of goods. Forinstance, if two different types of frequency bands are beingsold, one 5 megahertz wide and one 2.5 megahertz wide, anagent only wanting 5 megahertz bands could make substi-tutable bids for each such band in the set of regions desired(generating the bids so that the agent will acquire the samefrequency in all the regions).

The scheme for generating price offers used in our realestate example may be inappropriate for the spectrum auc-tion domain. Research indicates that while price offers willstill tend to be superadditive, this superadditivity may bequadratic in the population of the region rather than ex-ponential in the number of regions [2]. CATS includes aquadratic pricing option that may be used with this prob-lem, in which the common value term above is used as ameasure of population. Please see the CATS documenta-tion for more details.

10To see why this cannot be modeled as a multi-unit CA,consider an auction for three regions with two units each,and three bidders each wanting one unit of two goods. Inthe optimal allocation, b1 gets 1 unit of g1 and 1 unit of g2,b2 gets 1 unit of g2 and 1 unit of g3, and b3 gets 1 unit of g3

and 1 unit of g1. In this example there is no way of assigningfrequencies to the units so that each bidder gets the samefrequency in both regions.

Build a fully-connected graph with one node foreach good

Label each edge from n1 to n2 with a weightd(n1, n2) = rand(0, 1)

Figure 8: Graph-Building Technique

Routine Add Good to Bundle(bundle B)Compute s =

∑x/∈b,y∈B d(x, y) · pn(x)

Choose a random node x /∈ B from the

distribution∑

y∈B d(x, y) · pn(x)s

Add x to BEnd Routine

Figure 9: Routine Add Good to Bundle for Arbi-trary Relationships

4.3 Arbitrary RelationshipsSometimes complementarities between goods will not be

as universal as geographical adjacency, but some kind of reg-ularity in the complementarity relationships between goodswill still exist. Consider an auction of different, indivisi-ble goods, e.g. for semiconductor parts or collectables, orfor distinct multi-unit goods such as the right to emit somequantity of two different pollutants produced by the sameindustrial process. In this section we discuss a general wayof modeling such arbitrary relationships.

4.3.1 Building the GraphWe express the likelihood that a particular pair of goods

will appear together in a bundle as being proportional to theweight of the appropriate edge of a fully-connected graph.That is, the weight of an edge between n1 and n2 is propor-tional to the probability that, having only n1 in our bundle,we will add n2. Weights are only proportional to probabili-ties because we must normalize the sum of all weights froma given good to 1 in order to calculate a probability.

4.3.2 Generating BidsOur technique for modeling bidding is a generalization of

the technique presented in the previous section. We choosea first good and then proceed to add goods one by one, withthe probability of each new good being added dependingon the current bundle. Note that, since in this section thegraph is fully-connected, there is no need for the ‘jumping’mechanism described above. The likelihood of adding a newgood g to bundle B is proportional to

∑y∈B d(x, y) · pi(x).

The first term d(x, y) represents the likelihood (independentof a particular bidder) that goods x and y will appear ina bundle together; the second, pi(x), represents bidder i’sprivate valuation of the good x. We implement this newmechanism by changing the routine Add Good to Bundle().We are thus able to use the same techniques for assigning avalue to a bundle, as well as for determining other bundleswith which it is substitutable.

This is CATS 1.0 problem 3. CATS default param-eters: max good value = 100, additional good = 0.9,max substitutable bids = 5, additivity = 0.2, deviation =0.5, budget factor = 1.5, resale factor = 0.5, and S(n) =n1+additivity.

4.3.3 Multi-Unit Pollution Rights Auctions: FutureWork

24

Page 29: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Bidding in pollution-rights auctions[18, 13] may be mod-eled through a multi-unit generalization of the techniquepresented in this section. In such auctions, the governmentsells companies the right to generate specific amounts ofsome pollutant. In the United States, though these auc-tions are widely used, sulfur-dioxide is the only chemicalfor which they are the primary method of control. Cur-rent US pollution-rights auctions may therefore be modeledas single good multi-unit auctions. If the government wereto conduct pollution rights auctions for multiple pollutantsin the future, however, bidding would be best-representedas a multi-unit ‘Arbitrary Complementarity’ problem. Theproblem belongs to this class because some sets of pollutantsare more likely to be produced than others, yet the relation-ship between pollutants can not be modeled through anynotion of adjacency. Should such auctions become viable inthe future, we hope that a pollution-rights distribution willbe added to CATS .

4.4 Temporal MatchingWe now consider real-world domains in which complemen-

tarity arises from a temporal relationship between goods. Inthis section we discuss matching problems, in which corre-sponding time slices must be secured on multiple resources.The general form of temporal matching includes m sets ofresources, in which each bidder wants 1 time slice fromeach of j ≤ m sets subject to certain constraints on howthe times may relate to one another (e.g., the time in set2 must be at least two units later than the time in set3). Here we concern ourselves with the problem in whichj = 2, and model the problem of airport take-off and land-ing rights. Rassenti et al. [21] made the first study of auc-tions in this domain. The problem has been the topic formuch other work; in particular [11] includes detailed exper-iments and an excellent characterization of bidder behav-ior.

The airport take-off and landing problem arises becausecertain high-traffic airports require airlines to purchase theright to take off or land during a given time slice. However,if an airline buys the right for a plane to take off at oneairport then it must also purchase the right for the planeto land at its destination an appropriate amount of timelater. Thus, complementarity exists between certain pairsof goods, where goods are the right to use the runway at aparticular airport at a particular time. Substitutable bidsare different departure/arrival packages; therefore bids willonly be substitutable within certain limits.

4.4.1 Building the GraphDeparting from our graph-based approach above, we ground

this example in the real map of high-traffic US airports forwhich the Federal Aviation Administration auctions take-offand landing rights, described in [11]. These are the four bus-iest airports in the United States: La Guardia International,Ronald Reagan Washington National, John F. Kennedy In-ternational, and O’Hare International. This map is shownbelow.

We chose not to use a random graph in this example be-cause the number of bids and goods is dependent on thenumber of bidders and time slices at the given airports; itis not necessary to modify the number of airports in or-der to vary the problem size. Thus, num cities = 4 andnum times = �num goods/num cities�.

38.5

39

39.5

40

40.5

41

41.5

42

42.5

-88 -86 -84 -82 -80 -78 -76 -74 -72

Latit

ude

Longitude

O’Hare

Reagan

Kennedy

LaGuardia

"airports""airways"

Figure 10: Map of Airport Locations

4.4.2 Generating BidsOur bidding mechanism presumes that airlines have a

certain tolerance for when a plane can take off and land(early takeoff deviation, late takeoff deviation,early land deviation, late land deviation), as related totheir most preferred take-off and landing times (start time,start time + min flight length). We generate bids for allbundles that fit these criteria. The value of a bundle is de-rived from a particular agent’s utility function. We define autility umax for an agent, which corresponds to the utilitythe agent receives for flying from city1 to city2 if it receivesthe ideal takeoff and landing times. This utility depends ona common value for a time slot at the given airport, anddeviates by a random amount. Next we construct a util-ity function which reduces umax according to how late theplane will arrive, and how much the flight time deviates fromoptimal.

This is CATS 1.0 problem 4. CATS default parameters:max airport value = 5, longest flight length = 10,deviation = 0.5, early takeoff deviation = 1,late takeoff deviation = 2, early land deviation =1, late land deviation = 2, delay coeff = 0.9, andamount late coeff = 0.75.

4.5 Temporal SchedulingWellman et al. [26] proposed distributed job-shop schedul-

ing with one resource as a CA problem. We provide a dis-tribution that mirrors this problem. While there exist manyalgorithms for solving job-shop scheduling problems, the dis-tributed formulation of this problem places it in an economiccontext. In the problem formulation from Wellman et al., afactory conducts an auction for time-slices on some resource.Each bidder has a job requiring some amount of machinetime, and a deadline by which the job must be completed.Some jobs may have additional, later deadlines which areless desirable to the bidder and so for which the bidder iswilling to pay less.

4.5.1 Generating BidsIn the CA formulation of this problem, each good repre-

sents a specific time-slice. Two bids are substitutable if theyconstitute different possible schedules for the same job. Wedetermine the number of deadlines for a given job accordingto a decay distribution, and then generate a set of substi-tutable bids satisfying the deadline constraints. Specifically,let the set of deadlines of a particular job be d1 < · · · < dn

and the value of a job completed by d1 be v1, superadditivein the job length. We define the value of a job completed bydeadline di as vi = v1 · d1

di, reflecting the intuition that the

25

Page 30: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Set the average valuation for each city’sairport: cost(city) = rand(0, max airport value)

Let max l = length of longest distance betweenany two cities

While num generated bids < num bids:

Randomly select city1 and city2 wheree(city1, city2)

l = distance(city1, city2)min flight length =

round(longest flight length · 1max l

)start time =

rand int(1, num times − min flight length)dev = rand(1 − deviation, 1 + deviation)Make substitutable (XOR) bids. For

takeoff =max(1, start time − early takeoff deviation)to min(num times, start time +late takeoff deviation):

For land = takeoff + min flight lengthtomin(start time + min flight length +late land deviation, num times):

amount late =min(land − (start time +min flight length), 0)

delay =land−takeoff−min flight length

Bid dev · (cost(city1) + cost(city2)) ·delay coeffdelay ·amount late coeffamount late fortakeoff at time takeoff atcity1 and landing at time landat city2

End ForEnd For

End While

Figure 11: Bid-Generation Technique

decrease in value for a later deadline is proportional to its‘lateness’.

Note that, like Wellman et al., we assume that all jobsare eligible to be started in the first time-slot. Our for-mulation of the problem differs in only one respect—weconsider only allocations in which jobs receive continuousblocks of time. However, this constraint is not restrictivebecause for any arbitrary allocation of time slots to jobsthere exists a new allocation in which each job receives acontinuous block of time and no job finishes later than inthe original allocation. (This may be achieved by num-bering the winning bids in increasing order of scheduledend time, and then allocating continuous time-blocks tojobs in this order. Clearly no job will be rescheduled tofinish later than its original scheduled time.) Note alsothat this problem cannot be translated to a trivial one-goodmulti-unit CA problem because jobs have different dead-lines.

This is CATS 1.0 problem 5. CATS default parame-ters: deviation = 0.5, prob additional deadline = 0.9,additivity = 0.2, and max length = 10. Note that we pro-pose a constant maximum job length, because the lengthof time a job requires should not depend on the amount oftime the auctioneer makes available.

4.5.2 Multi-Unit Power Generation Auctions: FutureWork

While num generated bids < num bids:

l = rand int(1, max length)d1 = rand int(l, num goods)dev = rand(1 − deviation, 1 + deviation)cur max deadline = 0new d = d1

To generate substitutable (XOR) bids. Do:

Make bids with price offered= dev · l1+additivity · d1/new d for allblocks [start, end] where start ≥ 1,end ≤ new d, end > cur max deadline,end − start = l

cur max deadline = new dnew d = rand int(cur max deadline +

1, num goods)While rand(0, 1) ≤ prob additional deadline

End While

Figure 12: Bid-Generation Technique

The problem of scheduling power generation is superfi-cially similar to the job-shop scheduling problem describedabove. In these auctions, electrical power generation com-panies bid to produce a certain quantity of power for eachhour of the day. This new problem differs from job-shopscheduling primarily because different kinds of power plantswill exhibit very different utility functions, considering dif-ferent sorts of goods to be complementary. For example,some plants will want to produce for long blocks of time(because they have startup and shutdown costs), others willprefer certain times of day due to labor costs, and still oth-ers will have neither restriction [9]. Due to the domain-specific complexity of bidder utilities, the construction ofa distribution for this problem remains an area for futurework.

4.6 Legacy DistributionsTo aid researchers designing new CA algorithms by facil-

itating comparison with previous work, CATS includes theability to generate bids according to all previous publishedtest distributions of which we are aware, that are able toscale with the number of goods and bids. Each of thesedistributions may be seen as an answer to three questions:what number of goods to request in a bundle, which goodsto request, and the price offered for a bundle. We begin bydescribing different techniques for answering each of thesethree questions, and then show how they have been com-bined in previously published work.

4.6.1 Number of GoodsUniform: Uniformly distributed on [1, num goods]Normal: Normally distributed with µ = µ goods and σ =σ goodsConstant: Fixed at constant goodsDecay: Starting with 1, repeatedly increment the size ofthe bundle until rand(0, 1) exceeds αBinomial: Request n goods with probabilitypn(1 − p)num goods−n

(num goods

n

)

Exponential: Request n goods with probability C exp−n/q

4.6.2 Which GoodsRandom: Draw n random goods from the set of all goods,

26

Page 31: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

without replacement11

4.6.3 Price OfferFixed Random: Uniform on [low fixed, hi fixed].Linear Random: Uniform on [low linearly·n, hi linearly·n]Normal: Draw from a normal distribution with µ = µ priceand σ = σ priceQuadratic12: For each good k and each bidder i set thevalue vi

k = rand(0, 1). Then i’s price offer for a set of goodsS is

∑k∈S vi

k +∑

k,q vikvi

q.

4.7 Previously Published DistributionsThe following is a list of the distributions used in all pub-

lished tests of which we are aware. In each case we describefirst the method used to choose the number of goods, fol-lowed by the method used to choose the price offer. In allcases the ‘random’ technique was used to determine whichgoods should be requested in a bundle. Each case is labeledwith its corresponding CATS legacy suite number; very sim-ilar distributions are given similar numbers and identicaldistributions are given the same number.[L1] Sandholm: Uniform, fixed random with low fixed = 0,hi fixed = 1[L1a] Andersson et al.: Uniform, fixed random withlow fixed = 0, hi fixed = 1000[L2] Sandholm: Uniform, linearly random withlow linearly = 0, hi linearly = 1[L2a] Andersson et al.: Uniform, linearly random withlow linearly = 500, hi linearly = 1500[L3] Sandholm: Constant with constant goods = 3, fixedrandom with low fixed = 0, hi fixed = 1[L3] deVries and Vohra: Constant with constant goods = 3,fixed random with low fixed = 0, hi fixed = 1[L4] Sandholm: Decay with α = 0.55, linearly random withlow linearly = 0, hi linearly = 1[L4] deVries and Vohra: Decay with α = 0.55, linearlyrandom with low linearly = 0, hi linearly = 1[L4a] Andersson et al.: Decay with α = 0.55, linearlyrandom with low linearly = 1, hi linearly = 1000[L5] Boutilier et al.: Normal with µ goods = 4 andσ goods = 1, normal with µ price = 16 and σ price = 3[L6] Fujishima et al.: Exponential with q = 5, linearlyrandom with low linearly = 0.5, hi linearly = 1.5[L6a] Andersson et al.: Exponential with q = 5, linearlyrandom with low linearly = 500, hi linearly = 1500[L7] Fujishima et al.: Binomial with p = 0.2, linearlyrandom with low linearly = 0.5, hi linearly = 1.5[L7a] Andersson et al.: Binomial with p = 0.2, linearlyrandom with low linearly = 500, hi linearly = 1500[L8] deVries and Vohra: Constant with constant goods = 3,quadratic

Parkes [17] used many of the test sets described above(particularly those described by Sandholm and Boutilier et

11Although in principle the problem of which goods to re-quest could be answered in many ways, all legacy distribu-tions of which we are aware use this technique.

12DeVries and Vohra [8] briefly describe a more general ver-sion of this price offer scheme, but do not describe how to setall the parameters (e.g., defining which goods are comple-mentary); hence we do not include it here. Quadratic priceoffers may be particularly applicable to spectrum auctions;see [2].

al.), but tested with fixed numbers of goods and bids ratherthan scaling these parameters.

5. CONCLUSIONIn this paper we introduced CATS , a test suite for combi-

natorial auction optimization algorithms. The distributionsin CATS represent a step beyond current CA testing tech-niques because they are economically motivated and modelreal-world problems. It is our hope that, with the help ofothers in the CA community, CATS will evolve into a univer-sal test suite that will facilitate the development and evalu-ation of new CA optimization algorithms.

6. REFERENCES[1] A. Andersson, M. Tenhnen, , and F. Ygge. Integer

programming for combinatorial auction winnerdetermination. In ICMAS-00, 2000.

[2] L. Ausubel, P. Cramption, R. McAfee, andJ. McMillan. Synergies in wireless telephony:Evidence from the broadband PCS auctions. Journalof Economics and Management Strategy, 6(3):497–527,Fall 1997.

[3] J. Banks, J. Ledyard, and D. Porter. Allocatinguncertain and unresponsive resources: Anexperimental approach. RAND Journal of Economics,20:1–23, 1989.

[4] C. Boutilier, M. Goldszmidt, and B. Sabata.Sequential auctions for the allocation of resources withcomplementarities. Proceedings of IJCAI-99, 1999.

[5] P. Brewer and C. Plott. A binary conflict ascendingprice (BICAP) mechanism for the decentralizedallocation of the right to use railroad tracks.International Journal of Industrial Organization,14:857–886, 1996.

[6] M. Bykowsky, R. Cull, and J. Ledyard. Mutuallydestructive bidding: The FCC auction design problem.Technical Report Social Science Working Paper 916,California Institute of Technology, Pasadena, 1995.

[7] C. DeMartini, A. Kwasnica, J. Ledyard, andD. Porter. A new and improved design formulti-object iterative auctions. Technical ReportSocial Science Working Paper 1054, CaliforniaInstitute of Technology, Pasadena, November 1998.

[8] S. DeVries and R. Vohra. Combinatorial auctions: Asurvey. 2000.

[9] W. Elmaghraby and S. Oren. The efficiency ofmulti-unit electricity auctions. IAEE, 20(4):89–116,1999.

[10] Y. Fujishima, K. Leyton-Brown, and Y. Shoham.Taming the computational complexity ofcombinatorial auctions: Optimal and approximateapproaches. In IJCAI-99, 1999.

[11] D. Grether, R. Isaac, and C. Plott. The Allocation ofScarce Resources: Experimental Economics and theProblem of Allocating Airport Slots. Westview Press,Boulder, CO, 1989.

[12] J. Ledyard, D. Porter, and A. Rangel. Experimentstesting multiobject allocation mechanisms. Journal ofEconomics & Management Strategy, 6(3):639–675, Fall1997.

[13] J. Ledyard and K. Szakaly. Designing organizations

27

Page 32: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

for trading pollution rights. Journal of EconomicBehavior and Organization, 25:167–196, 1994.

[14] D. Lehmann, L. O’Callaghan, and Y. Shoham. Truthrevalation in rapid, approximately efficientcombinatorial auctions. In ACM Conference onElectronic Commerce, 1999.

[15] P. Milgrom. Putting auction theory to work: Thesimultaneous ascending auction. Technical Report98-0002, Department of Economics, StanfordUniversity, 1998.

[16] N. Nisan. Bidding and allocation in combinatorialauctions. Working paper, 1999.

[17] D. Parkes. ibundle: An efficient ascending pricebundle auction. In ACM Conference on ElectronicCommerce, 1999.

[18] C. Plott and T. Cason. EPA’s new emissions tradingmechanism: A laboratory evaluation. Journal ofEnvironmental Economics and Management,30:133–160, 1996.

[19] D. Quan. Real estate auctions: A survey of theory andpractice. Journal of Real Estate Finance andEconomics, 9:23–49, 1994.

[20] S. Rassenti, S. Reynolds, , and V. Smith. Cotenancyand competition in an experimental auction marketfor natural gas pipeline networks. Economic Theory,4:41–65, 1994.

[21] S. Rassenti, V. Smith, and R. Bulfin. A combinatorialauction mechanism for airport time slot allocation.Bell Journal of Economics, 13:402–417, 1982.

[22] M. Rothkopf, A. Pekec, and R. Harstad.Computationally manageable combinatorial auctions.Management Science, 44(8):1131–1147, 1998.

[23] T. Sandholm. An implementation of the contract netprotocol based on marginal cost calculations. pages256–262. Proceedings of AAAI-93, 1993.

[24] T. Sandholm. An algorithm for optimal winnerdetermination in combinatorial auctions. In IJCAI-99,1999.

[25] M. Tennenholtz. Some tractable combinatorialauctions. To appear in the proceedings of AAAI-2000,2000.

[26] M. Wellman, W. Walsh, P. Wurman, andJ. MacKie-Mason. Auction protocols for decentralizedscheduling. Proceedings of the 18th InternationalConference on Distributed Computing Systems, 1998.

28

Page 33: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

R�max � A General Polynomial Time Algorithm for

Near�Optimal Reinforcement Learning �

Ronen I� Brafman brafman�cs�bgu�ac�il

Computer Science Department

Ben�Gurion University

Beer�Sheva� Israel �����

Moshe Tennenholtz moshe�robotics�stanford�edu

Computer Science Department

Stanford University

Stanford� CA �����

Abstract

R�max is an extremely simple model�based reinforcement learning algorithm whichcan attain near�optimal average reward in polynomial time� In R�max� the agent alwaysmaintains a complete� but possibly inaccurate model of its environment and acts basedon the optimal policy derived from this model� The model is initialized in an optimisticfashion� all actions in all states return the maximal possible reward �hence the name��During execution� it is updated based on the agent�s observations� R�max improves uponseveral previous algorithms� ��� It is simpler and more general than Kearns and Singh�s E�

algorithm� covering zero�sum stochastic games� ��� It has a built�in mechanism for resolvingthe exploration vs� exploitation dilemma� �� It formally justies the �optimism underuncertainty� bias used in many RL algorithms� � � It is simpler� more general� and moree�cient than Brafman and Tennenholtz�s LSG algorithm for learning in single controllerstochastic games� ��� It generalizes the algorithmbyMonderer and Tennenholtz for learningin repeated games� ��� It is the only algorithm for learning in repeated games� to date�which is provably e�cient� considerably improving and simplifying previous algorithms byBanos and by Megiddo�

�� Introduction

Reinforcement learning has attracted the attention of researchers in AI and related �eldsfor quite some time� Many reinforcement learning algorithms exist and for some of themconvergence rates are known� However� Kearns and Singh�s E� algorithm �Kearns � Singh����� was the �rst provably nearoptimal polynomial time algorithm for learning in Markovdecision processes �MDPs� E� was extended later to handle single controller stochasticgames �SCSGs �Brafman � Tennenholtz� ���� as well as structured MDPs �Kearns �Koller� ����� In E� the agent learns by updating a model of its environment using statisticsit collects� This learning process continues as long as it can be done relatively e ciently�Once this is no longer the case� the agent uses its learned model to compute an optimal

�� This paper appeared as Technical Report �������� Department of Computer Science� Ben�Gurion Uni�

versity� The second author permanent address is� Faculty of Industrial Engineering and Management�Technion�Israel Institute of Technology� Haifa ����� Israel� The second author gratefully acknowledges

the support of DARPA grant F�������C���� �P������

APPENDIX D

29

Page 34: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

policy and follows it� The success of this approach rests on two important properties� theagent can determine online whether an e cient learning policy exists� and if such a policydoes not exist� it is guaranteed that the optimal policy with respect to the learned modelwill be approximately optimal with respect to the real world�

The di culty in generalizing E� to adverserial contexts� i�e�� to di�erent classes of games�stems from the adversary�s ability to in�uence the probability of reaching di�erent states�In a game� the agent does not control its adversary�s choices� nor can it predict them withany accuracy� Therefore� it has di culty predicting the outcome of its actions and whetheror not they will lead to new information� Consequently� it is unlikely that an agent canexplicitly choose between an exploration and an exploitation policy� For this reason� theonly extension of E� to adverserial contexts used the restricted SCSG model in which theadversary in�uences the reward of a game only� and not its dynamics�

To overcome this problem� we suggest a di�erent approach in which the agent neverattempts to learn explicitly� Our agent always attempts to optimize its behavior� albeitwith respect to a �ctitious model in which optimal behavior often leads to learning� Thismodel assumes that the reward the agent obtains in any situation it is not too familiarwith� is the maximal possible reward � Rmax� The optimal policy with respect to theagent�s �ctitious model has a very interesting and useful property with respect to the realmodel� it is always either optimal or it leads to e cient learning� The agent does not knowwhether it is optimizing or learning e ciently� but it always does one or the other� Thus� theagent will always either exploit or explore e ciently� without knowing ahead of time whichof the two will occur� Since there is only a polynomial number of parameters to learn� aslong as learning is done e ciently we can ensure that the agent spends a polynomial numberof steps exploring� and the rest of the time will be spent exploiting� Thus� the resultingalgorithm may be said to use an implicit explore or exploit approach� as opposed to Kearnsand Singh�s explicit explore or exploit approach�

This learning algorithm� which we call R�max� is very simple to understand and toimplement� The algorithm converges in polynomialtime to a nearoptimal solution� Moreover� R�max is described in the context of zerosum stochastic game� a model that is moregeneral than Markov Decision Processes� As a consequence� R�max is more general andmore e cient than a number of previous results� It generalizes the results of Kearns andSingh ����� to adverserial contexts and to situations where the agent considers a stochastic model of the environment inappropriate� opting for a nondeterministic model instead�R�max can handle more classes of stochastic games than the LSG algorithm �Brafman �Tennenholtz� ����� In addition� it attains a higher expected average reward than LSG�R�max also improves upon previous algorithms for learning in repeated games �Aumann� Maschler� ����� such as Megiddo�s �Megiddo� ���� and Banos �Banos� ����� It is theonly polynomial time algorithm for this class of games that we know of� and it is much simpler� too� Finally� R�max generalizes the results of Monderer and Tennenholtz �Monderer� Tennenholtz� ���� to handle the general probabilistic maximin �safety level decisioncriterion�

The approach taken by R�max is not new� It has been referred to as the optimism in theface of uncertainty heuristic� and was considered an adhoc� though useful� approach �e�g��see Section ����� in �Kaelbling� Littman� �Moore� ����� where it appears under the heading�AdHoc Techniques� and Section ��� in �Sutton � Barto� ���� where this approach is

30

Page 35: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

called optimistic initial values and is referred to as a �simple trick that can be quite e�ectiveon stationary problems�� This optimistic bias has been used in a number of wellknownreinforcement learning algorithms� e�g� Kaelbling�s interval exploration method �Kaelbling������ the exploration bonus in Dyna �Sutton� ����� the curiositydriven exploration of�Schmidhuber� ����� and the exploration mechanism in prioritized sweeping �Moore �Atkenson� ����� More recently� Tadepalli and Ok �Tadepalli � Ok� ���� presented areinforcement learning algorithm that works in the context of the undiscounted averagereward model used in this paper� In particular� one variant of their algorithm� called AHlearning� is very similar to R�max� However� as we noted above� none of this work providestheoretical justi�cation for this very natural bias� Thus� an additional contribution of thispaper is a formal justi�cation for the optimism under uncertainty bias�

The paper is organized as follows� in Section � we de�ne the learning problem moreprecisely and the relevant parameters� In Section � we describe the R�max algorithm� InSection � we prove that it yields nearoptimal reward in polynomial time� We conclude inSection ��

�� Preliminaries

We present R�max in the context of a model that is called a stochastic game� This modelis more general than a Markov decision process because it does not necessarily assume thatthe environment acts stochastically �although it can� In what follows we de�ne the basicmodel� describe the set of assumptions under which our algorithm operates� and de�ne theparameters in�uencing its running time�

��� Stochastic Games

A game is a model of multiagent interaction� In a game� we have a set of players� eachof whom chooses some action to perform from a given set of actions� As a result of theplayers� combined choices� some outcome is obtained which is described numerically in theform of a payo� vector� i�e�� a vector of values� one for each of the players� We concentrateon twoplayer� �xedsum games �i�e�� games in which the sum of values in the payo� vectoris constant� We refer to the player under our control as the agent� whereas the other playerwill be called the adversary�

A common description of a game is as a matrix� This is called a game in strategic form�The rows of the matrix correspond to the agent�s actions and the columns correspond tothe adversary�s actions� The entry in row i and column j in the game matrix contains therewards obtained by the agent and the adversary if the agent plays his ith action and theadversary plays his jth action� We make the simplifying assumption that the size of theaction set of both the agent and the adversary is identical� However� an extension to setsof di�erent sizes is trivial�

In a stochastic game �SG the players play a �possibly in�nite sequence of standardgames from some given set of games� After playing each game� the players receive theappropriate payo�� as dictated by that game�s matrix� and move to a new game� Theidentity of this new game depends� stochastically� on the previous game and on the players�actions in that previous game� Formally�

31

Page 36: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

De�nition � A �xedsum� two player� stochasticgame �SG� M on states S � f�� � � � � Ng�and actions A � fa�� � � � � akg� consists of�

� Stage Games� each state s � S is associated with a two�player� �xed�sum game instrategic form� where the action set of each player is A� We use Ri to denote thereward matrix associated with stage�game i�

� Probabilistic Transition Function� PM �s� t� a� a� is the probability of a transitionfrom state s to state t given that the �rst player �the agent� plays a and the secondplayer �the adversary� plays a��

An SG is similar to an MDP� In both models� actions lead to transitions between statesof the world� The main di�erence is that in an MDP the transition depends on the actionof a single agent whereas in an SG the transition depends on a jointaction of the agent andthe adversary� In addition� in an SG� the reward obtained by the agent for performing anaction depends on its action and the action of the adversary� To model this� we associate agame with every state� Therefore� we shall use the terms state and game interchangeably�

Stochastic games are useful not only in multiagent contexts� They can be used instead of MDPs when we do not wish to model the environment �or certain aspects of itstochastically� In that case� we can view the environment as an agent that can chooseamong di�erent alternatives� without assuming that its choice is based on some probabilitydistribution� This leads to behavior maximizing the worstcase scenario� In addition� theadversaries that the agent meets in each of the stagegames could be di�erent entities�

R�max is formulated as an algorithm for learning in Stochastic Games� However� it isimmediately applicable to �xedsum repeated games and to MDPs because both of thesemodels are degenerate forms of SGs� A repeated game is an SG with a single state and anMDP is an SG in which the adversary has a single action at each state�

For ease of exposition we normalize both players� payo�s in each stage game to be nonnegative reals between � and some constant Rmax� We also take the number of actions tobe constant� The set of possible histories of length t is �S�A�t�S� and the set of possiblehistories� H � is the union of the sets of possible histories for all t � �� where the set ofpossible histories of length � is S�

Given an SG� a policy for the agent is a mapping fromH to the set of possible probabilitydistributions over A� Hence� a policy determines the probability of choosing each particularaction for each possible history�

We de�ne the value of a policy using the average expected reward criterion as follows�Given an SG M and a natural number T � we denote the expected T step undiscountedaverage reward of a policy � when the adversary follows a policy �� and where both � and� are executed starting from a state s � S� by UM �s� �� �� T �we omit subscripts denotingthe SG when this causes no confusion� Let UM �s� �� T � min� is a policy UM �s� �� �� T denote the value that a policy � can guarantee in T steps starting from s� We de�neUM �s� � � lim infT��UM �s� �� T � Finally� we de�ne UM�� � mins�S UM �s� ���

�� We discuss this choice below�

32

Page 37: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

��� Assumptions� Complexity and Optimality

We make two central assumptions� First� we assume that the agent always recognizesthe identity of the state �or stagegame it reached �but not its associated payo�s andtransition probabilities and that after playing a game� it knows what actions were takenby its adversary and what payo�s were obtained� Second� we assume that the maximalpossible reward Rmax is known ahead of time� This latter assumption can be removed��

Next� we wish to discuss the central parameter in the analysis of the complexity of R�max � the mixing time� �rst identi�ed by Kearns and Singh ������ Kearns and Singh arguethat it is unreasonable to refer to the e ciency of learning algorithms without referring tothe e ciency of convergence to a desired value� They de�ned the �return mixing time ofa policy � to be the smallest value of T after which � guarantees an expected payo� of atleast U�� � �� In our case� we have to take into account the existence of an adversary�Therefore� we adjust this de�nition slightly as follows� a policy � belongs to the set ���� T of policies whose �return mixing time is at most T � if for any starting state s and for anyadversary behavior �� we have that U�s� �� �� T � U��� ��

That is� if a policy pi � ���� T then no matter what the initial state is and what theadversary does� the policy � will yield in any t � T steps an expected average rewardthat is � close to its value� The �return mixing time of a policy � is the smallest T forwhich pi � ���� T � Notice that this means that an agent with perfect information aboutthe nature of the games and the transition function will require at least T steps� on theaverage� to obtain an optimal value using an optimal policy � whose �return mixing timeis T � Clearly� one cannot expect an agent lacking this information to perform better�

We denote by Opt����� T the optimal expected T step undiscounted average returnfrom among the policies in ���� T � When looking for an optimal policy �with respect topolicies that mix at time T � for a given � � �� we will be interested in approaching thisvalue in time polynomial in T � in ���� in ��� �where � and � are the desired error bounds�and in the size of the description of the game�

The reader may have noticed that we de�ned UM �� as mins�S UM �s� �� It may appearthat this choice makes the learning task too easy� For instance� one may ask why shouldn�twe try to attain the maximal value over all possible states� or at least the value of our initialstate� We claim that the above is the only reasonable choice� and that it leads to resultsthat are as strong as previous algorithms�

To understand this point� consider the following situation� we start learning at somestate s in which the optimal action is a� If we do not execute the action a in s� we reachsome state s� that has a very low value� A learning algorithm without any prior knowledgecannot be expected to immediately guess that a should be done in s� In fact� without suchprior knowledge� it cannot conclude that a is a good action unless it tries the other actionsin s and compares their outcome to that of a� Thus� one can expect an agent to learn anearoptimal policy only if the agent can visit state s su ciently many times to learn aboutthe consequences of di�erent options in s� In a �nite SG� there will be some set of statesthat we can sample su ciently many times� and it is for such states that we can learn tobehave�

�� We would need to run the algorithm repeatedly for increasing values of Rmax� The resulting algorithm

remains polynomials in the relevant parameters�

33

Page 38: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

In fact� it probably makes sense to restrict our attention to a subset of the states suchthat from each state in this set it is not too hard to get to any other state� In the context ofMDPs� Kearns and Singh refer to this as the ergodicity assumption� In the context of SGs�Ho�man and Karp ����� refer to this as the irreducibility assumption� An SG is said to beirreducible if the Markovchain obtained by �xing any two �pure stationary strategies foreach of the players is irreducible �i�e�� each state is reachable from each other state� In thespecial case of an MDP� irreducibility is precisely the ergodicity property used by Kearnsand Singh in their analysis of E��

Irreducible SGs have a number of nice properties� as shown by �Ho�man � Karp� �����First� the maximal longterm average reward is independent of the starting state� implyingthat max� mins�S UM�s� � � max� maxs�S UM �s� �� Second� this optimal value can beobtained by a stationary policy �i�e�� one that depends on the current stagegame only�Thus� although we are not restricting ourselves to irreducible games� we believe that ourresults are primarily interesting in this class of games�

�� The R�max algorithm

Recall that we consider a stochastic game M consisting of a set S � fG�� � � � � GNg of stagegames in each of which both the agent and the adversary have a set A � fa�� � � � � akg ofpossible actions� We associate a reward matrix Ri with each game� and use Ri

m�l to denote apair consisting of the reward obtained by the agent and the adversary after playing actionsam and al in game Gi� respectively� In addition� we have a probabilistic transition function�PM � such that PM�s� t� a� a� is the probability of making a transition from Gs to Gt giventhat the agent played a and the adversary played a�� It is convenient to think of PM �i� �� a� a�as a function associated with the entry �a� a� in the stagegame Gi� This way� all modelparameters� both rewards and transitions� are associated with joint actions of a particulargame� Let � � �� For ease of exposition� we assume throughout most of the analysis thatthe �return mixing time of the optimal policy� T � is known� Later� we show how thisassumption can be relaxed�

The R�max algorithm is de�ned as follows�

Initialize� Construct the following modelM � consisting ofN�� stagegames� fG�� G�� � � � � GNg�and k actions� fa�� � � � � akg� Here� G�� � � � � GN correspond to the real games� fa�� � � � � akgcorrespond to the real actions� and G� is an additional �ctitious game� Initialize allgame matrices to have �Rmax� � in all entries�� Initialize PM �Gi� G�� a� a

� � � for alli � �� � � � � N and for all actions a� a��

In addition� maintain the following information for each entry in each gameG�� � � � � GN ��� a boolean value known�unknown� initialized to unknown� �� the states reachedby playing the joint action corresponding to this entry �and how many times� �� thereward obtained �by both players when playing the joint action corresponding to thisentry� Items � and � are initially empty�

Repeat�

� The value � given to the adversary does not play an important role here�

34

Page 39: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Compute and Act� Compute an optimal T step policy for the current state� andexecute it for T steps or until a new entry becomes known�

Observe and update� Following each joint action do as follows� Let a be the actionyou performed in Gi and let a� be the adversary�s action�

� If the joint aqction �a� a� is performed for the �rst time in Gi� update thereward associated with �a� a� in Gi� as observed�

� Update the set of states reached by playing �a� a� in Gi�

� If at this point your record of states reached from this entry containsK� � max��d�NTRmax

� �e� d��ln�� ��Nk�

e � � elements� mark this entry asknown� and update the transition probabilities for this entry according tothe observed frequencies�

As can be seen� R�max is quite simple� It starts with an initial estimate for the modelparameters that assumes all states and all joint actions yield maximal reward and leadwith probability � to the �ctitious stagegame G�� Based on the current model� an optimalpolicy is computed and followed� Following each joint action the agent arrives at a newstagegame� and this transition is recorded in the appropriate place� Once we have enoughinformation about where some joint action leads to from some stagegame� we update theentries associated with this stagegame and this joint action in our model� After each modelupdate� we recompute an optimal policy and repeat the above steps�

�� Optimality and Convergence

In this section we provide the tools that ultimately lead to the proof of the following theorem�

Theorem � Let M be an SG with N states and k actions� Let � � � � �� and � � � beconstants� Denote the policies for M whose ��return mixing time is T by �M ��� T � anddenote the optimal expected return achievable by such policies by Opt��M��� T � Then�with probability of no less than �� � the R�max algorithm will attain an expected return ofOptM ����� T � �� within a number of steps polynomial in N� k� T� �

�� and �

��

In the main lemma required for proving this theorem we show the following� if the agentfollows a policy that is optimal with respect to the model it maintains for T steps� it willeither attain nearoptimal average reward� as desired� or it will update its statistics forone of the unknown slots with su ciently high probability� This can be called the implicitexplore or exploit property of R�max� The agent does not know ahead of time whether it isexploring or exploiting � this depends in a large part on the adversary�s behavior which itcannot control or predict� However� it knows that it does one or the other� no matter whatthe adversary does� Using this result we can proceed as follows� As we will show� the numberof samples required to mark a slot as known is polynomial in the problem parameters� andso is the total number of entries� Therefore� the number of T step iterations in which nonoptimal reward is obtained is bounded by some polynomial function of the input parameters�say T �� This implies that by performing T step iterations D � T �Rmax� times� we get thatthe loss obtained by nonoptimal execution �where exploration is performed� is boundedby � for any � � � ��

35

Page 40: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Before proving our main lemma we state and prove an extension of Kearns and Singh�sSimulation Lemma �Kearns � Singh� ���� to the context of SGs with a slightly improvedbound�

De�nition � Let M and M be SGs over the same state and action spaces� We say that M is an �approximation of M if for every state s we have�

�� If PM �s� t� a� a� and P �M �s� t� a� a� are the probabilities of transition from state s tostate t given that the joint action carried out by the agent and the adversary is �a� a��in M and M respectively� then� PM �s� t� a� a�� � P �M �s� t� a� a� � PM�s� t� a� a��

� For every state s� the same stage�game is associated with s in M and in M �

Lemma � Let M and M be SGs over N states� where M is an �NTRmax

�approximation ofM � then for every state s� agent policy �� and adversary policy �� we have that

jU �M�s� �� �� T� UM�s� �� �� T j � ��

Proof� When we �x both players� policies we get� both in MDPs and in general SGs� aprobability distribution over T step paths in the state space� This is not a Markov processbecause the player�s policies can be nonstationary� However� the transition probabilities ateach point depend on the current state and the actions taken and the probability of eachpath is a product of the probability of each of the transitions� This is true whether thepolicies are pure or mixed�

We need to prove that�X

p

jPrM�pUM�p� Pr

�M�pU �M�pj � �

where p is a T step path starting at s� PrM�p �respectively� Pr �M�p is its probability in therandom process induced by M �resp� by M� �� and �� and UM �p� �U �M�p is the averagepayo� along this path� Because the average payo� is bound by Rmax we have�

X

p

jPrM�pUM�p� Pr

�M�pU �M�pj �

X

p

jPrM�p� Pr

�M�pjRmax�

To conclude our proof� it is su cient to show thatX

p

jPrM�p� Pr

�M�pj � ��Rmax

Let hi de�ne the following random processes� start at state s and follow policies � and� � for the �rst i steps� the transition probabilities are identical to the process de�ned aboveon M � and for the rest of the steps its transition probabilities are identical to M � Clearly�when we come to assess the probabilities of T step path� we have that h� is identical tothe original process on M � whereas hT is identical to original process on M � The triangleinequality implies that

X

p

jPrM�p� Pr

�M�pj �

X

p

jPrh��p� Pr

hT�pj �

T��X

i��

X

p

jPrhi�p� Pr

hi���pj

36

Page 41: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

If we show that for any � � i � T we have thatP

p jPrhi�p�Prhi���pj � ��TRmax� it willfollow that

Pp jPrM�p� Pr �M�pj � ��Rmax� which is precisely what we need to show�

We are left with the burden of proving thatP

p jPrhi�p�Prhi���pj � ��TRmax� We cansum over all path p as follows� �rst we sum over the N possible states that can be reachedin i steps� Then we sum over all possible path pre�xes that reach each such state� Next�we sum over all possible states reached after step i� �� and �nally over all possible su xesthat start from each such state� Now� we note that the probability of each particular pathp is the product of the probability of its particular pre�x� the probability of a transitionfrom xi to xi��� and the probability of the su x� We will use xi to denote the state reachedafter i steps� xi�� to denote the state reached after i� � steps� pre�xi to denote the isteppre�xes reaching xi� and suf�xj to denote the su xes starting at xj � Thus�

X

p

jPrhi�p� Pr

hi���pj �

X

xi

X

prexi

X

xi��

X

sufxi��

j�Prhi�pre�xi Pr

hi�xi � xi�� Pr

hi�suf�xi���

� Prhi��

�pre�xi Prhi��

�xi � xi�� Prhi��

�suf�xi��j

However� the pre�x and su x probabilities are identical in hi and hi��� Thus� this sum isequal to

X

xi

X

prexi

X

xi��

X

sufxi��

Prhi�pre�xi Pr

hi�suf�xi��jPr

hi�xi � xi��� Pr

hi���xi � xi��j �

X

xi

X

prexi

Prhi�pre�xi

X

xi��

X

sufxi��

Prhi�suf�xi��jPr

hi�xi � xi��� Pr

hi���xi � xi��j �

�X

xi

X

prexi

Prhi�pre�xi��

X

xi��

X

sufxi��

Prhi�suf�xi����NTRmax�

This last expression is a product of two independent terms� The �rst term is the sumover all possible istep pre�xes �i�e�� overall all pre�xes starting in the given x� and endingin xi� for any xi� Hence� it is equal to �� The second term is a sum over all su xes startingat xi��� for any value of xi��� For any given value of xi�� the probability of any su xstarting at this value is �� Summing over all possible values of xi��� we get a value of N �

Thus�

X

p

jPrhi�p� Pr

hi���pj � � � ��NTRmax �N

This concludes our proof�

Next� we de�ne the notion of an induced SG� The de�nition is similar to the de�nitionof an induced MDP given in �Kearns � Singh� ���� except for the use of R�max� Theinduced SG is the model used by the agent to determine its policy�

De�nition � Let M be an SG� Let L be the set of entries �Gi� a� a� marked unknown�

That is� if �Gi� a� a� � L then the entry corresponding to the joint action �a� a� in the

stage�game Gi is marked as unknown� De�ne ML to be the following SG� ML is identicalto M � except that ML contains an additional state G�� Transitions and rewards associated

37

Page 42: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

with all entries in ML which are not in L are identical to those in M � For any entry in L

or in G�� the transitions are with probability � to G�� and the reward is Rmax for the agentand for the adversary�

Given an SG M with a set L of unknown states� RMLmax denotes the optimal policyfor the induced SG ML� When ML is clear from the context we will simply use the termR�max policy instead of RMLmax policy�

We now state and prove the implicit explore or exploit lemma�

Lemma � Let M be an SG� let L and ML be as above� Let � be an arbitrary policy forthe adversary� let s be some state� and let � � � �� Then either ��� jOpt��M��� T �VR�maxj � � where VR�max is the expected T �step average reward for the RML�max policyon M � or �� An unknown entry will be played in the course of running R�max on M forT steps with a probability of at least �

Rmax�

In practice� we cannot determine �� the adversary�s policy� ahead of time� Thus� wedo not know whether Rmax will attain nearoptimal reward or whether it will reach anunknown entry with su cient probability� The crucial point is that it will do one or theother� no matter what the adversary does�

Proof� First� notice that the value of R�max in ML will be no less than the value ofthe optimal policy in M � This follows from the fact that the reward for the agent is at leastas large as in M � and that the Rmax policy is optimal with respect to ML�

In order to prove the claim� we will show that the di�erence between the reward obtainedby the agent in M and in ML when R�max is played is smaller than the explorationprobability times Rmax� This will imply that if the exploration probability is small� thenR�max will attain nearoptimal payo�� Conversely� if nearoptimal payo� is not attained�the exploration probability will be su ciently large�

For any policy� we may write�

UM�s� �� �� T �X

p

Pr��sM �p�UM�p �X

q

Pr��sM �q�UM�q �X

r

Pr��sM �r�UM�r

where the sums are over� respectively� all T paths p in M � all T paths q in M such thatevery entry visited in q is not in L� and all T path r in M in which at least one entry visitedis in L� Hence�

jUM�s�Rmax� �� T �UML�s�Rmax� �� T j� j

X

p

PrRmax���sM �p�UM�p�

X

p

PrRmax���sML

�p�UML�pj

� jX

q

PrRmax���sM �q�UM�q�

X

r

PrRmax���sM �r�UM�r�

X

q

PrRmax���sML

�q�UML�q�X

r

PrRmax���sML

�r�UML�rj

� jX

q

PrRmax���sM �q�UM�q�

X

q

PrRmax���sML

�q�UML�qj �

jX

r

PrRmax���sM �r�UM�r�

X

r

PrRmax���sML

�r�UML�rj

��

38

Page 43: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

The �rst di�erence�

jX

q

PrRmax���sM �q�UM�q�

X

q

PrRmax���sML

�q�UML�qj

must be �� This follows from the fact that in M and in ML� the rewards obtained in a pathwhich do not visit an unknown entry are identical� The probability of each such path isidentical as well�

Hence� we have�

jUM�s�Rmax� �� T �UML�s�Rmax� �� T j � j

X

r

PrRmax���sM �r�UM�r�

X

r

PrRmax���sML

�r�UML�rj

�X

r

PrRmax���sM �r�Rmax

This last inequality stems from the fact that the average reward in any path is no greaterthan Rmax and no smaller than � and the fact that we can appropriately associate di�erentpaths within these models with equal probabilities�

The last term is the probability of reaching an unknown entry multiplied by Rmax� Ifthis probability is less than �

Rmaxthen

jUM �s�Rmax� �� T � UML�s�Rmax� �� T j �

Denote by �� an optimal T step policy� and let UM�s� �� T be its value� �Note that thisvalue is independent of the adversary strategy �� as �� guarantees at least this value forevery adversary behavior� If UM�s�Rmax� �� T � UM�s� �� T we are done� Suppose thatUM �s�Rmax� �� T � UM�s� �� T � We know that UML

�s�Rmax� �� T � UM�s� �� T is nolesser than the optimal T step average reward for M � Therefore� we have that

jUM�s� �� T � UM �s�Rmax� �� T j� UM�s� �� T � UM�s�Rmax� �� T

� UML�s�Rmax� �� T � UM�s�Rmax� �� T �

We are now ready to prove Theorem �� First� we wish to show that the expected averagereward is as stated� We must consider three models� M � the real model� M �

L the actualmodel used� and M � where M � is an ���NTRmaxapproximation of M such that the SGinduced by M � and L is M �

L� At each T step iteration of our algorithm we can apply theImplicit Explore or Exploit Lemma to M � and M �

L for the set L applicable at that stage�Hence� at each step either the current Rmax policy leads to an average reward that is��� close to optimal with respect to the adversary�s behavior and the model M � or it leadsto an e cient learning policy with respect to the same model� However� because M � isan ���approximation of M � the simulation lemma guarantees that the policy generated iseither � close to optimal or explores e ciently� We know that the number of T step phasesin which we are exploring can be bounded polynomially� This follows from the fact that wehave a polynomial number of parameters to learn �in N and k and that the probabilitythat we obtain a new� useful statistic is polynomial in �� T and N � Thus� if we choose a

��

39

Page 44: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

large enough �but still polynomial number of T step phases� we shall guarantee that ouraverage reward is as close to optimal as we wish�

The above analysis was done assuming we actually obtain the expected value of eachrandom variable� This cannot be guaranteed with probability �� Yet� we can ensure thatthe probability that the algorithm fails to attain the expected value of certain parametersbe small enough by sampling it a larger �though still polynomial number of time� This isbased on the wellknown Cherno� bound� Using this technique one can show that when thevariance of some random variable is bounded� we can ensure that we get near its averagewith probability �� � by using a su ciently large sample that is polynomial in ����

In our algorithm� there are three reasons why the algorithm could fail to provided theagent with near optimal return in polynomial time�

�� First� we have to guarantee that our estimates of the transition probabilities for everyslot are su ciently accurate� Recall that to ensure a loss of no more than ��� ourestimates must be within �

�NTRmaxof the real probabilities�

Consider a set of trials� where the joint action �a� a� is performed in state s� Considerthe probability of moving from state s to state t given the jointaction �a� a� in agiven trial� and denote it by p� Notice that there are Nk� such probabilities �onefor each game and pair of agentadversary actions� Therefore� we would like toshow that the probability of failure in estimating p is less than �

�Nk�� Let Xi be an

indicator random variable� that is � i� we moved to state t when we were in state sand selected an action a in trial i� Let Zi � Xi � p� Then E�Zi � �� and jZij � ��

Then� Cherno� bound implies that �for any K� Prob�!K�

i��Zi � K��

� � e�K�

��

� � This

implies that Prob��K�i��

Xi

K��p � K�

��

� � e�K�

��

� � Similarly� we can de�ne Z�i � p�Xi�

and get by Cherno� bound that Prob�!K�

i��Z�

i � K��

� � e�K�

��

� � This implies that

Prob�p��K�i��

Xi

K�� K�

��

� � e�K�

��

� � Hence� we get that Prob�j�K�i��

Xi

K��pj � K�

��

� �

�e�K�

��

� �

We now choose K� such thatK���

� � ��NTRmax

� and �e�K�

��

� � ��Nk�

� This is obtained

by taking K� � max���NTRmax�

����ln�� ��Nk�

� ��

The above guarantees that if we sample each slot K� times the probability that ourestimate of the transition probability will be outside our desired bound is less than �

� �

Using the pigeonhole principle we know that total number of visits to slots markedunknown is Nk�K�� After at most this number of visits all slots will be markedknown�

�� The Implicit Exploit or Explore Lemma gives a probability of �Rmax

of getting toexplore� We now wish to show that after K� attempts to explore �i�e� when we do notexploit� we obtain the K� required visits� Let Xi be an indicator random variablewhich is � if we reach to the exploration state �G� in Lemma � when we do not exploit�and � otherwise� Let Zi � Xi �

�Rmax

� and let Z�i ��

Rmax� Xi� and apply Cherno�

��

40

Page 45: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

bound on the sum of Zi�s and Z�is as before� We get that Prob�j!K�

i��Xi �K��Rmax

j �

K�

� � �e�K�

��

� � We can now choose K� such that K�

� � K��

Rmax� k�NK� and

�e�K�

��

� � ��k�N to guarantee that we will have a failure probability of less than �

� dueto this reason�

�� When we perform a T step iteration without learning our expected return is Opt��M�T� ���� However� the actual return may be lower� This point is handled by the fact thatafter polynomially many local exploitations are carried out� Opt��M�T� �� �

�� can

be obtained with a probability of failure of at most �� � This is obtained by standard

Cherno� bounds� and makes use of the fact that the standard deviation of the expected reward in a T step policy is bounded because the maximal reward is boundedby Rmax� More speci�cally� consider z � MNT exploitation stages for some M � ��Denote the average return in an exploitation stage by �� and let Xi denote the returnin the ith exploitation stage �� � i � z� Let Yi � ��Xi

Rmax� Notice that jYij � ��

and that E�Yi � �� Cherno� bound implies that� Prob�!zj��Yj � z

� � e�z

��

This implies that the average return along z iterations is at most Rmax

z��

lower than

� with probability of at least e�z

��

� � By choosing M such that z � ��Rmax�

�� and

z � ��ln� ����� we get the desired result� with probability less than �

� the valueobtained will not be more than �

� lower than the expected value�

By making the failure probability less than �� for each of the above stages� we are able

to obtain a total failure probability of no more than ��From the proof� we can also observe the bounds on running times required to obtain

this result� However� notice that in practice� the only bound that we need to consider whenimplementing the algorithm is the sample size K��

To remove the assumptions that the �return mixing time is known� we proceed as in�Kearns � Singh� ����� From the proof of the algorithm we deduced some polynomialP in the problem parameters such that if T is the mixingtime� then after P �T steps weare guaranteed� with probability �� �� the desired return� We repeat the execution of thealgorithm for all values of T � �� �� �� � � �� each time performing P �T steps� Suppose thatT� is the mixing time� then after

PT�i�� P �i � O�P �T�� steps� we will obtain the desired

return�

Notice that the Rmax algorithm does not have a �nal halting time and will be appliedcontinuously as long as the agent is functioning in its environment� The only caveat is thatat some point our current mixing time candidate T will be exponential in the actual mixingtime T�� at which point each step of the algorithm will require an exponential calculation�However� this will occur only after an exponential number of steps� This is true for the E�

algorithm too�Another point worth noting is that the agent may never know the values of some of

the slots in the game because of the adversary�s choices� Consequently� if � is the optimalpolicy given full information about the game� the agent may actually converge to a policy ��

that di�ers from �� but which yields the best return given the adversary�s actual behavior�

��

41

Page 46: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

This return will be no smaller than the return guaranteed by �� The mixing time of �� will�in general� di�er from the mixing time of �� However� we are guaranteed that if T� is the�return mixing time of �� and v is its value� after time polynomial in T�� the agent�s actualreturn will be at least v �subject to the deviations a�orded by the theorem�

��� Repeated Games

A stochastic game in which the set of stage games contains a single game is called a repeatedgame� This is an important model in game theory and a lot of work has been devoted tothe study of learning in repeated games �Fudenberg � Levine� ����� There is large class oflearning problems associated with repeated games� and the problem as a whole is referred toas repeated games with incomplete information �Aumann � Maschler� ����� The particularclass of repeated games with incomplete information we are using �i�e�� where the agent getsto observe the adversary�s actions and the payo�s� and it knows the value of Rmax is knownas an Adaptive Competitive Decision Process and has been studied� e�g�� by Banos �Banos����� and Megiddo �Megiddo� �����

Because a repeated game contains a single stage game� there are no transition probabilities to learn� However� there is still the task of learning to play optimally� In addition�because there is only a single stage game� the mixing time of any policy is � � becausethe agent�s expected reward after playing a single stagegame is identical to the policy�s expected reward� However� the time required to guarantee this expected reward could be muchlarger� This stems from the fact that the optimal policy in a game is often mixed� That is�the agent chooses probabilistically� and not deterministically� among di�erent options�

In repeated games� the Rmax algorithm is slightly modi�ed� as we do not need tomaintain a �ctitious state and we need not maintain statistics on the frequency of varioustransitions� We describe the precise algorithm below�

Initialization Initialize the game model with payo�s of Rmax for every joint action for theagent and � for the adversary� Mark all joint actions as unknown�

Play Repeat the following process�

Policy Computation Compute an optimal policy for the game based on the currentmodel and play it�

Update If the joint action played is marked unknown� update the game matrix withits observed payo�s and mark is known�

Given � � �� and � � � � �� we need to show that after polynomially many iterationsM � where M is polynomial in the number of entries in the game� �

�� and �

�� we obtain a

payo� that is at most � lower than the expected payo� of the optimal strategy in this game�with probability of a least �� ��

First notice that the expected payo� at each stage� when we do not expose the value ofa new entry� is greater of equal to the expected payo� of the optimal strategy� By choosingM � Q� � Q�� where Q� � k�Rmax�M � ��� we get that the loss due to learning of newentries is bounded by �

� � Now� we need to guarantee that after Q� executions �where Q�

is polynomial in the problem parameters of a policy with expected payo� greater or equal

��

42

Page 47: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

to r �where r is the expected payo� of the optimal policy in the original game� our actualpayo� is at least r� ��� with probability of at least �� �� This follows from the argumentspresented in case � of the general proof for SGs�

�� Conclusion

We described Rmax� a simple reinforcement learning algorithm that is guaranteed to lead topolynomial time convergence to nearoptimal average reward in zerosum stochastic games�In fact� Rmax guarantees the safety level �probabilistic maximin value for the agent ingeneral noncooperative stochastic games�

Rmax is an optimistic modelbased algorithm that formally justi�es the optimism inthe face of uncertainty bias� Its analysis is similar� in many respects� to Kearns and Singh�sE� algorithm� However� unlike the E�� the agent does not need to explicitly contemplatewhether to explore or to exploit� In fact� the agent may never learn an optimal policyfor the game�� or it may play an optimal policy without knowing that it is optimal� The�clever� aspect of the agent�s policy is that it �o�ers� a catch to the adversary� if theadversary plays well� and leads the agent to low payo�s� then the agent will� with su cientprobability� learn something that will allow it to improve its policy� Eventually� withouttoo many �unpleasant� learning phases� the agent will have obtained enough informationto generate an optimal policy�

Rmax can be applied to MDPs� repeated games� and SGs� In particular� all singlecontroller stochastic game instances covered in �Brafman � Tennenholtz� ���� fall into thiscategory� and Rmax can be applied to them� However� Rmax is much simpler conceptuallyand easier to implement than the LSG algorithm described there� Moreover� it also attainshigher payo�� In LSG the agent must pay an additional multiplicative factor that doesnot appear in Rmax�

Two other SG learning algorithms appeared in the literature� Littman �Littman� ����describes a variant of Qlearning� called minimax Qlearning� designed for �person zerosum stochastic games� That paper presents experimental results� asymptotic convergenceresults are presented in �Littman � Szepesvri� ����� Hu and Wellman �Hu � Wellman����� consider a more general framework of multiagent generalsum games� This frameworkis more general than the framework treated in this paper which dealt with �xedsum� twoplayer games� Hu and Wellman based their algorithm on Qlearning as well� They provethat their algorithm converges to the optimal value �de�ned� in their case� via the notionof Nash equilibrium� However� convergence is in the limit� i�e�� provided that every stateand every joint action has been visited in�nitely often� Note that an adversary can preventa learning agent from learning certain aspects of the game inde�nitely and that Rmax�spolynomial time convergence to optimal payo� is guaranteed even if certain states and jointactions have never been encountered�

The class of repeated games is another subclass of stochastic games for which Rmaxis applicable� In repeated games� T � �� there are no transition probabilities to learn� andwe need not use a �ctitious stagegame� Therefore� a much simpler version of Rmax canbe used� The resulting algorithm is much simpler and much more e cient than previous

� In a game� an agent need not play optimally to obtain an optimal reward because it may obtain this

reward because of bad choices by the adversary�

��

43

Page 48: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

algorithms by Megiddo �Megiddo� ���� and by Banos �Banos� ����� Moreover� for thesealgorithms� only convergence in the limit is proven� A more recent algorithm by Hart andMasColell�Hart � MasColell� ���� features an algorithm that is much simpler than thealgorithms by Banos and Megiddo� Moreover� this algorithm is Hannan�Consistent whichmeans that it not only guarantees the agent its safety level� but it also guarantees thatthe agent will obtain the maximal average reward given the actual strategy used by theadversary� Hence� if the adversary plays suboptimally� the agent can get an average rewardthat is higher than its safetylevel� However� it is only known that this algorithm convergesalmostsurely� and its convergence rate is unknown� An interesting open problem is whethera polynomial time hannanconsistent nearoptimal algorithm exists for repeated games andfor stochastic games�Acknowledgments� We wish to thank Amos Beimel for his help in improving the errorbound in Lemma � and the anonymous referees for their useful comments� The �rst authoris partially supported by the Paul Ivanier Center for Robotics and Production Management�

References

Aumann� R�� � Maschler� M� ������ Repeated Games with Incomplete Information� MITPress�

Banos� A� ������ On pseudo games� The Annals of Mathematical Statistics� � � ����������

Brafman� R�� � Tennenholtz� M� ������ A nearoptimal polynomial time algorithm forlearning in certain classes of stochastic games� Arti�cial Intelligence� �� ����� ������

Fudenberg� D�� � Levine� D� ������ Selfcon�rming equilibrium� Econometrica� �� �����������

Hart� S�� �MasColell� A� ������ A general class of adaptive strategies� Journal of EconomicTheory� forthcoming�

Ho�man� A�� � Karp� R� ������ On Nonterminating Stochastic Games� ManagementScience� � ��� ��������

Hu� J�� � Wellman� M� ������ Multiagent reinforcement learning� Theoretical frameworkand an algorithms� In Proc� ��th ICML�

Kaelbling� L� P� ������ Learning in Embedded Systems� The MIT Press�

Kaelbling� L� P�� Littman� M� L�� � Moore� A� W� ������ Reinforcement learning� A survey�Journal of AI Research� �� ��������

Kearns� M�� � Koller� D� ������ E cient reinforcement learning in factored mdps� In Proc���th International Joint Conference on Arti�cial Intelligence �IJCAI�� pp� ��������

Kearns� M�� � Singh� S� ������ Nearoptimal reinforcement learning in polynomial time�In Int� Conf� on Machine Learning�

��

44

Page 49: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Littman� M� L� ������ Markov games as a framework for multiagent reinforcement learning� In Proc� ��th ICML� pp� ��������

Littman� M� L�� � Szepesvri� C� ������ A generalized reinforcementlearning model� Convergence and apllications� In Proc� ��th Intl� Conf� on Machine Learning� pp� ��������

Megiddo� N� ������ On repeated games with incomplete information played by nonbayesian players� Int� J� of Game Theory� � ��������

Monderer� D�� � Tennenholtz� M� ������ Dynamic NonBayesian DecisionMaking� J� ofAI Research� �� ��������

Moore� A� W�� � Atkenson� C� G� ������ Prioratized sweeping� Reinforcement learningwith less data and less real time� Machine Learning� ���

Schmidhuber� J� H� ������ Curious modelbuilding control systems� In Proc� Intl� JointConf� on Neural Networks� pp� ����������

Sutton� R� S� ������ Integrated architectures for learning� planning� and reacting basedon approximating dynamic programming� In Proc� of the �th Intl� Conf� on MachineLearning� Morgan Kaufmann�

Sutton� R� S�� � Barto� A� G� ������ Reinforcement Learning� An Introduction� MIT Press�

Tadepalli� P�� � Ok� D� ������ Modelbased average reward reinforcement learning� Arti��cial Intelligence� �� ��������

��

45

Page 50: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Bidding Clubs: Institutionalized Collusion in Auctions

Kevin Leyton-BrownDept. of Computer Science

Stanford UniversityStanford, CA 94305

[email protected]

Yoav ShohamDept. of Computer Science

Stanford UniversityStanford, CA 94305

[email protected]

Moshe TennenholtzDept. of Computer Science

Stanford UniversityStanford, CA 94305

[email protected]

ABSTRACTWe introduce a class of mechanisms, called bidding clubs,for agents to coordinate their bidding in auctions. In a bid-ding club agents first conduct a “pre-auction” within theclub; depending on the outcome of the pre-auction somesubset of the members of the club bid in the primary auctionin a prescribed way; and, in some cases, certain monetarytransfers take place after the auction. Bidding clubs haveself-enforcing collusion properties in the context of second-price auctions. We show that this is still true when multipleauctions take place for substitutable goods, as well as forcomplementary goods. We also present a bidding club pro-tocol for first-price auctions. Finally, we show cases wherebidding clubs have self-enforcing cooperation protocols inarbitrary mechanisms.1

1. INTRODUCTIONWith the exploding popularity of auctions on the Inter-

net and elsewhere has come increased interest in systems toassist (software or human) agents bidding in such auctions.Most of these systems have to date done little more than ag-gregate information from multiple auctions and present it tothe user in a convenient fashion (e.g., www.auctionwatch.com).There is now beginning to emerge a second generation of sys-tems which actually provide bidding advice and automationservices to bidders, going beyond the familiar proxy-biddingfeature prevalent in online auctions to the realm of bona-fidedecision support.

This paper looks even beyond such systems, which aregeared towards assisting a single bidder, and presents a classof systems to assist a collection of bidders, “bidding clubs”.The idea is similar to the idea behind “buyer clubs” on theInternet (e.g., www.merkata.com and www.mobshop.com),namely to aggregate the market power of individual bidders.The new twist is that whereas in a buyer club there is a per-fect alignment of the various buyers’ interests (since there

1This work was partly supported by DARPA grant numberF30602-98-C-0214-P00005.

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.EC’00, October 17-20, 2000, Minneapolis, Minnesota.Copyright 2000 ACM 1-58113-272-7/00/0010 ..$5.00

the more buyers join in a purchase the lower the price for ev-eryone), in a bidding club there is a more complex strategicrelationship among them, and the bidding club rules mustbe designed accordingly.

Here’s a simple example. Consider an auction with a sin-gle seller, and six potential buyers. Assume that three ofthe potential buyers – A, B and C, with corresponding (se-cret) valuations v1 > v2 > v3 – attempt to coordinate theirbidding. Assume the auction is a first-price auction. Un-der well known assumptions from the auction literature, itwould be the interest of each bidder to bid exactly 5/6 of histrue value in the auction. Thus A would end up with a sur-plus of v1/6 (if he wins the auction) or 0 (if he doesn’t), andB and C with a surplus of 0. Is there some pre-agreementA, B and C can make that will cause all of them to comeout of the auction at least as well off, and some of themstrictly better off? One could naıvely say that they wouldeach reveal their valuations to one another agreeing thatonly the highest would go on to the auction; A would there-fore be the one going on, and when he bids in the auctionhe would bid lower than 5v1/6 (a bid of 3v1/4 will work,given the above-mentioned assumptions), and thus increasehis expected surplus. The obvious flaw in this mechanismis that A, B and C will have incentive to lie in this initialphase; this could still be true if A were obliged to pay Band C a certain amount if they sat it out and he won theauction.

The above protocol is a simple instance of the class biddingclubs. In general, given some primary mechanism (typically,an auction), a bidding club protocol is as follows:

1. Some set of bidders are invited to join the biddingclub, and informed of its rules. The other bidders arenot made aware of the existence of the bidding club;we assume here that they are not even aware of thepossibility of its existence.

2. The bidders have the freedom to join the club or not.If they do it is assumed that they are guaranteed tofollow its rules.2

3. The bidding-club coordinator (or simply ‘coordinator’)asks the members for certain private information, suchas their valuations for the good that is being sold. No-tice that in general bidders may cheat about their val-uations.

2In practice, we will design bidding clubs in such a way thatany agent who would want to participate in the main auctionwill want to join the bidding club.

Appendix E

46

Page 51: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

4. The coordinator determines, according to pre-specifiedrules, how the members should behave in the primarymechanism based on the information they all supply.

5. The coordinator may also determine (and enforce) ad-ditional monetary transfers of the club members, basedon the results of the main mechanism.

6. The coordinator acts only as a representative of bid-ders.

It may seem natural to ask why a coordinator should bewilling and/or able to function as a trusted third party,without attention having been paid to its own incentives.We believe that it is best not to see the coordinator as aparty (with interests of its own) at all; rather, we conceiveof a coordinator as a software agent which is able to actonly according to its (commonly-known) programming. Itis therefore possible for the coordinator to act reliably—and for agents to be confident that the coordinator will actreliably—even in cases where the coordinator stands to gainnothing through its efforts. We do assume that coordina-tors should not cost money to operate—all of our coordina-tors are budget-balanced except for one that (unavoidably!)makes money. Finally, we have often been asked about thelegal issues surrounding the use of bidding clubs. While thisis an interesting and pertinent question, it exceeds both ourexpertise and the scope of this paper.

It turns out that, while the simple mechanism outlinedearlier fails, a more sophisticated one will ensure that B andC do not participate in the primary auction, and that Ais therefore assured higher expected payoff in the auction.More generally, the contributions of this paper are as follows:

1. We present a protocol for self-enforcing cooperation insecond-price auctions for substitute goods.

2. We present a protocol for self-enforcing cooperation insecond-price auctions for complementary goods.

3. We present a protocol for self-enforcing collusion infirst-price (as well as Dutch) auctions, in which onlysome of the agents coordinate their activities, and whichdoes not make any use of monetary transfers.

4. We present a protocol for self-enforcing cooperationin general auctions and economic mechanisms, whenthe agents’ types (e.g. valuations for goods) are takenfrom a finite set.

2. TECHNICAL BACKGROUNDThe strategic interaction among self-interested agents is

a primary topic of study in microeconomics [4] and gametheory [1]. In particular, the design of protocols for strategicinteractions is the subject of the field termed mechanismdesign [1]. The role of a mechanism (in particular, auction)designer is to define a game whose equilibrium strategiesare desirable in some respect or another. Thus, the designof a bidding club consists of taking a given mechanism – theprimary auction – and turning it into a more elaborate one,namely one with an added first stage in which a subset of theplayers play in some newly-designed game (as well as someadditional rules regarding behavior in the primary auctionand possible side payments after the auction).

Research on strategic aspects of multi-agent activity inArtificial Intelligence has grown rapidly in the recent years.This work has concentrated on the design of protocols foragents’ interaction [7, 3, 9], and shares much in commonwith work on mechanism design in economics. Many princi-ples and ideas grew up from the mechanism design literature,and have been adapted to the AI context.

Although the study of deals among agents has receivedmuch attention in the AI literature (see e.g. [7]), and al-though the study and design of contracts is central to infor-mation economics [4] (and received much attention in therecent AI literature [8]), the literature on cooperation un-der incomplete information in auctions and trades is quitelimited. In particular, the literature on collusion in auctionsis somewhat spotty. It is still too broad to give a completeoverview of it, and the bulk of it is informal. In the formalliterature on the topic, the results are quite specific, andcertainly do not apply in settings of parallel auctions (witheither substitutability or complementarity among goods),first-price auctions without side-payments, and general mech-anisms, which are the focus of our technical results. Theclosest result from the literature of which we are aware isby Graham and Marshall [2], who present a protocol forself-enforcing collusion by a subset of the participants of a(single-good) second-price auction. We discuss this resultbelow. Additional related study of collusion in auctions canbe found in [5].

3. AUCTION PRELIMINARIESWe now present some preliminaries of auction theory, as

well as a description of the classical auction model discussedin the paper and our parallel auction model.

3.1 Single auctionsAn auction procedure for selling a single good to one of

n potential participants, N = {1, 2, . . . , n} is characterizedby 4 parameters, M, g, c, d: M is the set of possible mes-sages a participant may submit; g = (g1, g2, . . . , gn), gi :Mn → [0, 1], is an allocation function, where gi determinesthe probability the winner of the auction will be agent i;c : Mn → R determines the payment by the winner of theauction; d is a participation fee. It is assumed that agentsmay decide not to participate in an auction.

In order to analyze auctions we have to discuss the infor-mation available to the participants. We assume the inde-pendent private values model, with no externalities. Eachagent i is assumed to have a valuation vi selected from theinterval of real numbers [0, 1] or from a finite domain, whichcaptures its maximal willingness to pay for the good. Wefurther assume that this valuation is selected from the uni-form distribution on the interval [0, 1] or on a finite domain.For ease of presentation we will assume the continuous case,excluding the section on general mechanisms, where the as-sumption that the set of possible valuations is finite is re-quired for our result. If agent i obtains the good and is askedto pay p, as well as a participation fee d, then its utility, ui,is given by vi − p − d; otherwise, if it is not assigned anygood then its utility is −d; if the agent does not participatein the auction then its utility is 0.

The above defines a Bayesian game, where a strategy foran agent is a decision about the message to be sent given itsvaluation, and the payoffs are determined as above. The so-lution of this game is given by computing a (Bayesian Nash)

47

Page 52: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

equilibrium of it: a joint strategy of the agents such that itis irrational for each agent to deviate from its strategy, giventhat all of the other agents stick to their strategy. Given anequilibrium strategy b = (b1, b2, . . . , bn), one can computeLi(b), the expected utility of agent i in equilibrium of thecorresponding game. In a case where there is more thanone equilibrium Li(b) is taken as the lowest expected util-ity over all the equilibria. Further discussion of equilibriumuniqueness is omitted from this paper.

One of the best-known auction mechanisms is the second-price auction. In such an auction, each participant submitsa bid in a sealed envelope. The agent with the highest bidwins the good and pays the amount of the second-highestbid, and all other participants pay nothing. In a case of a tie,the winner of the auction is selected randomly, with uniformprobability. If there is no participation fee then participationin second-price auctions is always rational. Truth revealing,i.e. bi(vi) = vi, is an equilibrium of the second-price auc-tion (in fact, it is an equilibrium in dominant strategies).Another popular auction is the first-price auction. Theseauctions are conducted similarly to second-price auctions,except that the winner pays the amount of his own bid. Theequilibrium analysis of first-price auctions is quite standard.For example, if valuations are selected according to the uni-form distribution on [0, 1] and there is no participation fee,then the strategy of agent i in equilibrium is bi(vi) = n−1

nvi.

3.2 Parallel auctionsMore generally, several auctions may be conducted in par-

allel. We first consider the case of two parallel auctions ofsimilar goods. A parallel auction is given in this case bya pair A = (A1, A2), where Ai = 〈N, g, c, d〉, (i = 1, 2) asbefore.

One such problem is a parallel auction for substitute goods,in which the set of possible buyers N is shared among A1 andA2, and each agent’s valuation for the pair of goods {g1, g2}equals its valuation for g1 which equals its valuation for g2.Agent i′s strategy consists of two parts:

1. It selects at most one of the auctions, in which it willparticipate.

2. It submits a bid in the selected auction.

Parallel auctions for substitute goods define a Bayesiangame in a natural way. For example, if the auctions aresecond-price auctions, then an appropriate equilibrium ofthe corresponding parallel auction is as follows: each agentrandomly selects one of the auctions, and sends his actualvaluation as his bid there.

Another type of parallel auction is the parallel auction forcomplementary goods. Here we have two similar auctions,e.g. second-price auctions, for two different goods g1 andg2. The set of agents N = N1 ∪ N2 ∪ Np consists of threeparts:

• N1 are agents that are interested only in g1

• N2 are agents that are interested only in g2

• Np are agents that have valuation 0 for g1 and for g2,but their valuation for the pair {g1, g2} is uniformlydistributed on the interval [0, 2].

For ease of exposition we will assume that we can distin-guish whether an agent is from group N1, N2, or Np, andthat the agents in Np have extremely high negative utilityfor losses. This second assumption means that an agent willnever submit bids in both auctions; notice that we assumedthat an agent who is interested in obtaining a pair of goodshas a valuation of 0 for getting only one of them, and there-fore by bidding in two auctions the agent may end up gettingand paying for only one good. Hence, we will assume thatthe strategies available to the agents are as in the case ofsubstitute goods.

We will rely on the notion of surplus in our evaluation ofcoordinators for parallel auctions. The surplus of an allo-cation is defined as the sum of agents’ valuations for thatallocation. For example, in a parallel auction for substitutegoods the surplus of an allocation that assigns good g1 inauction 1 to agent i, and assigns good g2 in auction 2 toagent j, is v1(g1) + v2(g2) (i.e., the sum of these agents’valuations for the goods they are assigned).

4. COORDINATORS AND BIDDINGCLUBS

Let G ⊂ N , where 1 < |G| < n. W.l.o.g let the ele-ments of G be {1, 2, . . . , |G|}. Given an auction A, denoteby Φi(A)(1 ≤ i ≤ n) the set of strategies available to agenti ∈ N .

Given a set of coordinator messages, Mc, which we takew.l.o.g to be R+, a (bidding club) coordinator is a pair offunctions C(A, G) = (T1(A, G), T2(A, G)), where T1(A, G) :

M|G|c → Φi(A)|G| and T2(A, G) = (tc

1, tc2, . . . , t

c|G|), tc

i : M|G|c ×

Mn → R. Namely, a coordinator is a mechanism that asksthe agents in G for some information and decides on theway they will behave in A; this is determined by the func-tion T1(A, G). In addition, following the decision made byT1(A, G), and given the messages sent in the main auction Aby members of N \G, an additional payment tc

i may be im-posed on agent i. The payment can be negative, positive, orzero. Mc contains the null message e that tells the coordina-tor that the corresponding agent is not willing to participatein the coordination activity. This agent will be free to partic-ipate in the auction by itself, and will not be asked to makeany payments to the coordinator. A key assumption is thatparticipants in N \ G are unaware of even the possibility ofthe existence of a coordinator, and that they act accordingto an equilibrium of A. We denote the game obtained byconcatenating C(A, G) and A, by C(A, G). For every agenti, let Li(A) be the agent’s expected utility in an equilibriumof A, and let Li(C(A, G)) be the agent’s expected utility inan equilibrium of C(A, G).

Definition 1. Given an auction A, and a G ⊂ N as be-fore, we will say that a participation-preserving coordinatorfor G in A exists, if there exists C(A, G), such that everyagent i ∈ G that would have had participated in A will alsoparticipate in C(A, G) (in equilibrium of C(A, G)).

Definition 2. We say that a utility-improving coordi-nator exists if there exists a participation-preserving coordi-nator, and Li(C(A, G)) > Li(A) (i.e. participation in thebidding club is beneficial).

The existence of a utility-improving coordinator for anauction setup implies a self-enforcing cooperative strategyfor a group of agents.

48

Page 53: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Definition 3. We say that a surplus-improving coor-dinator for G in A exists if there exists a C(A, G) thatis participation-preserving , and the expected surplus of themembers of G in C(A, G) is greater than their expected sur-plus in A.

When dealing with parallel auctions in sections 5.2 and5.3, we will be interested in surplus-improving coordina-tors. Besides the observation that neither concept impliesthe other, the discussion of the connection between utility-improving and surplus-improving coordinators is left to thefull paper.

5. COORDINATION IN SECOND-PRICEAUCTIONS

5.1 Second-price auctions for a single goodThe case of collusion in second-price auctions is discussed

in [2]. The following theorem may be deduced from thiswork; we present the result here for the sake of completeness.Consider a second-price auction. In the case of a second-price auction a group of buyers may wish to avoid paying aparticipation fee, or alternatively bidders who will certainlylose may want to receive advance notice. As it turns out,such behavior can be obtained:

Theorem 1. There exists a utility-improving coordinatorfor second-price auctions.

Sketch of proof:In the case of a second-price auction, no assumptions on

the distribution of the agents’ valuations need to be made.We will assume that there is a participation fee d > 0, andshow a coordination protocol that enables the members ofthe group G who do not have the highest valuation to avoidpaying d. We use the following protocol:

1. The agents in G are asked to submit their valuationsto the coordinator.

2. Let v1 and v2 denote the highest and second highestvaluations, announced by agents 1 and 2, respectively.3

3. Only agent 1 is represented in the main auction, andhis bid there will be v1.

4. If agent 1 wins the main auction, and is asked to payz, and z < v2, then agent 1 will pay v2 − z to thecoordinator.

We show that if the agents participate in the pre-auctionand reveal their true valuations there, then this cooperationwill be beneficial to them. The agent with the highest val-uation cannot lose, because his behavior and expected gainwill be as in the case where there was no coordinator. Theother agents will gain due to the fact they won’t need to paythe participation fee.

Consider now the agent i ∈ G with the highest valu-ation, and assume that the other agents in G are truth-revealing agents. Given that truth-revealing is an equilib-rium of second-price auctions, agents in N \ G are taken to

3Note that, unlike in some of the coordination protocolsthat follow, the coordinator behaves the same regardless ofwhether some bidders decline to participate in the coordi-nation.

be truth-revealing as well. Given that if the agent i winsthe main auction, then he pays exactly the highest valua-tion in N − {i} (because he will pay the maximum of theauction’s second-highest bid and v2). Standard second-priceauction analysis yields that it is irrational for i to deviatefrom truth-revealing to the announcement of a higher valua-tion. If agent i was willing to participate in the main auctionthen clearly he does not wish to lose the pre-auction andtherefore announcing a lower valuation than his actual oneis irrational too. Clearly, every agent j �= i, j ∈ G does nothave any incentive to cheat if the others are truth-revealing.He can only lose if by cheating he will be chosen to partici-pate in the main auction.

It is easy to see that our result holds for Japanese auctionsas well. In a Japanese auction an auctioneer starts with alow asking price, and continuously increments this price aslong as are still multiple agents willing to pay the currentprice. Once only a single agent remains, he will get the goodfor the current asking price. The fact our result holds alsofor Japanese auctions is immediately implied by the factthat in both Japanese auctions and second-price auctionsthe good is sold to the agent with the highest valuation, ata price that equals the second-highest valuation.

5.2 Parallel auctions with substitute goodsIn this section we deal with parallel auctions of substitute

goods. Here the idea of the coordinator is to ensure that thetwo agents with the highest valuations in the group G willcompete for different goods rather than among themselves.This will enable to improve upon the surplus of the membersof G. We can show:

Theorem 2. There exists a surplus-improving coordina-tor for parallel second-price auctions of substitute goods.

Sketch of proof:

1. The agents in G are asked to submit their valuationsto the coordinator.

2. Let v1, v2, and v3 denote the highest, the second high-est, and the third highest valuations which have beenannounced, respectively.4

3. Only the agents with the highest and second highestvaluations will participate in the main auction. Theagents will be randomly assigned to different auctions.

4. If an agent gets the object in auction Ai for the pricey < v3, then he will pay v3 − y to the coordinator.

It is clear that if all agents obey the coordinator’s pro-tocol, and send their actual valuations to the coordinator,then the agents will improve upon their surplus. In equilib-rium agents will want to participate; for example, consideragents 1 and 2, having the two highest bids submitted to thecoordinator. As a result of the coordination the first agentwill have a lower expected payment, since he will always paysome amount less than v2, while the second agent will havea greater chance of winning, since he will never be outbidby agent 1.

4Once again, note that the coordinator behaves the sameregardless of whether some bidders decline to participate.

49

Page 54: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

We now show that truth-revealing is an equilibrium. Con-sider an agent i1, with the highest valuation in G, v1, andassume that the rest of the agents are truth-revealing. Ifagent 1 reports a valuation higher than v1, and obtains as aresult of this a good he could not obtain otherwise, then itmust be the case that his payment is higher than his valua-tion, which makes that deviation irrational. It is clear thatreporting on a valuation lower than v1 does not help agent1.

Consider an agent i2, with the second-highest valuationin G, v2, and assume the other agents are truth-revealing.If the agent reports a higher valuation than v1 then he willbe the highest-ranking bidder in the pre-auction rather thanthe second highest-ranking, but this will not benefit him asthe top two bidders are assigned to auctions randomly. Therest of the analysis is the same as for i1.

Consider an agent i3, with the third-highest valuation inG, v3, and assume the other agents are truth-revealing. Ifthe agent reports a valuation that causes it to gain the pre-auction, then its payment will be at least v2 > v3, whichmakes such deviation irrational. Similar analysis will holdfor agents with lower valuations.

5.3 Parallel auctions with complementary goodsIn this section we deal with parallel auctions for comple-

mentary goods. Our aim is to allow the participants in G toobtain a higher surplus than what they could obtain with-out the coordinator. We assume that in G we have at leasttwo representatives of N1, N2 and Np. We can show:

Theorem 3. There exists a surplus-improving coordina-tor for parallel second-price auctions of complementary goods.

Sketch of proof:Let 0 < k << 1 be a commonly-known constant. We will

use the following coordinator5:

1. The coordinator asks the agents that are interested inthe single goods for their valuations

2. The coordinator selects two agents, s1 and s2, whoreported the highest valuations for goods g1 and g2,v1 and v2 respectively.

3. If any agent from N1

⋃N2 declined to participate, the

coordinator submits bids in the appropriate auctionsfor all agents in N1

⋃N2 who did elect to participate,

with a price offer equal to the agents’ stated valuations,and the protocol is complete. Otherwise, if all agentselected to participate, we proceed to step 4.

4. The coordinator announces v1 and v2 to all of the par-ticipants in G.

5. The coordinator asks the agents that are interested inthe pair of goods for their valuations.

6. The coordinator randomly selects an agent, sp, whoreported a valuation vp for the pair of goods, suchthat v1 + v2 + 2k < vp (if such an agent exists).

5This requires a quite straightforward modification to thedefinition of coordinators, which we skip. Namely, a coor-dinator can run a multi-stage game instead of the functionT1(A, G).

7. The coordinator bids v1 in A1, and v2 in A2.

8. If the coordinator wins both auctions, and an agent sp

exists, then sp will get the pair of goods and pay vsec1+vsec2 to the coordinator, where vseci is the second-highest bid in Ai. Agent sp will also pay agent i (i =1, 2) k + max(0, vi − vseci).

9. If the coordinator only wins auction i, or if the coordi-nator wins both auctions but there does not exist anagent sp, then agent si gets the good and pays vseci

to the coordinator.

Consider an equilibrium of the corresponding C(A, G),and an agent s′i ∈ Ni∩G (i = 1, 2). It is clear that in equilib-rium s′i will participate in C(A, G) and that the submissionof a valuation which is at least as high as s′i’s valuation by s′idominates the submission of a lower valuation. This is dueto the fact that by submitting a valuation that is lower thanhis actual valuation an agent can only lose, given that this isa second-price auction. The agent cannot lose by participat-ing in the pre-auction, since it is guaranteed to get at leastthe difference between its stated valuation and the second-highest bid, if its stated valuation is the highest. Moreover,if agent sp wins the good then s′i may also get a payment ofk > 0. For this reason, and also because vseci may be lessthan the highest rejected bid from Ni

⋂G, truth revelation

will not be in the best interest of agent s′i. Instead, he willsubmit a bid that exceeds his true valuation.

Given the above, an agent sp, who has interest in the pairof goods will be willing to participate in the coordinator’sprotocol if v1 + v2 + 2k < vp. Note that all agents areaware of k before placing their bids. It is easy to checkthat it is irrational for sp to send a message that could winthe pre-auction if its valuation is smaller than v1 + v2 +2k, and likewise it is irrational for sp to falsely submit avaluation smaller than v1 + v2 + 2k. Otherwise the amountsubmitted by sp is irrelevant, as the coordinator choosesrandomly between eligible agents in Np. Thus, expectedsurplus is increased by this protocol.

6. COORDINATION IN FIRST-PRICEAUCTIONS

Theorem 4. There exists a utility-improving coordinatorfor first-price auctions.

Sketch of proof:Recall that we assume that the agents’ valuations are

drawn uniformly from the interval [0, 1]. Our protocol canbe easily modified to deal with other distributions on theagents’ types. Let m be the number of agents who will par-ticipate in the main auction, who are not members of thebidding club (and who are thus assumed not to be awareeven of the possibility of its existence). We use the followingprotocol:

1. Invite the agents in G to submit their valuations tothe coordinator.

2. If any agent declines to participate, submit bids for allagents that did elect to participate, with a price offerof n−1

nvi, and the protocol is complete. Otherwise, if

all agents elected to participate, we proceed to step 3.

50

Page 55: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

3. Let the two agents with the highest reported valuationsbe agents 1 and 2, with reported valuations v1 and v2

respectively.

4. If v1n

n< v2

m · (v1 − v2), submit a bid only for agent 1,with a price offer of v2.

5. Otherwise, submit bids for all agents i ∈ G, with priceoffer n−1

nvi.

First, we show that if the agents reveal their true valu-ations then beneficial cooperation ensues. It is clear thatthe only agent who can gain is the agent with the highestvaluation, v1, while the other agents do not lose. Note thatv1

n

nis the expected utility of agent 1 at the equilibrium in

the original mechanism, while v2m · (v1 − v2) is his expected

utility if he submits a bid of v2 in a modified mechanismwith m + 1 participants. v1 benefits because the protocol istailored specifically to him: the coordinator offers agent 1the choice of participating in the original mechanism at itsequilibrium, or of eliminating some bidders from the auctionand bidding v2. In every situation, the coordinator selectsthe alternative that agent 1 would prefer, given his statedvaluation. (Note that there exists a set with non-zero mea-sure of values of v1 and v2 satisfying the condition in step3 of the protocol; the demonstration of this fact is left tothe full version of the paper.) At the same time, no biddersuffers from being eliminated: each eliminated bidder is as-sured that a bid will be placed in the main auction exceedinghis valuation.

Now we show that the protocol leads the agents to re-veal their true valuations. As a result, participation willbe rational for all agents. To show that truth-revelation isan equilibrium, assume that all but one of the agents sub-mit their true valuations. Notice that since only agent 1can profit from the bidding club, the only reason that anyagent other than agent 1 would lie is to become the agentwith the highest valuation. However, this agent would theneither be represented in the original mechanism above theequilibrium, or be made to bid v1, more than his valuation.Agent 1 has no reason to lie because the mechanism is tai-lored exactly to him, as described above.

Note that, paradoxically, the bidding club can also benefitbidders who don’t even know of its existence! This is dueto the fact that in equilibrium of first-price auctions, bidsare decreasing as a function of the number of participants,and we assume that all agents are made aware of the num-ber of bidders participating in the main auction.6 Bidderswho are unaware of the bidding club will thus submit lowerbids if the bidding club eliminates bidders than if it doesnot. We do not analyze the case where bidders who are un-aware of the bidding club are aware of the total number ofbidders including those eliminated by the coordinator, sincethis knowledge would lead them to knowledge of the biddingclub’s existence (when they observed that a smaller numberof bids were actually entered in the auction), violating a keyassumption of our model.

6We assume that the number of bidders participating in theauction is determined according to the number of distinctbidders wanting to submit bids. Thus if the coordinatorplaces only one bid in the main auction then bidders who areunaware of the bidding club will also be unaware of bidderswho were eliminated in the bidding club’s pre-auction.

It is easy to see that our result holds for Dutch auctions aswell. In a Dutch auction the auctioneer starts with a highasking price, and then continuously decrements this priceuntil an agent claims the good for the current price. Thefact our result holds also for Dutch auctions is immediatelyimplied by the strategic equivalence between first-price auc-tions and Dutch auctions.

7. BIDDING CLUBS FOR GENERALMECHANISMS

The first-price and the second price auctions are two rep-resentative auctions, but many other auctions, as well asother economic mechanisms (various types of trades, nego-tiations, etc.), are also discussed in the literature. In thissection we show that utility-improving coordinators exist formany other related contexts as well.

General mechanisms are usually analyzed using Bayesiangames. In a Bayesian game each agent has a set of possibletypes, and an agent’s strategy is a decision of his action as afunction of his type. The actual type of the agent is knownto him, and is selected from a commonly known distributionfunction. The payoff of each agent is a function of both thejoint strategy of the agents and the particular type of theagent. In the context of auctions, the types of the agentsrefer to their valuations. The definition and analysis of equi-librium strategies for general mechanisms will therefore besimilar to what we described in Section 3 for the case ofauctions.

In order to prove results that are general and hold for anymechanism, researchers have used the following observation,which is a direct implication of the definition of an equilib-rium of a Bayesian game. It turns out that it is enoughto consider only mechanisms such that in the equilibriumof the corresponding Bayesian game the agents will revealtheir true types. According to this observation, termed therevelation principle, it is natural to restrict our attentionto (main) mechanisms which make a decision based on trueinformation supplied by the agents.

This brings us to the following general problem. Assumethat the agents’ types are selected from a finite set, and thatthe agents are about to participate in a given truth revealingmechanism M . Assume that the equilibrium of the game as-sociated with that mechanism leads to a non Pareto-optimaloutcome for at least one tuple of agent types (i.e. for thistuple of types the agents would better perform a joint strat-egy that is different from the equilibrium strategy). Can acoordinator be used in order to make a cooperative (bene-ficial and incentive compatible) deal among the agents? Inthe sequel, we assume that the valuations of the agents aretaken from V = {v1, . . . , vm} where vi < vi+1 for every i.We can show:

Theorem 5. Consider a truth revealing mechanism withunique strict Bayesian equilibrium, that leads to a non Pareto-optimal outcome for at least one tuple of agent types. Then,a utility-improving coordinator exists.

Basic idea behind proof: Each agent will be invited tosend his valuation to the coordinator. The coordinator willcalculate a tuple of other valuations that would benefit theagents (assuming they reported their actual valuations), ifsubmitted to the main mechanism. Notice that while an

51

Page 56: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

agent would lose in equilibrium by deviating from truth-revelation in the original mechanism, sending true valuationsis not necessarily an equilibrium if the coordinator submitsthe new tuple. However, we can show that there exists auseful coordinator which also maintains incentive compati-bility.

1. Invite the agents to submit their valuations to the co-ordinator.

2. If any agent declines to participate, submit the de-clared valuations of all participating agents to the mainmechanism.

3. Otherwise, submit the new tuple of valuations to themain mechanism on behalf of all agents with proba-bility p; with probability 1 − p submit the valuationsreported by the agents.

The probability p is determined as follows. Consider anagent i, who made the announcement vi. First, we can com-pute the maximum expected gain, gi, that i could achieve bysubmitting a valuation v′

i �= vi. Second, we can compute i’ssmallest expected loss in the original mechanism, li, if vi is afalse valuation. Notice that li is positive, given the assump-tion that truth-revelation is a strict Nash equilibrium. Letg = maxi(gi) and l = mini(li). Then we can take p = l

g+l.

The analysis of this protocol is straightforward. Agentsshould want to participate, as their expected utility is in-creased. Incentive compatibility is ensured because the mostan agent can gain by lying is p·g−(1−p)·l = l

g+l·g− g

g+l·l =

0. On expectation agents will lose by lying, since g and lare calculated globally, not individually for each agent.

8. CONCLUSIONIn this paper we have presented the notion of bidding clubs

and its use in obtaining self-enforcing cooperation in classi-cal auction setups. We have presented protocols for parallelsecond-price auctions for substitutable and complimentarygoods, for first-price auctions for single goods, and for gen-eral mechanisms under various assumptions. Our work canbe considered as a first attempt to formalize “strategic buy-ers’ clubs”, where participants may cheat about their valu-ations and so the club’s protocol must be designed carefullyenough to account for this possibility. The study of biddingclubs is complementary to the rich work on efficient marketdesign [4, 1, 6]. Bidding clubs take the agents’ perspectivein improving their situation in existing markets, rather thantaking a center’s perspective on optimal, revenue maximiz-ing market design.

9. REFERENCES[1] D. Fudenberg and J. Tirole. Game Theory. MIT Press,

1991.

[2] D.A. Graham and R.C. Marshall. Collusive bidderbehavior at single-object second-price and englishauctions. Journal of Political Economy, 95:579–599,1987.

[3] S. Kraus. Negotiation and cooperation in multi-agentenvironments. Artificial Intelligence, 94, 1997.

[4] D. Kreps. A Course in Microeconomic Theory.Princeton University Press, 1990.

[5] R.P. McAfee and J. McMillan. Bidding rings. TheAmerican Economic Theory, 82:579–599, 1992.

[6] D. Monderer and M. Tennenholtz. Optimal AuctionsRevisited. Artificial Intelligence, forthcoming, 2000.

[7] Jeffrey S. Rosenschein and Gilad Zlotkin. Rules ofEncounter. MIT Press, 1994.

[8] T. Sandholm and V. Lesser. Leveled commitmentcontracts. Working paper (to appear in Games andEconomic Behavior), 2000.

[9] M. Tennenholtz. Electronic commerce: Fromgame-theoretic and economic models to workingprotocols. In IJCAI-99, 1999.

52

Page 57: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Bidding Clubs in First-Price Auctions

Kevin Leyton-Brown, Yoav Shoham and Moshe TennenholtzDepartment of Computer Science

Stanford UniversityStanford, CA 94305

Email: {kevinlb;shoham;moshe}@cs.stanford.edu

Version: March 22, 2002

We introduce a class of mechanisms, called bidding clubs, that allow agentsto coordinate their bidding in auctions. Bidding clubs invite a set of agents tojoin, and each invited agent freely chooses whether to accept the invitation orwhether to participate independently in the auction. Agents who join a biddingclub first conduct a “pre-auction” within the club; depending on the outcomeof the pre-auction some subset of the members of the club bid in the primaryauction in a prescribed way. We model this setting as a Bayesian game, includingagents’ choices of whether or not to accept a bidding club’s invitation. Afterdescribing this general setting, we examine the specific case of bidding clubs forfirst-price auctions. We show the existence of a Bayes-Nash equilibrium whereagents choose to participate in bidding clubs when invited and truthfully declaretheir valuations to the coordinator. Furthermore, we show that the existence ofbidding clubs benefits all agents (including both agent inside and outside of abidding club) in several different senses.1

1. INTRODUCTION

The advent of internet markets has spurred new interest in auctions.Most work in both economics and computer science has concentrated onthe design of auction protocols from the seller’s perspective, and in par-ticular on optimal (i.e., revenue maximizing) auction design. In this pa-per we present a class of systems to assist sets of bidders, bidding clubs.The idea is similar to the idea behind “buyer clubs” on the Internet (e.g.,www.mobshop.com): to aggregate the market power of individual bidders.Buyer clubs work when buyers’ interests are perfectly aligned; the morebuyers join in a purchase the lower the price for everyone. In auctions heldon the internet it is relatively easy for multiple agents to cooperate, hidingbehind a single auction participant. Intuitively, these bidders can gain bycausing others to lower their bids in the case of a first-price auction or bypossibly removing the second-highest bidder in the case of a second-price

1This work was partly supported by DARPA grant number F30602-98-C-0214-P00005.

53

APPENDIX F

Page 58: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

auction. However, the situation in auctions is not as simple as in buyerclubs, because while bidders can gain by sharing information, the competi-tive nature of auctions means that bidders’ interests are not aligned. Thusthere is a complex strategic relationship among bidders in a bidding club,and bidding club rules must be designed accordingly.

1.1. Related Work

While there is relative scarcity of previous work on bidder-centric mech-anisms, certainly our work has not been carried out in a vacuum. Below wediscuss the most relevant previous work and its relation to ours. This workall comes under the umbrella of collusion in auctions, a negative term stillreflecting a seller-oriented perspective. We adopt a more neutral stance to-wards such bidder activities and thus use the term bidding clubs rather thanthe terms bidding rings and cartels that have been used in the past. How-ever, the technical development is not impacted by such subtle differencesin moral attitude.

1.1.1. Collusion in Second-Price Auctions

One of the first formal papers to consider collusion in second-price auc-tions was written by Graham and Marshall [Graham and Marshall, 1987].This paper introduces a knockout procedure: agents announce their bids ina pre-auction; only the highest bidder goes to the auction but this biddermust pay a “ring center” the amount of his gain relative to the case wherethere was no collusion. The ring center pays each agent in advance; theamount of this payment is calculated so that the ring center will budget-balance ex-ante, before knowing the agents’ valuations.

Graham and Marshall’s work has been extended to deal with varia-tions in the knockout procedure, differential payments, and relations tothe Shapley value [Graham et al., 1990]. The case where only some ofthe agents are part of the cartel is discussed by Mailath and Zemsky[Mailath and Zemsky, 1991]. Ungern and Sternberg [von Ungern-Sternberg, 1988]discuss collusion in second-price auctions where the designated winner ofa cartel is not the agent with the highest valuation. Finally, although thisfact is not presented in any existing work of which we are aware, it is alsoeasy to extend Graham and Marshall’s protocol to handle an environmentwhere multiple cartels may operate in the same auction alongside indepen-dent bidders.

Overall, a much richer body of work deals with second-price auctionsthan with first-price auctions. This is possibly explained by the fact thatsince second-price auctions give rise to dominant strategies, it is possibleto study collusion in many settings related to these auctions without per-forming strategic equilibrium analysis.

54

Page 59: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

1.1.2. Collusion in First-Price Auctions

The key exception to the scarcity of formal work on first-price auctions isa very influential paper by McAfee and McMillan [McAfee and McMillan, 1992].It is the closest in the literature to our work, and indeed we have borrowedsome modelling elements from it. Several sections of their paper, includingthe discussion of enforcement and the argument for independent privatevalues as a model of agents’ valuations, are directly applicable to our pa-per. However, the setting introduced in their work assumes that a fixednumber of agents participate in the auction and that all agents are part ofa single cartel that coordinates its behavior in the auction. The authorsshow optimal collusion protocols for “weak” cartels (in which transfers be-tween agents are not permitted: all bidders bid the reserve price, using theauctioneer’s tie-breaking rule to randomly select a winner) and for “strong”cartels (the cartel holds a pre-auction, the winner of which bids the reserveprice in the main auction while all other bidders sit out; the winner dis-tributes some of his gains to other cartel members through side payments).A small part of the paper deals with the case where in addition to thesingle cartel there are also additional agents. However, results are shownonly for two cases: (1) when non-cartel members bid without taking theexistence of a cartel into account and (2) when each agent i has valuationvi ∈ {0, 1}. The authors explain that they do not attempt to deal withgeneral strategic behavior in the case where the cartel consists of only asubset of the agents; furthermore, they do not consider the case where mul-tiple cartels can operate in the same auction. Finally, a brief presentationof “cartel-formation games” is related to our discussion of agents’ decisionof whether or not to accept an invitation to join a bidding club.

1.1.3. Other Work on Collusion

Less formal discussion of collusion in auctions can be found in a widevariety of papers. For example, a survey paper that discusses mechanismsthat are likely to facilitate collusion in auctions, as well as methods for thedetection of such schemes, can be found in [Hendricks and Porter, 1989]. Adiscussion and comparison of the stability of rings associated with classicalauctions can be found in [Robinson, 1985]. That paper concentrates on thecase where the valuations of agents in the cartel are honestly reported.

Collusion is also discussed in other settings. For example, the literaturediscusses collusion that aims to influence purchaser behavior in a repeatedprocurement setting (see [Feinstein et al., 1985]), and in the context of gen-eral Bertrand or Cournot competition (see [Cramton and Palfrey, 1990]).

We should also mention that in an earlier paper we have anticipatedsome of the results reported here. Specifically, in [Leyton-Brown et al., 2000]we considered bidding clubs under the assumptions that only a single bid-ding club exists, and that bidders who were not invited to join the club are

55

Page 60: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

not aware of the possibility that a bidding club might exist. The currentpaper is an extension and generalization of that earlier work.

1.2. Distinguishing Features of our Model

Our goal in this work is to study cooperation between self-interested bid-ders in a rich model that captures many of the characteristics of auctions onthe internet. This leads to many differences between our model and mod-els proposed in the work surveyed above (particularly [Graham et al., 1990]and [McAfee and McMillan, 1992]). In particular, we argue that a modelof an internet auction setting that includes bidding clubs should includethe following features:

1. The number of bidders is stochastic.

2. There is no minimum number of bidders in a bidding club (i.e., bid-ding clubs are not required to contain all bidders).2

3. There is no limit to the number of bidding clubs in a single auction.

4. Club members and independent bidders behave strategically, actingaccording to correct beliefs about this complex environment.

The first feature above is crucial. In many real-world internet auctions,bidders are not aware of the number of other agents in the economic en-vironment. A bidding club that drops one or more interested bidders isthus undetectable to other bidders in an internet auction. An economicenvironment with a fixed number of bidders would not model this uncer-tainty, as the number of interested bidders would be common knowledgeamong all bidders regardless of the number of bids received in the auction.For this reason, we consider economic environments where the number ofbidders is chosen at random. We make use of a model of auctions withstochastic numbers of participants which is due to McAfee and McMillan[McAfee and McMillan, 1987]; we also refer to equilibrium analysis of thismodel by Harstad, Kagel and Levin [Harstad et al., 1990].

1.3. Bidding Clubs at a Glance

Roughly speaking, a scenario with bidding clubs has the following struc-ture:

1. Given a primary auction;

2. Given a set of bidders in that auction, drawn randomly from a set ofpotential bidders;

2For technical reasons we will have to assume that there is a finite maximum numberof bidders in each bidding club; however, this maximum may be any integer greater thanor equal to two.

56

Page 61: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

3. Given a partition of bidders into disjoint clubs, each of which can bethe redundant singleton club;

4. Each bidder chooses whether to bid in the primary auction directlyor through his club (it is assumed that this choice is strictly enforce-able). In the latter case, the bidder declares his valuation to the clubcoordinator;

5. Based on the bidders’ choices and declarations each club bids in theprimary auction, as do both the bidders who elected not to join theirrespective clubs and the singleton bidders.

6. Each (non-singleton) club bids according to pre-specified, commonlyknown rules. These rules also specify internal allocations and possiblemonetary transfers among club members upon the conclusion of theprimary auction.

To make bidding clubs a more realistic model of collusion in internetauctions, we restrict bidding club protocols in the following ways:

1. Participation in bidding clubs requires an invitation, but bidders mustbe free to decline this invitation without (direct) penalty. In this waywe include the choice to collude as one of agents’ strategic decisions,rather than starting from the assumption that agents will collude.

2. Bidding club coordinators must make money on expectation, andmust never lose money. This ensures that third-parties have incen-tive to run bidding club coordinators. Note that this requirement isnot satisfied by a [Graham et al., 1990]-type result, in which biddingclubs (or, in their parlance, cartels) are budget balanced ex ante, butmay lose money in individual auctions.

3. The bidding club protocol must give rise to an equilibrium whereall invited agents choose to participate, even when the bidding cluboperates in a single auction as opposed to a sequence of auctions.This means that agents can not be induced to collude in a givenauction by the threat of being denied future opportunities to collude.

1.4. Overview

This paper consists of two parts. First, sections 2 through 4 presentrelevant background that does not directly concern cooperation betweenbidders. In section 2 we give a formal model of an auction with a stochasticnumber of participants based on the model in [McAfee and McMillan, 1987].We set up an economic environment in which a finite number of agents ischosen at random from an infinite set of potential agents. We also give ageneral model of auction mechanisms based on [Monderer and Tennenholtz, 2000],

57

Page 62: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

and define symmetric Bayes-Nash equilibria for the resulting Bayesiangame. In section 3 we consider different variations on the first-price auc-tion mechanism. We begin with classical first-price auctions, in which thenumber of bidders is common knowledge, and then consider first-price auc-tions in the economic environment from section 2, where the number ofbidders is drawn from a known distribution. Combining results from bothauction types, we present first-price auctions with participation revelation:auctions in which the number of bidders is stochastic, but the auction-eer announces the number of participants before taking bids. This is theauction mechanism upon which we will base our bidding club protocol forfirst-price auctions. Finally, section 4 makes use of the revelation princi-ple to show a class of auction mechanisms in which bidders are subjectto different payment rules and may have different private information (inaddition to their valuations), yet all bid truthfully. We think that this re-sult is interesting in its own right, and certainly it is applicable to settingsother than collusion; however, it is also necessary to the proof of the maintheorem in section 6.

The second part of our paper is concerned explicitly with bidding clubs,using material from the first part to present a general model of bidding clubsand then a bidding club protocol for first-price auctions. First, section 5expands the economic environment from section 2 to include the followingnovel features:

• A finite set of bidding clubs is selected from an infinite set of potentialbidding clubs.

• A finite set of agents is selected to participate in the auction, froman infinite set of potential agents. Some agents are associated withbidding clubs, and the whole procedure is carried out in such a waythat no agent can gain information about the total number of agentsin the economic environment from the fact of his own selection.

• The space of agent types is expanded to include both an agent’svaluation, and the number of agents present in that agent’s biddingclub (equal to one if the agent does not belong to a bidding club).

We introduce notation to describe each agent’s beliefs about the num-ber of agents in the economic environment, conditioned on that agent’sprivate information. We also augment the auction mechanism from sec-tion 2 to describe additional strategic choices available to agents invitedto bidding clubs. In section 6 we examine bidding club protocols for first-price auctions. We begin with two assumptions on the distribution of agentvaluations: the first related to continuity of the distribution, and the sec-ond to monotonicity of equilibrium bids. After a technical lemma relatingequilibrium bids in auctions with stochastic numbers of participants un-der different distributions, we give a bidding club protocol for first-priceauctions with participation revelation. Our main technical results follow:

58

Page 63: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

• We show that it is an equilibrium for agents to accept invitations tojoin bidding clubs when invited and to disclose their true valuationsto their bidding club’s coordinator. Under the same equilibrium,singleton agents bid as they would in an auction with a stochasticnumber of participants in an economic environment without biddingclubs, in which the distribution over the number of participants isthe same as in the bidding clubs setting.

• In equilibrium each agent is better off as a result of his own club (thatis, his expected payoff is higher than would have been the case if hisclub never existed, but other clubs—if any—still did exist).

• In equilibrium each club increases all non-members’ expected payoffs,as compared to equilibrium in the case where all club members par-ticipated in the auction as singleton bidders, but all other clubs—ifany—still existed.

• In equilibrium each agent’s expected payoff is identical to the casein which no clubs exist; note that since clubs make money on ex-pectation, if clubs are willing to make money (or break even) onlyon expectation, they could distribute some of their ex ante expectedprofits among the club members, ensuring that all bidders gain onexpectation.

Finally, sections 7 and 8 consist of discussion and conclusions. Wetouch on questions of trustworthiness of coordinators, legality of biddingclubs and steps an auctioneer could take to disrupt the operation of biddingclubs in her auction.

2. AUCTION MODEL

In this section we provide a (non-controversial) auction model, meantto capture an internet auction setting such as eBay. Of course, this modelis applicable to many other auctions as well. Auctions may be seen asconsisting of an economic environment plus an auction mechanism whichtogether define a Bayesian game. First, our economic environment consistsof a stochastic number of agents, each of which has private informationabout the number of participants in the auction and knows the distribu-tion from which others’ types are drawn. This section draws heavily onwork by McAfee and McMillan [McAfee and McMillan, 1987] on auctionswith a stochastic number of participants. Second, the game includes anauction mechanism in which the agents participate; this section is basedon [Monderer and Tennenholtz, 2000]. After defining these elements, wegive a formal definition of the Bayesian game.

59

Page 64: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

2.1. The Economic Environment

An economic environment E consists of a finite set of agents who havenon-negative valuations for a good at auction, and a distinguished agent 0—the seller or center. The set of agents is selected by an exogenous process,and each agent is unaware of the total number of agents participating inthe economic environment. Following [McAfee and McMillan, 1987], letthe set of agents who may participate in the economic environment beA ≡ N. Let βA represent the probability that a finite set A ⊂ A is the setof agents. The probability that n agents3 will participate in the auctionis γA(n) =

∑A,|A|=n βA. All agents know the probability distribution βA.

Once an agent k is selected, he updates his probability of the number ofagents present as:

pkn =

∑A,|A|=n,k∈A βA∑

A,k∈A βA. (1)

We deviate from the model in [McAfee and McMillan, 1987] by addingthe assumption that it is common knowledge that all bidders are equallylikely to be chosen. Hence pk

n is the same for all k; we will hereafter referonly to pn. Finally, we assume that γA(0) = γA(1) = 0; at least two agentswill participate in the auction.

Let T be the set of possible agent types. The type τi ∈ T of agent iis the tuple (vi, si) ∈ V × S. vi denotes an agent’s valuation: his maximalwillingness to pay for the good offered by the center. We assume that vi

represents a purely private valuation for the good, and that vi is selected in-dependently from the other vj ’s of other agents from a known distribution,F , having density function f . By si we denote agent i’s signal: his privateinformation about the number of agents in the auction. In this section wewill consider the simple case where S = {∅}: it is common knowledge thatall agents receive the null signal, and hence gain no additional informationabout the number of agents. Note, however, that the economic environ-ment itself is always common knowledge, and so agents always have someinformation about the number of agents even when they receive the nullsignal. We will consider more complex signals in section 5. We will use thenotation pτi

n to denote the probability that agent i assigns to there beingn agents in the auction, conditioned on his type τi. Throughout the pa-per we will use uppercase P to denote the whole probability distributionas compared to the probability of a particular number of agents which wehave denoted by lowercase p; in this case we denote the whole distributionconditioned on i’s type as P τi .

The utility function of agent i, ui : R → R is linear, normalized withui(0) = 0. The utility of agent i (having valuation vi) when asked to pay

3When we say that n agents participate in the auction we do not count the distin-guished agent 0, who is always present.

60

Page 65: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

t is vi − t if i is allocated a good, and it is 0 otherwise. Thus, we assumethat there are no externalities in agents’ valuations and that agents arerisk-neutral.

2.2. The Auction Mechanism

We denote the possible allocations of the good to the agents by Π. Anauction mechanism is a tuple (M, g, t), where:

• M is the set of possible messages an agent may send.

• g : Mn → ∆(Π) is an allocation function where ∆(Π) is the tuple ofdistribution functions over Π (e.g., the allocation may include randomelements).

• t = (t1, t2, . . . , tn), ti : Mn × Π → R is the (monetary) transferfunction for agent i.

Notice that n is a parameter. Technically, an auction mechanism definesg and t for any number of participants, and can be therefore considered asa set of tuples (one for each number of agents).

Given the above, the dynamics of an auction mechanism can be de-scribed as follows:

• Each agent i sends a message µi to the center. We denote the set ofmessages received by the center as µ.

• The center conducts a lottery according to the distribution g(µ), andselects the allocation π.

• Agent i gets πi, and is required to transfer ti(µ, π) to the center.

• The utility of i is vi − ti(µ, π) if he is assigned a good, and it is−ti(µ, π) otherwise.

2.3. The Bayesian Game

The auction mechanism (M, g, t), in conjunction with the economic en-vironment E, defines a Bayesian game. We will use the following definitionsand notation. A strategy bi : T → M for agent i is a mapping from histype τi to a message µi. This may be the null message, which means thathe has elected not to participate in the auction. Σ denotes the set of possi-ble strategies, i.e., the set of functions from types to messages in M. Eachagent’s type is that agent’s private information, but the whole setting iscommon knowledge.

For notational simplicity we only define symmetric equilibria, whereall agents bid the same function of their type, as this is sufficient for ourpurposes in this paper. A more general definition would proceed along the

61

Page 66: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

same lines. By Li(τi, bi, bj−1) we denote agent i’s ex post expected utility

given that his type is τi, he follows the strategy bi and all other agents usethe strategy b, in the case that there are a total of j agents. The strategyprofile bn ∈ Σn is a symmetric equilibrium if and only if:

∀i ∈ A,∀τi ∈ T , b ∈ argmaxbi∈Σ

∞∑j=2

pτij Li(τi, bi, b

j−1) (2)

3. FIRST-PRICE AUCTIONS

In this section we discuss several different variants of the first-priceauction. First we describe classical first-price auctions, in which a fixednumber of participants belong to the economic environment, and hencethe number of bidders is common knowledge. Next we consider first-priceauctions with a stochastic number of participants, where the number ofbidders in the economic environment is drawn from a known distribution.Using the previous two settings, we present first-price auctions with par-ticipation revelation, where the number of agents is chosen stochastically,but the auctioneer announces the number of agents who have registered inthe auction before taking bids. This last type of first-price auction is theone we will consider in our discussion of bidding clubs in section 6.

3.1. Classical first-price auctions

In a classical first-price auction, each participant submits a bid in asealed envelope. The agent with the highest bid wins the good and paysthe amount of his bid, and all other participants pay nothing. In the caseof a tie, the winner of the auction is selected uniformly at random from thebidders who tied for the highest bid. (Note, however, that when F is con-tinuous and has no atoms the probability of two bidders having the sametype is 0; ties will therefore occur with probability 0 if bidders follow anequilibrium in which they all bid a strictly monotonically-increasing func-tion of their valuations.) The equilibrium analysis of first-price auctions isquite standard:

Proposition 1. If valuations are selected independently according tothe uniform distribution on [0, 1] then it is a symmetric equilibrium for eachagent i to follow the strategy:

b(vi) =n − 1

nvi.

Using classical equilibrium analysis (e.g., following Riley and Samuelson[Riley and Samuelson, 1981]) it is possible to show how classical first-priceauctions can be generalized to an arbitrary continuous distribution F .

62

Page 67: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Proposition 2. If valuations are selected from a continuous distribu-tion F then it is a symmetric equilibrium for each agent i to follow thestrategy:

b(vi) = vi − F (vi)−(n−1)

∫ vi

0

F (u)n−1du.

.

In both cases, observe that although n is a free variable, n is not aparameter of the strategy; the same is true of the distribution F . Agentsdeduce this information from their full knowledge of the economic environ-ment. It is useful, however, to have notation specifying the amount of theequilibrium bid as a function of both v and n. We write

be(vi, n) = vi − F (vi)−(n−1)

∫ vi

0

F (u)n−1du. (3)

3.2. First-price auctions with a stochastic number of bidders

In the economic environment described in section 2.1 the number ofagents is not a constant; rather, it is chosen stochastically from a knownprobability distribution. An equilibrium for this setting was demonstratedby Harstad, Kagel and Levin [Harstad et al., 1990]:

Proposition 3. If valuations are selected from a continuous distribu-tion F and the number of bidders is selected from the distribution P thenit is a symmetric equilibrium for each agent i to follow the strategy:

b(vi) =∞∑

j=2

pjbe(vi, j)

Observe that be(vi, j) is the amount of the equilibrium bid for a bidderwith valuation vi in a setting with j bidders as described in section 3.1above. P is deduced from the economic environment.4 We overload ourprevious notation for the equilibrium bid, this time as a function of theagent’s valuation and the probability distribution P . Thus we write:

be(vi, P ) =∞∑

j=2

pjbe(vi, j) (4)

We will make frequent use of this function throughout the paper. Animportant note is that it describes the equilibrium bid in the situationwhere the economic environment is such that the number of agents is chosenby P and where all agents receive the null signal.

4Recall that P is a set: pj ∈ P for all j ≥ 0, where pj denotes the probability thatthe economic environment contains exactly j agents.

63

Page 68: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

3.3. First-price auctions with participation revelation

In some first-price auctions (e.g., auctions held on the internet), biddersparticipate in an economic environment where the number of bidders in theauction is not common knowledge. However, this can be helpful informa-tion for bidders. One obvious way of addressing this problem is to intro-duce a two-phase mechanism with revelation of the number of participantsbetween the stages. Specifically, a first-price auction with participationrevelation is as follows:

1. Agents indicate their intention to bid in the auction.

2. The auctioneer announces n, the number of agents who registered inthe first phase.

3. Agents submit bids to the auctioneer. The auctioneer will only acceptbids from agents who registered in the first phase.

4. The agent who submitted the highest bid is awarded the good for theamount of his bid; all other agents are made to pay 0.

It is unsurprising that, although a first-price auction with participationrevelation may have a stochastic number of participants,

Proposition 4. There exists an equilibrium of the first-price auctionwith participation revelation where every agent i indicates the intention toparticipate and bids according to be(vi, n).

Proof. Agents are always better off participating in first-price auctionsas long as there is no participation fee. The only way of participating isto declare the intention to participate in the first phase of the auction.Thus the number of agents announced by the auctioneer is equal to thetotal number of agents in the economic environment. From proposition 2it is best for agent i to bid be(vi, n) when it is common knowledge that thenumber of agents in the economic environment is n. That is exactly thecase under our mechanism.

In section 6 we will be concerned with first-price auctions with infor-mation revelation, but we will show an equilibrium in which the numberof agents registering in the first phase is smaller than the total number ofagents participating in the auction, because some bidders with low valua-tions drop out as part of a collusive agreement. The auctioneer’s declarationacts as a signal about the total number of bidders, but individual agentswill still be uncertain about the total number of opponents they face.

4. TRUTHFUL EQUILIBRIA IN ASYMMETRIC MECHANISMS

In this section we describe a particular class of auction mechanismsthat are asymmetric in the sense that every agent is subject to the same

64

Page 69: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

allocation rule but to a potentially different payment rule, and furthermorethat agents may receive different signals. It will be helpful for the proof ofour main theorem in section 6 to show that a truth-revealing equilibriumexists in such auctions under the following two conditions:

1. The auction allocates the good to the agent who submits the highestbid.

2. Consider the auction Mi in which all agents are subject to agenti’s payment rule and the above allocation rule, and where (hypo-thetically) all agents receive the signal si.5 Truth-revelation is asymmetric equilibrium in Mi.

Observe that the second condition above is less restrictive than it mayappear. From the revelation principle we can see that for every auctionwith a symmetric equilibrium there is a corresponding auction in whichtruth-revealing is an equilibrium that gives rise to the same allocation andthe same payments for all agents. Mi can thus be seen as a revelationmechanism for some other auction that has a symmetric equilibrium.

More formally, given a good g, let M represent a set of auctions {M1, . . . ,Mn} which all allocate the good to the agent who submits the highestbid, and which are all truth-revealing direct mechanisms for n risk-neutralagents with independent private valuations drawn from the same distribu-tion. We now define another auction M :

1. Each agent i sends a message µi to the center.

2. The center allocates the good to the agent i with µi ∈ maxj µj . Ifmultiple agents submit the highest message, the tie is broken in somearbitrary way.

3. Agent i is made to transfer ti(µ, π) to the center.6 The transferfunction ti is taken from Mi ∈ M.

We can now show:

Lemma 1. Truth-revelation is an equilibrium of M .

Proof. The payoff of agent i is uniquely determined by the allocationrule, the transfer function ti, and all agents’ strategies. Assume that theother agents are truth revealing, then the other agents’ behavior, the al-location rule, and agent i’s payment rule are all identical in M and Mi.Since truth-revelation is an equilibrium in Mi, truth-revelation is agent i’sbest response in M .

5That is, for every agent j in the real auction, we create an agent k in the hypotheticalauction Mi having type τk = (vj , si).

6Of course, this transfer can be either positive or negative.

65

Page 70: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Example. Consider an auction for a single good g, where eight agentsbid for the good. The agents’ valuations are IPV, IID from a known distri-bution F , and the agents are risk-averse. Let M1 be a revelation mechanismfor a first-price auction: i.e., agents declare their valuations, and the win-ner is charged be(v, 8). In an economic environment consisting of eightagents with IPV valuations from F it is an equilibrium of M1 for agentsto truthfully declare their valuations to the center. Let M2 be a second-price auction; truthful declaration is a weakly dominant strategy under thisauction type. Both M1 and M2 allocate the good to the agent with thehighest declaration, and so these auctions meet the conditions given at thebeginning of the section. Now consider an auction M where odd-numberedagents are subject to the payment rule from M1, and even-numbered agentsare subject to the payment rule from M2. By lemma 1, truth-revelation isan equilibrium of M . There are other differences between payment rulesthat can cause agents’ expected utilities to differ: for example, lemma 1would still hold if M2 gave each agent an additional payment of $10 forparticipating in the auction.

The next corollary, which follows directly from the lemma, comparesa single agent’s expected utility under two different auctions M and M ′,which implement different payment rules. We will need this result for ourproof of theorem 1.

Corollary 1. Consider two auctions M and M ′, defined as above,which both implement the same transfer function for agent i. Agent i’sexpected utility is the same in both M and M ′.

Proof. The payoff of agent i is uniquely determined by the allocationrule, its transfer function, and all agents’ strategies. Both M and M ′ havethe same allocation rule. Lemma 1 tells us that truth revelation is a bestresponse for all agents in both M and M ′, so all agents’ strategies areidentical in the two auctions. In general, agents may not receive the sameexpected utility from M and M ′. However, since i has the same transferfunction in both auctions, i’s expected utility in M is equal to his expectedutility in M ′.

5. AUCTION MODEL FOR BIDDING CLUBS

In this section we extend both the economic environment and auctionmechanism from section 2 to include the characteristics necessary for amodel of bidding clubs. Because our aim is not to model a situation whereagents’ decision to collude is exogenous—as this would gloss over the ques-tion of whether the collusion is stable—we include the collusive protocolas part of the model and show that it is individually rational ex post (i.e.,after agents have observed their valuations) for agents to choose to collude.However, we do consider exogenous the selection of the set of agents whoare offered the opportunity to collude. Furthermore, we want to show the

66

Page 71: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

impact of the possibility of collusion upon non-colluding agents; indeed,even colluding agents must take into account the possibility that othergroups of agents in the auction may also be colluding. Once we have de-fined the new economic environment and auction mechanism, a well-definedBayesian game will be specified by every tuple of primary auction type, bid-ding club rules and distributions of agent types, the number of agents andthe number of bidding clubs.

5.1. The Economic Environment

We extend the economic environment E from the previous section toconsist of a set of agents who have non-negative valuations for a good atauction, the distinguished agent 0 and a set of bidding club coordinatorswho may invite agents to participate in a bidding club. Intuitively, weconstruct an environment where an agent’s belief update after observingthe number of agents in his bidding club does not result in any change inthe distribution over the number of other agents in the auction, becausethe number of agents in each bidding club is independent of the number ofagents in every other bidding club.

5.1.1. Coordinators

Coordinators are not free to choose their own strategies; rather, theyact as part of the mechanism for a subset of the agents in the economicenvironment. We select coordinators in a process analogous to our previousapproach for exogenously selecting agents: we draw a finite set of individ-uals from an infinite set of potential coordinators. In this case, however,this finite set is considered “potential coordinators”; in section 5.1.2 we willdescribe which potential coordinators are “actualized”, i.e., correspond toactual coordinators. Possible coordinators that are not actualized will cor-respond to singleton bidders in the auction.

More formally, let C ≡ N (excluding 0) be the set of all coordinators.βC represents the probability that a finite set C ⊂ C is selected to be theset of potential coordinators. We add the restriction that all coordinatorsare equally likely to be chosen. A consequence of this restriction is thatan agent’s knowledge of the coordinator with whom he is associated doesnot give him additional information about what other coordinators mayhave been selected. We denote the probability that an auction will involvenc potential coordinators as γC(nc) =

∑C,|C|=nc

βC . The distribution βC

is common knowledge. We assume that γC(0) = γC(1) = 0: at least twopotential coordinators will be associated with each auction.

5.1.2. Agents

We independently associate a random number of agents with each po-tential coordinator, again drawing a finite set of actual agents from an

67

Page 72: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

infinite set of potential agents. If only one (actual) agent is associatedwith a potential coordinator, the potential coordinator will not be actual-ized and hence the agent will not belong to a bidding club. In this waywe model agents who participate directly in the auction without being as-sociated with a coordinator. If more than one agent is associated witha potential coordinator, the coordinator is actualized and all the agentsreceive an invitation to participate in the bidding club.

More formally, let A ≡ N be the set of all agents, and let κ be themaximum number of agents who may be associated with a single biddingclub. Partition A into subsets, where agent i belongs to the subset A�i/κ�.Let βA be the probability that a finite set A ⊂ Ai is the set of agentsassociated with potential coordinator i; we assume that this distributionis the same for all i. Furthermore, as above, we assume that it is commonknowledge that all agents are equally likely to be chosen. The probabilitythat n agents will be associated with a potential coordinator is denotedγA(n) =

∑A,|A|=n βA. By the definition of κ, ∀j > κ, γA(j) = 0; we

assume that γA(0) = 0 and that γA(1) < 1.

5.1.3. Signals

Each agent receives a signal informing him of the number of agents inhis bidding club; as above we denote this signal as si.7 Of course, if thisnumber is 1 then there is no coordinator for the agent to deal with, andhe will simply participate in the main auction. Note also that agents areneither aware of the number of potential coordinators for their auction northe number of actualized potential coordinators, though they are aware ofboth distributions.

5.1.4. Beliefs

Once an agent is selected, he updates his probability distribution overthe number of actual agents in the economic environment. Not all agentswill have the same beliefs—agents who have been signaled that they be-long to a bidding club will expect a larger number of agents than singletonagents. We denote by pn,k

m the probability that there are a total of m agentsin the auction, given that there are n bidding clubs and that there are kagents in the bidder’s own club; we denote the whole distribution Pn,k.Because the numbers of agents in each bidding club are independent, ob-serve that every agent in the whole auction has the same beliefs about thenumber of other agents in the economic environment, discounting thoseagents in his own bidding club. Hence agent i’s beliefs are described by

7In fact, none of our results require that agents know the number of agents in theirbidding clubs; it would be sufficient that agents know whether they belong to a biddingclub. We consider the setting where agents’ signals are more informative because itsimplifies the exposition of the main theorem.

68

Page 73: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

the distribution Pn,si . It is important to note that Pn,si is simply an-other distribution over the number of agents in the auction. Although thisshorthand makes reference to the bidding club economic environment inorder to describe the construction of the distribution, it makes sense totalk about a classical auction with a stochastic number of bidders (i.e.,section 3.2) where the number of bidders is distributed according to Pn,k

for given values of n and k.

5.2. The Augmented Auction Mechanism

Bidding clubs, in combination with a main auction, induce an aug-mented auction mechanism for their members:

1. A set A of bidders is invited to join the bidding club.

2. Each agent i sends a message µi to the bidding club coordinator.This may be the null message, which indicates that the agent willnot participate in the coordination and will instead participate freelyin the main auction. Otherwise, agent i agrees to be bound by thebidding club rules, and µi is agent i’s declared valuation for the good.Of course, i can lie about his valuation.

3. Based on pre-specified and commonly-known rules, and on the infor-mation all the members supply, the coordinator selects a subset ofthe agents to bid in the main auction. The coordinator may bid onbehalf of these agents (e.g., using their ID’s on the auction web site)or it may instruct agents on how to bid. In either case we assumethat the coordinator can force agents to bid as desired, for exampleby imposing a charge on agents who do not behave as directed.

4. If a bidder represented by the coordinator wins the main auction, heis made to pay the amount required by the auction mechanism to theauctioneer. In addition, he may be required to make an additionalpayment to the coordinator.

Any number of coordinators may participate in an auction. However,we assume that there is only a single coordination protocol, and that thisprotocol is common knowledge.

6. BIDDING CLUBS FOR FIRST-PRICE AUCTIONS

In this section we first give some (mild) assumptions about the distri-bution of agent valuations, then use these assumptions to prove a technicallemma. We then give the bidding club protocol for first-price auctions. Weconsider a first-price auction with participation revelation as described insection 3.3. Bidders indicate their intention to participate, the auction-eer announces the total number of bidders and then bidders place their

69

Page 74: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

bids. The bidding club decides whether to drop bidders before the firstphase; therefore the number announced by the auctioneer does not includedropped bidders. We show an equilibrium of this auction, and demonstratethat agents gain under this equilibrium.

6.1. Assumptions

Our results hold for a broad class of distributions of agent valuations—all distributions for which the following two assumptions are true.

First, we assume that F is continuous and atomless.In order to give our second assumption, we must introduce some nota-

tion. Define:

Px≥i =∞∑

x=i

px. (5)

We now define the relation “<” for probability distributions:

P < P ′ iff ∃l(∀i < l, Px≥i = p′x≥i and ∀i ≥ l, Px≥i < P ′x≥i). (6)

We are now able to state our second assumption:

(P < P ′) implies that ∀v, be(v, P ) < be(v, P ′), (7)

Intuitively, we assume that every agent’s symmetric equilibrium bid ina setting with a stochastic number of participants drawn from P ′ is strictlygreater than that agent’s symmetric equilibrium bid in a setting with astochastic number of participants drawn from P , in the case where P ′

stochastically dominates P .

6.2. A Technical Lemma

Recall from section 5.1.4 that the notation Pn,k may be seen as defininga probability distribution over the number of agents that is independentof the bidding club setting. It is thus possible to discuss equilibrium bidsin the classical stochastic settings where the number of bidders is drawnfrom such a distribution. While it will remain to show why these valuesare meaningful in our setting where (among other differences) agents haveasymmetric information, it will be useful to prove the following lemmaabout the classical stochastic setting:

Lemma 2. ∀k ≥ 2,∀n ≥ 2,∀v, be(v, Pn+k−1,1) > be(v, Pn,k)

Remark. For convenience and to preserve intuition in what followswe will refer to the number of potential coordinators and the number ofagents belonging to a coordinator even though we concern ourselves with

70

Page 75: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

the classical economic environment from section 2.1 where bidding clubsdo not exist. The number of potential coordinators is shorthand for thenumber nc drawn from γC in the first phase of the procedural definitionof the distribution Pn,k. Likewise the number of agents associated with apotential coordinator is shorthand for the number of agents chosen fromone of the nc iterative draws from γA. Intuitively, this lemma asserts thatthe symmetric equilibrium bid is always higher when more agents belongto the main auction as singleton bidders and the total number of agents isheld constant.

Proof. Recall our second assumption from section 6.1. We defined P <P ′ as the proposition that ∃l(∀i < l, Px≥i = P ′

x≥i and ∀i ≥ l, Px≥i < P ′x≥i).

Our second assumption was that (P < P ′) implies that ∀v, be(v, P ) <be(v, P ′). It is thus sufficient to show that Pn+k−1,1 > Pn,k. We will takel = n + k.

First we will show that ∀j < n + k, Pn+k−1,1x≥j = Pn,k

x≥j . The distributionPn+k−1,1 expresses the belief that there are n+k−2 potential coordinators,the membership of which is distributed as described in section 5.1, andone potential coordinator that is known to contain only a single bidder.The distribution Pn,k expresses the belief that there are n − 1 potentialcoordinators, the membership of which is again distributed as described insection 5.1, and one potential coordinator that is known to contain exactlyk bidders. Under both distributions it is certain that there are at leastn + k − 1 agents. Therefore ∀j < n + k, Pn+k−1,1

x≥j = Pn,kx≥j = 1.

Second, ∀j ≥ n + k, Pn+k−1,1x≥j > Pn,k

x≥j . Considering Pn+k−1,1, observethat for n + k − 2 of the potential coordinators the probability that thiscoordinator contains a single agent is less than one and these probabili-ties are all independent; the last potential coordinator contains a singleagent with probability one. Considering Pn,k, there are n − 1 potentialcoordinators where the probability of containing a single agent is less thanone, exactly as above, and k potential coordinators certain to contain ex-actly one agent. Thus the two distributions agree exactly about n − 1 ofthe potential coordinators, which both hold to contain more than a sin-gle agent, and likewise both distributions agree that one of the potentialcoordinators contains exactly one agent. However, there remain k − 1potential coordinators about which the distributions disagree; Pn+k−1,1

always generates a greater or equal number of agents for these potentialcoordinators, as compared to Pn,k. Under the latter distribution all theseagents are singletons with probability one, while under the former thereis positive probability that each of the potential coordinators containsmore than one agent. As long as k ≥ 2, there is at least one poten-tial coordinator for which Pn+k−1,1 stochastically dominates Pn,k. Thus∀k ≥ 2,∀n ≥ 2,∀v Pn+k−1,1 > Pn,k.

71

Page 76: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

6.3. First-Price Auction Bidding Club Protocol

What follows is the protocol of a coordinator who approaches k agents.

1. Each agent i sends a message µi to the coordinator.

2. If at least one agent declines participation then the coordinator regis-ters in the main auction for every agent who accepted the invitationto the bidding club. For each bidder i, the coordinator submits a bidof be(µi, P

n,k), where n is the number of bidders announced by theauctioneer.

3. If all k agents accepted the invitation then the coordinator drops allbidders except the bidder with the highest reported valuation, whowe will denote as bidder h. For this bidder the coordinator will placea bid of be(µh, Pn,1) in the main auction.

4. If bidder h wins in the main auction, he is made to pay be(µh, Pn,1)to the center and be(µh, Pn,k) − be(µh, Pn,1) to the coordinator.

We are now ready to prove the main theorem of the paper:

Theorem 1. It is an equilibrium for all bidding club members to chooseto participate and to truthfully declare their valuations to their respectivebidding club coordinators, and for all non-bidding club members to partici-pate in the main auction with a bid of be(v, Pn,1).

Proof. We first prove that the above strategy is in equilibrium for bothcategories of bidders given that agents all participate; we then prove thatparticipation is rational for all agents.

For the proof of equilibrium we consider a one-stage mechanism whichbehaves as follows:

1. The center announces n, the number of bidders in the main auction.

2. Bidders submit bids (messages) to the mechanism.

3. The bidder with the highest bid is allocated the good.

4. The winning bidder is made to pay be(vi, Pn,si).

This one-stage mechanism has the same payment rule for bidding clubbidders as the bidding club protocol given above, but no longer implementsa first-price payment rule for singleton bidders. In order to prove that thestrategies given in the statement of the theorem are an equilibrium, it issufficient to show that truthful bidding is an equilibrium for all biddersunder the one-stage mechanism. Observe that this mechanism may beseen as a mechanism M in the sense of lemma 1: it allocates the good tothe agent who submits the highest message, and (by definition of be) the

72

Page 77: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

auction Mi in which all agents are subject to agent i’s payment rule andreceive the signal si has truth revelation as a symmetric equilibrium.

Strategy of non-club bidder: Assume that all bidding club agents bidtruthfully. Further assume that all non-club agents also bid truthfullyexcept for agent i. The probability distribution Pn,1 correctly describes thebeliefs of non-club agents, given the auctioneer’s announcement that thereare n bidders in the main auction. Although agents in bidding clubs haveadditional information about the number of agents—each agent knows thatthere is at least one other agent in his own club—their prescribed behavioris to place bids of be(µ, Pn,1) in the main auction. Agent i thus faces astochastic number of agents distributed according to Pn,1 and all biddingbe(v, Pn,1). Using the result from lemma 1, i’s strategic decision is the sameas under a mechanism where all agents are subject to his payment rule andshare his signal si, and with a stochastic number of bidders distributedaccording to Pn,1. In particular, it does not matter that the club membersare subject to different payment rules and have additional information, andso i will also bid be(v, Pn,1).

Strategy of club bidder: Assume that all agents accept the invitationto join their respective clubs and then truthfully declare their valuations,excluding agent i who decides to participate but considers his bid. Onceagain, observe that i is in a setting that is exactly described by lemma1: Pn,k really does describe the distribution over the number of agentsgiven his signal, and the bidder submitting the highest (global) messagewill always be allocated the good. Therefore the information asymmetrydoes not affect i’s strategy, and so truthful bidding is a best response foragent i.

We now turn to the question of participation; for this part of the proofwe return to the original, multi-stage mechanism.

Participation of non-club bidder: Because there is no participation fee,it is always rational for a bidder to participate in a first-price auction.

Participation of club bidder: Likewise, because there is no participationfee, all bidding club bidders will participate in the auction, but must decidewhether or not to accept their coordinators’ invitations. Assume that allagents except for i join their respective clubs and bid truthfully, and agenti must decide whether or not to join his bidding club. Agent i knows thenumber of agents in his bidding club and updates his distribution over thenumber of agents in the whole auction as Pn,k.

Consider the classical stochastic case where all bidders have the sameinformation as i (and are subject to the same payment rules): from propo-sition 3 it is a best response for i to bid be(vi, P

n,k). In this setting i’sexpected gain is the same as in the equilibrium where all bidding clubmembers (including i) join their clubs and bid truthfully, by corollary 1.

As a result of i declining the offer to participate in the bidding clubthere are n− 1 bidders in the main auction placing bids of be(v, Pn+k−1,1)and k − 1 other bidders placing bids of be(v, Pn,k). Note that this occurs

73

Page 78: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

because the singleton bidders and other bidding clubs in the main auctionfollow a strategy that depends on the number of bidders announced bythe auctioneer; hence they bid as though all the k − 1 bidders from thedisbanded bidding club might each be independent bidding clubs. Weknow from lemma 2 that be(v, Pn+k−1,1) > be(v, Pn,k). Thus the singletonbidders and other bidding clubs will bid a higher function of their valuationsthan the bidders from the disbanded bidding club. It always reduces abidder’s expected gain in a first-price auction to cause other bidders tobid above the equilibrium, because it reduces the chance that he will winwithout affecting his payment if he does win. This is exactly the effect ofi declining the offer to join his bidding club: the k − 1 other bidders fromi’s bidding club bid according to the equilibrium of the classical stochasticcase discussed above, but the n − 1 singleton and bidding club bidderssubmit bids that exceed the symmetric equilibrium amount. Therefore i’sexpected gain is smaller if he declines the offer to participate than if heaccepts it.

6.4. Do bidding clubs cause agents to gain?

We can show that bidders are better off being invited to a bidding clubthan being sent to the auction as singleton bidders. Intuitively, an agentgains by not having to consider the possibility that other bidders who wouldotherwise have belonged to his bidding club might themselves be biddingclubs.

Theorem 2. An agent i has higher expected utility in a bidding clubof size k bidding as described in theorem 1 than he does if the bidding clubdoes not exist and k additional agents (including i) participate directly inthe main auction as singleton bidders, again bidding as described in theorem1.

Proof. Consider the counterfactual case where agent i’s bidding clubdoes not exist, and all the members of this bidding club become single-ton bidders. We will show that i is better off as a member of the bid-ding club than in this case. If there were n potential coordinators inthe original auction and k agents in i’s bidding club, then the auction-eer will announce n + k − 1 as the number of participants in the newauction. Under the equilibrium from theorem 1, as a singleton bidder iwill bid be(vi, P

n+k−1,1). If he belonged to the bidding club and followedthe same equilibrium i would bid be(vi, P

n,k). In both cases the auctionis economically efficient, which means i is better off in the auction thatrequires him to pay a smaller amount when he wins. Lemma 2 showsthat ∀k ≥ 2,∀n ≥ 2,∀v, be(v, Pn+k−1,1) > be(v, Pn,k), and so our resultfollows.

We can also show that singleton bidders and members of other biddingclubs benefit from the existence of each bidding club in the same sense. Fol-

74

Page 79: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

lowing an argument similar to the one in theorem 2, other bidders gain fromnot having to consider the possibility that additional bidders might repre-sent bidding clubs. Paradoxically, other bidders’ gain from the existence ofa given bidding club is greater than the gain of that club’s members.

Corollary 2. In the equilibrium described in theorem 1, singleton bid-ders and members of other bidding clubs have higher expected utility whenother agents participate in a given bidding club of size k ≥ 2, as comparedto a case where k additional agents participate directly in the main auctionas singleton bidders.

Proof. Consider a singleton bidder in the first case, where the club of kagents does exist. (It is sufficient to consider singleton bidders, since otherbidding clubs bid in the same way as singleton bidders.) Following theequilibrium from theorem 1 this agent would submit the bid be(vi, P

n,1).Theorem 2 shows that it is better to belong to a bidding club (and thus tobid be(vi, P

n,k)) than to be a singleton bidder in an auction with the samenumber of agents (and thus to bid be(vi, P

n+k−1,1). Since the distributionPn,k is just Pn,1 with k − 1 singleton agents added, ∀k ≥ 2, be(vi, P

n,1) <be(vi, P

n,k). Thus ∀k ≥ 2, be(vi, Pn,1) < be(vi, P

n+k−1,1).

Finally, we can show that agents are indifferent between participatingin the equilibrium from theorem 1 in a bidding club of size k (thus, wherethe number of agents is distributed according to Pn,k) and participating inan economic environment with a stochastic number of bidders distributedaccording to Pn,k, but with no coordinators.

Theorem 3. For all τi ∈ T , for all k ≥ 1, for all n ≥ 2, agent i obtainsthe same expected utility by:

1. participating in a bidding club of size k in the economic environmentfrom section 5.1 and following the equilibrium from theorem 1;

2. participating in a first-price auction with participation revelation inan economic environment with a stochastic number of bidders dis-tributed according to Pn,k where all bidders receive the null signal,and where there are no coordinators.

Proof. First we will show that agent i’s expected utility in case (2) aboveis the same as in a classical first-price auction with a stochastic number ofbidders (i.e., without participation revelation). Second, we will show thatagent i’s expected utility in this classical stochastic setting is the same asin case (1) above.

From proposition 4 it is an equilibrium for agent i to bid be(vi, j) ina first-price auction with participation revelation (case (2)), where j isthe number of bidders announced by the auctioneer. Since the number ofagents is distributed according to Pn,k, the expected payment of agent iis

∑∞j=2 pn,k

j be(vi, j). This is the definition of be(vi, Pn,k) from equation 4.

75

Page 80: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

From proposition 3 this is an equilibrium bid of agent i when the numberof agents is distributed according to Pn,k (without information revelation).Since both the classical first-price auction with a stochastic number ofbidders and the first-price auction with participation revelation are efficient,agent i’s expected utility is the same under both auctions.

Under the equilibrium from theorem 1 (case (1)) the amount of i’spayment will be be(vi, P

n,k) if he wins. Since both the mechanism from case(1) and the classical first-price auction with a stochastic number of biddersare efficient, agent i has the same expected utility in both auctions.

This theorem shows that an agent would be as happy in a world with-out bidding clubs as he is in our economic environment. The difference be-tween the two worlds is that in the latter bidding club coordinators makea positive profit on expectation, and indeed never lose money. That is,in the bidding club economic environment some expected profit is shiftedfrom the auctioneer to the bidding club coordinator(s) without affectingthe bidders’ expected utility. We observe that it would be easy for coordi-nators to redistribute some of these gains to bidders along the lines of thesecond-price auction protocol proposed by Graham and Marshall: coordi-nators make a payment to every bidder who accepts the invitation to join,where the amount of this payment is less than or equal to the ex ante ex-pected difference that bidder makes to the coordinator’s profit. With thismodification coordinators would be budget balanced only on expectation(violating requirement 2 from section 1.3), but agents would strictly preferthe bidding club economic environment to the economic environment inwhich coordinators are not present.

7. DISCUSSION

In this section we consider the trustworthiness and legality of coordina-tors, and also discuss two ways for auctioneers to disrupt bidding clubs intheir auctions.

7.1. Trust

Why would a bidding club coordinator be willing to provide reliableservice, and likewise why would bidders have reason to trust a coordinator?For example, a malicious coordination protocol could be used simply todrop all its members from the auction and reduce competition. While thisis a reasonable concern, all the bidding club protocols discussed in thispaper allow the coordinator to make a profit on expectation. There is thusincentive for a trusted third party to run a reliable coordination service.Indeed, coordinators would be very inexpensive to run: as their behavior isentirely specified, they could operate without any human supervision. Theestablishment of trust is exogenous to our model; we have simply assumedthat all agents trust coordinators and that all coordinators are honest.

76

Page 81: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

7.2. Legality

We have often been asked about the legal issues surrounding the useof bidding clubs. While this is an interesting and pertinent question, itexceeds both our expertise and the scope of this paper. We should note,however, that uses of bidding clubs exist that might not fall under the legaldefinition of collusion. For example, a corporation could use a bidding clubto choose one of its departments to bid in an external auction. In this waythe corporation could be sure to avoid bidding against itself in the externalauction while avoiding dictatorship and respecting each department’s self-interest. Coordinators may also be permitted by the auctioneer: e.g., byan internet market seeking to attract more bidders to its site.

7.3. Disrupting Bidding Clubs

There are two things an auctioneer can do to disrupt bidding clubs in afirst-price auction. First, she can permit “false-name bidding.” Our auctionmodel has assumed that each agent may place only a single bid in theauction, and that the center has a way of uniquely identifying agents. Forexample, the auctioneer might use user accounts keyed to credit card billingaddresses in combination with a reputation ranking, making it impossiblefor bidders to place bids claiming to originate from different agents. Second,she can refrain from publicly disclosing the winner of the auction.

If bidders can bid both in their bidding clubs and in the main auction,they are better off deviating from the equilibrium in theorem 1 in thefollowing way. A bidder i can accept the invitation to join the biddingclub but place a very low bid with the coordinator; at the same time, ican directly submit a competitive bid in the main auction. Agent i willgain by following this strategy when all other agents follow the strategiesspecified in theorem 1 because accepting the invitation to join the biddingclub ensures that the club does drop all but one of its members and alsocauses the high bidder to bid less than he would if he were not bound to thecoordination protocol. If the bidding club drops any bidders other than ithen all agents’ bids will also be lowered because the number of participantsannounced by the auctioneer will be smaller, compared to the case wherethe bidding club did not exist or where it was disbanded. However, iffalse-name bidding is impossible and the winner of the auction is publiclydisclosed then the bidding club coordinator can detect an agent who hasdeviated in this way. Because the agent has agreed to participate in thebidding club the coordinator has the power to impose a punitive fine onthis agent, making the deviation unprofitable. If either or both of theserequirements does not hold, however, the coordinator will be unable todetect defection and so the equilibrium from theorem 1 will not hold.

77

Page 82: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

8. CONCLUSION

We have presented a formal model of bidding clubs which departs inmany ways from models traditionally used in the study of collusion; mostimportantly, all agents behave strategically based on correct informationabout the economic environment, including the possibility that other agentswill collude. Other features of our setting include a stochastic number ofagents and a stochastic number of bidding clubs in each auction. Agents’strategy space is expanded so that the decision of whether or not to join abidding club is part of an agent’s choice of strategy. Bidding clubs neverlose money, and gain on expectation. We have showed a bidding clubprotocol for first-price auctions that leads to a (globally) efficient allocationin equilibrium, and which does not make use of side-payments. There arethree ways of asking the question of whether agents gain by participatingin bidding clubs in first-price auctions:

1. Could any agent gain by deviating from the protocol?

2. Would any agent be better off if his bidding club did not exist?

3. Would any agent would be better off in an economic environmentthat did not include bidding clubs at all?

We have showed that agents are strictly better off in the first two sensesand no worse off in the last sense; furthermore, we have described a simpleside-payment scheme that would make agents strictly better off in all threesenses. We have also showed that each bidding club causes non-members togain in the second sense. Finally, we have discussed ways for an auctioneerto set up the rules of her auction so as to disrupt bidding clubs.

REFERENCES

Cramton, P. and Palfrey, T. (1990). Cartel enforcement with uncertaintyabout costs. International Economic Review, 31(1):17–47.

Feinstein, J., Block, M., and Nold, F. (1985). Assymetric behavior andcollusive behavior in auction markets. The American Economic Review,75(3):441–460.

Graham, D. and Marshall, R. (1987). Collusive bidder behavior at single-object second-price and english auctions. Journal of Political Economy,95:579–599.

Graham, D., Marshall, R., and Richard, J.-F. (1990). Differential pay-ments within a bidder coalition and the shapley value. The AmericanEconomic Review, 80(3):493–510.

78

goodelle
Page 83: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Harstad, R., Kagel, J., and Levin, D. (1990). Equilibrium bid functionsfor auctions with an uncertain number of bidders. Economic Letters,33(1):35–40.

Hendricks, K. and Porter, R. (1989). Collusion in auctions. Annale’sD’economie de Statistique, 15/16:216–229.

Leyton-Brown, K., Shoham, Y., and Tennenholtz, M. (2000). Biddingclubs: institutionalized collusion in auctions. In ACM Conference onElectronic Commerce.

Mailath, G. and Zemsky, P. (1991). Collusion in second-price auctionswith heterogeneous bidders. Games and Economic Behavior, 3:467–486.

McAfee, R. and McMillan, J. (1987). Auctions with a stochastic numberof bidders. Journal of Economic Theory, 43:1–19.

McAfee, R. and McMillan, J. (1992). Bidding rings. The American Eco-nomic Theory, 82:579–599.

Monderer, D. and Tennenholtz, M. (2000). Optimal Auctions Revisited.Artificial Intelligence, 120(1):29–42.

Riley, J. and Samuelson, W. (1981). Optimal auctions. American Eco-nomic Review, 71:381–392.

Robinson, M. (1985). Collusion and the choice of auction. Rand Journalof Economics, 16(1):141–145.

von Ungern-Sternberg, T. (1988). Cartel stability in sealed bid secondprice auctions. The Journal of Industrial Economics, 18(3):351–358.

79

Page 84: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Mechanism Design with Incomplete Languages

Amir Ronen∗

Department of Computer ScienceStanford University

([email protected])

AbstractA major achievement of mechanism design theory is the family of

truthful mechanisms often called VCG (named after Vickrey, Clarkeand Groves). Although these mechanisms have many appealing prop-erties, their essential intractability prevents them from being appliedto complex problems like combinatorial auctions. In particular, VCGmechanisms require the agents to fully describe their valuation func-tions to the mechanism. Such a description may require exponentialsize and thus be infeasible for the agents.

A natural approach for this problem is to introduce an intermediatelanguage for the description of the valuations. Such a language mustbe succinct to both the agents and the mechanism. Unfortunately, theresulting mechanisms are neither truthful nor do they satisfy individualrationality.

This paper suggests a general method for overcoming this difficulty.Given an intermediate language and an algorithm for computing theresults, we propose three different mechanisms, each more powerfulthan its predecessor, but also more time consuming. Under reasonableassumptions, the results of our mechanisms are at least as good as theresults of the algorithm on the actual valuations. All of our mechanismshave polynomial computational time and satisfy individual rationality.

1 Introduction

1.1 Motivation

The theory of mechanism design may be described as studying the design

of protocols under the assumption that the participants behave according∗This research was supported by Darpa grants number F30602-98-C-0214 and F30602-

00-2-0598.

Appendix G

80

Page 85: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

to their own goals and preferences and not necessarily as instructed by the

protocol. The canonical mechanism design problem can be described as fol-

lows: A set of rational agents need to collaboratively choose an outcome

o from a finite set O of possibilities. Each agent i has a privately known

valuation function vi : O → R quantifying the agent’s benefit from each

possible outcome. The agents are supposed to report their valuation func-

tions vi(·) to some centralized mechanism that chooses an outcome o that

maximizes the total welfare∑

i vi(o). The main difficulty is that agents

may choose not to reveal their true valuations but rather report carefully

designed lies in an attempt to influence the outcome to their liking. The

tool that the mechanism uses to motivate the agents to reveal the truth

is monetary payments. These payments are to be designed in a way that

ensures that rational agents always reveal their true valuations – making

the mechanism, so called, incentive compatible or truthful. To date there is

only one general technique known for designing such a payment structure,

sometimes called the generalized Vickrey auction [21], the Clarke pivot rule

[1] the Groves mechanism [5], or, as we will, VCG. In certain senses this

payment structure is unique [4, 17].

Although VCG mechanisms have many appealing properties, their in-

tractibility prevents them from being applied to complex problems like

combinatorial auctions. This intractability is twofold: Firstly, VCG mecha-

nisms require the agents to fully describe their valuation functions. Secondly,

it requires the mechanism to find the optimal allocation.

The problem of combinatorial auctions (CA) is an important example of

a mechanism design problem. In CA, the designer would like to auction a set

S of items (e.g. radio spectra licenses) among a group of agents who desire

them. As items may be substitutes (e.g. two licenses in the same place)

or complementary (e.g. licenses in two neighboring states) the valuation

of each agent may have a complex structure. A formal definition of the

81

Page 86: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

problem can be found in section 2.1.

Consider a VCG mechanism for CA: The mechanism first asks each agent

to declare her valuation function, i.e. to report a function wi : 2S → R+. It

then computes the optimal allocation and the payments of each agent.

Such a mechanism is clearly intractable. Firstly, finding the optimal

allocation is NP-hard even to approximate. Secondly, the mechanism relies

on the agents’ ability to describe their valuations in a way which is succinct

to its allocation algorithm. This ability cannot be taken for granted. For

example a naive solution will require each agent to report a vector of 2|S|−1

numbers to the mechanism. This of course is not feasible unless the number

of items is very small. On the other extreme the designer can ask the

agents to submit oracles, i.e. programs that return for every set s their

valuation vi(s). However, it is not difficult to see that in order to find the

optimal allocation or even a reasonable one, the allocation algorithm must

query these oracles an exponential number of times. The natural solution

for this problem is to introduce the notion of a bidding language. Such

a language should enable the agents to efficiently represent or at least to

approximate their valuations, but should also allow the allocation algorithm

to compute the desired allocation in polynomial time. Hopefully such a

language will capture most ”real life” valuations. Various bidding languages

were proposed in recent years. The interested reader is pointed to [11].

The drawback of this approach is that there are always valuation func-

tions which are impossible to represent in polynomial-time. We therefore

call such languages incomplete. Since VCG mechanisms with incomplete

languages are not optimal, the impossibility results of [13] imply that they

cannot be truthful! In other words, instead of describing their true valua-

tion according to the designer’s instructions, agents may have incentive to

misreport. Therefore, there is no guarantee, even when the agents are ratio-

nal, that the mechanism will find a reasonable allocation. Moreover, such

82

Page 87: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

mechanisms do not even guarantee individual rationality. That is, there are

cases where truthful agents will pay for their allocated sets more than their

actual valuations for them.

Our goal in this paper is to prevent these phenomena.

1.2 This work

This paper proposes a general method for overcoming the non truthfulness

of VCG mechanisms with incomplete languages. We first introduce the no-

tions of oracles1, descriptions and consistency checkers in the context of

VCG mechanisms. Oracles are programs that represent the agents’ valua-

tions. They are used by the mechanism to measure the agents’ welfare. A

consistency checker is a function that checks whether an agent’s description,

which is given in the intermediate language, is consistent with her oracle.

These additions to the VCG method still do not suffice to guarantee its

truthfulness.

We then describe three mechanisms which guarantee that under rea-

sonable assumptions, truth-telling is the rational strategy for the agents.

Each mechanism is more powerful but also more time consuming than its

predecessor. All of our mechanisms have polynomial computational time.

Following [13] we adopt the concept of feasibly dominant actions (FDAs).

Informally speaking, we assume that the agents choose their actions (strate-

gies) according to their strategic knowledge. We say that an action is feasibly

domiant if the agent is not aware of any circumstances where another strat-

egy is better for her. It was argued in [13] that when feasibly dominant

actions are available for the agent, it is irrational for her not to choose one

of them. It was also shown in [13] that if the payment of a non-optimal mech-

anism is calculated according to the VCG formula, the existence of FDAs

must rely on further assumptions on the agent’s knowledge. Our mecha-1Some advantages of using oracles were discussed in [19]

83

Page 88: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

nisms guarantee that, under such reasonable assumptions, truth-telling is

indeed an FDA. Each of them handles a more general form of knowledge

than its predecessor (i.e. more sophisticated agents).

When the agents are truthful, the result of our mechanisms is at least

as good as the result of the allocation algorithm on the truthfully reported

descriptions. Our mechanisms also satisfy individual rationality.

Note that our method does not make any assumptions on the algorithm

or the bidding language. The designer needs to design an intermediate lan-

guage, a consistency checker and an allocation algorithm such that, when the

agents prepare their descriptions according to her instructions, the overall

result is good. She then gets the mechanism for free.

For simplicity we prove all our theorems directly for the combinatorial

auction problem. Our results however are much more general and can be

applied to any VCG, weighted VCG or compensation and bonus [14] mech-

anism.

1.3 Related work

Non optimal VCG mechanisms were first studied in [13]. This paper dis-

cusses VCG mechanisms where the optimal algorithm is replaced by a poly-

time approximation or heuristic. This paper shows that mechanisms con-

structed this way cannot be truthful. It then proposes a general way of

dealing with this non-truthfulness using a certain form of appeal functions.

The problem of combinatorial auctions has been studied by several re-

searchers in recent years. A comprehensive survey of various aspects of

this problem can be found in [2]. In particular, various bidding languages

[3, 7, 20] and restrictions on the classes of bids that can be submitted (e.g.

[6]) were proposed. A comparative study of some of these languages can be

found in [11].

An alternative approach to the one that is taken here is to consider

84

Page 89: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

mechanisms where the agents are not required to declare their valuation

functions (non-revelation mechanisms). Examples of such mechanisms are

the simultaneous ascending auction [10] and iBundle [16]. The efficiency of

these auctions however is dependent on strong assumptions on the agents’

behaviour. They are also specifically designed to address the combinatorial

auction problem.

Finally, there is an extensive literature in the field of mechanism design.

An introduction can be found in [8, chapter 23] and [15, chapter 10]

Organization of this paper: The rest of the paper is organized as follows:

Section 2 formally defines combinatorial auctions and VCG mechanisms for

CA and explains their intractability. Section 3 provides an example of a

VCG mechanism with incomplete language and demonstrates the drawbacks

of such mechanisms. Section 4 defines our most basic mechanism, describes

the main concepts of [13] and shows that under reasonable assumptions on

the agents’ knowledge, truth-telling is an FDA. Sections 4 to 6 define ex-

tended versions of this mechanism and prove their basic properties. Section

7 discusses additional implementation issues and section 8 concludes the

paper.

2 Preliminaries

2.1 Combinatorial auctions (CA)

The problem of combinatorial auctions (CA) has been extensively studied

in recent years (see e.g. [7] [20] [3] [6] [11] ). The importance of this problem

is twofold. Firstly, several important applications rely on it (e.g. the FCC

auction [9]). Secondly, it is a generalization of many other problems of

interest, in particular in the field of electronic commerce. A recent survey of

various aspects of this problem can be found in [2]. For simplicity we prove

all our theorems directly for this problem.

85

Page 90: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

The problem: A seller wishes to sell a set S of items (radio spectra licenses,

electronic devices, etc.) to a group of n agents who desire them. Each agent

i has, for every subset s ⊆ S of the items, a non-negative number vi(s)

that represents how much s is worth for her. The function vi(.) is called

the agent’s valuation or type. We assume that vi(.) is privately known to

the agent. Given a (possibly partial) allocation s = (s1, . . . , sn) we shall

define the total welfare of the agents as g =∑

i vi(s). In this paper we will

be interested in mechanisms (protocols) which are designed to maximize

the total welfare. This goal is justified in many settings. There is also a

basic correlation between maximizing welfare and maximizing the seller’s

revenue. Solving the problem without monetary transfers is impossible (see

a discussion at [8, chapter 23]). We assume that the mechanism can ask

for payment from the agents and that the overall utility of each agent i is

ui = vi(s) + pi where s denotes the chosen allocation and pi the amount of

currency that the mechanism pays to the agent2. In an auction, pi will be

non-positive. This utility is what each agent tries to maximize.

For the sake of the example we take some standard additional assump-

tions on the type space of the agents:

No externalities The valuation of each agent depends only on the items

allocated to her. I.e. {vi(si)|s ⊆ S)} completely represents the agent’s

valuation.

Free disposal Items have non-negative values. I.e if s ⊆ t then vi(s) ≤

vi(t).

Normalization vi(φ) = 0.

Note that the problem allows items to be complementary, i.e.

vi(S⋃T ) ≥ vi(S) + vi(T ) or substitutes, i.e. vi(S

⋃T ) ≤ vi(S) + vi(T )

2This is called the quasi-linearity assumption.

86

Page 91: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

(S, T disjointed). For example an agent may be willing to pay $200 for

a TV set, $150 for a VCR, $450 for both and only $200 for two VCRs.

The structure of the valuation functions might therefore be complex. The

problem of finding an optimal allocation is equivalent to set-packing and is

NP -hard even to approximate within any reasonable factor.

Note that the valuation functions are not known to the mechanism in

advance. Moreover, if the mechanism is not carefully designed, the agents

will have an incentive to manipulate it for their own self interest. Such

manipulations might severely damage the efficiency of the mechanism. In

mechanism design problems the agents are assumed to be rational in a game

theoretic sense. They choose strategies which are good for them and not nec-

essarily act as instructed. The goal of the designer is to design a mechanism

(protocol) that produces good results under this assumption. Comprehen-

sive surveys of mechanism design theory can be found in [15, chapter 10] [8,

chapter 23].

In order to handle complex problems like combinatorial auctions the

mechanism needs to address the following issues:

• Agents’ valuations might be complex to express.

• The allocation and payments might be hard to compute.

• The mechanism needs to be designed to find good allocations even

though the agents follow their own self interest.

Let us summarize our notations and terminology regarding this problem.

Notations: We shall denote the whole set of items by S and a (possibly

partial) allocation by s = (s1, . . . , sn). Note that the sis are disjointed. We

denote the type of agent i by vi and the group’s type by v = (v1, . . . , vn).

Let pi denote the amount of currency that the mechanism pays to each agent

i and ui the agent’s utility. Given an allocation s and a type v we denote by

87

Page 92: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

gs(v) the welfare∑

i vi(si). Finally we shall use the following vectorial nota-

tion: given a vector a = (a1, . . . , an) we let a−i = (a1, . . . , ai−1, ai+1, . . . , an)

and (bi, a−i) denote the vector (a1, . . . , ai−1, bi, ai+1, . . . , an).

2.2 VCG mechanisms for CA

One of the major achievements of mechanism design theory is the VCG

method for constructing truthful mechanisms. In this subsection we briefly

describe these mechanisms for CA and discuss some of their properties.

The simplest kind of mechanisms are protocols (called revelation mecha-

nisms) where the agents are simply required to (privately) report their types

to the mechanism. According to these declarations the mechanism computes

the allocation and the payments. Note that agents may lie if it is beneficial

for them. Such a mechanism can be denoted by a pair m = (k(w), p(w))

where k denotes the allocation function, p the payment function and w the

agents’ declaration .

Definition 1 (truthful mechanism) A revelation mechanism is called

truthful if truth-telling is a dominant strategy for all agents. I.e. if lying to

the mechanism can never be more beneficial than declaring vi.

VCG mechanism are a special kind of revelation mechanisms.

Definition 2 (VCG mechanism) A VCG mechanism for CA is a reve-

lation mechanism m = (k(w), p(w)) such that:

• The mechanism chooses an allocation s = k(w) that maximizes the

total welfare gs(w) according to the declaration w.

• The payment is calculated according to the VCG formula: pi(w) =∑j 6=iw

j(s)) + hi(w−i) (hi(.) can be any real function of w−i).

Theorem 2.1 ([5]) A VCG mechanism is truthful.

88

Page 93: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Proof: Assume by contradiction that the mechanism is not truthful. Then

there exists an agent i of type vi, a type declaration w−i for the other

agents, and wi 6= vi such that vi(k((vi, w−i))) + pi((vi, w−i)) + hi(w−i) <

vi(k((wi, w−i))) + pi((wi, w−i)) + hi(w−i). Let s = k((vi, w−i)) denote the

chosen allocation when the agent is truthful and let s′ = k((wi, w−i)). The

above inequality implies that gs((vi, w−i)) < gs′((vi, w−i)). This contradicts

the optimality of k(.).

Rational agents will therefore reveal their true type to the mechanism.

Thus, when agents are rational the mechanism will result in the optimal

allocation!

Note that the main trick of this method is to identify the utility of

truthful agents with the declared total welfare. Similar techniques were

introduced in [14] for handling different type of problems. The results pre-

sented here are applicable to their methods as well.

Another desirable property of mechanisms is called individual rationality.

This means that the utility of a truthful agent is guaranteed to be non-

negative. A special kind of VCG mechanism called Clarke’s mechanism

[1] can guarantee this property. It also guarantees that the payment of

agents who are not allocated any object is zero. It does so by setting hi =

−∑

j 6=i k(w−i) where k(w−i) denotes the result of the algorithm when agent

i is ”ignored”. Until section 7 we shall only be interested in truthfulness.

Thus, for simplicity we can assume that hi(w−i) ≡ 0.

It is worth notifying that weighted VCG mechanisms are possible as well

(see e.g. [17] [14]). Also the designer can impose her own preferences by

”pretending” to be one of the agents. To date VCG is the only general

known method for the construction of truthful mechanisms. There is also

some evidence [17] that other methods are generally impossible.

89

Page 94: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

2.3 The intractability of VCG mechanisms

Although VCG mechanisms have many desirable properties, their essential

intractability prevents them from being used for complex problems like CA.

This intractability is twofold: VCG mechanisms require the agents to fully

describe their valuation functions and require the mechanism to find opti-

mal allocations.

The second aspect has been extensively discussed in [13]. This paper

discusses VCG mechanisms where the optimal algorithm is replaced by a

poly-time approximation or heuristics. It shows that mechanisms which are

constructed in this way cannot be truthful. The paper proposes a method

to overcome this non-truthfulness. It suggests a bounded rationality variant

of truthfulness called feasible truthfulness and shows that under reasonable

assumptions there is a general way of constructing poly-time feasible truthful

mechanisms.

An even more fundamental obstacle on the way to the application of VCG

mechanisms (and revelation mechanisms in general) to complex problems is

the fact that the agents are required to describe their valuation functions

to the mechanism. Consider for example a VCG mechanism for CA. One

natural way in which an agent can describe her valuation function to the

mechanism is by reporting a vector of numbers denoting her valuation for

every possible combination of items. This however is infeasible unless the

number of items is very small as it will require a vector of size 2|S| − 1. On

the other extreme, the designer can ask the agent to construct an oracle, i.e.

a program that returns for every set s the agent’s valuation vi(s). However

it is not difficult to see that in order to find the best allocation or even

a reasonable one, the algorithm needs to query the oracle an exponential

number of times.

The natural solution for this problem is to introduce the notion of a

90

Page 95: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

bidding language (see e.g. [11]) – a language that will enable agents to effi-

ciently represent or at least approximate their valuations but will also allow

the mechanism’s algorithm to compute the desired allocation in polynomial

time. Hopefully such a language will capture most ”real life” valuations.

In addition the designer must provide the agents with instructions of how

to construct these descriptions from their actual valuations. Given such a

language L we can define VCG mechanisms as before. The bidding lan-

guage and allocation algorithm must be constructed in a way that when the

agents follow the designer’s instructions, the results will be good (heuristi-

cally, within a certain factor from the optimum etc.)

The problem with this approach is that there are always valuation func-

tions which are impossible to represent in polynomial-time. We therefore

call such languages incomplete. As such a mechanism is not optimal, the

impossibility results in [13] imply that VCG mechanisms with incomplete

languages cannot be truthful! In other words, agents may have incentives

not to follow the designer’s instructions. Therefore there is no guarantee,

even when the agents are rational, that the overall results will be good.

Moreover, such mechanisms do not even guarantee individual rationality.

That is, there are cases where truthful agents will pay for their allocated

sets more than their actual valuations for them.

In this paper we propose a general method for overcoming this non-

truthfulness. Our solution is in the same spirit of [13]. However several

additional steps are needed to guarantee the good game theoretical proper-

ties of the resulting mechanisms.

3 Example VCG with OR bids

In this section we describe a simple example for a VCG mechanism with

an incomplete bidding language. We shall use this example throughout

91

Page 96: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

A B ABAgent1: 1 1 1.25 (2)Agent2: 0.8 0.8 1.2 (1.6)

Figure 1: Type matrix for the OR example

the paper. We first describe the language and the mechanism. Then we

analyze what strategies rational agents might choose when participating in

it. Note that our language is less expressive than what we expect from real

life mechanisms. We will demonstrate that even with such a language it is

possible to construct mechanisms where truth-telling is the rational strategy.

Following [11] we define an atomic bid to be a pair (s, p) where s ⊆ S is

a set of items and p is a price. The semantic of such a bid is ”my maximum

willingness to pay for s is p”. A description in this language consists of a

polynomial number of such pairs. Given such a description (sj , pj) we can

define, for every set s, the price ps to be the maximal3 sum of pjs such that

sj ⊆ s are disjointed: max{∑

j pj |(sj ⊆ s) and ∀j 6= k, sj⋂sk = φ}. This so

called OR language was used in [20].

Proposition 3.1 [11] OR bids can represent only super-additive valuation

functions.

The OR language therefore assumes that if an agent is willing to pay

up to PA for item A and PB for item B, then she is willing to pay at least

(PA + PB) for both.

Consider now the following (toy) example of a VCG mechanism: There

are only two items A and B. As shown in figure 3, the type of Agent1 is

(1, 1, 1.25) and of Agent2 is (0.8, 0.8, 1.2).3For the sake of the example we ignore the fact that computing this maximum might

be NP -hard.

92

Page 97: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Suppose that the designer instructs the agents to submit their true

valuation for every singleton. In this case we can define a description

di = {(sj , pj)} as truthful if for every item j, pj = vi(j). In other words,

such a description was prepared according to the designer’s instructions.

Consider a VCG mechanism with this language. After the descriptions are

reported, the mechanism allocates the items optimaly (according to the de-

scriptions but not to the actual allocations!). It then calculates the payments

according to the VCG formula. We assume that the designer has a small

reserved price for each item, so objects which are not desired by the agents

are not allocated.

In the example, when both agents are truthful, the mechanism will assign

the valuation in brackets to the set AB (see figure 3). The mechanism in this

case will allocate both items to Agent1 resulting in a utility of ui = 1.25 for

each agent (recall that we assume the simplified form where hi ≡ 0). The

optimal allocation will allocate to each agent one item, resulting in a welfare

of 1.8.

The above mechanism is not truthful. For example if Agent1 ”gives up”

item B and declares (1, 0, 1.25) while Agent2’s declaration remains the same,

it will cause the algorithm to produce the optimal result and therefore will

increase Agent1’s utility to 1.8! The same is true for Agent2. On the other

hand if both agents are ”giving up” the same item, only one item will be

allocated (to Agent1). This will result in a welfare of only 1.0. We shall

call a declaration where the agent reports a 0 value on one of the items

singleton concession. Another reasonable strategy for an agents is to find

a description which will bring the mechanism’s interpretation as close as

possible to her actual valuation. Formally we define the l∞-approximation

of vi(.) to be the description that minimizes maxs |vi(s) − di(s)|4. Such a4For the sake of the example we ignore the fact that calculating such a description

might be NP-hard.

93

Page 98: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

description for Agent1 is {2/3, 2/3, 4/3} . Note that the worthwhileness

of such declarations is highly dependent on the declarations of the others.

There are cases where such declarations will considerably improve the result

of the algorithm and therefore will increase the agent’s utility. On the other

hand there are many cases where such designated ”lies” will severely damage

the total welfare and henceforth the agent’s utility.

Note that the Clarke version of the above mechanism does not satisfy

individual rationality. For example, if both agents are truthful, Agent1 gets

both items, but pays 1.6, thereby loosing 0.35.

In this paper we will try to prevent these bad phenomena from happen-

ing.

4 Mechanism1

In this section we describe our first and most basic mechanism. We first

describe the building blocks of the mechanism – oracles, descriptions and

consistency checkers. Then we define the mechanism and formulate its basic

properties. Finally we show that under reasonable assumptions truth-telling

is the rational strategy for the agents.

We start with a formal definition adopted from [13] of computationally

bounded algorithms5.

Definition 3 (algorithm of degree d) Let n denote the number of agents.

We say that a function F is of degree d if its running time is bounded by

some polynomial of degree d of n.

Our mechanism fixes a constant c = O(nd) and terminates each function

that runs more than c time units (see section 7 for more details).5There are several alternative definitions. This one simplifies the formalization of the

results.

94

Page 99: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

4.1 Oracles and valid descriptions

All the mechanisms described in this paper ask the agents to prepare oracles

that represent their valuation functions. These oracles are queried by the

mechanisms in order to measure the total welfare. Formally:

Definition 4 (oracle) An oracle is a function w : 2S → R+. It is called

truthful for agent i if wi(s) = vi(s) for every set s.

We shall assume that agents are capable of preparing such oracles6. We

also assume that all the oracles are of degree d.

As mentioned earlier, it is hard for allocation algorithms to work with

oracles. We assume that the allocation algorithm accepts as input descrip-

tions in some bidding language (e.g. the OR language) and ask the agents

to prepare such descriptions. A consistency checker verifies that the agents’

descriptions are consistent with their oracles.

Definition 5 (valid description) A consistency checker is a function

ψ(w, d) such that:

• ψ(w, d) gets an oracle w and a description d in the bidding language

and returns a ”corrected” oracle w′.

• for every oracle w there exists at least one description d such that

w = ψ(w, d). Such ”fixpoint” descriptions are called valid.

Semantically, a valid description was prepared according to the designer’s

instructions. Since the mechanism can always use the ”corrected” oracle w′

we shall assume that agents’ descriptions are valid. We also assume that a

consistency checker of degree d is available to the designer and that given a6The tools which must be provided by the designer in order to make this assumption

realistic are not discussed in this paper.

95

Page 100: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

declaration d, the designer can compute an oracle wd such that d is a valid

description of wd. We say that an agent’s description is truthful if it is a

valid description of a truthful oracle.

In the OR language for example we can define w′(s) = p for every atomic

bid (s, p) in the description. Creating an oracle w from a description d such

that d is valid is straight forward.

4.2 Appeal functions

Another basic building block of our mechanism is the notion of appeal func-

tions. This is a modification of the appeals that were introduced in [13].

Intuitively an appeal function lets an agent incorporate her own knowledge

about the algorithm into the mechanism. The idea is that instead of declar-

ing a falsified type, the agent can follow the designer’s instructions and ask

the mechanism to check whether the false description would have lead to

better results. The mechanism will then choose the better of these two

possibilities leveraging both the agent’s utility and the total welfare.

Definition 6 (appeal) An appeal function gets as input the agents’ oracles

and valid descriptions and returns a tuple of alternative descriptions. I.e.

it is of the form: l(w1, . . . , wn, d1, . . . , dn) = (d′1, . . . , d′n) where di is a valid

description of wi.

Note that the d′is do not have to be valid. The semantics of an appeal l is:

“when the agents’ type is w = (w1, . . . , wn) and is described by (d1, . . . , dn),

I believe that the output algorithm k produces a better result if it is given

d′ instead of the actual description d”.

We assume that all appeal functions are of degree d for some reasonable

value of d. In section 7 we will discuss ways to enforce such a limit.

In our OR example (section 3) an appeal for Agent1 might try to give

up one of the items (i.e. perform a singleton concession) or try to give up

96

Page 101: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

item A for herself and B for Agent2 etc.

The actual implementation of the appeal functions is discussed in sub-

section 7.

4.3 Mechanism1

We can now define our first mechanism.

Definition 7 (mechanism1) Given an allocation algorithm k(d), and a

consistency checker for the bidding language we define mechanism1 as fol-

lows:

1. Each agent submits to the mechanism:

• An oracle wi(.).

• A (valid) description di.

• An appeal function li(.).

2. Let w = (w1, . . . , wn), d = (d1, . . . , dn). The mechanism computes the

allocations k(d), k(l1(w, d)), . . . , k(ln(w, d)) and chooses among these

allocations the one that maximizes the total welfare (according to w!).

In other words, the mechanism tries all the appeals and chooses the

one that yields the best result.

3. Let s denote the chosen allocation. The mechanism calculates

the payments according to the VCG formula: pi =∑

j 6=iwj(s) +

hi(w−i, d−i, l−i) (hi(.) can be any real function).

Note that hi(.) is independent of agent i. Until section 7 we simply assume

that it is always zero. Note also that we do not require the allocation

algorithm k(.) to be optimal. It can be any polynomial time approximation

or heuristic.

97

Page 102: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

An action (strategy) in mechanism1 is a triplet (wi, di, li). We say that

such an action is truthful if wi is truthful. The following two observations

are key properties of the mechanism:

Proposition 4.1 Consider mechanism1 with an allocation algorithm k(.).

Let d = (d1, . . . , dn) denote the agents’ descriptions. If all the agents are

truth-telling, the allocation chosen by the mechanism is at least as good

as k(d).

Proposition 4.2 If the allocation algorithm k, the appeal functions, oracles

and consistency checkers are of degree d, then the mechanism is of degree

d+ 2.

Let s denote the chosen allocation. Let v = (vi, w−i). Since we assume

that hi() = 0, the utility of agent i equals gs(v) – the total welfare when the

allocation is s and the type is v . Lying to the mechanism, i.e. submitting an

oracle wi 6= vi, is thus beneficial for the agent only if it causes the mechanism

to compute a better result (relatively to v). (For a more comprehensive

discussion see [13].) Note that when an agent lies to the mechanism, she

may not only cause damage to the algorithm’s result, but may also cause

the mechanism to prefer the wrong allocation on the second stage. Thus,

an agent needs to have a good reason for lying to the mechanism.

We will show that under reasonable assumptions on the agents, truth-

telling is the rational strategy for the agents. Thus, when the agents are

rational, the result of the mechanism is at least as good as the result of the

allocation algorithm on the truthful descriptions.

98

Page 103: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

4.4 An example

Consider the OR example of section 3. Suppose that Agent1 notices that

usually the result of the algorithm improves when she is giving up item A.

In a VCG mechanism the agent may be tempted to misreport in order to

increase the total welfare and henceforth her own utility. In many cases how-

ever this will cause damage to the overall welfare and henceforth to Agent1.

In our mechanism Agent1 can, instead of lying, declare her true type to the

mechanism and ask it to check whether such a lie would have been helpful.

If so, it prefers the result that was obtained by ”lying”. Otherwise, the

mechanism prefers the result of the algorithm on the truthful description

and thus prevents the damage that would have been caused by the lie. This

form of appeal functions provides the agents with a lot of power. Suppose,

for example, that Agent1 notices that the result improves if she gives up

item A while Agent2 is giving up item B. As before, the agent can ask the

mechanism to check whether such a transformation of the input would have

improve the overall result.

We note that not every knowledge of the agent about the allocation

algorithm k(.) can be exploited in this mechanism. Suppose that Agent1

notices that when both agents submit l∞-approximations of their valuations

the overall result improves. However, as she is given an oracle for v2, she

cannot compute Agent2’s approximation as it requires her to query the

oracle for every possible subset. Therefore, she cannot exploit her knowledge

about the algorithm. Such phenomena is problematic and do not occur in

the setting of [13].

4.5 When is it rational to tell the truth to the mechanism?

It was shown in [13] that even with full descriptions available, non-optimal

VCG mechanisms cannot be truthful (unless they produce unreasonable re-

99

Page 104: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

sults). That paper introduces a bounded rationality variant of the concept

of dominant strategies called feasible dominance and shows that under rea-

sonable assumptions truth-telling is feasibly dominant for the agents. This

paper follows this pattern. In this section we first describe the basic con-

cepts of [13]. We then consider mechanis1 and analyze the conditions under

which truth-telling is feasibly dominant for the agents.

4.5.1 Feasibly dominant actions (FDAs)

In this section we briefly describe the main concepts of [13]. The reader is

referred to this paper for a more comprehensive discussion.

Notations: We denote the action (strategy) space of agent i by Ai. Given a

tuple a = (a1, . . . , an) of actions chosen by the agents, we denote the utility

of agent i by ui(a).

In mechanism1 an action for the agent is a triplet (wi, di, li).

In classical game theory, given the actions of the other agents a−i, the

agent is (implicitly) assumed to be capable of responding by the optimal ai.

As the action space is typically very complex, this assumption is not natural

in many real-life situations. The concept of feasibly dominant actions re-

formulates the concept of dominant actions under the assumption that the

agent has only a limited capability of computing her response. It is meant

to be used in the context of revelation games.

Definition 8 (strategic knowledge) Strategic knowledge (or response

function) of agent i is a partial function bi : A−i → Ai.

Knowledge is a function by which the agent describes (for herself!) how

she would like to respond to any given situation. The semantics of ai =

bi(a−i) is “when the others’ actions are a−i, the best action which I can

think of is ai”. The fact that a−i is not in the domain of bi means that the

100

Page 105: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

agent does not know how to respond to a−i or alternatively will not regret

her choice of action when the others played a−i. Naturally we assume that

each agent is capable of computing her own knowledge and henceforth that

bi is of degree d.

Definition 9 (feasible best response) An action ai for agent i is called

feasible best response to a−i if either a−i is not in the domain of the agent’s

knowledge bi or ui((bi(a−i), a−i)) ≤ ui(a).

In other words, other actions may be better against a−i but at least

when choosing her action the agent was not aware of these.

The definition of feasibly dominant actions now follows naturally.

Definition 10 (feasibly dominant action) An action ai for agent i is

called feasibly dominant if it is a feasible best response against any a−i. We

also call such an action FDA .

It was argued in [13] that if an agent has feasibly dominant actions

available, then it is irrational not to choose one of them.

4.5.2 When is it rational to tell the truth to the mechanism?

Recall that the overall utility of each agent i equals gs(v) where s denotes

the chosen allocation and v = (vi, w−i). It is not difficult to see that when

the agent declares a falsified valuation, there are cases where she will con-

sequently lose. The agent needs therefore a good reason for lying to the

mechanism. When the appeals of the agents are time-limited (i.e. of degree

d) it was shown in [13] that the existence of FDAs for the agents must rely on

further assumptions on the agents’ knowledge. Here we formulate two such

assumptions and show how to construct computationally efficient truthful

FDAs for the agents.

101

Page 106: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Definition 11 ([13]) (declaration based knowledge) Knowledge bi(.) is

called declaration based if it is of the form bi(w−i, d−i) = (wi, di).

The semantics of declaration based knowledge is: “If I knew that the

others declare (w−i, d−i), regardless of their appeals, I would like to declare

(wi, di)”. In our OR bids example of section 3, such knowledge for Agent1

may be: ”If Agent2 has a high valuation for item B, I would like to give it

up”.

A declaration based knowledge naturally defines an appeal function

which we also denote by bi(.): bi(w, d) = (bi(w−i, d−i), d−i).

Theorem 4.3 If bi(.) is a declaration based knowledge for agent i then

(vi, di, bi) is feasibly dominant for the agent.

Proof: Let s denote the chosen allocation. Let v = (vi, w−i). Recall that

the utility of agent i equals gs(v). Also let φ denote the empty appeal.

Assume by contradiction that there exists a−i = (w−i, d−i, φ−i) that con-

tradicts the agent’s knowledge. Note that the appeals of the other agents

can be assumed empty and also that it must be that a−i is in the domain of

bi(.). Let (w′i, d′i) = bi(a−i). Let s = k(d) and let s′ = k(d′i, d−i) denote the

allocation when she lies. By the assumption, gs(v) < g′s(v). However when

the agent truthfully submits (vi, di, bi) the mechanism computes s = k(d)

and s′ = k(d′i, d−i) and takes the better among them according to v. A

contradiction.

Definition 12 ([13])(appeal independent knowledge) Knowledge bi(.)

is called appeal independent if it is of the form bi(w−i, d−i) = (w′i, d′i, li).

Theorem 4.4 If bi(.) is an appeal independent knowledge of agent i then

there exists a truthful FDA of degree d for the agent.

102

Page 107: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Proof: Define an appeal li as follows. Given (w−i, d−i) let (w′i, d′i, l′i) =

bi(w−i, d−i). li computes k(d′i, d−i), k(l′i((w′i, w−i), (d′i, d−i))) and takes the

best according to (vi, w−i). Since all the functions involved are of degree d,

so is li(.). Similarly to theorem 4.3, (vi, li) is an FDA.

The semantics of declaration based knowledge is the same as declaration

based except that the agent also submits an appeal li.

Agents who are not capable of reasoning about others’ appeals or do

not want to count on them would have appeal independent knowledge. We

argue that this would be the most common case. In all of the examples of

section 4.4 the agents’ knowledge was appeal independent.

5 Mechanism2: moving information around

A major difficulty that arises when coping with incomplete languages is

the asymmetric knowledge of the agents regarding their own valuations.

For example, in the setting of section 3, it is reasonable to assume that

Agent1 can compute her own l∞-approximation but Agent2 cannot compute

it. Thus, Agent1 might face the following considerations:

• The result of the algorithm improves significantly when all agents re-

port their l∞-approximations.

• Reporting my l∞-approximation instead of my truthful description,

will enable Agent2 to compute the optimal result.

In other words, in mechanism1, agents may want to misreport in order to

pass useful information about their own valuation to the others. In order to

prevent this we modify the mechanism to allow the agents to convey such

information.

Definition 13 (information structure) An information structure Ii

for agent i is a sequence of descriptions (possibly with repetitions)

103

Page 108: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

(d0, d1, . . . , dk) such that d0 is a valid description.

In addition we require each agent to provide for each dj an example

(w−i, d−i) such that k(dj , d−i) is a better allocation than k(d0, d

−i). This is

done in order to force the agents to submit only useful information.

Ii contains additional information that the agent can pass to the others’

appeals. The semantics of Ii is ”My valid description is d0. Nevertheless, I

suggest that you first try to work with d1, after that with d2, etc”. Many

alternative ways to define such information structures are possible. It may

be interesting to compare between different structures.

We can now define our second mechanism.

Definition 14 (mechanism2) Given an allocation algorithm k(d), and

consistency checker for the bidding language we define mechanism2 as fol-

lows:

1. Each agent submits to the mechanism:

• an oracle wi. (let w = (w1, . . . , wn))

• an information structure Ii. (let I = (I1, . . . , In))

• an appeal function of the form li(w, I) = d.

2. The mechanism computes the allocations

k(d), k(l1(w, I)), . . . , k(ln(w, I)) and chooses among these alloca-

tions the one that maximizes the declared total welfare.

3. The mechanism calculates the payments according to the VCG formula.

We can now expand the definition of knowledge under which the exis-

tence of truthful FDAs is guaranteed. This definition refers to knowledge

that was obtained by checking a representative family of (tuples of) appeals

of the other agents.

104

Page 109: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Definition 15 [13] (d-obtainable knowledge) Knowledge bi(.) is called

d-obtainable if the following holds:

1. bi is of degree d.

2. Every appeal function that appears in the domain or in the range of

bi(.), is of degree d.

3. There are at most nd appeal functions that appear in the domain or

in the range of bi(.). Moreover there exists a representative family Li

of no more than nd (n− 1)-tuples of appeals such that for every tuple

ϕ−i that appears in the domain of bi there exists a ψ−i ∈ Li such that

for all (w−i, I−i), bi(((w−i, I−i), ϕ−i)) = bi(((w−i, I−i), ψ−i)).

The assumption that agents’ knowledge is d-bounded is justified by the

immense complexity of the appeal space. It assumes that an agent cannot

think about more than a small family of representative cases Li. For a more

comprehensive discussion on this assumption see [13]. We need an additional

assumption on the appeal class that the agent considers. We will remove

this assumption later on.

Definition 16 (monotonic appeal) We say that an appeal function l(.)

is monotonic if for every w and for every two structures I = (I1, . . . , In)

and I ′ = (I ′1, . . . , I ′n) such that Ij is a subset of I ′j for all j, k(l(w, I ′)) is

at least as good as k(l(w, I ′)).

In other words, giving more information to the appeal can just help it

to compute a better result. We cannot expect the appeals to be monotonic

as such monotonicity usually requires exponential time. However, it is rea-

sonable to think that appeals will be monotonic in general, that is that the

addition of useful information and, in particular, of truthful descriptions,

105

Page 110: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

usually helps the appeals to improve the overall result. Changing the order

of the dis in the information structures does not affect monotonic appeals.

Definition 17 (monotonic d-obtainable knowledge) Knowledge for

agent i is called monotonic d-obtainable if it is d-obtainable and all the

appeals that appear in its domain or in its range are monotonic.

Theorem 5.1 If the agent’s knowledge is monotonic d-obtainable, she has

a truthful FDA of degree 3 · d.

Proof: Let Ii denote a maximal sequence of useful information that an

agent i can compute (i.e. it contains all the cases that the agent finds

useful). Let bi be a d-obtainable knowledge for agent i. Given (w−i, I−i) we

shall define an appeal li as follows: Let L be the family of all appeals that

appear in the domain or in the range of bi. Let Li be the representative

family. We define ω to denote the set of all the ”useful lies” ω = {wi|∃ψ−i ∈

Li, ϕis.t.(wi, Ii, ϕi) = bi(w−i, I−i, ψ−i)}. Obviously |W |, |L| are bounded by

a polynomial of degree d.

For every pair (wi ∈ ω, l ∈ L) we let li compute the result of l as if she

had submitted (wi, Ii, l), i.e. compute k(l(wi, w−i), (Ii, I−i)). The appeal

returns the best of these allocations according to (vi, w−i).

As all the functions involved are of degree d, it is not difficult to verify

that the appeal is of degree 3 · d.

We now show that submitting (vi, Ii, Li) is an FDA. Otherwise there ex-

ists a triplet (w−i, I−i, l−i) that contradicts bi(.). Since bi(.) is d-obtainable

we can assume that l−i is in the representative family. Let (wi, ιi, δi) =

bi(w−i, I−i, l−i). Because of the monotonicity we can assume that ιi con-

tains all the useful information that i can think of (i.e. ιi = Ii). However

the appeal li checks the case where i submits (wi, Ii, δi). Therefore lis result

must be at least as good as the result of the mechanism in this case – a

contradiction.

106

Page 111: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

6 Mechanism3: adding meta-appeals

In section 5 we assumed that the agents’ appeals are monotonic. Our final

step is to get rid of this assumption. We first define the notion of a meta-

appeal.

Definition 18 (meta appeal) A meta appeal is a function that gets a

vector of information structures I = (I1, . . . , In) and returns a list of vectors

of the form I ′ = (I ′1, . . . , I ′n) such that I ′j is a subset of Ij.

In other words, the meta appeals compute a list of alternative informa-

tion structures for the group. Note that many variants of this definition are

possible. We assume that all the meta-appeals are of degree d.

Definition 19 (mechanism3) Given an allocation algorithm k(d), and a

consistency checker for the bidding language we define mechanism3 as fol-

lows:

1. Each agent submits to the mechanism:

• An oracle wi. (let w = (w1, . . . , wn))

• An information structure Ii. (let I = (I1, . . . , In))

• An appeal function of the form li(w, I).

• A meta appeal χi(.).

2. The mechanism computes a list Γ containing all the results of the meta-

appeals as well as the original tuple of information structures I.

3. The mechanism computes, for every pair (lj , I ′) such that I ′ ∈ Γ and lj

is an appeal, the allocation k(lj(w, I ′)). It also computes k(d). It then

chooses among these allocations the one that maximizes the declared

total welfare.

107

Page 112: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

4. The mechanism calculates the payments according to the VCG formula.

Note that the mechanism is of degree d+ 3.

We can now define d-obtainable knowledge similarly to the previous sec-

tion. We add however the condition that it ignores the meta appeals of the

other agents.

Definition 20 [13] (d-obtainable knowledge of mechanism3) We say

that knowledge bi(.) of mechanism3 is d-obtainable if the following holds:

1. bi(.) is of degree d.

2. bi(.) ignores the meta-appeals of the other agents, i.e it is of the form

bi(w−i, I−i, l−i) = (wi, Ii, li).

3. Every appeal function that appears, in the domain or in the range of

bi(.), is of degree d.

4. There are at most nd appeal functions that appear in the domain or

in the range of bi(.). Moreover there exists a representative family Li

of no more than nd (n− 1)-tuples of appeals such that for every tuple

ϕ−i that appears in the domain of bi(.) there exists a ψ−i ∈ Li such

that for all (w−i, I−i), bi(((w−i, I−i), ϕ−i)) = bi(((w−i, I−i), ψ−i)).

The main justification behind the assumption that bi ignores the meta-

appeals is that the space of meta-appeals is extremely complex. Moreover,

properties of the meta-appeals are only partially connected to the actual

bidding language or the algorithm. The only potential profit from lying

that we can imagine are ”extra-trials” of the allocation algorithm when the

others’ appeals are forced to use the agent’s false description. We presume

that such potential gains are negligible compared to the obvious loss caused

by lying. It is also natural to think that if the appeals of the other agents

108

Page 113: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

ignore the agent’s recommendation to use d1, they have a good reason to do

so. We argue that knowledge which is not d-obtainable is unlikely to exist.

However, this ”thesis” needs to be checked experimentally.

Theorem 6.1 If the agent’s knowledge is d-obtainable, then she has a truth-

ful FDA (vi, I, li, χi) such that li is of degree 3 · d and χi of degree d.

Proof: Similarly to the proof of 5.1, given (w−i, I−i), we define the set of

”useful lies” ω = {wi|∃ψ−i ∈ Li, ϕis.t.(wi, Ii, ϕi) = bi(w−i, I−i, ψ−i)}, and

the family L of appeals which appear in bi(.). In addition we define the

set of useful information structures χi = {Ii|∃ψ−i ∈ Li, ϕis.t.(wi, Ii, ϕi) =

bi(w−i, I−i, ψ−i)}. We define I to be a union of all I ∈ χi, an appeal li like

in the proof of 5.1. The proof that (vi, I, li, χi) is an FDA is similar to 5.1.

6.1 Example: Mechanism3 with OR bids

Consider mechanism3 for CA with OR bids (section 3). Suppose that Agent1

notices the following phenomena:

1. When all agents perform l∞-approximations the result of k(.) usually

improves considerably.

2. The result also typically improves if agents perform singleton conces-

sions on different items. The improvement however is less significant

than in the first case.

Such an agent may anticipate three kinds of appeals:

• Appeals of agents that notice the first phenomenon and will therefore

leverage from her l∞-approximation.

• Appeals of agents who notice only the second phenomenon and will

only be disturbed by her l∞-approximation.

109

Page 114: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

• Appeals that will work best with her valid description.

Mechanism3 gives Agent1 the possibility of constructing a strategy that

will dominate every case that she can think of! She just needs to include

in her meta appeal three information structures. One that includes only her

valid description, one that will include her singleton concessions as well and

one that will also include her l∞-approximation.

We note that there exist additional ways to justify why truth-telling

is the rational strategy for the agents. Those are omitted from the paper

mainly due to space constraints.

7 Other implementation issues

In this section we address two additional issues which a designer may face

when implementing our mechanisms: guaranteeing individual rationality

and forcing reasonable time limitations on the agents.

In [13] it was shown that the allocation algorithm can be transformed

in polynomial time to an algorithm which satisfies additional monotonicity

requirements. With such an algorithm it is possible to define the function

hi(.) of our mechanisms similarly to Clarke’s mechanism [1]. The proof that

the resulting mechanisms satisfy individual rationality is similar to [13].

This paper shows that if enough computational time is given to the

agents, they can construct truthful FDAs. On the other hand the mechanism

needs to find a way to enforce reasonable time limits on the computational

time of the agents, i.e. to enforce time limits on the appeals and meta

appeals. This issue was discussed in [13]. In particular it was suggested that

knowledge-reflecting structure will be chosen for description of the appeal

functions. Such a structure enables the limitation of the computational time

of the appeals according to the agents’ own limitations and thus preserves

the existence of truthful FDAs. We presume that severe limitations can

110

Page 115: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

be imposed on the length of the lists produced by the meta appeals while

preserving the existence of FDAs. Finally, we think that it is a good heuristic

to charge small fees for extra computational time.

At a first glance our protocols may seem to put a lot of burden on the

agents. However we argue that with the right tools (e.g. tools for building

oracles), mechanisms like ours can become even more ”agent friendly” than

non-revelation mechanisms.

8 Conclusions and further research

In this paper we propose a general way to overcome the deficiencies of VCG

mechanisms with incomplete languages. Given an intermediate language, a

consistency checker, and an algorithm for the computation of the outcomes

(e.g. allocations) we construct three mechanisms, each more powerful but

also more time-consuming than its predecessor. All our mechanisms have

polynomial computational time and satisfy individual rationality.

We adopt the strong concept of feasible dominant strategies of [13] which

is a bounded rationality version of dominant strategies and showed that un-

der reasonable assumptions on the agents’ knowledge, truth-telling is feasibly

dominant for the agents. In addition when an agent lies to the mechanism,

there are cases where she will consequently lose.

When the agents are truth-telling the results of our mechanisms are at

least as good as the mechanisms’ algorithm. Our methods are general and

can be applied to any VCG , weighted VCG or compensation and bonus [14]

mechanism.

The paper assumes that in practice, agents will have only limited knowl-

edge and thus will not be able to do better than their truthful FDAs. This

thesis can and should be checked by experiments with ”real” agents. On

the other hand we feel that this assumption will remain true even when

111

Page 116: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

severe time limitations are forced on the agents. In fact it will not even be

a surprise if even in a VCG mechanism, if the bidding language and the

allocation algorithm are reasonably designed, the agents will not be able to

do better than truth-telling! This too can be checked experimentally.

Very little is currently known about the revenue of mechanisms for com-

plex problems. In particular note that when a non-optimal VCG mechanism

is naively used for a combinatorial auction, there are even cases where the

mechanism must pay to the agents instead of vice-versa!

In our constructions, there are several tools that the designer must pro-

vide to the agents. Tools to construct oracles, descriptions, appeals etc.

Methods for providing such tools were not discussed in this paper and are

crucial for the success of our mechanisms.

Finally we note that it might be fruitful to explore the possibility of using

appeal functions in situations where the agents have budget limits. When

such limits exist, agents may have incentives to cause others to run out of

budget and it is not likely that dominant strategy mechanisms exist. One

natural way to deal with budget limits, is to truncate the agent’s valuation

to her limit and then use VCG [12]. Truth-telling in this mechanism is a

safe strategy for the agent as she never pays more than her budget. We

argue that appeals of certain forms can play the role of threats and prevent

the worth-willingness of causing others to run out of budget.

Acknowledgments: We thank Yoav Shoham for many fruitful discussions.

We thank Inbal Ronen for reviewing previous versions of this paper.

References

[1] E. H. Clarke. Multipart pricing of public goods. Public Choice, pages

17–33, 1971.

112

Page 117: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

[2] Sven de Vries and Vohra Rakesh. Combinatorial auctions: A survey.

To appear.

[3] Yuzo Fujishima, Kevin Leyton-Brown, and Yoav Shoham. Taming the

computational complexity of combinatorial auctions: Optimal and ap-

proximate approaches. In IJCAI-99, 1999.

[4] J. Green and J.J. Laffont. Characterization of satisfactory mechanisms

for the revelation of preferences for public goods. Econometrica, pages

427–438, 1977.

[5] T. Groves. Incentives in teams. Econometrica, pages 617–631, 1973.

[6] R. M. Harstad, Rothkopf M. H., and Pekec A. Computationally man-

ageable combinatorial auctions. Technical Report 95-09, DIMACS, Rut-

gers university, 1995.

[7] Daniel Lehmann, Liadan O‘Callaghan, and Yoav Shoham. Truth reve-

lation in rapid, approximately efficient combinatorial auctions. In ACM

Conference on Electronic Commerce (EC-99), pages 96–102, November

1999.

[8] A. Mas-Collel, Whinston W., and J. Green. Microeconomic Theory.

Oxford university press, 1995.

[9] J. McMillan. Selling spectrum rights. Journal of Economic Perspectives,

pages 145–162, 1994.

[10] Paul Milgrom. Putting auction theory to work: the simultaneous as-

cending auction. Technical Report 98-002, Dept. of Economics, Stan-

ford University, 1998.

113

Page 118: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

[11] Noam Nisan. Bidding and allocation in combinatorial auctions. In Proc.

of the Second ACM Conference on Electronic Commerce (EC00), pages

1–12, 2000.

[12] Noam Nisan. Private communication, 2000.

[13] Noam Nisan and Amir Ronen. Computationally feasible vcg mecha-

nisms. In Proc. of the Second ACM Conference on Electronic Commerce

(EC00), pages 242–252, October 2000.

[14] Noam Nisan and Amir Ronen. Algorithmic mechanism design. Games

and Economic Behaviour, 2001. To appear; Extended abstract ap-

peared in STOC99.

[15] M. J. Osborne and A. Rubinstein. A Course in Game Theory. MIT

press, 1994.

[16] David Parkes. ibundle: An efficient ascending price bundle auction.

In In Proc. ACM Conference on Electronic Commerce (EC-99), pages

148–157, 1999.

[17] Kevin Roberts. The characterization of implementable choise rules.

In Jean-Jacques Laffont, editor, Aggregation and Revelation of Prefer-

ences, pages 321–349. North-Holland, 1979. Papers presented at the

first European Summer Workshop of the Econometric Society.

[18] Amir Ronen. Mechanism design with incomplete languages.

http://task.stanford.edu/, 2001. (full version).

[19] Toumas Sandholm. Issues in computational vickrey auctions. Interna-

tional Journal of Electronic Commerce, 4(3):107–129, 2000.

[20] Tuomas W. Sandholm. Approaches to winner determination in combi-

natorial auctions. Decision Support Systems, to appear.

114

Page 119: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

[21] W. Vickrey. Counterspeculation, auctions and competitive sealed ten-

ders. Journal of Finance, pages 8–37, 1961.

115

Page 120: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

��������� ��� �� ���������������

����� ���������� ���� �����������|}

������� ���������

��� ��� ����

� ����������

��� ������ ����� ������ ���������� �� ������� �� ������� �� ����������� � ��������� ������� ������ ����� ��� ���������� ��������� ��������� �� ����� ���������� �� ��� ���������� ������� ������������� ����������� �� �� ������ � �������� �������� �� �� ����� ������ ������ �� � �������� �� �� �� ���� ������� �� ��� ���� � ������ ������ ������������ ����� �������� ����� �������� � ��� ������� ��� ������

� ������� ������� ��� ��� ���� ��� ������ �� � ������ ��� ������������ � ������� �����! �� ��� ���������� ���� ���� ��������� ��� �� ������� ������ ��������� �� � ������ ��� ���������������� ������� ��� �����" ���������! �� ���� ����� �� ��� ���� ������ ��� ������ ���� ������� ������ ����� ����� ����� � �� � ������ ������������ ������� ���� ���������� �� ������� ������� ����� ����� ������ ����� �#����� ������� � ��� �������

��� ���� ������� ��� ������ ���� �������� �� ������� �#���������� $� ������ �� �������� � ������ �������� �� ������ %

��������� ������ �� ��� �� �������� �������� �� ����������� ����������� �����������

|������� ������� ����������� �������� ��!�����"� �������� �� ������ ������������ ��������������

}��� ����� ���������� ���������� �������� ����� � ����� ����� ��� �!�"#�$��!%&�'����(�

Appendix H

116

Page 121: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

������� ������� �� ��� '��� ��������� ��� ����� ������ ��� ����������� ���� ���� ��� ������ ��� ������� � ���� ���� ���� �� �������� ��� ���� ��������� �(��� ��� �� �� ������� ��� ����� ��������� ��� ��� ���� ����� ���� �� ����� �� � ����� ������ ������� �� ��� �� ���� ���� ��� ��������� �� ������ � ��������� ��� ���� �������� �� �� ���������� �#���� ��� �������� ����������� �� � ��� ������� ��� ������� ���� ����������� �� ���� �������

��� �#����� ������� ����� ���� ������� ����� �� � ���� ������� ��������� ������ ��� �� )�� ���� ��������� ���������� ��������� *���� ��+ ������� ��������� �� ��� ���� ����� ,���� ��� � �������� ������ )�"� �������� ���� � ������� �������� )�� ��� �� )� ��� ������� �������� �� ��� ������� ���������� ,���"� �������� �� � ������ )�"� �������� �� ������ ����� -������ ,��� �� �� ��� )�"��������� �#������ ����� �� �� ��� ��� �� ��������� �� ����� �� ������� �������� ��� ������� ������ ������ � ������ ��� ��� ������������$� ������ ��� �#����� ������� � ��� ������ �� ���� ����������� �� � ������� ��� ������� ���������� ,���"� �������� ����� )�"� ������������ ���� ������ �� ���� � ��� ������ ���� ������� ��� �#����� �������� �� ������ ���� ��� ����� ������ ��� ������� � ���� ���� ��������� � ���� ������ �� �� ��� ������ ����� ����� � � ������ ���������� � ������ ���� ���� ����� ��� ������ �#����� �������� ����� ���� ������ ���� �#����� �� ���� ��� �� ������ ��������� �� ���������� ��������� ��� ��������� ��� ���� � ����� ����� ���� ��������� �� �� ������������� ������ ���� ��������� ����� ��������� �� ��� ��'����� ���� ������ ������ ���� � ����������� ���

� �� ������

,����� . ������ ���� )� �� ,��� ����� � � ������ ���������� ��� �� �������� �� ������� ������ ���� ��� ����� � ������ �� / � &�0�� �� �� ���"� �������� ��� ���� ��������� �� ��� �������� 1/� &2� 0�� ���� )�"� �������� ����� �� ��������� ���"� �������� �� �������������������� )�� ��� �� )� ��� ����� �� ��������� 0�� �� �� ,���"��������� ����� �� ����� � �� 3 � � ��� ������ ���� ������� � �*/� &+� ������ ���� ,��� �� �� ��� ��� ��������� ����� ������������� ��� �� ������ ��������� � ��� ����� ��� ������ � �� ������

117

Page 122: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

�����$� ������ �� �������� � ��� �������� ������� �� ��� '��� ������

���� ���� ��� ������ ��� ������� ��� ���� ������ ����� � ���� �������� � �� ��� �� � ������� ������ �� ��� ���� ������� ���� ������� ��� ������ ���� ����� ���� ��� ��� ������� ��� ���� �������� �� �� ������� % ��� ������� �����

����� � ����� � ���� � ����� ������� ����������� � ���� ���������� ������ � ��� �� ����������

$� ���� ���� ����� �� �#�������� ����������� � ���� �������� ������������ � ���� �#������

0�� ���"� �������� ��5 6�� �� � �( � � ��60�� )�"� �������� �� 6�� �� � �( � � ��60�� ,���"� �������� �� 6�� �� � �( � � �R6 ����� �� �� ��� �#��������

�� ����� ���� ,��� ���� ��� ������ �� � �� ���� ��� �� )� ������ ���������� ����� 7�� ���� �R � � �� �������� ,���"� �#����� ���( � ��� ���� ��� ������ �� � �� ��� �� )� ��� ��� ���������� �����

8���� ��� �� )� ��� ��� ��� ����� �� ��� �� � ����� ������� ��������� ��� ����� ��� ���������� ������� ���� ��� ������ ������ ����������� � ����� 9����� ���"� �������� �������� ������� ��� ���� ���������� ��� ����� ����� � � ��� $� ��� ��� ���� )�"� �������� �� �������� ������� ���� ,��� �� �� ������ ��� ������� � ���� ���� ��� ����� �� � � ��� �� �� �� ������ �������� �� ������� ��� ���� �����"����������� -���� ��� ��� �� )� ���� ��� ������ ��� ��� ��������������������� ���� ��� ��������� ���� ���� ���� �������� �� ����� ����� ��)� �'������ ,���"� �������� �� � ���� ������� � ���������� ������� ���� �� )�� ���� �� ���� ��� ������ ��� ��� �������������

$� �� �#�������� ��������� ,���"� �������� � �� ��������� ��,����� ��� '��� ���� ����� ,��� *�� ������� ����+ ��� ������ ���

������� � ���� �� ��� �� ��� ������� :����� ���� ,���"� ���������� )�"� �������� ���� �� -���� ��� ��� �� �� ��� �� � � �� 3 � ������� �� ��� ����� ����� )� ����� ����� �� �� ����� � ��� ;� ���� �������� ��� �#����� ���( �� ��� ��� �� ����������� *�� �� �������� ������� � ��������� ���������� ���+� ;� ����� � � �� 3 � ��� �#����� ���( �� ���������������� �� ��� ���� )�� �� ���� ���� �R < ��3�� 8 ,���"� �������� �� *��������������+ � �� ����� ��� ����� �� �������� �� � �� ��� ���� �����)� �����

118

Page 123: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

0.1 0.2 0.3 0.4 0.5a

-0.05

0.05

0.1

0.15

0.2- E RA + E RN

;����� &5 �*� +��*�+ �� � ������ �

�� ��� ���� ���� ,��� ������� ����� �� ��� �� � �� ���� ���� �������� �� ���� ���� ���� ��� �� ��� ����� ��� ,��� ���� ������ ����������� = � �� < �� % ��� ���� �� ��� )� �� ���� % ����� ���� ��� ������ �� ��� ���� ��������� � ��� �������� 1��� &2 % ������� �� �� ��� �� ����� �� ����� ����� � � �� �� ���� ���� ,���"������ �� � ��� ���������� �� ��� �� ��� �� �� ���� ���������� =�� ��� ���� ��� ������ �� ��� ���� ��������� � ��� �������� 1�� &2��������� � ������ ���� ��� ���� ��� ������ �� ���� )�� ��� �������� ���� ��� ������� ��� �� ��� ����� �� ��� ����� �� � )�"� �������� �� ������ ���������� = �� � ���� ���������� =� -����� ��� �#����� ���( ��=*��3�+3��� < =*����+3� �� ��� ���� �� �( ���� ���( �� ���������$� >��� ���� ���� ,��� ���� ��� �� ��� ������ �� � � �� 3 4��

$� �� ���� � ��� ���������� ��� ������"� ������� �� � ������ ���� ��� ���������� ���� �� ���� ����� < �� ���� ���� �� 3 �� � ���

������� � ��� ���������� ����� �� � ��� ������� ���� �� ����� < ��� ��� ���� ����������� ���3 4��� ��� �#����� ������� ���� ������ �� ����� � �*�+ < &�4 3 &�4� � �2�4 � ���? �� �*�+ <&�. 3 4� � @�2 3 A���. ������������� ��� ����� ������ �� ;����� & �����*�+� �*�+ �� � ������ ��

�� ������� � ��� ����� �� ������ ���������� ��� ������� ���������� ����� � ������ �#����� ������� � ��� ������ � ��� ������ �*����#������� ������ ���� &B&+ �� �� ����� � ���� �#����� ������� �

119

Page 124: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

���� ������ � *���� &B&+�

� �������

��� ���� � �� ������� �� � ��� ����� �� ����� ��� �� ���������� ����������� �� ������ ����������� ;� � ����� � �� �������� ��� ��� ���� �#�������� �� ���� ���� �� ������� ��'������ ��� �� ������ ��������� ������� ��� �� ��� ����� ������� ��� �������� ������ �� ���� ����# ���� �� ������ ��������� �� �������� � ��� ������ $��� ������� ��� �������� ��� ���� ���� �� � ��� ���������� ��� ������ ��� ����(� �� �� �������� ������ � ��� ��� ������� �� ��� �� ��������������� ��� ��� ��� ���� ���� �������� �� ����� ��� � ������ ���������� ���� � ��'�� �� ������ ���������� ��� ���������� ��� ������ ������'������ ����� �� ��� ������� ��������� �� ������ % ��� ������� ������� ��� ������� � ���� ���� ���� �� �� ����������� � ��� ���� ����� �� ������ � *���� �� ��� �#����� �������+ ��� �(������ ������� ����#����� ������� � ��� �� ���������� �� � ������ �� �� '� ���� ��������� � �(������ �� �� �������

���������

120

Page 125: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

������� ��� ������������� ��������������� ������� �!�"�����

#%$'&)(+*-,/.103254"6�27(+&"8�9 $:*;&=<><?276)&[email protected]�G�.HG�27IJ .HILK)M�,+$:(�E�N'*;$'6)N:$POQ$:K�2R(>,/IL$:6C,

E�,S276�TU.7(+&FVW6)*-DH$'(+X+*-, 4E",S276�TU.H(+&ZY J\[^]7_H`HaCb

c K�$:&�I1254"6ZYdX+G).HG)27IFeRfQN'X:gdX>,S2R6�TU.H(+&hgi$'&)Mj�$:k)(>M�27(>4=lHm�Y�l a7aHa

nBoHprqrs�t5u+qvQw�xzyS{}|U~��'�:��wh��y:w���~+��w�|���{U~>|�������� �-���H�����U���>�������'xz����xz�)���>wry:w�|����zxd r��{Uxz~+y

~>¡�{U�:w��r�d�>�}�}xz� ���h¢�£�¤¥|Uwr¦�xd�}xz~>yP~>��w�|���{U~�|§{U~�{U�'w�¨©�:�ª{Uxª«¬�>�>wryS{§� ���}w+­¯®;yQ{U�'w�'|U~S�rw��}�7�)w��'w�°:y:wC±'�}²>�d³>´}���}²µ����� �-�����U¶-·>¶¬���U¸+���'xd���¹wry'|Uxz���¹��{��>y5�'��|���º�wr�zxzw�¡'��{���{Uwr���xª{U�§{U�:w��}~+�»|U�rw�~�¡¼w �>�����:xzwr�rw�~>¡¼xzy'¡½~>|U¨���{Uxd~>yR­Cv!w��}�'~ ��{U�5��{)¢�£�¤¾|Uwr¦�xz�}xd~>y� �>y\º�w¹�'w�|Uxz¦/w �B¡½|U~+¨º�wr�zxzw�¡�¡��'�}xz~+yR­�vQwZ{U�'wry\y'~>{Uw�{U�5��{�{U�'wZ¡��'�}xd~>y¿~+��w�|���{U~>|�'w�°:y:wr�©�¿�}wr¨�xª«¬�d��{}{Uxd��w+¸���y5�¯xzyÀ�:��|}{Uxz�r�'�i��|�xz�§xd�'wr¨���~�{UwryS{ ¸"�>�}�}~S�rxd��{Uxz¦/w>¸��>y:��r~+¨�¨%�'{���{Uxd¦+w+­�¢���~>y:w��r~>y:�}w ÁS�:w�y:�rw+¸'�)w�xz�z�z�:��{}|���{Uw¹�'~��º�wr�zxzw�¡C¡��'�}xz~+y¿� ��y?º�wxª{Uw�|���{Uw �!��xª{U�'~+�'{©�'xªÃ����:�ª{;ÄS¸CxzyÅ�r~>y�{}|�����{%{U~¿º�w��dxzw�¡�|Uwr¦�xz�}xz~+y¯���:~>�}wÆxª{Uw�|���{Uxd~>y�5�����'|U~�¦/w �����5�>�z�zwry'�+xzy:�»­)ÇHxzy5���z�zÄS¸»�)wZ�'w�°:y:whº�w��dxzw�¡�²>� È��»�U�����:É����:w�|Uwr�>��¡��'�}xd~>y�'|U~��»�:�rwr�R��º�wr�zxzw�¡:��{���{Uw"��xª{U�¹¨�~>|Uw�xzy'¡½~>|U¨���{Uxd~>y�{U�5�>y�xz�R��~+�}�}wr�}�}w �¹ºSÄhw�xz{U�'w�|H~�¡xª{U�7{;��~h��|U�+�:¨�w�y�{U�r¸��»xªÊË�:�}xz~+yµ�'|U~��»�:�rwr�H����{���{Uw���xª{U�¹�dw��}�Cxdy»¡�~�|U¨���{Uxz~+y¼­�Ç:�:�}xz~>y�>y:�\�'xªÊË�'�}xd~>y¿��|Uw¹��Ä»¨�¨�w�{}|Uxz�h~>��w�|���{U~�|U�r¸Ë�>y:�Æ{U~+�>w�{U�:w�|h�»w�°5y'w¹�§�'xz��{}|Uxzº:�»{Uxz¦/w�d��{}{Uxz�rw+­

Ì Í�Î\Ï)Ð�Ñ ÒÅÓQÔhÏ)Õ�ÑÆÎÖ}× Ø�Ù¼ÚSÛ�ܽÝ�Þ5ßÆ×ËàSØAá�â½Ú'Ý�Ý�ܽá>Ú�âËØhà»ã�äHå'æµâ½árÙËà'çËã�ãRèà'×�å:é?êÚ»ã�ë�ì>×�í�à»ãrÝ>å'Ú�×Rë¿îÅÚ�ä5ܽ׼Ý�à'×Åïdð+ñ�å7ð òó ã�à ó à:Ý�ì+ë¯Ú?Û�Ù¼ì�à»ã�ß!à�í�ô�ã�ì>Ú'Ý�à'×¼Ú�ÞËâ½ì>õ\Þ7ì�â½Üiì�í�ã�ì>ö5ܽÝ�Üià'×�åËÛ�ÙËì?÷Æøhù^ú�ûËürý�þ ÿ�ÙËì�×¼á�ì�í�à'ã�Û�Ù���ZÙËìµÜ½×:Û�ì�×:Û�Üià'× à�íCÛ�ÙËì¹Û�ÙËì�à'ã�ß?Ü�Ý�Û�àBí�à»ã��¿Ú�â½Ü��>ìµÚ�×��©á>á�Ú�� Ý��;ãrÚ �>à»ã ó ã�Üi×Rá�Ü ó âiì'å'ì>×¼Ý�çËã�ܽ×��Û�Ù¼Ú�Û�Þ7ì�â½Üiì�í-Ý�árÙ¼Ú»×��»ì�à'×Ëâiß\Ø�ÙËì>׿í�à»ãrá�ì+ëBÛ�à�Þ5ßÆ×¼ì�Ø�Üi×Ëí�à»ã��?Ú�Û�ܽà»×����ZÙËì��\à:ÝUÛhá�à���?à»×ØhÚ/ßÂà�í ó ã�ì>Ý�ì�×:Û�ܽ×���Û�Ù¼ìÀæ¹é§î Û�ÙËì>à»ã�ßLÜ�Ý¿Û�ÙËã�à'ç��»Ù�Û�ÙËì¯í-Ú�\à'ç¼Ý æ¹é§î ó à'Ý�Û�ç¼â½Ú�Û�ì>Ý>åØ�ÙËܽárÙQÜ�� ó à:Ý�ì§ã�ì>Ý�Û�ã�ܽá�Û�ܽà»×¼Ý�Û�ÙRÚSÛµÚSÛ�Û�ì�� ó Û¹Û�à¿á�Ú ó Û�ç¼ã�ì©Û�Ù¼Ü½Ý ó ã�ܽ׼á�Ü ó â½ì ó ã�ì+á�Ü�Ý�ì>âiß��������� ���! #"!$#�%�&�('*)!+�,&-�"!,�,&+� #.%+�./���!+�,103240 '�5768+�����9 '*,&'�:;24.<�&+�-=5&2��&�('�5����324.�2�>�+� #:;2���+�-<?�@� #5�-� #:=A

0(��'���'�.!'�,&,�6B'C+�.!-���"!)('C�!'�5&'D�&�!'B�FEHGI0 #,���"!�J2��&'�,�KL24,M@� #5&:N"!�J2��&'�)1+�.;O PRQRS�@� #5T���!'CU3.!+ ��'C0!5� #0 #,�+ �&+� #.!24�-�24,�'�VXWDYL@MZ[+�,\2H�&�('� #5^]=+�.=,� #:*'�0!5& #0 #,&+ ��+� #.324�3�J24.!$#"!24$#'<?�_;24.!)*`�245�'8,�'�.a��'�.!-�'�,�+�.b���32��c�J24.!$#"324$#'#?24.()edN+�,82b5�'�f�+�,&+� #.� #0 '�5X2��� #5R?!���!'�.hgi P7ZjdM_k+�:*0!��+�'�,M_i8l YL@\Zjm7_k+�,�,X2���+�,�U 249!��'#?!���!'�.eZjdM_;noZjm7_

Appendix I

121

Page 126: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

�ZÙËì�ã�ì Ù¼Ú'ÝÅÞ7ì�ì�×p�Bç¼árÙ Ý�çËÞ¼Ý�ì�q:çËì>×:ÛPØhà»ã�äFÜ½× Ý�ì>ö»ì>ã�Ú»âBë�Ü�Ý�á�Ü ó â½Ü½×Ëì>Ý>åÆá�à»×¼Ý�ܽÝ�Û�ܽ×���\à:ÝUÛ�âiß!à�í�á�à� ó â�Ú�ܽ×'ÛrݵڻÞRà'ç�Û%Ú�×Rëo�?à5ëËÜsrRá>ÚSÛ�Üià'׼ݹà»í�Û�ÙËìÆæ¹é§î ó à'Ý�Û�ç¼â½Ú�Û�ì>Ý%Ú�×¼ëÀÝ�ì�Û��Û�ܽ×��D�t�ZÙËì?á�ÚSÛrÚ�â½ß�ÝUÛ%í�à»ãk�Bç¼árÙÅà»í�Û�ÙËÜ�Ý©Ø�à'ã�ä!ܽ×Àã�ì>á�ì>×:۵߻ì+Ú�ãrݹټÚ'ݵÞ7ì�ì>×ÅÛ�Ù¼ìÆÜiÛ�ì>ã�Ú�Û�ì>ëá�Ú»Ý�ì¿à�íZÞ7ì�â½Üiì�íZã�ì�ö5Ü�Ý�ܽà»×vu Û�Ù¼ÚSÛÆܽÝ>å�ã�ì�ö5Ü�Ý�ܽ×�� ó ã�ì�ö5ܽà»ç¼Ý�âißh��ã�ì>ö5ܽÝ�ì>ëÀÞRì>âiܽì�í-Ý4w#�o�ZÙËì æ¹é§îó à:ÝUÛ�çËâ½Ú�Û�ì+Ý�ã�ì+ÝUÛ�ã�Ü�á Û�Ú»×¿Ú �»ì>×:Û� Ý�ÞRì>âiܽì�í-Ý�ÚSí Û�ì�ã�Ú�Ý�ܽ×��»â½ì�ã�ì>ö5ܽÝ�Üià'×�å�ÞËçËÛ ó ã�àSö:Ü�ë�ì�×Ëà�Ú'Ý�Ý�Ü�Ý��Û�Ú�×Rá�ì�ܽ×�ë�ì�Û�ì�ã��?Üi×Ëܽ×��¹Ø�ÙRÚSÛ�Ú»×�Ú �»ì>×:Û�à»ç��'Ù'Û"Û�à¹Þ7ì�â½Üiì>ö»ì�Ú�í Û�ì>ã"Ú¹Ý�ì�q:çËì�×Rá�ì�à»í5ã�ì�ö5Ü�Ý�ܽà»×RÝa�xPà'ã�äPà»× ì<y5Û�ì�×¼ë�ܽ×��Àæ¹é§î Û�à¯Û�Ù¼ì ÜdÛ�ì�ãrÚSÛ�ì>ë�á�Ú'Ý�ì¿Ü½×¼á�â½ç¼ë�ì+Ý�ï�z5å�ð�{�å�ð»ð�å�ð�|�åH}}5åH} ~/ò�åÞËç�Û¹ÜiÛZÜ�Ýhí-Ú�ܽãZÛ�à¿Ý�Ú/ß?Û�Ù¼Ú�ÛµÚ»Ýhà�í�×ËàSØ=Û�ÙËì©Û�ÙËì>à»ã�ß à�í�ÜdÛ�ì�ãrÚSÛ�ì>ëWÞRì>âiܽì�í)ã�ì>ö5ܽÝ�Üià'×�Ü�ÝZ×Ëà�Û¹ÚÝ�ì�Û�Û�â½ì>ë��¿ÚSÛ�Û�ì>ã��ËÝ�ì�ì ïið(}+òCí�à'ãµë�Ü�Ý�á�ç¼Ý�Ý�ܽà»×Wà�í�Ý�à��\ì§à»í"Û�Ù¼ì�à»ç�ÛrÝUÛrÚ�×¼ë�ܽ×��¿Ü�Ý�Ý�çËì+Ýa�

�%çËã�ë�Üiã�ì>á�Û�Üi×:Û�ì>ã�ì+ÝUÛ�â½Üiì+ݧÜi×��Bç¼âdÛ�Üs�}Ú �'ì�×:Û©Þ7ì�â½Ü½ì�íZã�ì>ö:Ü�Ý�Üià'×�åHÛ�Ù¼Ú�Û�ܽÝ>åCÛ�ÙËì Ý�ÜiÛ�çRÚSÛ�ܽà»×Üi× Ø�ټܽárÙ Ú»×�Ú �'ì�×:Û�Ü�Ý�ܽ×�í�à»ã��?ì>ë?Þ5ß��BçËâiÛ�Ü ó â½ìµà�Û�ÙËì�ãhÚ �'ì�×:Û�Ý>å'Ú»×¼ëCåh�\à'ã�ìµÜi×:Û�ì>ã�ì+ÝUÛ�Üi×D�»â½ß»åØ�ÙËì�×��BçËâiÛ�Ü ó â½ì\Ú�»ì�×:Ûrݵܽ×�í�à»ã��¥ì>Ú'árÙÅà�Û�ÙËì�ã(�t�¹àSØhì�ö»ì>ã>å7ÜiÛ%Û�çËã�×RÝ%à'ç�Û©Û�Ù¼Ú�Û%Û�ÙËܽݩÜ�Ý�Ý�çËìܽݧܽ×Ëì<y5Û�ã�Ü�á�Ú»ÞËâißÀÞRà'çË×¼ëPÛ�àQÛ�Ù¼ÚSÛ�à»í�ÜiÛ�ì�ãrÚSÛ�ì>ëPÞRì>âiܽì�í�ã�ì�ö5Ü�Ý�ܽà»×����¹à»Û�à»×Ëâ½ßÀë�àQØhì¿ö5Üiì>Ø�BçËâiÛ�Ü���Ú�»ì>×'Û§ã�ì�ö5ܽÝ�ܽà»×PÚ'Ý�ÚWÝ�ì�q:çËì>×¼á�ì?à»í�ã�ì�ö5ܽÝ�ܽà»×¼Ý%ì>Ú'árÙPà»íhÚ!Ý�ܽ×��»â½ì¿Ú �'ì�×:Û� Ý©ÞRì>âiܽì�í-Ý>åÞËç�Û%Øhì�Ø�ܽâiâ�Ý�ÙËàSØ Û�ÙRÚSÛµçË×Rë�ì�ãµÛ�ÙËì��BçËâiÛ�Ü���Ú�»ì>×'Û ó ì�ãrÝ ó ì+á Û�ܽö»ì�ÜiÛ�ì>ã�Ú�Û�ì+ëQÞRì>âiܽì�í�árÙ¼Ú»×��»ìܽÝ�ç¼× ó ã�à»ÞËâ½ìa�¿ÚSÛ�ܽá�

�ZÙËìWÞRÚ»Ý�ܽÝBí�à»ã\Û�Ù¼Ü½Ý ó Ú ó ì>ãÆÜ�ÝÆÛUØ�à�à»ÞRÝ�ì>ã�öSÚSÛ�Üià'×¼Ý>å�ÞRà»Û�ÙLà»íµØ�ÙËÜ�árÙ1Ú»ã�ì!ë�Ü�Ý�á�ç¼Ý�Ý�ì>ëí�çËã�Û�ÙËì�ãµÜi×WÛ�ÙËì�×¼ì<y5ÛµÝ�ì+á Û�Üià'×��ð�*�ZÙËì¯æ¹é§î�ã�ì>ö5ܽÝ�Üià'×Là ó ì>ã�Ú�Û�à»ã?á�à'×:Û�Ú�ܽ׼Ý\ÛUØhà�Ú'Ý�ßT���?ì�Û�ã�ܽì>Ý\ܽ×ÂÜiÛ�Ý?ÛUØhà�Ú�ã��»çM��?ì�×:ÛrÝa���ZÙËì©à»Þ5ö5Üià'ç¼ÝhÚ»Ý�ßh���?ì�Û�ã�ß?Ü�Ý�Û�ÙËì ó ã�ì>á�ì+ë�ì�×Rá�ì©à�í�Û�ÙËì©Ý�ì>á�à»×¼ë�Ú�ã��»ç��?ì�×:ÛàSö'ì�ãhÛ�ÙËìtrRã�Ý�Û¹à»×Ëì��7�ZÙËìe�?à»ã�ì�Ý�ç¼Þ�Û�â½ì�Ú»Ý�ßT���?ì�Û�ã�ß»åËØ�ÙËÜ�árÙQܽݹì<y ó à:Ý�ì+ë�à'×Ëâ½ß�Þ5ßìay�Ú�?Üi×Ëܽ×��©Û�Ù¼ì*�\à�ë�ì>â�Û�Ù¼ì�à»ã�ì�Û�ܽá�árÙRÚ�ãrÚ»á Û�ì�ã�Ü��+ÚSÛ�Üià'×�à�í7Û�ÙËì¹æ¹é§î Ý�ì�Û�Û�ܽ×��¼å'ܽÝ)Û�ÙËìã�ܽárÙ¼ì�ã�ÝUÛ�ã�çRá Û�ç¼ã�ì§à�í�Û�Ù¼ìtr¼ã�Ý�Û¹Ú�ã��»ç��?ì>×'Û�Ú»Ý�á�à� ó Ú»ã�ì+ë Û�àÆÛ�ÙËì�Ý�ì>á�à'×¼ëB�

}M�*�ZÙËì�ö'ì�ã�ß�Ý�ì�Û�Û�Üi×���à»íHæ¹é§î ã�ì�ö5Ü�Ý�ܽà»×ÆܽÝ�à ó ì�×\Û�àe�¿Ú�×5ßÆÜi×:Û�ì>ã ó ã�ì�Û�ÚSÛ�Üià'×¼Ý>å�Ú�×¼ë\ã�ì<�Ý�à»â½ö5Üi×�� ó ã�à'ÞËâ½ìa�¿Ý�Ú'Ý�Ý�à�á�Ü�ÚSÛ�ì+ë?Ø�ÜdÛ�ÙWæ¹é§î¥ã�ì�ö5Ü�Ý�ܽà»×�ã�ì(q'ç¼Üiã�ì>Ý�ܽ×��'ì�×Ëì>ã�Ú»âRárÙËà5à:Ý��ܽ×���Ú�\à'×��¿Û�ÙËì>Ý�ìBܽ×:Û�ì�ã ó ã�ì�ÛrÚSÛ�ܽà»×RÝa�¹Ö}× ó Ú�ã�Û�Ü�á�çËâ�Ú�ã+åRÛ�ÙËì>ã�ìÆÜ½Ý©Ú árÙËà'ܽá�ì�ÞRì�ÛUØ�ì>ì�×Ú!ú}ü<���Ëý�þR�3�M�Ëü�þ4�&�Ëü4��ú����Sü¹Ú»×¼ëWÚ����M�zú&�����a�:ü<�RúB�¼ü�þ4�&�¼ü4��ú&���/üa�xPìZØ�ܽâiâ¼Ú'ë�à ó Û�Û�ÙËì*�BçËâiÛ�Ü��}Ú �»ì>×:Û ó ì�ãrÝ ó ì>á�Û�ܽö»ì'å»Ú�×¼ë\Ø�ܽâiâËëËì�ö»ì>âià ó Ú%Û�Ù¼ì�à»ã�ß�à»íH�rü<�s�-ü^�

�4�T�4�-ý �¿Ø�ÙËÜ�árÙ�ã�ìa�?àSö»ì+Ý�Û�ÙËì�Ý�ì+á�à'×¼ë�Ý�à»çËãrá�ì©à�í)Ú»Ý�ßT���?ì�Û�ã�ßÆí�ã�à� ÞRì>âiܽì�í)ã�ì>ö5ܽÝ�Üià'×�u�Þ¼ç�Û×Ëà�Û+åCÜi×ÀÛ�ÙËÜ�Ý ó Ú ó ì�ã+å7Û�ÙËì�r¼ãrÝ�Û©Ú»Ý�ßT���\ì�Û�ã�ß�w#�%�5à��\ì\à�í�Û�ÙËì?Ý ó ì>á�Üsr7áÆá�à»×:Û�ã�ÜiÞ¼ç�Û�ܽà»×¼Ý%à�íÛ�ÙËÜ�Ý ó Ú ó ì>ã¹Ú�ã�ì§Ú'Ýhí�à»â½âiàSعÝ��

� �ZÙËì§×Ëì>ØFí�çRÝ�ܽà»×Qà ó ì�ãrÚSÛ�à»ãhܽÝ�Û�ì>árÙ¼×Ëܽá>Ú�â½âißWÚ�×¼ë!á�à'×¼á�ì ó Û�ç¼Ú»âiâ½ß�á�âiì+Ú�ã(�� Ö�Ûrݹë�ì<rR×ËÜdÛ�Üià'×¯Ú óËó ì>Ú»â½ÝZÛ�à Ú�×Ëà»Û�ÙËì>ã�×ËàSö»ì>â�ë�ì<rR×ËÜdÛ�Üià'×�å¼à»í8�¼ü4�3�s�»þ�ürü4���rüa�J�-ü��e� úX��ú�ü�åØ�ÙËÜ�árÙ!ì�×Ëã�ܽárÙ¼ì>Ý�Û�ÙËì�Ý�Û�Ú»×¼ëËÚ»ã�ë�×Ëà�Û�Üià'×Wà»í�Þ7ì�â½Üiì�í)Ý�Û�Ú�Û�ì§Ø�ÜdÛ�Ù�Û�ÙËì�Ý�à»çËãrá�ì©à�í�ì>Ú»árÙÞ7ì�â½Üiì�í��

� æ¹é§î ã�ì>ö5ܽÝ�Üià'×!á>Ú�×!ÞRì�ëËì�ã�Üiö'ì>ë�í�ã�à�� ÞRì>âiܽì�í)í�ç¼Ý�ܽà»×��i8� YL@_k+�,�,&2��&+�,^U 249!��'#?!���!'�./ZjdM_k+�,�,X2���+�,�U 249!��'i�� YL@\Z �8� Z*�N24.!)b_ ��� _��<?(���!'�.eZ � dM_ �8� Z*�BdM_�i8  KLZjdM_�V�m;`N+�:*0!��+�'�,�Zjd�K _Nm�`#Vi8¡ YL@BKLZId�_Vm�`7+�,�,X2���+�,�U 249!��'#?!���!'�./Zjd�K _Nm;`<VC+�:*0!��+�'�,�KLZ¢dM_Vm�`

122

Page 127: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

� Ö�Û�ì�ãrÚSÛ�ì+ë¯í�ç¼Ý�ܽà»×�ܽݩ×Ëà»Û§à'×Ëâ½ßÅØ�ì>âiâ�ë�ìar¼×Ëì+ëÀÞËçËÛBÚ�â�Ý�àWì<y5Û�ã�ìa�?ì�â½ßÅØhì�â½âs��Þ7ì�Ù¼Ú/ö'ì>ëB��ZÙËÜ�Ý�ܽÝ)Þ7ì>á>Ú�ç¼Ý�ìhÛ�ÙËìhí�ç¼Ý�ܽà»×\à ó ì�ãrÚSÛ�à'ã�ëËì<r¼×Ëì+Ý�Ú/��üa�����£���Sú;ú��^�rü�å'Ú�×RëBÜ½× ó Ú�ã�Û�Ü�á�çËâ�Ú�ãÜ�ÝZܽëËìa� ó à»Û�ì>×'Û+å¼Ú»Ý�Ý�à5á�ܽÚ�Û�ܽö»ì'å5Ú»×¼ëQá�à����Æç�Û�Ú�Û�ܽö»ì��

� æµ×�Ú'ëËë�ÜiÛ�ܽà»×¼Ú»â¹à ó ì�ãrÚSÛ�à»ã¿Ü�Ý ë�ìar¼×Ëì>ë�åe� � ¤*�T�#�-ý3�Rå�Ø�ÙËܽárÙAÜ�Ý Ý�ßT���?ì�Û�ã�ܽá!Û�à�í�çM�Ý�Üià'×���Ø�ÙËì>ã�ì+Ú»Ý�í�ç¼Ý�Üià'×!ܽ×��»ì>×Ëì�ãrÚ�âCÚ»ë¼ëËÝZÜi×Ëí�à»ã��?Ú�Û�ܽà»×�åËëËÜs¥Hç¼Ý�Üià'×!ã�ìa�?àSö'ì>ÝZÝ�à��?ì���à�'ì�Û�Ù¼ì�ã+å'Û�ÙËì§í�ç¼Ý�Üià'ׯÚ�×¼ë!ë�Ü�¥7ç¼Ý�ܽà»×Qà ó ì�ãrÚSÛ�à»ãrÝhëËì<r¼×Ëì�Ú¦� �L�rú;þ#�^�a��ú����Sü/���Sú;ú��^�rüa�

xPì�×ËàSØ ó ã�à5á�ì�ì+ë¿Û�à á�àSö»ì>ã�Û�ÙËì>Ý�ì ó à'Üi×:ÛrÝ�Üi×Qà'ã�ë�ì>ã��

§ ¨ª©�« Ïc¬ Ñ®­e¯8°²±³±´«ZÏ)Ð�Õ<«;¯ Õ>ζµ¸·º¹ Ð�«*»WÕa¯�Õ�ÑÆÎ�5Üi×Rá�ì�Û�ÙËìµá�à'×¼á�ì ó Û�çRÚ�âËì>âiì��\ì>×:Û�Ý�à»íHà'çËã�Ú óËó ã�à'Ú»árÙ\Ú�ã�ì¹Ú»Ý�Ü�� ó à'ã�ÛrÚ�×:Û�Ú»Ý�Û�ÙËì�Û�ì>árÙË׼ܽá>Ú�âà»×Ëì+Ý�å�ܽ×!Û�ÙËÜ�Ý�Ý�ì+á Û�Üià'×WØhì�ÝUÛrÚ�ã�ÛZØ�ÜiÛ�Ù¯Ú?Ý�à�?ì�Ø�Ù¼Ú�Û�âiì>×���Û�Ù:ß ó ã�ìa�¬í�à'ãR�¿Ú»âHëËܽÝ�á�ç¼Ý�Ý�Üià'×�à�íÞ¼Ú»áräT�»ã�à»ç¼×¼ë�Ú»×¼ë�ܽ×:Û�çËÜiÛ�ܽà»×8�1�ZÙËì�ã�ì��¿Ú�ܽ×ËÜi×D�?Ý�ì>á�Û�ܽà»×¼Ý�Ú»ã�ìk�?à'Ý�Û�â½ß í�à'ãR�¿Ú�â&�æµÝ�á�â½Ú'Ý�Ý�ܽá>Ú�â½âiß ó ã�ì+Ý�ì>×:Û�ì>ë�å+ڻקæ¹é§î ã�ì�ö5Ü�Ý�ܽà»×©à ó ì�ãrÚSÛ�à»ãF¼hÚ»á>á�ì ó Û�ÝHÛUØhà�Ú�ã��»ç��?ì>×'ÛrÝ£½Ú�u�ÛUß ó ܽá>Ú�â½âiß'å ó ã�à ó à'Ý�ÜiÛ�ܽà»×¼Ú»â�â½à�'ܽá(wµÛ�ÙËì>à»ã�ß�¾ Ú�×¼ë�ÚQÝ�ì>×:Û�ì�×Rá�ìe¿�Üi× Ý�à�?ì?â½Ú»×��»çRÚ �»ì

À ½=Ú�×¼ë ó ã�à�ë�ç¼á�ì>Ý�Ú§×Ëì>Ø1Û�ÙËì>à»ã�ß�¾Á¼�¿8���%ã>å»í�ã�à��Û�ÙËìµÝ�ìa�¿Ú�×:Û�Ü�á ó à»Ü½×'Û�à�íCö5Üiì>Ø�å'Ú§ã�ì<�ö:Ü�Ý�Üià'×\à ó ì>ã�Ú�Û�à'ã�ܽÝ�ç¼Ý�çRÚ�â½âißÆö:ܽì�Øhì>ë?Ú'Ý�Ú»á>á�ì ó Û�ܽ×��§ÛUØ�à�Ý�ì�ÛrÝ�à�íCܽ×:Û�ì�ã ó ã�ì�ÛrÚSÛ�ܽà»×RÝ£½�Û�ÙËà:Ý�ìÝ�Ú�Û�Ü�ÝUí�ß5ܽ×��¾ Ú�×¼ë�¿"åSã�ì>Ý ó ì+á Û�Üiö'ì�â½ß½=Ú�×Rë ó ã�à5ëËç¼á�ܽ×��§Ú¹Û�ÙËÜiãrëBÝ�ì�Û+åSà»×ËìZÝ�ÚSÛ�Ü�Ý�í�ß:ܽ×��t¾[¼3¿8�æµÝZØhì©Ý�Ù¼Ú�â½âCë�Ü�Ý�á�ç¼Ý�Ý>å:Û�ÙËÜ�ÝZÜ�Ý�Ú%�?Ü�Ý�â½ì>Ú'ë�Üi×D�Æö5ܽì�Ø=Ø�ÙËÜ�árÙ!ܽÝhì<y ó à'Ý�ì>ë�Þ:ß�âià5à»ä5ܽ×��%�?à'ã�ìá�â½à'Ý�ì�â½ß�ÚSÛZÛ�Ù¼ì�Ý�ì��?Ú»×:Û�Ü�á�ÝZà�í�æ¹é§î ã�ì>ö:Ü�Ý�Üià'×��Ö}×¼ë�ì�ì+ëCåËÛ�ÙËìBì>×:Û�ܽã�ìBëËܽÝ�á�ç¼Ý�Ý�Üià'×!ܽׯÛ�ÙËÜ½Ý ó Ú ó ì�ã%Ø�Üiâ½â"Þ7ìÆÝ�ìa�¿Ú»×'Û�ܽá�ã�Ú�Û�ÙËì>ã�Û�Ù¼Ú»×ÅÚ3yT�Üià��?Ú�Û�Ü�á�å�Ú�×RëAÝ�à ÜiÛ�Ø�ܽâiâ%ÞRìÀç¼Ý�ì�í�çËâ%Û�àLÝ�Û�Ú»ã�Û Þ5ß1ã�ì+á�Ú�â½â½Üi×���Û�ÙËìÀØhì�â½âs��ä5×ËàSØ�×[�\à�ë�ì>â

Û�ÙËì>à»ã�ì�Û�Ü�áµárÙRÚ�ãrÚ»á Û�ì�ã�Ü��+ÚSÛ�Üià'׿à�í"æ¹é§î ã�ì>ö:Ü�Ý�Üià'×Åïdð�ÃËåHð!z>òX�1Ä�ì�Û=Å Þ7ìµÛ�ÙËì§Ý�ì�Ûhà»í"Ø�à'ã�â�ëËÝu�Ü&� ì��iå�ܽ×:Û�ì�ã ó ã�ì�ÛrÚSÛ�ܽà»×RÝ�w�í�à»ã À ��æ3ã�ì�ö5Ü�Ý�ܽà»×Wà ó ì�ãrÚSÛ�à'ã*¼BÝ�ÚSÛ�Ü�Ý�r¼ì>Ý�Û�Ù¼ì§æ¹é§î ó à'Ý�Û�çËâ�ÚSÛ�ì>ÝÜdí%Ú»×¼ë à'×Ëâiß�Üií¹í�à»ã\ì�ö'ì�ã�ßÀÛ�ÙËì�à'ã�ßj¾ Û�ÙËì>ã�ìWì<y�ܽÝ�Û�Ý\Ú¯Û�à�ÛrÚ�â ó ã�ì<��à»ãrë�ì�ã�Üi×D�jÆ�àSö»ì�ã�ÅÝ�ç¼árÙ\Û�Ù¼ÚSÛ�Û�ÙËì¹Øhà»ã�â½ë¼Ý��?Üi×ËÜ��¿Ú�âRØ�ÜdÛ�Ù¿ã�ì+Ý ó ì>á�Û�Û�àÇÆ�Ú�ã�ì�ì<yËÚ»á�Û�â½ß�Û�ÙËà'Ý�ì�Û�ÙRÚSÛ�Ý�ÚSÛ�ܽÝ�í�ß�¾Ú�×¼ëCå»í�à»ã�ì>ö»ì>ã�ßBÝ�ì�×:Û�ì�×¼á�ì�¿�å�Û�ÙËì¹Øhà»ã�â½ëËÝ�Û�Ù¼ÚSÛ�Ý�ÚSÛ�ܽÝ�í�ß�¾Á¼T¿¿Ú�ã�ì ó ã�ì+á�Ü�Ý�ì>âißBÛ�ÙËì*�?ܽ×ËÜ��?Ú»âØ�à'ã�â�ëËÝ>åCØ�ÜdÛ�Ù�ã�ì>Ý ó ì>á Û§Û�à�Æ�å�Ý�Ú�Û�Ü�ÝUí�ß5ܽ×���¿8�?Ö}×¼ë�ì>ì>ëCå"Û�ÙËì¿ã�à»â½ìÆà»íhà'ã�ëËì�ã�Üi×��:ݵܽ×�ÞRì>âiܽì�íã�ì>ö:Ü�Ý�Üià'×WÚ�×RëW×¼à»×M�X�?à»×Ëà»Û�à»×¼Ü½á%âià��»Ü�á�ÝZÙ¼Ú'ÝZÞRì>ì�×QØhì�â½âCì>Ý�Û�Ú�Þ¼âiÜ�Ý�Ù¼ì>ëWÜi×!Û�Ù¼ì�âiÜiÛ�ì>ã�Ú�Û�çËã�ì�Ö}×ÀÛ�ÙËì¿Ý�ì(q'ç¼ì�â;åCØ�ì?Ø�ܽâiâ�á>Ú�â½â)Ú ó Ú»ÜiãÇu�ÅÉÈ�Æ�wµÚ!Þ7ì�â½Üiì�í�� ú£�Sú}ü�å"Ú»×¼ë�ÚWÝ�ì�Û�à�íhØ�à'ã�â�ëËÝ

ÊÌ˦ŠÚ?Þ7ì�â½Üiì�í=��ü�ú&��Ö}×:Û�ç¼ÜdÛ�Üiö'ì�â½ß»åRÚ?Þ7ì�â½Ü½ì�í�Ý�ì�Û%ë�ì+Ý�á�ã�ܽÞ7ì>ݹڻ×QÚ�»ì>×'Û( ݵڻá Û�ç¼Ú�â"ÞRì>âiܽì�í-Ý>åØ�ÙËÜiâ½ìÆÚ�ÞRì>âiܽì�í�Ý�Û�Ú�Û�ì\ë�ì>Ý�á�ã�ÜiÞ7ì>Ý%ÙËÜ�Ý%á�à»×¼ëËÜdÛ�Üià'×¼Ú�â�ÞRì>âiܽì�í�Ý�ì�ÛrÝ;�'Üiö'ì�×ÀÚ»×:ß ó à'Ý�Ý�ܽÞËâ½ì�×Ëì�ØÜi×�í�à'ãR�¿Ú�Û�ܽà»×��ÎÍ�âiì+Ú�ã�âiß'åCì�ö'ì�ã�߯ÞRì>âiܽì�í�ÝUÛrÚSÛ�ì\ܽ׼ë�çRá�ì>ÝBÚ�Þ7ì�â½Üiì�íZÝ�ì�Û+å"×¼Ú�?ì�â½ß¯Û�ÙËì¿Ý�ì�Û�à�í�\ܽ×ËÜ��¿Ú�â�Øhà»ã�â½ë¼Ý�ܽ×!Û�ÙËì�Þ7ì�â½Ü½ì�í�ÝUÛrÚSÛ�ì��æ¹âiÛ�ÙËà'ç��»Ù�í�à»ã�Û�ÙËà:Ý�ì?í-Ú�\ܽâ½Ü½Ú»ã�Ø�ÜdÛ�Ù�Û�ÙËì�æ¹é§î ó à'Ý�Û�ç¼â½Ú�Û�ì>Ý�Û�ÙËìÇ�?à�ë�ì�â�Û�ÙËì>à»ã�ì�Û�ܽáárÙ¼Ú�ãrÚ»á�Û�ì�ã�Ü��>ÚSÛ�Üià'×WØZڻݵà»Þ5ö5Üià'ç¼Ý�ܽ×ÅÙËܽ׼ëËÝ�Ü��'Ù:Û>åHÜd۩ټڻݹí-Ú�ãR�;ã�ì>Ú'árÙËܽ×��?ã�Ú�\Ü�rRá>ÚSÛ�ܽà»×RÝa��Ö}×

ó Ú�ã�Û�Ü�á�çËâ�Ú�ã+å/ÜdÛ��?ì>Ú»×¼Ý�Û�Ù¼Ú�Ûhþ�ü<�(�L�#�-ý3���L�;�e�M�\�^Ï<�¼üa�zÿ/�»ü�Ð1�Hü4��ý��¼ü�þ��Sú&�-ý3��ú�û���ú"ú£��Ñ'ü#�;�!�=� ú��Ð�þ4� ú1�Sþ���M�?üa�¼ú��HýSú1���¿ü�þ�üe�rüa�J�-ü�����ü�ú^Ò��<��ú7�b�4�M�����rü<�s�-ü^�Â� úX��ú�ü(Ó��ZÙËì©æ¹é§î ó à'Ý�Û�çËâ�ÚSÛ�ì>ÝÚ�ã�ì%×Ëà»Û�ã�ì>×¼ë�ì�ã�ì>ë��?ì>Ú�×¼Üi×��'âiì+Ý�ÝZÞ5ß?Û�ÙËܽݹà»Þ¼Ý�ì�ã�öSÚSÛ�ܽà»×"å:Þ¼ç�Û�ÜdÛ¹Ü�ÝZÜ�� ó à»ã�Û�Ú»×'ÛhÛ�à?ã�ì>Ú»âiÜ���ìÛ�Ù¼Ú�Û©Û�ÙËì�ßÀìa� ó â½àSßÀÚ��?Ü�Ý�â½ì>Ú'ë�ܽ×��Q×Ëà�ÛrÚSÛ�ܽà»×RÚ�â)ì>á�à»×Ëà��BßQÞ5ßÅÜ�� ó â½Ü�á�ÜiÛ�â½ßÅÞËçËܽâ½ëËÜi×��¯Ü½×:Û�àÛ�ÙËì©ã�ì>ö5ܽÝ�Üià'× à ó ì�ãrÚSÛ�à»ã�Üi×�í�à'ãR�¿Ú�Û�ܽà»×Î�?à»ã�ì%Ú'á�á�çËã�Ú�Û�ì>âiß?á�à'×¼Ý�Ü�ë�ì>ã�ì+ë Ú»Ý ó Ú�ã�Û�à»í"ÜiÛ�ÝNr¼ãrÝUÛ

123

Page 128: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Ú�ã��»ç��?ì�×:Û(�bÍ�à»×¼Ý�Ü�ë�ì�ã+å5í�à»ã¹ìayËÚ � ó â½ì»å�Û�ÙËìer¼ãrÝUÛ ó à'Ý�Û�ç¼â½Ú�Û�ìÎu-Ý�ì�ì§í�à5à»Û�×Ëà»Û�ì�ܽ×QÖ}×'Û�ã�à�ë�çRá#�Û�ܽà»×\w�åHÔ Õ�ð��c¾Ö¼�¿�Ü�� ó â½Üiì+Ý7¿F��e�ZÙËìÆ×¼Ú�ܽö»ìÆã�ì+Ú»ë�ܽ×�� à»í�Û�ÙËÜ�Ý ó à:ÝUÛ�çËâ�ÚSÛ�ìÆܽÝ!ôRxAÙËì�ׯÛ�ÙËìÞRì>âiܽì�í�Ý�ì�Û�ö/Ú»âiÜ�ëËÚ�Û�ܽ×��Ǿ Ü�ÝZã�ì�ö5ܽÝ�ì>ë�Þ5ߦ�a����õ��:Û�ÙËì�á�à'ã�ã�ì>á�Ûhã�ì>Ú'ë�Üi×D�\Ý�ÙËà»ç¼â½ëWÞRìQôRxAÙËì�×Ú�×5ß¿ÞRì>âiܽì�íb� úX��ú�ühØ�Ù¼à'Ý�ì%ܽ׼ë�ç¼á�ì>ëWÞRì>âiܽì�íb��ü�úHöSÚ�â½Ü�ëËÚSÛ�ì>Ý*¾ ܽÝZã�ì�ö5Ü�Ý�ì+ë Þ5ߦ�a���<� õ×!Ö}×�Û�ÙËìã�ì��?Ú»Üi×Rë�ì�ã©à�í�Û�ÙËì ó Ú ó ì�ã+åHØ�ì?Ú»Ý�Ý�ç��?ìÀô�æ¹é§î Ý�ì�Û�Û�Üi×��:õWÚ»×¼ë=ô�æ¹é§î ã�ì�ö5Ü�Ý�ܽà»×¼õ ã�ì�í�ì�ãÛ�à?Û�ټܽÝ=�?à�ë�Ü�r¼ì>ë!ö5Üiì>Ø�åËì<y ó â½Ü�á�ÜiÛ�â½ß�ܽ׼ë�Ü�á�ÚSÛ�Üi×D�?Ú»×5ß ã�ì�í�ì�ã�ì�×¼á�ì©Û�à\Û�ÙËì�à'ã�Ü��»Ü½×¼Ú»â7ö5ܽì�Øe�

ØËã�à� Û�ÙËÜ�Ý ó ì>ã�Ý ó ì+á Û�Üiö'ì»å:ÜdÛZܽÝhá�â½ì>Ú»ã�Û�Ù¼ÚSÛ�Û�ÙËì§æ¹é§î Ý�ì�Û�Û�Üi×D�Æá�à'×:Û�Ú�ܽ׼Ý�ÛUØ�à\Ý�à'çËã�á�ì>Ýà�í�Ú»Ý�ßh���?ì�Û�ã�ß��ÙØ�ÜiãrÝ�Û>å�Ú»Ý?Ü�Ý¿Øhì�â½âs��ä5×ËàSØ�×�å�Û�ÙËìÅÝ�ì>á�à»×¼ë�Ú�ã��»çD�\ì>×:Û\Û�à�Û�ÙËìÀã�ì>ö:Ü�Ý�Üià'×à ó ì�ãrÚSÛ�à'ã�ÛrÚ�ä»ì+Ý�á�à� ó â½ì�Û�ì ó ã�ì>á�ì>ë�ì>×¼á�ì%àSö»ì>ã�Û�Ù¼ì�r¼ãrÝ�Û�à»×¼ì�u¬Ý�ì>ì ó à'Ý�Û�çËâ�ÚSÛ�ìkÕ�ð©Ú�Þ7àSö»ì!w#��5ì>á�à»×¼ëCå/Ú»Ý�Ø�ì�Ù¼Ú/ö'ìcÚUç¼ÝUÛ�ë�ܽÝ�á�çRÝ�Ý�ì>ëCå>Û�Ù¼ì�r¼ãrÝUÛ"Ú»ãR�'ç��?ì�×:Û�ܽÝ"Úhí�çËâiâ'ÞRì>âiܽì�íËÝ�Û�Ú�Û�ì»å/Ø�ÙËì>ã�ì+Ú»ÝÛ�ÙËìBÝ�ì>á�à»×¼ëWÜ�ݵÚ��?ì�ã�ì§ÞRì>âiܽì�í�Ý�ì�Û(�N�ZÙËÜ�ݹÝ�ì>á�à'×¼ëQÚ»Ý�ßT���?ì�Û�ã�ß Ü½Ý;�\à'ã�ì�Ý�çËÞ�Û�âiì�Ú»×¼ëCåRØ�ìÞRì>âiܽì�ö'ì»åËçËâiÛ�Ü��¿ÚSÛ�ì�â½ßWë�ì>ì ó ì�ã(�

�5à�?ì ã�ì>á�ì�×:ÛBØhà»ã�äÀÜ½× Û�ÙËì!Ú�ã�ì>ÚQÙ¼Ú'ÝBÚ�Û�Û�Ú'árä»ì+ë�Û�ÙËì�r¼ã�Ý�ÛÆÝ�à»çËãrá�ì�à�í¹Ú»Ý�ßT���?ì�Û�ã�ß��ZÙËܽÝ\Ú»Ý�ßT���\ì�Û�ã�ß�ܽÝÆà�í Û�ì�× Ü½×:Û�ì�ã ó ã�ì�Û�ì>ë Ú'ÝÀô�×Ëì�Ø�ܽ×�í�à»ã��¿ÚSÛ�Üià'× àSö»ì�ã�ã�ܽë�ì+Ý�à»â�ë ܽ×�í�à»ãR��?Ú�Û�ܽà»×�å õ?Ú�×¼ëWÛ�ÙËì�ã�ì�Ù¼Ú/ö»ì§Þ7ì�ì>×QÝ�ç����»ì>Ý�Û�ܽà»×RÝhÛ�Ù¼Ú�Û�Û�ÙËÜ�ݹárÙËã�à»×Ëà'âià��»Ü�á�Ú»â ó ã�ì>á�ì>ë�ì>×¼á�ì§Ü�ÝçË×3ÚUç¼Ý�Û�Ü�r¼ì>ë?ܽ×��»ì>×Ëì�ãrÚ�âBu�ã�ì>á>Ú�â½âiܽ×���Ý�Ü��?ܽâ½Ú»ã�á�à»×Rá�â½ç¼Ý�ܽà»×RÝ�ܽ×\Û�ÙËì¹á>Ú»Ý�ìZà�íH×Ëà»×M�X�?à»×Ëà»Û�à'×ËܽáÛ�ìa� ó à»ãrÚ�â�ã�ì+Ú»Ý�à»×Ëܽ×��¼å�á í��Zï } ÃSò�w<�¦�¹àSØhì�ö'ì�ã+å�ÜdÛ( ÝÆÜ�� ó à»ã�Û�Ú»×:Û�Û�àÅã�ì>Ú»âiÜ���ì?Û�Ù¼ÚSÛÆÛ�ÙËì>ã�ì¿Ü�Ý×Ëà�Û�ÙËÜi×D�Wܽ×ÅÛ�ÙËì\æ¹é§î Ý�ì�Û�Û�Üi×��WÛ�à!çË×ËÜLq:çËì�â½ßÅÝ�Ú�×¼á�Û�ܽà»×¯Û�ÙËìBÛ�ìa� ó à»ãrÚ�â�Üi×:Û�ì>ã ó ã�ì�Û�ÚSÛ�Üià'×��Ö}× ó Ú�ã�Û�Ü�á�ç¼â½Ú»ã>å:Ý�ì>ö»ì�ãrÚ�â¼ã�ì+Ý�ì+Ú�ãrárÙËì�ãrÝ�árÙËà5à'Ý�ì�Û�àÆö5Üiì>ØÂÛ�ÙËì ó ã�à�á�ì>Ý�Ý�Ú»Ý�à»×ËìµÜi× Ø�ټܽárÙ¿Û�ÙËìÞRì>âiܽì�í�Ý�ì�ÛrÝ�à»í�ÛUØhàBÚ �'ì�×:Û�ÝhÚ�ã�ìµá�à��BÞËܽ×Ëì+ë?Û�à ó ã�à�ë�çRá�ì%Ú�Û�ÙËܽãrëB��Ö}׿Û�ټܽÝhö5Üiì>Ø�å:Û�ÙËìÂr¼ãrÝUÛÚ»Ý�ßh���?ì�Û�ã�ß Ú�?à»çË×:Û�Ý\Û�àÛ�»Ü½ö:ܽ×���à»×ËìQÚ �'ì�×:Û¦u Û�ÙËì�Ô ìay ó ì�ã�Û��wÆÛ�à»Û�Ú»â ó ã�ì+á�ì+ë�ì�×¼á�ìWàSö'ì�ãÛ�ÙËì à»Û�ÙËì>ã�u Û�ÙËì¦Ô ×ËàSö5Ü�á�ì�w å)Ú�×RëÀÛ�ÙËì>Ý�ì ã�ì>á�ì�×:ÛBÚSÛ�Û�ìa� ó ÛrÝ�Ù¼Ú/ö'ì¿ÞRì>ì�×j�'ì>Ú�ã�ì>ëÀÛ�àSØZÚ�ãrëËÝá�Ú ó Û�çËã�Üi×D�Àâ½ì>Ý�Ý�ÞËÜ�Ú»Ý�ì>ë ä5ܽ׼ëËÝÆà�í¹Þ7ì�â½Üiì�í ó à:à'âiܽ×��\��ؼà»ãBìayËÚ � ó â½ì»å¹ï } {/ò�ç¼Ý�ì Û�ÙËì Û�ì�ã���Sþa�a� ú¬þR��ú��-ý ��Û�àPëËì>Ý�á�ã�ÜiÞ7ì!ÚPá�à����Æç�Û�Ú�Û�ܽö»ìWã�ì>ö:Ü�Ý�Üià'× à ó ì�ãrÚSÛ�à»ã(�ÀÖ}× Û�ÙËì�ܽã¿Ý�ß�Ý�Û�ì�� Û�ÙËìí-Ú�ܽã�×Ëì+Ý�Ý\Ü�Ý Ú»árÙ¼Üiì>ö»ì>ëLÞ5ß à�?ÜiÛ�Û�ܽ×���Û�ÙËì�r¼ã�Ý�Û�æ¹é§î Ú3y�Üià��Üu^Õ�ðQÚ»ÞRàSö'ì(w�u Û�ÙËì�ßÂÚ�â�Ý�àá�à»×RÝ�Ü�ë�ì�ã�Ú'ëËë�ܽ×��Wà»Û�ÙËì>ã§ã�ì+ÝUÛ�ã�Ü�á Û�Üià'׼ݩà»×PÚ»ã�Þ¼ÜdÛ�ã�Ú�Û�ܽà»×�åCÞËçËÛ©Û�ÙËì>Ý�ì¿Ú�ã�ìÆ×¼à�Ûe�»ì+Ú�ã�ì>ëQÛ�à �ØhÚ»ã�ëËݹí-Ú�ܽã�×¼ì>Ý�Ý�w<�?ïið�ÝSò ó â½Ú'á�ìÆÚ»×PÚ'ëËë�ÜiÛ�ܽà»×¼Ú»â"í-Ú»Üiã�×Ëì>Ý�Ý%ã�ì(q'ç¼Üiã�ìa�?ì�×:Û¹Û�Ù¼ÚSÛ�Ú�\à'çË×:Û�ݹÛ�àã�ì(q'ç¼Üiã�Üi×D�¿Û�Ù¼Ú�Û%Ø�ÙËì>×ÅÛUØhà ܽ׼á�à'×¼Ý�Ü�Ý�Û�ì�×:Û%Û�ÙËì>à»ã�Üiì+Ý%Ú»ã�ì/�?ì>ãR�'ì>ëQì>Ú'árÙÅà»×¼ìBÙ¼Ú'Ý�Û�à²�»Ü½ö»ìç ó Ý�à��\ì�Û�ÙËܽ×��\�1�µÛ�ÙËì�ã�ã�ì>Ý�ì>Ú»ã�árÙWÜi×!Û�ÙËì�Ú�ã�ì>ÚÆÜi×Rá�â½ç¼ë�ì>ÝBï ñ�å\~�å\}�ñSò&�

�5Üi×Rá�ì�Øhì�Ú�»ã�ì�ì¿Ø�ÜiÛ�ÙFïið(}+òhÛ�ÙRÚSÛBÛ�ÙËìQæ¹é§î Ý�ì�Û�Û�Üi×��ÀܽÝÆçË×¼á�âiì+Ú�ãBà'× Ü�Ý�Ý�çËì>ÝBà�í�ܽ×M�Û�ì�ã ó ã�ì�ÛrÚSÛ�ܽà»×"å7Øhì?á�à»×RÝ�Ü�ë�ì�ã§ÜiÛk�?ì+Ú�×Ëܽ×��»â½ì>Ý�Ý%Û�àQÚ�ã��»çËìBÛ�Ù¼Ú�Û©à'×Ëì\Üi×:Û�ì>ã ó ã�ì�Û�ÚSÛ�Üià'×T½�Û�ÙËìÛ�ìa� ó à»ãrÚ�â�à'×ËìWà'ã\Û�ÙËìo�ÆçËâdÛ�Üs�}Ú �'ì�×:Û?à»×Ëì<½�Ü�Ý?ã�Ü��»Ù:Û¿Ú�×¼ë1Ú�×¼à�Û�Ù¼ì�ã?Ø�ã�à'×��¼å�à»×Ëâ½ß Û�ÙRÚSÛà»×ËìµÝ�ÙËà'çËâ½ë\Þ7ì%á�âiì+Ú�ã�à»×¿à'×Ëì Ý�ܽ×:Û�ì�ã ó ã�ì�ÛrÚSÛ�ܽà»×¿Ú»×¼ë¿Ý�Ù¼à»çËâ�ë?ì<y ó â½à»ã�ìZÜdÛrÝ�á�à»×¼Ý�ì�q:çËì>×¼á�ì+Ýa��¹àSØhì�ö»ì>ã>å'Ø�ì%ë�àÆÚ»ãR�'çËì�Û�ÙRÚSÛ�Û�ÙËìk�BçËâiÛ�Ü���Ú�»ì>×'Û ó ì>ã�Ý ó ì+á Û�Üiö'ìµâ½ì>Ú»ë¼Ý�Û�à�q:çËÜdÛ�ì©Ú�Û�Û�ã�Ú'á Û�ܽö»ìó ã�à ó ì>ã�Û�Üiì+Ýa�

xPì ã�ì ó â½Ú'á�ì\Û�ÙËì�à ó ì�ãrÚSÛ�à'ã�à�í�Þ7ì�â½Üiì�íZã�ì�ö5Ü�Ý�ܽà»×�Þ5ßÅÛ�ÙËì à ó ì>ã�Ú�Û�à»ã�à»í%��üa�J�-ü����4�T�#�-ý3�\�Ä�ܽä»ì7�?ì>ãR�'Üi×��%Ú�×¼ë�Ú»ã�Þ¼ÜdÛ�ã�Ú�Û�ܽà»×�å>í�ç¼Ý�Üià'×�Üi×5ö'à»â½ö»ì>Ý�ÛUØ�à%Ú �»ì>×:Û�Ý>å+Ø�Ù¼à'Ý�ì�ÞRì>âiܽì�í-Ý)Ú»ã�ì�í�çRÝ�ì+ëB�� ó ì+á�Ü�rRá�Ú»âiâ½ß»åhÛ�ÙËìÅÚ�ã��»ç��?ì>×'ÛrÝ\Û�à Þ7ì�â½Üiì�í§í�ç¼Ý�ܽà»×AÚ»ã�ì!ÛUØhàPí�ç¼âiâ%ÞRì>âiܽì�í�Ý�Û�ÚSÛ�ì>Ý��ßÞ¹×Ëâ½Ü½ä»ì�\ì>ãR�'Üi×D�BÚ�×¼ë�Ú�ã�ÞËÜdÛ�ã�Ú�Û�ܽà»×�å'ÙËàSØhì�ö»ì>ã>å�Û�ÙËì>ã�ì%ܽÝ�×Ëà�Û�ÙËܽ×���í-Ú»ÜiãhÚ�Þ7à»ç�Û�í�ç¼Ý�ܽà»×8��Ö}×Rë�ì�ì+ëCå5Üi×Ú ó ã�ì+á�Ü�Ý�ì�Ý�ì�×¼Ý�ì©í�ç¼Ý�ܽà»×!Ü�ݹÚÆí-Ú�ÜiÛ�Ù�í�çËâ8�»ì�×¼ì�ãrÚ�â½Ü��+ÚSÛ�ܽà»×Wà�í)æ¹é§î ã�ì�ö5Ü�Ý�ܽà»×�Û�à\Û�Ù¼ìe�BçËâiÛ�Ü���#à .!'t+�:*0 #5^�X24.<�=-X�!24.!$#'e+�,*.!'�-�'�,�,X245^]�gÂá�'k5&'£685&+ ��' i�� 24,%â£ã � nÁã��/24.()�_ �/� _�a?C���!'�.Z � d�_ � � Z � da_ � ? äH6F�('�5�'Fã � 24.!);ã � 245�'�9'���+�'£@,��&2��&'�,�WDá²+ ���! #"(�c���!+�,\-X�324.!$#'�68�!+�-X�h?<'�,&,�'�.a��+J24��� ]<?24���� �68,C�&�!'�5�'�,&"!� �� 4@�275&'£f�+�,&+� #.*�& b)!'�0'�.!) #.Â0!24,��85&'£f�+�,&+� #.!,�?�:* #,^��+ �&'�5&2��&'�)�5&'£f(+�,�+� #.Â0(5& #0 #,X24��,æå

+�.(-���"()!+�.!$= #"!5� �6F.aå�245�'�+�.!-� #.!,&+�,^�&'�.<�F68+ ���Â���!'�0 #,���"!�J2���'�,�W

124

Page 129: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Ú �»ì>×:Û©á�Ú'Ý�ì��txPì?Ý�Ù¼àSؾÛ�Ù¼Ú�Û%í�ç¼Ý�ܽà»×ÀÜ�Ý©ì<y5Û�ã�ìa�?ì>âißQØhì�â½âs��ÞRì>Ù¼Ú/ö»ì+ëCåCÚ�×¼ëÅÜi× ó Ú�ã�Û�Ü�á�ç¼â½Ú»ã>åÜdÛrÝ�ÜdÛ�ì�ãrÚSÛ�Üià'× ó à'Ý�ì>ÝZ×Ëà ó ã�à»Þ¼âiì��?Ý���5Üi×Rá�ì!Üi×ÂÛ�ÙËÜ�Ý ó Ú ó ì>ã\Øhì¯ë�à�×Ëà�Û ë�ܽã�ì+á Û�âiß árÙ¼Ú»âiâ½ì�×D�»ì Û�ÙËì�r¼ãrÝUÛ Ú»Ý�ßT���\ì�Û�ã�ß Ú�×¼ëá�à»×:Û�Üi×5çËìÀÛ�àLã�ì�â½ß1à»×Fë�à��?Üi×¼Ú»×¼á�ìÀÜi×FÛ�ÙËì ó ã�à�á�ì>Ý�Ý à�í�í�çRÝ�ܽà»×�å¹ÛUØ�àÂã�ì��?Ú»ã�ä�Ý Ú»ã�ìÀÜi×à»ãrë�ì�ã(�²Ø�ܽã�Ý�Û>å�Ú»Ý�Ø�ì�Ý�Ù¼Ú»âiâ�Ý�ì�ì'å"Û�ÙËì�×¼à�Û�ܽà»×�à»í�ëËà�?Üi×RÚ�×¼á�ì¿Ø�ì ÙRÚ/ö»ì¿Ü�Ýe�Bç¼árÙj�?à'ã�ì

r¼×Ëì<�X�»ãrÚ�ܽ×Ëì+ë�Û�Ù¼Ú»×LÛ�Ù¼ÚSÛ?à�í©à»×ËìQÚ �'ì�×:Û\ëËà�?Üi×RÚSÛ�ܽ×���Ú�×Ëà»Û�ÙËì>ã��èç�Ý�Ý�ì>×'Û�ܽڻâiâ½ß»å�Øhì!Ø�Üiâ½âÙ¼Ú/ö»ì�à'×ËìBÚ �'ì�×:Ûµë�à��\ܽ׼Ú�Û�ܽ×���Ú�×¼à�Û�Ù¼ì�ã¹à'×Ëâiß!Ø�ÜiÛ�ÙÅã�ì>Ý ó ì+á Û¹Û�à ó Ú»ã�Û�ܽá�çËâ½Ú»ã�ÚUç¼ëM���?ì�×:Û�Ý���5ì>á�à»×¼ëCå/Ø�ÙËܽâ½ì�à»ç¼ã�í�ãrÚ �?ì�Øhà»ã�äµá�Ú»×�Þ7ì�Ú»ë¼Ú ó Û�ì>ë©Û�àµìa�BÞ¼ã�Ú'á�ì�Ü�ë�ì>Ú'Ý�à'ׯôUí-Ú�ܽãrõ=�\ì>ãR�'Üi×D�¼åÜdÛ)ܽÝ�ܽ׼Ý�Û�ã�ç¼á Û�Üiö'ì�Û�à©Ý�ì>ì�Û�Ù¼ÚSÛ)Ý�à'âiö5ܽ×��¹Û�ÙËì ó ã�à'ÞËâ½ìa�¿Ý�Û�Ù¼ÚSÛ)Ù¼Ú/ö'ì ó â�Ú �»ç¼ì>ë�ÜiÛ�ì>ã�Ú�Û�ì>ëBÞRì>âiܽì�íã�ì>ö:Ü�Ý�Üià'×�ë�à5ì>Ýh×Ëà»Ûhã�ì�q:çËܽã�ì©ë�à»Ü½×��?Ú/ØhÚ/ß\Ø�ÜiÛ�ÙQë�à�?ܽ׼Ú�×¼á�ì%Ú'ÝhÚ��?ì�Û�ÙËà�ë¿í�à»ãZã�ì>Ý�à»â½ö5Üi×��á�à»×�é¼Ü½á�Û�ܽ×��?ÞRì>âiܽì�í-Ý��æ ó ã�à ó à'Ý�Ú�â¹ã�ì�â�ÚSÛ�ì>ëÂÛ�à à'çËã�àSØ�×AÜ�Ý¿Û�Ù¼ÚSÛWà�í ï Ý�òµí�à'ã�ã�ì�ö5ܽÝ�ܽ×�� ÞRì>âiܽì�íBÝ�Û�Ú�Û�ì+Ý Þ5ßá�à»×Rë�ÜdÛ�Üià'×¼Ú�âhÞ7ì�â½Üiì�í-Ýa�I�ZÙ¼Ú�Û\Øhà»ã�ä�á�Ú�×LÞRì!Û�Ù¼à»ç��'Ù'Û\à�í%Ú'Ý�ÛrÚ�ä5ܽ×��Àܽ×:Û�à�Ú»á�á�à»çË×:ÛB×¼à�Û

×Ëì>á�ì>Ý�Ý�Ú»ã�ܽâißÀÛ�ÙËìWã�ì>ö5ܽÝ�Üi×D�ÅÚ �»ì>×:Û� Ý�çË×¼á�à»×¼ë�ÜiÛ�ܽà»×RÚ�â�Þ7ì�â½Ü½ì�íUå�ÞËçËÛÆÙËÜ�ÝBá�à»×¼ë�ÜiÛ�ܽà»×RÚ�â�à'×Ëì>Ý��Ö}×BÚµÝ�ì�×¼Ý�ì�à'çËã�á�à»×¼Ý�Û�ã�ç¼á�Û�ܽà»×§Û�Ú»ä»ì+Ý�ܽ×'Û�à%Ú'á�á�à'çË×:Û"ÙËÜ�Ý"ì>×:Û�ܽã�ì�Ý�ì�Û�à�í¼á�à»×¼ë�ÜiÛ�ܽà»×RÚ�â:ÞRì>âiܽì�í-Ý���µÛ�ÙËì>ã�ë�Ü�¥7ì>ã�ì>×¼á�ì+Ý"ܽ׼á�âiç¼ëËì�Û�Ù¼ì�í-Ú»á�Û"Û�Ù¼ÚSÛ�à»çËã�Ú óËó ã�à'Ú'árÙ�Ú�â�Ý�à�ÛrÚ�ä'ì>ÝCܽ×:Û�à%á�à»×¼Ý�Ü�ë�ì�ãrÚSÛ�Üià'×Ý�à'çËã�á�ì>Ý¿à»í�Üi×Ëí�à»ã��?Ú�Û�ܽà»×AÚ�×Rë�Û�ÙËìPã�ì>â½Ú�Û�ܽö»ìÅá�ã�ì+ë�ÜiÞ¼Üiâ½ÜdÛUß1à�í�Û�ÙËì>Ý�ìÀÝ�à»çËãrá�ì+Ýa�êØ�ܽ׼Ú�â½âiß'åØ�ì?Û�ÙËÜi×¼äÀÜiÛÆÚWí-Ú�ܽãBÝUÛrÚSÛ�ìa�?ì�×:Û�Û�ÙRÚSÛ�à»ç¼ãBÚ óËó ã�à:Ú»árÙÀÜ�Ý�Þ¼Ú'Ý�ì+ëÀà'× á�â½ì>Ú�ã�ì�ã�Ý�ìa�¿Ú�×:Û�ܽá>Ú�âçË×¼ë�ì>ã ó ܽ×Ë×Ëܽ×��:Ýa�

ë)ì�ã�Ù¼Ú ó Ý�á�âià:Ý�ì+ÝUÛ�Û�àBà'çËã�Øhà»ã�äBÜ�Ý�Û�ÙËì%ã�ì+á�ì�×:Û ó ã�à ó à:Ý�Ú»âËÜi×Pï |�òRí�à'ã�á�à��BÞËܽ×Ëܽ×���ܽ×�í�à»ãR��?Ú�Û�ܽà»×¿í�ã�à��á�à»×Mé¼Ü�á Û�Üi×D�ÆÝ�à»çËãrá�ì+Ýa���¹ì§Ú»ëËë�ã�ì>Ý�Ý�ì>Ý�ÚÆá�à� ó â½ìa�?ì>×'ÛrÚ�ã�ß ó ã�à'ÞËâiì�� Û�àBà'çËãàSØ�×���ë�ì>á�ܽë�ܽ×��¿Ø�Ù¼Ú�Û�Üi×�í�à'ãR�¿Ú�Û�ܽà»×WÛ�à¿ã�ì�ÚUì>á�Û*�»Ü½ö»ì�×WÛ�Ù¼ì�Ý�çËÞRÝ�ì�ÛZà»í�ܽ×�í�à'ãR�?ܽ×��¿Ý�à'çËã�á�ì>Ýã�ì�ÚUì>á Û�Üi×D��Û�ÙËì?Üi×�í�à'ãR�¿Ú�Û�ܽà»×��ÆÖ}×��¿Ú�ä5ܽ×��WÛ�ÙËÜ�Ý�ë�ì>á�ܽÝ�ܽà»×�åFÍhÚ»×'ÛUØhì�â½â�Ú»Ý�Ý�çD�\ì+Ý%Ú��'ì�×Ëì>ã��Ú�â½Ü��+ÚSÛ�ܽà»×Là�í¹à»ç¼ã\á�ã�ì+ë�ÜiÞ¼Üiâ½ÜdÛUß�à'ã�ë�ì>ã�ܽ×��Rå�Ü½× Û�ÙËÜ�Ý\á>Ú»Ý�ìWÚ ó Ú�ã�Û�Ü�Ú�â ó ã�ìa�;à'ã�ëËì�ãBàSö'ì�ã\Ý�ì�Û�Ýà�í�Ý�à»çËãrá�ì+Ýa�/�µì?ì<y ó â½à»ã�ì>Ý©Ú�×5ç��ÆÞRì>ã©à»í�ØZÚ/ß5Ý%à�í�Üi×¼ëËç¼á�ܽ×��¯Ú ó Ú»ã�Û�ܽڻâ ó ã�ìa�;à'ã�ëËì�ãµàSö'ì�ãÝ�ì>×'Û�ì�×¼á�ì>Ý�Þ¼Ú»Ý�ì>ë à»× Û�ÙËÜ�Ý�à'ã�ëËì�ã�Üi×��Rå»Ø�ÙËÜ�árÙWá>Ú�׿Û�ÙËì�×WÞRì©ç¼Ý�ì>ë Û�àÆëËì�Û�ì>ãR�?ܽ×Ëì§ÚBÝ�çËÞ¼Ý�ì�Ûu-Ú�âiÛ�Ù¼à»ç��'Ù ×Ëà�ÛhÚ�â½âLw�à�íCÛ�Ù¼ì%Ý�ì�×:Û�ì>×¼á�ì+Ý�Û�à�ã�ì�ÚUì>á�Û����ZÙËì ó ã�à ó à:Ý�Ú»âRÚ�â�Ý�à�ë�Üs¥Hì�ãrÝ�Ø�ÜdÛ�Ù à»çËãrÝÜi×BÛ�Ù¼Ú�Û�Û�ÙËìZÝ�à'çËã�á�ì>Ý�à�í¼Ü½×�í�à'ãR�¿ÚSÛ�Üià'×BÚ�×Rë�ã�ì>Ý�çËâdÛ�Üi×D�%Þ7ì�â½Üiì�í7Ý�Û�Ú�Û�ì>Ý�Ú�ã�ì�ì+Ý�Ý�ì�×:Û�ܽڻâiâ½ß�ÞRì>âiܽì�íÝ�ì�Û�Ý��:×¼à»×M�;Û�ã�Üiö5Ü�Ú�â7á�à»×Rë�ÜdÛ�Üià'×¼Ú�â7Þ7ì�â½Üiì�í-ÝhÚ»ã�ì%×Ëà�ÛZÚ»á>á�à»ç¼×'Û�ì>ë?í�à'ã���Ö}×WÚ'ëËë�ÜiÛ�ܽà»×�å:Û�ÙËì%Øhà»ã�äë�à5ì>ݧ×Ëà»ÛBÚ»ëËë�ã�ì>Ý�Ý%Û�ÙËì ó ã�à'ÞËâiì�� à�íZá�à��BÞËܽ×Ëܽ×��!Û�ÙËì+Ý�ì?Þ7ì�â½Üiì�í�Ý�Û�Ú�Û�ì+Ýa�Ç�ZÙËì¿ë�ì��»ã�ì�ì\Û�àØ�ÙËܽárÙBÛ�ÙËì�í�ãrÚ �?ì�Øhà»ã�ä©á>Ú ó Û�çËã�ì>Ý�à»çËã�Üi×:Û�çËÜdÛ�Üià'×¼Ý)ܽ×ÆÝ ó ì+á�Ü�rRá�ëËà�¿Ú�ܽ׼Ý�ë�ì>Ý�ì�ã�ö»ì+ÝCí�çËã�Û�ÙËì>ãã�ì+Ý�ì+Ú�ãrárÙ��

�µÛ�ÙËì>ã�ã�ì>â½Ú�Û�ì>ë\ã�ì>Ý�ì>Ú»ã�árÙÆÜi×¼á�âiçRë�ìBï ÃSòRØ�ÙËÜ�árÙ¿Ú óËó ã�à:Ú»árÙËì+Ý�ܽ×�í�à'ãR�¿ÚSÛ�Üià'×\Ú��'ã�ì��'ÚSÛ�Üià'×í�ã�à��^Ú ó à'Ý�Ý�ÜiÞËܽâ½Ü½Ý�Û�Ü�á!âià��»Ü�á ó à»Ü½×:Û à�íµö5ܽì�Ø�åZÚ�×¼ë1Ý�ì>ö»ì>ã�Ú»â ó Ú ó ì>ã�Ý\ܽ×1Ú�Ý ó ì+á�Ü�Ú�â�Ü�Ý�Ý�çËìà�í�ì7û¼ürýSþ#�^� ïið(í/òµØ�ÙËÜ�árÙ1Ú»â½Ý�à Ý�ì>ì�ä Û�à ì<y5Û�ì>×¼ëÂÛ�ÙËìÅæ¹é§î�í�ãrÚ �?ì�Øhà»ã�ä�Û�à ë�ì+Ú�â¹Ø�ÜiÛ�Ù×Ëà»×M� ó ã�Üià'ã�ÜiÛ�Ü���ì+ë ã�ì�ö5Ü�Ý�ܽà»×��

î ïð«�ñ>Õ<«;òóò»Ó�¯)Õ�ÑÆÎØ�ܽãrÝUÛ+å:Ú�Þ¼ÜdÛhà�í�ÝUÛrÚ�×¼ë¼Ú�ãrëÆ×¼à�Û�Ú�Û�ܽà»×���x�ì%Ú»Ý�Ý�ç��?ì¹Ý�à��\ìµâ½Ú»×��»ç¼Ú�»ì À ��æ=Ø�à'ã�â�ë�ô=ܽÝhÚ�×Üi×:Û�ì>ã ó ã�ì�Û�ÚSÛ�Üià'×�àSö»ì>ã À å»Ú�×Rë�ØhìhÝ�Ú/ß©Û�Ù¼Ú�Û�í�à'ã�Ú%Ý�ì�×:Û�ì>×¼á�ì�¿oË À å ôpõ öI¿ÆÜ�¥/¿Æì>öSÚ�â½ç¼ÚSÛ�ì>ÝÛ�à%Û�ã�çËì�ܽ×�ô/��é©Üiö'ì�×ÆÚ©Ý�ì�Û)à�í¼Øhà»ã�â½ë¼ÝFÅ Ú»×¼ëBÚ©Ý�ì�×:Û�ì>×¼á�ì�¿"åD÷�¿�÷*öùø(ôùË�ÅÌõ!ôpõ öj¿8ú��Ö�íT¿WÚ�×¼ë�û�Ú�ã�ì¹Ý�ì�×:Û�ì>×¼á�ì+Ý�å»Û�ÙËì>×e¿�õ ößû§Üs¥�ücôùËI÷£¿H÷3È�ôýõ öþûT��æ¹â�Ý�à¼å»Ü½×?Û�Ù¼ì¹Û�ã�ì>Ú�ÛR�?ì�×:Û

125

Page 130: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Û�Ù¼Ú�ÛÆí�à'âiâ½àSعÝBØhì��?Ú»ä»ìWç¼Ý�ì!à�í©ÚÀ×5ç��ÆÞRì>ã\à»í�u ó ã�ìa�£wUà»ãrë�ì�ãrÝa� é©Ü½ö»ì�×1ÚPÝ�ì�Û�ÿ Ú»×¼ëLÚu ó ã�ì<��w}à'ã�ë�ì>ã��FàSö»ì>ã�ÿÆå�Ø�ì�ë�ìar¼×Ëìt�?Üi×�u��eÈ#ÿtw1öÁø��I˦ÿpõ�ü��Û˦ÿ�È������ú��Ä�ì�Û�ç¼Ý�Ý�Û�Ú»ã�Û�Þ5ßÆí�à»ã��?Ú»âiâ½ßÆëËì<r¼×Ëܽ×���Þ7ì�â½Ü½ì�í�ÝUÛrÚSÛ�ì>Ý���ؼà»ã�ã�ì>Ú'Ý�à'×¼Ý�Û�ÙRÚSÛ�Ø�ܽâiâRÞRìÂ�¿Ú»ë�ì

á�â½ì>Ú�ã+å�Øhì§Ý�Ù¼Ú»âiâ�á>Ú�â½âHÛ�ÙËì��º�3�Hý3�Rÿ3�?ý �T��Þ7ì�â½Üiì�í�Ý�Û�Ú�Û�ì+Ýa���� ��������������F÷Â��u¬Ú�×Ëà'×:ßT�?à»çRÝ�w)Þ7ì�â½Ü½ì�í"Ý�Û�ÚSÛ�ì�u-àSö»ì�ãNÅùwÂ�L�k���D�3� þtu^Å È���w���û¼ü�þ�ü=Å�L�/�Ç��ü�ú�ý��N�¼ý!�4�#�^�<�½ü���ýSþ#���3�/� �c���ª�L�/� ú�ý�úX� �T�7þ�ü<��ý�þR�»ü�þÆý3�Sü�þ;ŪÓxPì?ç¼Ý�ì��Û�àQë�ì�×Ëà»Û�ìÆÛ�ÙËì¿Ý�Û�ã�ܽá�Û©ö'ì�ãrÝ�ܽà»×Àà�í��e��Ö}×PÛ�ÙËÜ�ݧÚ�ã�Û�Ü�á�â½ìBÛ�ÙËì?Ý�ì�ÛeÅ Ø�ܽâ½â�×¼à�Ûó â½Ú/ß�Ú¯ã�à»â½ì»å�Ú»×¼ë á�Ú»× ÞRìWÚ»Ý�Ý�ç��?ì+ëÀÛ�àÅÞ7ìÇr�y�ì>ëB��x�ìWë�ì>×Ëà�Û�ì¿Þ5ß��� ¿Û�ÙËì¦ÔªÚ �'×Ëà'Ý�Û�Ü�á ÞRì>âiܽì�í�ÝUÛrÚSÛ�ì»åËܽ×QØ�ÙËܽárÙ!�3ܽÝhÛ�ÙËì�á�à�� ó âiì�Û�ì�ã�ì�â�ÚSÛ�ܽà»×8�

�"à�r¼ãrÝUÛ�Ú óËó ã�à!y�Ü��?Ú�Û�ܽà»×�å'Û�Ù¼ì©Þ7ì�â½Ü½ì�í"í�çRÝ�ܽà»×!à ó ì�ãrÚSÛ�à»ã�Øhì§Ø�Üiâ½âCë�ì<rR×Ëì�Ú»á�á�ì ó ÛrÝ�ÛUØ�àÞRì>âiܽì�í§ÝUÛrÚSÛ�ì+ÝÆÚ»×¼ë ó ã�à�ë�ç¼á�ì>Ý\ÚÅÛ�ÙËÜiãrëB�j�¹àSØhì�ö'ì�ã+å)Üi×Âà»ãrë�ì�ãBí�à»ãÆÛ�ÙËì!à ó ì�ãrÚSÛ�à»ã�Û�àPÞ7ì�\ì+Ú�×Ëܽ×��»í�çËâ¬å�ÜdÛÆØ�Üiâ½â�ã�ì�q:çËܽã�ì?Ú»ëËëËÜdÛ�Üià'×¼Ú�â�Ü½× ó çËÛ>å�Ø�ÙËܽárÙ"å"ܽ×'Û�çËÜiÛ�ܽö»ì�â½ß»å�Ø�Üiâ½â�Ú'ë!ÚUç¼ë�Ü�á�Ú�Û�ìÞRì�ÛUØ�ì>ì�×!Û�ÙËì§ÛUØhà\ÞRì>âiܽì�í�Ý�Û�ÚSÛ�ì>ÝZØ�ÙËì>ã�ì©Û�ÙËì�ß!ë�ܽÝ�Ú �'ã�ì>ì�Ö�۵ܽݹÛ�ì�� ó Û�ܽ×�� Û�à ã�ì+Ý�à'âiö'ì�á�à»×�é¼Ü½á�Û�ݵÞ:ßQë�ì+á�â�Ú�ã�Üi×��¿à'×Ëì�Ú �'ì�×:Û�u#"�w=�?à»ã�ìBá�ã�ì+ë�ÜiÞ¼âiìÛ�Ù¼Ú»×�Û�ÙËì à»Û�ÙËì>ã�u�$kw�Ú�×Rë�Ù¼Ú/ö»ì?ÙËÜ�Ý*ÚUç¼ë���?ì�×:Û�ÝBë�à�?ܽ׼ÚSÛ�ì�o� ó ì+á�Ü�rRá�Ú»âiâ½ß»å�à»×¼ì�á�à'çËâ½ëë�ì<r¼×¼ì�í�ç¼Ý�ì>ëLÞRì>âiܽì�í©ÝUÛrÚSÛ�ì%$'&)(*" Û�àPÞRìWÛ�ÙËì�þ�ü�Ð1�Hü<�¿ü<�Rú�à»í�" Þ5ß+$/�j�µì�ã�ìWÜ�Ý�Û�ÙËì

ë�ì<r¼×¼ÜdÛ�Üià'×!à»í�Û�ÙËÜ�Ý�ÝUÛ�ã�Ú/Ø=�X�¿Ú�× í�çRÝ�ܽà»×Qà ó ì�ãrÚSÛ�à»ã+å,&(.- �$/& ( - "pöêøhu^ô � È�ô × wb�

uæô × È�ô � w10Ë2"3(�u�u^ô � ÈRô × w=Ë2"34�u^ô × ÈRô � w50Ë6$Âw#ú87Ö}×�à»Û�ÙËì>ã�Øhà»ãrëËÝ�å"Ø�ì?Øhà»çËâ�ë�á�à»×RÝUÛ�ã�ç¼á�Û§Û�ÙËì¿í�çRÝ�ì+ëPÞ7ì�â½Ü½ì�í¹ÝUÛrÚSÛ�ì ڻݩí�à»â½â½àSعÝa�§í�à'ã�ì>Ú»árÙó Ú�ܽã�à�íhØ�à'ã�â�ëËÝ>åCØ�ÙËì�×Ëì>ö»ì>ã%Û�ÙËìÇ�?à»ã�ì\á�ã�ì+ë�ܽÞËâiì¿Ú�»ì�×:Û�ÝUÛ�ã�Ü�á Û�âiß ó ã�ì�í�ì�ãrݧà»×Ëì\Øhà»ã�â½ëÅÛ�àÛ�ÙËì�à»Û�ÙËì>ã>å¼Ø�ì�Ý�ܽëËì�Ø�ÜdÛ�ÙQÛ�ÙËÜ�Ý ó ã�ì�í�ì>ã�ì>×¼á�ì���Ö}×Åá�Ú'Ý�ì+ÝZØ�ÙËì�ã�ì§Û�ÙËìe�?à:ÝUÛµá�ã�ì+ë�ÜiÞ¼âiì�Ú�»ì�×:Ûټڻݩ×Ëà ó ã�ì�í�ì�ã�ì�×Rá�ì»åCØhì\í�à»â½âiàSØÛ�ÙËì?ãrÚ�×Ëä5ܽ×���à»í�Û�ÙËì?â½ì>Ý�Ý�á�ã�ì>ë�ܽÞËâ½ì\Ú�»ì>×'Û(�%�µÚ�Û�çËãrÚ�â½â½ß»å& (.-Bܽݩ×Ëà»Û�Ú�Ý�ßh���?ì�Û�ã�Ü�áÆà ó ì�ãrÚSÛ�à'ã��e�ZÙËÜ�Ý©à ó ì�ãrÚSÛ�à»ã©Ü½Ý©Ü½âiâ½ç¼Ý�Û�ãrÚSÛ�ì+ëÅܽ×�Ø�Ü��»çËã�ìWð��/�ZÙËìë�à�ÛrÝ�â�Ú�Þ7ì�â½ì>ë�Ø�ÜdÛ�Ù�â½àSØ�ì>ã��}á�Ú'Ý�ìZâ½ì�Û�Û�ì�ãrÝZÚ�ã�ì¹Ø�à'ã�â�ëËÝ��»Û�ÙËì§á�ܽã�á�âiì+Ý�ã�ì ó ã�ì>Ý�ì�×:Û�ì(q:çËÜiöSÚ»âiì>×¼á�ìá�â�Ú»Ý�Ý�ì+Ý�à»í�Øhà»ã�â�ëËÝa�

Ø�Ü��»ç¼ã�ìPð�²�ZÙ¼ìWÝ�Û�ãrÚ/Ø=�&�¿Ú�× í�ç¼Ý�ܽà»×Âà ó ì�ãrÚSÛ�à»ã²u-ÞRì>âiܽì�í§Ý�ì�Û�Ý\ܽ×Lì>Ú'árÙ Þ7ì�â½Üiì�í©Ý�Û�ÚSÛ�ìWÚ»ã�ìÙËÜ��'ÙËâ½Ü��'Ù'Û�ì>ëDw<�

�ZÙËܽݹܽÝ%Ú?Ø�ì>âiâ���ëËì<r¼×Ëì+ë!à ó ì�ãrÚSÛ�ܽà»×Qܽ×!Û�Ù¼Ú�ÛµÜdÛ ó ã�à�ë�çRá�ì>ݵÚ\Û�à�ÛrÚ�â ó ã�ìa�;à'ã�ëËì�ã(�1�¹àSØ=�ì�ö»ì>ã>å�Û�ÙËì>ã�ì?Ü�Ý�Ú ó ã�à'ÞËâ½ìa� Ø�ÜiÛ�Ù�Û�ÙËܽÝ�ëËì<r¼×ËÜiÛ�ܽà»× ó ì�ã�Û�Ú»Üi×¼Üi×��!Û�àWÛ�ÙËì¿ÜiÛ�ì�ãrÚSÛ�Üià'×Pà»í�Û�ÙËì

126

Page 131: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

à ó ì�ãrÚSÛ�à'ã���Í�à»×¼Ý�Ü�ë�ì�ã�Û�ÙËã�ì�ì¹Þ7ì�â½Üiì�í"ÝUÛrÚSÛ�ì>Ý�æBå�91Ú�×Rë�Í Ø�ÜiÛ�Ù�Üi×¼á�ã�ì+Ú»Ý�Üi×D��à»ãrë�ì�ã�à»í�ë�à���Üi×¼Ú»×¼á�ì�u-æ ë�à��\ܽ׼Ú�Û�ì>ëÀÞ5ß:9%åCÚ»×¼ë¯Þ7à�Û�Ù�Þ5ß�Í=w#��ë�ã�ì>Ý�ç��¿Ú�Þ¼âiß'å¼Û�ÙËì\Ú»ÞRàSö'ìÆë�ìar¼×ËÜiÛ�ܽà»×Ø�à'çËâ½ëÎ�»Ü½ö»ì��?ì+Ú�×Ëܽ×���í�ç¼âRܽ×:Û�ì�ã ó ã�ì�ÛrÚSÛ�ܽà»× Û�à²u#$!& (.-;"�w<& (.-�=\å5Ý�ܽ׼á�ì'å:ܽ×:Û�çËÜiÛ�ܽö»ì>âiß¿Ý ó ì+Ú�äh�Üi×��Rå�Ú»âiâZÛ�ÙËìQÜi×�í�à'ãR�¿Ú�Û�ܽà»×Lܽ×>= ë�à��\ܽ׼Ú�Û�ì>Ý¿Ú»âiâhÛ�ÙËìQÜi×�í�à'ãR�¿Ú�Û�ܽà»×1Üi×?$�&@(.-;"Î�?9hç�ÛØ�Ù¼ÚSÛ�Ú�Þ7à»ç�Û�u�$>&A(.-�=ew�&A(.-;"CBþ�µì�ã�ìBÜiÛ©Øhà»çËâ�ëÅÝ�ì�ì�� Û�Ù¼ÚSÛ�Ý�à�?ìBà»í�Û�ÙËì\ܽ×�í�à»ã��¿ÚSÛ�ܽà»×Üi×D$?&E(F-�= ë�à�?ܽ׼ÚSÛ�ì>ݹÛ�ÙËì\Üi×Ëí�à»ã��?Ú�Û�ܽà»×¯Ü½×D" u-ÞRì+á�Ú»ç¼Ý�ìBÜdÛ©à»ã�Ü��'Üi×¼Ú�Û�ì+ë!í�ã�à��G=ew%Ú�×¼ëÝ�à��\ì§Ü�ݹë�à��\ܽ׼Ú�Û�ì>ëWÞ5ß ÜiÛ%u�Þ7ì>á�Ú»ç¼Ý�ì�ÜdÛ¹à'ã�Ü��»Ü½×¼ÚSÛ�ì>ë í�ã�à��H$Âw<�

�ZÙËì ó ã�à»ÞËâ½ìa� ܽÝ�Û�Ù¼Ú�Û�Û�ÙËìµÝ�Û�Ú�×RëËÚ�ãrë?ÞRì>âiܽì�í�Ý�Û�Ú�Û�ì¹Ü�Ý�×Ëà»Û�ã�ܽárÙ?ì>×Ëà»ç��'Ù?Û�à�ã�ì ó ã�ì+Ý�ì>×'ÛÛ�ÙËì�Ý�à»çËãrá�ì�à�í¼ì+Ú»árÙBܽ×�í�à'ãR�¿ÚSÛ�Üià'×BÜiÛ�ì��!å»Ø�ÙËܽárÙÆܽÝ�Û�ÙËìZã�ì+Ú»Ý�à»×�ØhìhÛ�ì�ã���ÜiÛbÔªÚ�×¼à»×5ßT�\à'ç¼Ý����%çËãWÚ'á Û�çRÚ�â%ëËì<r¼×ËÜiÛ�ܽà»×FØ�Üiâ½â%ì>×Ëã�ܽárÙ=Þ7ì�â½Ü½ì�íBÝ�Û�ÚSÛ�ì>Ý�Ø�ÜdÛ�Ù=Û�ټܽݲ�?ܽÝ�Ý�ܽ×�� ܽ×�í�à'ãR�¿ÚSÛ�Üià'×���"à!ë�ì�ö'ì�â½à ó ܽ×:Û�çËÜiÛ�ܽà»×Àí�à»ã%Û�ÙËìÆí�à»â½âiàSØ�ܽ×���ëËì<r¼×ËÜiÛ�ܽà»×RÝ�åCÜ��¿Ú �»Ü½×Ëì\Ú�Ý�ì�Û§à�í�ܽ×�í�à»ã��¿ÚSÛ�ܽà»×Ý�à'çËã�á�ì>Ý�Ú»×¼ë¯Ú¿Ý�ì�Ûµà�í�Ú�»ì�×:ÛrÝa�b�ZÙËìBÝ�à»çËãrá�ì+Ý�á�Ú�×QÞ7ì�Û�ÙËà»ç��'Ù:Û¹à�í�Ú»Ý ó ã�Ü��?ÜdÛ�Üiö'ìBÚ �'ì�×:Û�ÝØ�ÜdÛ�Ù�r�y�ì>ëou-Ú»×Ëà»×5ßT�?à»ç¼Ý4w�Þ7ì�â½Üiì�í"ÝUÛrÚSÛ�ì+Ýa�Hç�Ú'árÙ¿Ý�à'çËã�á�ì�Üi×Ëí�à»ã��?Ý�Ý�à��\ì�à»íHÛ�ÙËì©Ú �»ì>×:Û�Ý)à�íÜdÛrݵÞ7ì�â½Üiì�í�Ý�Û�ÚSÛ�ì�Hܽ×Åì<¥Hì>á�Û>åCì>Ú'árÙÅÝ�à'çËãrá�ì�à¥7ì>ã�ݹÛ�ÙËìÆà ó ܽ×Ëܽà»×¯Û�Ù¼ÚSÛ§á�ì�ã�Û�Ú»Üi×ÀØ�à'ã�â�ëËݹڻã�ì�\à'ã�ì§â½Üiä'ì�â½ß¿Û�ÙRÚ�×Qà�Û�ÙËì�ãrÝ�å�Ú»×¼ë!ã�ì��?Ú»Üi×RÝZ×Ëì�ç�Û�ã�Ú»â�Ú�Þ7à»ç�Û�à»Û�ÙËì>ã ó Ú�ܽã�Ý��æ¹×?Ú �»ì>×:Û� Ý�Þ7ì�â½Ü½ì�íCÝUÛrÚSÛ�ìZÜ�Ý�Ý�Ü�� ó â½ß�Û�ÙËì�Ú�?Ú»â��:Ú �¿ÚSÛ�Üià'×Bà»í7Ú�â½â:Û�ÙËì+Ý�ì�à ó ܽ×ËÜià'×¼Ý>åSì>Ú»árÙÚ�×Ë×Ëà»Û�Ú�Û�ì>ë Þ:ß�ÜiÛ�Ý\à»ã�Ü��»Ü½×ßu�à»ã�ô ó ì>ëËÜ��'ã�ì>ì>õ�w<���µí%á�à»çËãrÝ�ì'å�Û�ÙËì+Ý�ìWà ó ܽ×Ëܽà»×¼ÝÆÜi×è�»ì>×Ëì�ãrÚ�âá�à»×�é¼Ü½á�ÛhØ�ÜiÛ�ÙWà»×Ëì©Ú�×Ëà»Û�ÙËì>ã>å5Ú�×Rë?Û�ÙËì§Ú �'ì�×:ÛN�Æç¼Ý�Ûhã�ì>Ý�à»â½ö»ì�Û�ÙËì>Ý�ì%á�à»×MéRܽá�Û�Ý�Üi×Wà»ãrë�ì�ã�Û�àÚ�ã�ã�ܽö»ì¹Ú�Û�Ú�á�à'ÙËì�ã�ì�×:Û�Þ7ì�â½Üiì�í�Ý�Û�ÚSÛ�ì�H�ZÙËì�ã�ìµÚ»ã�ì�öSÚ�ã�ܽà»ç¼Ý ó â½Ú»ç¼Ý�ÜiÞËâ½ì¹ØZÚ/ß�Ý�à�í ó ì�ã�í�à»ã��\ܽ×��Û�ÙËÜ�Ý�ã�ì>Ý�à»â½ç�Û�Üià'×���Ö}× Û�ÙËÜ�Ý ó Ú ó ì�ã�ØhìµÚ'Ý�Ý�ç��?ì�Û�ÙRÚSÛ�Û�ÙËì%Ú�»ì>×'Û ó â½Ú'á�ì+Ý�Ú�Ý�Û�ã�ܽá�ÛBô�á�ã�ì+ë�ܽÞËÜiâ��ÜdÛUß�õ¿ãrÚ�×Ëä5ܽ×��¿à»×QÛ�ÙËìÆÝ�à»ç¼ã�á�ì>Ý>å¼Ú�×¼ëÀÚ»á�á�ì ó ÛrÝZÛ�Ù¼ìBÙËÜ��»ÙËì+ÝUÛR�;ãrÚ�×¼ä»ì>ë!à ó Üi×¼Üià'×¯à ¥Hì�ã�ì>ëQà»×ì�ö»ì>ã�ß ó Ú�ܽã�à�í)Ø�à'ã�â�ëËÝ��

�ZÙËì©í�à»â½âiàSØ�ܽ×��¿ë�ì<rR×ËÜdÛ�Üià'ׯá�à»×RÝ�Ü�ë�ì�ãrÝhà»×Ëâ½ßÇr¼×¼ÜdÛ�ì�Ý�ì�Û�Ý�à»í�Ý�à»çËãrá�ì+Ýa�:Û�ټܽÝ�ã�ì>Ý�Û�ã�Ü�á Û�ܽà»×á�Ú�×!Þ7ì�ã�ì>â½Ú y5ì+ë�Ú�Û�Û�ÙËì ó ã�ܽá�ì§à�í�á�à� ó â½Ü�á�ÚSÛ�Üi×D�ÆÛ�ÙËì�Ý�ç¼Þ¼Ý�ì(q:çËì�×:Ûµë�ì>ö»ì�â½à ó �\ì>×:Ûhܽ×!Û�ÙËÜ�Ýó Ú ó ì�ã(���� ������������JI øb���Süa���7Ð1�D� ú}ü;��ü�ú�ý����3�Hý3�Rÿ3�?ý �T�;�rüa�J�-ü��=� úX��ú�ü<�FK/L/M ú�ûËü ó ì+ë�Ü��»ã�ì�ì>ëÞRì>âiܽì�í�ÝUÛrÚSÛ�ì¦u�àSö»ì>ãeÅùw©Ü½×¼ë�çRá�ì>ë�Þ5ßNK �L���e�4�M�C��ú&�-ý3�+O �BÅQP�ÅSRT }VUXWZY\[^]\_��4�D�rûú ûD�SúO�uæô � ÈRô × w7öùøhu^Å È���wbË2Kè� ô � �èô × úa`�øb� úMÓxPì?Ø�ܽâiâ)ç¼Ý�ìcM Û�àQë�ì�×¼à�Û�ìÆÛ�Ù¼ì?Ý�ì�Û§à�íhÚ�â½â�à»í�Ý�à»çËãrá�ì+ݵàSö'ì�ãÂÅ å�Ú»×¼ë¯Û�ÙËã�à'ç��»Ù¼à»ç�Û%Û�ÙËÜ�Ýó Ú ó ì�ã\Ø�ì!Ø�ܽâiâ�á�à'×¼Ý�ܽë�ì>ã ó ì>ë�Ü��»ã�ì�ì+ë Þ7ì�â½Üiì�í%Ý�Û�ÚSÛ�ì>ÝÆÛ�ÙRÚSÛ¿Ú�ã�ì Üi×Rë�ç¼á�ì+ë Þ5ß Ý�çËÞ¼Ý�ì�Û�ÝÆà�íM��¦�¹à»Û�ì¿Û�Ù¼ÚSÛBÞ7à�Û�Ùþøú�Ú»×¼ë��� ܽ׼ë�ç¼á�ì¿Û�ÙËì!Ý�Ú�?ì ó ì+ë�Ü��'ã�ì>ì>ë�Þ7ì�â½Üiì�í¹ÝUÛrÚSÛ�ì���Ø�ì�Ø�Üiâ½âë�ì�×Ëà»Û�ì ÜiÛBÛ�à5àÅÞ5ß!d �¦Ø�ܽ׼ڻâiâ½ß»å)Øhì Ø�Üiâ½âhç¼Ý�ì�d8e*fhg!Û�àÀë�ì>×Ëà�Û�ì Û�ÙËì ó ì>ë�Ü��»ã�ì�ì+ë�ÞRì>âiܽì�íÝUÛrÚSÛ�ì§Ü½×¼ë�ç¼á�ì>ë!Þ5ßiM��

�¹ì<y5Û¹Øhì�ë�ì<r¼×¼ì�Ú ó Ú�ã�Û�Ü�á�çËâ�Ú�ã ó à»â½Ü½á�ß í�à'ã�ã�ì+Ý�à'âiö5ܽ×��\á�à»×MéRܽá�Û�Ý�Ø�ÜiÛ�ÙËÜ½×¯Ú ó ì+ë�Ü��»ã�ì�ì>ëÞRì>âiܽì�í�Ý�Û�Ú�Û�ì��;xPì\Ú»Ý�Ý�çD�\ìÆÚ ÝUÛ�ã�Ü�á Û©ã�Ú»×Ëä5Üi×D�:j à'×6MÉu¬Ú�×¼ëQÛ�Ù5ç¼Ý%Ú»â½Ý�à¿à»×ÅÛ�ÙËì\Ý�à'çËã�á�ì>ÝÛ�Ù¼Ú�Û!ܽ׼ë�ç¼á�ì�Ú»×5ß ó Ú�ã�Û�Ü�á�çËâ�Ú�ã:Otw#�§Û�ÙËì�Ý�Û�ã�Ü�á Û�×¼ì>Ý�ÝWà»í�Û�ÙËì�ãrÚ�×Ëä5ܽ×�� Ü�ÝQÚLÝ�Ü��'×ËÜsr7á�Ú�×:Ûã�ì+ÝUÛ�ã�Ü�á Û�Üià'×WÛ�Ù¼ÚSÛ%Ø�ì�ë�Ü�Ý�á�ç¼Ý�Ý�í�çËã�Û�Ù¼ì�ã¹Ü½×¯Û�Ù¼ìer¼×¼Ú»â"Ý�ì+á Û�Üià'×��=xPì�ܽ×:Û�ì>ã ó ã�ì�Û5� � jJ� × Ú»ÝÔE� × Ü½Ýk�?à»ã�ìÆá�ã�ì+ë�ܽÞËâiìÆÛ�ÙRÚ�×+� � ��ÆæµÝ©ç¼Ý�ç¼Ú�â;åCØ�ì?ë�ìar¼×Ëì2k�åHã�ì+Ú»ë=ô�ڻݩá�ã�ì>ë�ܽÞËâ½ì\Ú'Ý�õ¼åCÚ»ÝÛ�ÙËì�ã�ì<é¼ìay�Üiö'ì©á�âià:Ý�çËã�ì§à�íljt�

xPì\Ú�â�Ý�à�Ú'Ý�Ý�ç��?ì�Û�Ù¼ÚSÛm�� BÜ�ݹÛ�ÙËìÆâ½ì>Ú'ÝUÛ©á�ã�ì>ë�ܽÞËâ½ìÆÝ�à»çËãrá�ì'å¼Ø�ÙËÜ�árÙ¦�¿Ú/ß��\ì>ã�ÜiÛ©Ý�à�?ìì<y ó â�Ú�×¼Ú�Û�ܽà»×��§Ö�Ût�\Ü��»Ù:Û§ÞRì?Ú»Ý�ä»ì+ë¯Ø�Ù5߯ì�q:ç¼Ú�Û�ì�Û�ÙËì��?à'Ý�Û©Ú �'×Ëà'Ý�Û�Ü�áBÝ�à»ç¼ã�á�ìBØ�ÜiÛ�ÙÅÛ�ÙËì

127

Page 132: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

âiì+Ú»Ý�ÛÆá�ã�ì>ë�ܽÞËâ½ì¿à»×Ëì��QÖ}×�í-Ú»á ÛÆØ�ì�ëËà»×� ÛBÙ¼Ú/ö'ì?Û�à¼å)ÞËçËÛÆÝ�ܽ׼á�ì Üi× Û�ÙËì�ëËì<r¼×ËÜiÛ�ܽà»×RÝ�Û�ÙRÚSÛí�à»â½âiàSØ�åSÚ �'×Ëà'Ý�Û�Ü�á�Ü�ÝR�FÜ�Ý"àSö'ì�ã�ã�Ü�ëËë�ì>שÞ5ß�Ú�×5ß©à ó ܽ×Ëܽà»×�ã�ìa�:Ú�ãrë�â½ì>Ý�ÝHà�í¼á�ã�ì>ëËÜiÞËܽâ½ÜdÛUߧãrÚ�×Ëä5ܽ×��¼åØ�ìe�?Ü��»Ù:ÛµÚ»Ý�Øhì�â½â"Ú'Ý�Ý�ç��?ì§Û�Ù¼Ú�Û%Ú�â½â"Ú�»×Ëà:ÝUÛ�ܽá�ܽÝR� à»ã�Ü��»Ü½×¼ÚSÛ�ì>Ýhí�ã�à� Û�ÙËì�â½ì>Ú'ÝUÛµá�ã�ì+ë�ÜiÞ¼âiìÝ�à'çËã�á�ì»å5Ø�ÙËÜ�árÙQØ�Üiâ½â ó ì�ã��\ÜiÛµÝ�Ü�� ó â½ì�ã¹ë�ìar¼×ËÜiÛ�ܽà»×¼Ý��Ö}×'Û�çËÜiÛ�ܽö»ì�â½ß»åÂ�»Ü½ö»ì>×FÚ ó ì>ëËÜ��'ã�ì>ì>ë=Þ7ì�â½Ü½ì�í\Ý�Û�Ú�Û�ì�OÆå�OCn Ø�Üiâ½â§ã�ì�Û�Ú�ܽ×Aí�ã�à�oO Û�ÙËìÙËÜ��'ÙËì>Ý�Û���ãrÚ�×Ëä'ì>ë à ó ܽ×ËÜià'×!Ú»ÞRà'ç�Û�Û�Ù¼ì�ã�ì>â½Ú�Û�ܽö»ì©âiܽä»ì>âiܽÙËà5à�ë�Þ7ì�ÛUØhì�ì>×QÚ»×5ß¿ÛUØ�à\Øhà»ã�â½ëËÝ����� ������������>p øb���Süa��ÅÉÒ�M;ÒqO �3�C�CjÉ�!�k����ý �/ü<Ò�ú û¼ühë�à�?ܽ׼ÚSÛ�Üi×��BÞRì>âiܽì�í"Ý�Û�Ú�Û�ì¿ý���O�L�%ú�ûËü��4�M�C��ú&�-ý3��O n �ÅQP�ÅrRTsKþ�#�D�rûWú ûD�Súcücô � È�ô × Ë�Å ú û¼ü���ý3���½ýt�����h�?ûËý ���!�;ulvæ��?Ú y�uwO�uæô × È�ô � wRwcjù�¿Ú yBuxO�uæô � È�ô × wRw¿ú�ûËüa�NO n uæô � È�ô × wtöÉ�¿Ú3y�uwO�u^ô � È�ô × w�waÓ�yZú ûM�ü�þz���L��ü<Ò�O n uæô � È�ô × wNöJ�� hÓ {

Í�âiì+Ú�ã�âiß'å�í�à»ã©Ú�×5ß²ô � È�ô × Ë�Å ì�ÜiÛ�ÙËì>ã�OCn1uæô � È�ô × wbö|� à'ã�OCn1u^ô × È�ô � wbö}� à»ãÞRà»Û�Ù���5à�?ì�Ø�ÙRÚSÛ¹Ý�ç¼ã ó ã�Ü�Ý�ܽ×��»â½ß»å,O n Üi×Rë�ç¼á�ì+ݹÚ?ÝUÛrÚ�×¼ë¼Ú�ãrë�u¬Ú�×Ëà'×:ßT�?à»çRÝ�w�ÞRì>âiܽì�í�Ý�Û�Ú�Û�ì�

��� ������������?~ ì7û¼ü\à»ãrë�ì�ã�ܽ×��¯Üi×Rë�ç¼á�ì+ë Þ5ß�OCnp�L� ú�ûËü!þ�üa���Sú&�-ý3�vÆ��þÅ�PÛÅ �4�D�rûú ûD�Sú�ô � Æ¢ô × � ¤�OCn1uæô × ÈRô � w1ö'� ÓxPì�ë�ì>×Ëà�Û�ì§Û�ÙËì�Ý�Û�ã�Ü�á Û�ö»ì>ã�Ý�Üià'×�à»íbÆ Þ:ßD�e����b�<�F���������#�<�'�ùÆ �L�/� ú�ý�úX� �T�7þ�ü<��ý�þR�»ü�þÆý3�²ÅªÓ

�ZÙ:çRÝ Ú ë�à�?ܽ׼ÚSÛ�Üi×D� ÞRì>âiܽì�í�Ý�Û�ÚSÛ�ì¯Ü½ÝWÚÛ�»ì>×Ëì�ãrÚ�â½Ü��+ÚSÛ�Üià'×Là�í©Û�Ù¼ìÀÝUÛrÚ�×¼ë¼Ú�ãrëL×Ëà»Û�ܽà»×�à�íu-Ú�×¼à»×5ßT�\à'ç¼Ý4w�Þ7ì�â½Ü½ì�í�ÝUÛrÚSÛ�ì»å�ã�ì ó ã�ì>Ý�ì�×:Û�ܽ×��\Û�Ù¼ì�Ú �»ì>×:Û� ÝZà»ãrë�ì�ã�Üi×D�Æà'×!Øhà»ã�â�ëËÝhÞ¼Ú»Ý�ì>ë!à»×Û�ÙËìZÚ �'ì�×:Û� Ý�à ó Üi×¼Üià'×�à�í�Û�ÙËì�ܽã�ã�ì�â�ÚSÛ�Üiö'ì�â½Üiä'ì�â½ÜiÙ¼à:à�ë�Ú'Ý"Øhì�â½â:Ú'Ý"Ø�ټܽárÙBÝ�à»ç¼ã�á�ì�Û�ÙËìhà ó Üi×Ëܽà»×à»ã�Ü��'Üi×¼Ú�Û�ì+ë í�ã�à���*�¹àSØ�åËÜií�Û�Ù¼ìBÚ �'ì�×:Û¹â½Ú�Û�ì>ã¹Üi×:Û�ì�ãrÚ»á ÛrÝ�Ø�ÜiÛ�ÙÅÚ�×¼à�Û�Ù¼ì�ãµÚ�»ì�×:ÛµÚ»×¼ëWÛ�ÙËì>ßë�ܽÝ�Ú �'ã�ì>ì�àSö»ì>ã�Ý�à��?ì ó ܽì>á�ì%à�í�ܽ×�í�à»ã��¿ÚSÛ�Üià'×�å'ܽ×:Û�çËÜiÛ�ܽö»ì>âiß\Û�ÙËì�ß¿á�ڻ׿ã�ì+Ý�à'âiö'ìZÛ�ÙËì©á�à'×Mé¼Ü�á ÛÞ¼Ú»Ý�ì>ëWà»×!Ø�ÙËà¿Ù¼Ú'ÝhÛ�ÙËì�Ý�Û�ã�à»×��'ì�ã�Ý�ç óËó à»ã�Û��

�ZÙËìBí�çRÝ�ܽà»×�à ó ì�ãrÚSÛ�à»ãµØhì¿ë�ì<rR×Ëì\á>Ú ó Û�çËã�ì>Ý%Û�ÙËÜ�ݧܽ×'Û�çËÜiÛ�ܽà»×���xPì�r¼ãrÝUÛt�»Ü½ö»ì\Ú!ö»ì�ã�ß×¼ÚSÛ�çËã�Ú»âCë�ì<r¼×¼ÜdÛ�Üià'× í�à»ãhÛ�ÙËì©í�ç¼Ý�ܽà»×Wà�í�ÛUØhà ó ì+ë�Ü��'ã�ì>ì>ë¿Þ7ì�â½Ü½ì�í�ÝUÛrÚSÛ�ì+ÝlO � Ú»×¼ë%O × Þ¼Ú»Ý�ì>ëà»×QÛ�Ù¼ì�ܽãµã�ì>Ý ó ì+á Û�ܽö»ìÆÝ�ì�Û�ݹà»í�Ý�ç ó¼ó à'ã�Û�Üi×��WÝ�à'çËã�á�ì>Ý���Øhì�Ý�Ü�� ó â½ßQá�à�ÆÞËÜi×¼ì�Û�ÙËì����;�ZÙËì�×Ø�ì?Ý�Ù¼àSØ�Û�Ù¼Ú�Û©ÜiÛ�Ü½Ý ó à'Ý�Ý�ÜiÞËâ½ìÆÛ�àQá�à� ó çËÛ�ì\Û�ÙËì?×Ëì>Ø ó ì>ë�Ü��»ã�ì�ì+ëÅÞ7ì�â½Üiì�íhÝ�Û�ÚSÛ�ì¿ë�Üiã�ì>á�Û�â½ßÜi×AÛ�ì�ã��¿Ý�à»í�Û�ÙËì�O � Ú»×¼ë'O × Ø�ÜiÛ�Ù¼à»ç�Û!×Ëì�ì+ë�Üi×D� Û�àÂã�ì�í�ì�ã�Û�à Û�ÙËì�Ý�ì�Û�ÝWà�íBÝ�à»çËãrá�ì+Ýa�ØËçËã�Û�ÙËì>ãR�?à'ã�ì'å:Øhì§Ý�ÙËàSØFÙËàSØAÛ�à?ë�ì�Û�ì�ã��?Üi×¼ì%Û�ÙËì§×Ëì�Ø3ë�à�?ܽ׼ÚSÛ�Üi×D�ÆÞ7ì�â½Üiì�í�Ý�Û�Ú�Û�ì§Þ¼Ú»Ý�ì>ëà»×!Û�ÙËà:Ý�ì�Ú'Ý�Ý�à�á�Ü�ÚSÛ�ì+ë Ø�ÜiÛ�ÙNO � Ú�×Rë6O × ��æµÝ�ÜiÛ�Û�ç¼ã�׼ݹà»ç�Û+å�Û�ÙËì�ã�ì>Ý�çËâiÛ�Ø�Üiâ½â��¿ÚSÛrárÙWÛ�ÙËìá�à»×�é¼Ü½á�Û���ã�ì+Ý�à'âiçËÛ�ܽà»× ó à»â½Ü�á�ß�Øhì§à»ç�Û�âiܽ×Ëì+ë!Ú»ÞRàSö'ì���� ������������J� øb���Süa���;��ü�ú"ý��1��ý ��þR�rü#�qMþ� �c��jÙ�3�=��rý �/ü<Ò<K � È�K × L�M;ÒCú û¼ü8�Ëü4�3�s�»þ�ürü4���üa�J�-ü���� úX��ú�ü�O � ���c�3�D�rü4�¦��ÿ�K � ÒÂ� �c���¼ü4�3�s�»þ�ürü4���rü<�s�-ü^��� ú£�Sú}üCO × ���c� �D�rü�����ÿiK × Ò¹ú û¼üí�ç¼Ý�ܽà»×=ý���O � � �c��O × Ò��»üa�7ýSú}ü4��O � &�(FO × Ò;�L�\ú û¼üÂ�¼ü�� �s��þ�ü�ü4����üa�J�-ü����rú£�Sú}ü����C�3�D�rü4�¦��ÿK � `%K × Ó�z� 4��'��&�!'8"!,�'F 4@�&�('F5�'�,^�&5�+�-£�&+� #.!,�WZ��+�.!+ �&'�.!'�,�,C24,&,�"!5&'�,D�&�32��B2H:;2�>(+�:;24��,& #"!5�-�'8'£>(+�,^�&,��<6B'F-� #"!��)5�'�24)!+�� ]15&'�0!�J24-�'B+ �D9a]16B'�2��#'�5D5&'���"(+�5&'�:*'�.a��,� #.N���!'B+�.(U3.!+ ��'B,&'£��WZ���!'�249!,&'�.!-�'� 4@ �&+�'�,�+�.N���!'B5&24.b��+�.($� '�.!,�"!5&'�,c���32��B���!'�:;2�>�+�:;24�!,& #"(5&-�'F+�,c"!.!+E�a"!'z�(5�'�:* �f(+�.($1���!+�,C5�'�,���5&+�-£�&+� #.*+�,c.! 4�B,���5X24+�$#�<�^@� #5^6�245&)hW

128

Page 133: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

�%Þ:ö5ܽà»ç¼Ý�â½ß»å:Û�ÙËì�Ý�ì�Û¹à»í ó ì>ëËÜ��'ã�ì>ì>ëWÞRì>âiܽì�í�ÝUÛrÚSÛ�ì>ÝZÜ�ݹá�â½à'Ý�ì>ëWçË×¼ë�ì>ã5&(b����b�<�F���������#�<�>I� Ó%uxO � & (.O × w<u^ô � ÈRô × w1ö�O � uæô � È�ô × wq`2O × uæô � È�ô × w� Ó%uxO � & (.O × w n uæô � È�ô × w7ö�?Ú yCuxO ��� u^ô � È�ô × w#È�O ×;� uæô � ÈRô × wRw� �;�?Ú yCuxO � � u^ô × È�ô � w#È�O × � uæô × ÈRô � wRw�j�?Ú yCuxO � � u^ô � È�ô × w#È�O × � uæô � ÈRô × wRw4Òb� �c�� ýSú�ûËü�þz���L��ü�Ó� Ó5v^�*Æ � Ò�Æ × ÒH� �c��Æp�Sþ�ü%ú�ûËü§ýSþ��»ü�þ#���h� �����c� �D��ü4����ÿ�O ��� Ò�O ×;� Ò��3�C�%uxO � & (FO × w n Òþ�ü<�&�Ëü4��ú����Süa�zÿ!Ò�ú�ûËüa�ô � �[ô × � ¤ô � � � ô × � �c�/�?Ú yCuxO × � u^ô � È�ô × w#È�O × � uæô × ÈRô � wRw�k>O � � u^ô � ÈRô × wBý�þô � � × ô × � �c�/�?Ú yCuxO � � u^ô � È�ô × w#È�O � � uæô × ÈRô � wRw�k>O × � u^ô � ÈRô × waÓ

�ZÙËì�Ý�ì+á�à»×Rë ó ã�à ó ì�ã�ÛUߧí�à»ã��¿Ú�â½Ü��>ì>Ý�Û�Ù¼ìhÜ�ë�ì+ÚµÛ�Ù¼ÚSÛ+å�í�à'ã)Úk�'Üiö'ì�× ó Ú�ܽã�à�íRØhà»ã�â�ëËÝ�å/Û�ÙËìZ×Ëì�Øë�à�?ܽ׼ÚSÛ�Üi×��¿Þ7ì�â½Ü½ì�í�ÝUÛrÚSÛ�ì�Ý�ÙËà'çËâ½ëQárÙËà5à:Ý�ì%Û�ÙËìBà»ãrë�ì�ãZà�í�Û�Ù¼ì ó Ú�ܽãZÛ�Ù¼Ú�Û��»Ü½ö»ì+Ý�Û�ÙËì/�?à:ÝUÛá�ã�ì>ë�ܽÞËâ½ì?Ý�ç óËó à»ã�Û�Þ7ì�ÛUØhì�ì�×PÛ�Ù¼ì?ÛUØ�à¯Ú �'ì�×:Û�Ý%í�à'ã§Û�ÙËÜ�Ý ó Ú»Üiã�à»í�Øhà»ã�â½ëËÝ>å"Ú'Ý�Ý�Ü��'×Ëܽ×���Û�ÙËìÝ�Ú�\ì?Ý�ç óËó à»ã�Û©Û�à!Û�ÙËÜ�ݧà»ãrë�ì�ã+å�Ú�×RëN�� ?Û�àWÛ�ÙËì¿à óËó à'Ý�ÜdÛ�ìÆà'ã�ë�ì>ã����ZÙËì\Û�ÙËܽã�ë ó ã�à ó ì�ã�ÛUßë�ì>Ý�á�ã�ÜiÞ7ì>Ý©ÙËàSؾÛ�à!ëËì�ã�Üiö'ì�Û�Ù¼ì\×¼ì�Øܽ׼ë�ç¼á�ì>ëÀà»ãrë�ì�ã�Üi×D� í�ã�à� Û�ÙËà:Ý�ì\à�í�Û�ÙËì\ÛUØhà í�ç¼Ý�ì>ëÞRì>âiܽì�í�ÝUÛrÚSÛ�ì>Ý��

Ø�Ü��»ç¼ã�ìÇ}WÜiâ½âiçRÝUÛ�ã�Ú�Û�ì>Ý©Û�ÙËì?í�ç¼Ý�Üià'×�à ó ì�ãrÚSÛ�ܽà»×Pà»×PÛ�ÙËã�ì�ì¿ë�à��?Üi×¼Ú�Û�ܽ×��¯Þ7ì�â½Üiì�íZÝ�Û�Ú�Û�ì>Ý��xPì!á�Ú�×Lö:ܽì�Ø�$BåF" å�Ú»×¼ë�=\å)Û�àPÞRì!Ú �»ì>×:Û�Ý�Ø�ÜiÛ�ÙLÜi×�í�à'ãR�¿Ú�Û�ܽà»× à»íhí�ã�à� Ý�à»çËãrá�ì+Ý�à�íÙËÜ��'Ù�u¬Ý�à'çËãrá�ì%ñhw åM�?ì>ëËÜiç�� u-Ý�à»çËãrá�ìe}w�åËÚ�×¼ë�âiàSØpu¬Ý�à'çËãrá�ì\ð(w�á�ã�ì>ë�ܽÞËܽâiÜiÛUß»å�ã�ì>Ý ó ì+á Û�Üiö'ì�â½ß�9�ì+á�Ú»ç¼Ý�ì\Øhì\×¼àSØý�¿Ú�ä»ì ó ã�ì>á�ì>ë�ì>×¼á�ì?ë�ì>á�ܽÝ�ܽà»×¼Ý§ÚSÛ�Ú!â½à5á>Ú�â)ãrÚSÛ�Ù¼ì�ã©Û�Ù¼Ú»×��'âià'Þ¼Ú�â)â½ì�ö'ì�âÞ¼Ú»Ý�ì>ëLà»×�Ý�à'çËã�á�ì>Ý\à�í§Ý�ç óËó à'ã�Û+å�í�ç¼Ý�ܽ×��+$ Ø�ÜiÛ�Ù>= Ú�×¼ëLÛ�ÙËìQã�ì>Ý�çËâiÛ?Ø�ÜiÛ�Ù?" ܽÝ?×¼àSØØ�ì>âiâ���ëËì<r¼×Ëì+ë!Üi×ÅÚ¿á�à»×¼á�ì ó Û�ç¼Ú�â½âiß%ÚUç¼Ý�Û�Ü�r¼ì>ëQØZÚ/ß»å�çË×¼âiܽä»ì�ܽ×!Û�ÙËìÆá�Ú»Ý�ì�à�í�Û�Ù¼ìBÝUÛ�ã�Ú/Ø*�¿Ú»×à ó ì�ãrÚSÛ�à'ã�ë�Ü�Ý�á�ç¼Ý�Ý�ì+ë ì+Ú�ã�âiܽì�ã(���µà�Û�ܽá�ìµÛ�ÙËì�ë�ì ó ì�×¼ëËì�×¼á�ì%à»í�Û�ÙËìkr¼×RÚ�âHÞRì>âiܽì�í)ÝUÛrÚSÛ�ì%à'×WÚ�â½âÛ�ÙËã�ì�ì�Ý�à»çËãrá�ì+Ýa�

xPìÆØ�Üiâ½â"í�ç¼ã�Û�ÙËì�ã§ì<y ó â½à»ã�ì�Û�ÙËì ó ã�à ó ì�ã�Û�ܽì>ݵà�í�Þ7ì�â½Ü½ì�í�í�ç¼Ý�Üià'×Àܽ×Åâ�ÚSÛ�ì>ã©Ý�ì>á�Û�ܽà»×¼Ý>å7ÞËç�Ûr¼ã�Ý�Û�Ø�ìZë�Ü�Ý�á�ç¼Ý�Ý�Û�ÙËì�á�à'×Ë×Ëì+á Û�ܽà»×ÆÞRì�ÛUØ�ì>ì�×\ÞRì>âiܽì�í7í�ç¼Ý�Üià'×\Ú»×¼ë\á�â�Ú»Ý�Ý�Ü�á�Ú»â:æ¹é§î ã�ì>ö5ܽÝ�Üià'×��

� � «*»WÕa¯�Õ�ÑÆζ­e¯ Ó¯ÎQÒo«¹Ð��\¯���«¹Ô¹Õz ¦«¹Ò ò»Ó�¯)Õ�Ñ\Î�¹àSØ Û�Ù¼Ú�ÛBØ�ì¿ÙRÚ/ö»ì¿ë�ì<rR×Ëì>ë�í�ç¼Ý�ܽà»×�å�à»×Ëì á>Ú�×�ö5Üiì>Ø Û�ÙËì¿Û�ã�Ú'ë�ÜdÛ�Üià'×¼Ú�â�æ¹é§î ã�ì>ö:Ü�Ý�Üià'×à ó ì�ãrÚSÛ�à'ã�Ú»ÝZÛ�ÙËì�Ú óËó âiÜ�á�Ú�Û�ܽà»×Qà�í�Û�ÙËì�í�ç¼Ý�ܽà»×Qà ó ì>ã�Ú�Û�à»ãhÛ�à Ú ó Ú�ã�Û�Ü�Ú�â½âißWÝ ó ì>á�Üsr¼ì+ëQÜi× ó ç�Ûu�à»×¼âiß!Û�Ù¼ìBÞ7ì�â½Üiì�í�Ý�ì�Û%à»í�Û�ÙËì\ì<y ó ì>ã�Û%ܽÝÂ�»Ü½ö»ì>×�åR×¼à�Û%ټܽݵí�çËâ½â�Þ7ì�â½Üiì�í�Ý�Û�ÚSÛ�ì(w<�µÖ}צ�'ì�×Ëì>ã�Ú»â¬åÛ�ÙËìWí�çËâ½â�ÞRì>âiܽì�í§ÝUÛrÚSÛ�ì!à�íµÛ�ÙËì!ì<y ó ì�ã�Û?Ý�Û�ã�à»×��'âiß Ú ¥7ì+á Û�ÝBÛ�Ù¼ì!ã�ì>Ý�çËâiÛ�ܽ×��Fô�í�ç¼Ý�ì>ëËõPÞRì>âiܽì�íÝUÛrÚSÛ�ì��1�¹àSØhì�ö'ì�ã+å'ÜiÛ�Û�çËã�×¼ÝZà'ç�ÛZÛ�Ù¼Ú�ÛhÛ�ÙËì�Þ7ì�â½Üiì�íN��ü�ú�ë�ì<r¼×¼ì>ë!Þ5ß?Û�ÙËì§í�ç¼Ý�ì>ë!Þ7ì�â½Üiì�í)Ý�Û�Ú�Û�ìë�ì ó ì�×¼ë¼Ý¹à»×¼âißWà»×!Û�ÙËìBÞRì>âiܽì�í�Ý�ì�Û¹à»í�Û�ÙËì�ìay ó ì�ã�Û��bx�ì�×ËàSØ Ý�ÙËàSØ Û�Ù¼Ú�Û¹Û�ټܽݵܽݹÝ�à¼å7Ú�×¼ëÛ�Ù¼Ú�Û�Û�ÙËì�æ¹é§î ã�ì�ö5ܽÝ�ܽà»× ó ã�ì+á�Ü�Ý�ì>âiß�á>Ú ó Û�çËã�ì>ÝhÛ�Ù¼ì ó ã�à ó ì>ã�Û�Üiì+Ýhà�í�Û�ÙËÜ�Ý�Þ7ì�â½Üiì�í)Ý�ì�Û(�

129

Page 134: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

31 ab dc 1

33

A C3

a bA C B 2 c d13

3 3

3

3 c

b

a

dC

Ba bc2 2

d

2

A a1 1b d

c

1

Ø�Ü��»ç¼ã�ì/}M�1�ZÙ¼ì§á�à»ã�ã�ì+á Ûhí�ç¼Ý�Üià'×!à ó ì>ã�Ú�Û�à'ã��

Ö}× à»ãrë�ì�ã�Û�à��\Ü��?ܽá©æ¹é§î ã�ì�ö5Ü�Ý�ܽà»×¿Øhìµ×¼ì�ì>ë¿Û�àBÞ7ì§Ú�ÞËâ½ì¹Û�àÆë�Ü�¥7ì>ã�ì>×:Û�Ü�ÚSÛ�ì%ÞRì�ÛUØ�ì>ì�×Û�ÙËì�ì<y ó ì>ã�Û)Ú �'ì�×:Û�Ú�×Rë§Û�ÙËì�×ËàSö5ܽá�ì�FxPìhë�à%Ý�à¹Þ5ß�ë�ì<r¼×¼Üi×��©Ú�×�à'ã�ëËì�ã�Üi×���à'×BÚ �'ì�×:Û�Ý>åTu�à»ã+åì�q:çËܽö/Ú»âiì>×:Û�â½ß»å/à»× ó ì>ë�Ü��»ã�ì�ì+ë§ÞRì>âiܽì�í¼Ý�Û�Ú�Û�ì+Ý�w<�"Ö}×:Û�çËÜiÛ�ܽö»ì>âiß'åSà»×Ëì�á�Ú�×�ëËܽÝ�Û�ܽ×��»ç¼Ü½Ý�Ù�ÞRì�ÛUØ�ì>ì�×Û�ÙËìÎÏa�D�3�¼ú&� ú;ÿµà�í�Üi×�í�à'ãR�¿Ú�Û�ܽà»×WÚ»×!Ú�»ì>×'ÛhÙ¼Ú»Ýtu-Ø�ÙËܽárÙ!Øhà»ã�â½ëËÝhÙËì§á�Ú»×!ëËܽÝ�Û�ܽ×��»ç¼Ü½Ý�Ù\whÚ�×¼ëÜdÛrÝ/Ï<�D� �J� ú¬ÿ�u�Ø�ÙRÚSÛ¹Ú�ã�ì©Û�ÙËì�Ý�à»çËãrá�ì+Ý�à»í"Û�ÙËì>Ý�ì�ë�ܽÝ�Û�ܽ׼á�Û�ܽà»×¼Ý4w#�1�ZÙËì©í�à'âiâ½àSØ�Üi×D�\ë�ìar¼×ËÜiÛ�ܽà»×ã�Ú»×Ëä�ÝhÚ�»ì>×'ÛrÝbr¼ãrÝUÛ�à»×oq:ç¼Ú»âiÜiÛUß»å�ÞËã�ì>Ú»ä5Üi×��ÆÛ�ܽì>ݹÞ:ß²q'çRÚ�×:Û�ÜiÛUßC���� ������������>¡�÷7�:ü<�Rú�$ × ��� ú�û¦�t�Ëü4�3�s�»þ�ürü4����üa�J�-ü��t� úX��ú�ü�O × ý �/ü�þ�Å Ù¼Ú»Ý�Ú'ÝZã�ì>âiÜ�Ú�Þ¼âiìÝ�à'çËã�á�ì>Ý�Ú'Ý��a�:üa�¼ú*$ � ��� ú û��¼ü�� �s��þ�ü�ü4���rü<�J�-ü��e� ú£�Sú}ü1O � ý3�Sü�þÂÅ ¢w��þ#� ú;ú}ü<�%$ ×�£ $ � ýSþO ×�£ O �h¤ � ¤ê� úb�L��ú û¼ü��4�3��üBú�û��Sú���ûËüa�7üa�Sü�þ1O ×;� u^ô � ÈRô × w50ö'�; Wú û¼ü<��?Ú y�uwO �z� u^ô � È�ô × w#È�O �z� uæô × ÈRô � wRwlk?O ×;� u^ô � ÈRô × waÓ���b�<�F���������#�<�?p|¥"ü�ú1O � �3�C�:O × ��ü%�¼ü4�3�s�»þ�ürü4�I�rü<�J�-ü��²� úX��ú�ü<�4Ò��3�c���½ü�ú/Æ � � �c��Æ ×��üWú�ûËüQýSþ��»ü�þ#���h� �Î���c�3�D�rü4����ÿ6O ��� �3�C�%O ×;� Ò§þ�ü<�&�Ëü4��ú����Sü<�dÿÓN¦���þ ú û¼ü�þ4Òt�iü�úÂÆ �rüWú û¼üýSþR�'ü�þ#���T�����c� �D��ü4����ÿ²uwO � &E(FO × w n Ó�væ��O ×�£ O � Òhú û¼ü<��ô � � × ô × �����c�J�-ü<�Âô � �Ùô ×��ýSþ�� ����ô � ÈRô × Ë�Å Ó

�¹à�Û�ì©Û�Ù¼Ú�ÛZÚ»×:ß ó ì>ëËÜ��'ã�ì>ì>ë Þ7ì�â½Ü½ì�í�Ý�Û�Ú�Û�ì©Ù¼Ú»ÝZÚ'Ý�ã�ì�â½Ü�Ú�ÞËâ½ì©Ý�à»ç¼ã�á�ì>ÝhÚ»Ýld åËÚ�×Rë�d�e*f�g�ÙRÚ»ÝÚ»ÝZã�ì�â½Ü½Ú»ÞËâiì�Ý�à»çËãrá�ì+ÝZÚ»Ý�Ú�×5ß à»Û�ÙËì>ã ó ì+ë�Ü��»ã�ì�ì>ëWÞ7ì�â½Üiì�í)Ý�Û�Ú�Û�ì�Ö}×WÛ�ÙËì�í�à'âiâ½àSØ�Üi×D�¼åËØhì�ç¼Ý�ì§Û�ÙËì�×¼à�Û�Ú�Û�ܽà»×DOl§\Û�à ë�ì�×Ëà»Û�ì§Û�ÙËì�Þ7ì�â½Üiì�í�Ý�ì�Û©ë�ì<r¼×¼ì>ëQÞ5ßÚ ó ì+ë�Ü��»ã�ì�ì>ë�Þ7ì�â½Üiì�íµÝUÛrÚSÛ�ì%OÆå"Û�Ù¼ÚSÛ\ܽÝ>å�Û�ÙËì!Ý�ì�ÛBà�í�Øhà»ã�â½ëËÝ/�\ܽ×ËÜ��¿Ú�â�Ø�ÜiÛ�ÙLã�ì+Ý ó ì>á�Û�Û�àÛ�ÙËì§à»ãrë�ì>ã�ܽ×��Bܽ׼ë�çRá�ì>ë�Þ5ß%O n ��æ¹â�Ý�àRå5Ø�ì©ç¼Ý�ì�OI¼*¨ Û�à\ë�ì>×Ëà�Û�ìµÛ�ÙËì§ã�ì>ö5ܽÝ�Üià'× à�í�ÞRì>âiܽì�í

130

Page 135: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

ÝUÛrÚSÛ�ìmO3Þ5ß¿Þ7ì�â½Üiì�í)Ý�ì�Û�¨ÂÚ»á>á�à'ã�ë�ܽ×��BÛ�à?æ¹é§î u�Û�Ù¼Ú�ÛZÜ�Ý>å'Û�ÙËì©Û�ÙËì§Øhà»ã�â½ë¼Ý�ܽ×�¨LÛ�Ù¼Ú�ÛZÚ»ã�ì�\ܽ×ËÜ��¿Ú�â"Ú'á�á�à'ã�ëËÜi×��ÆÛ�à�O n w<����b�<�F���������#�<�~}¥"ü�ú*O � �3�c�©O × �rü*�Ëü4�3�s�»þ�ürü4�²�rüa�J�-ü��t� úX��ú�ü<�Â�#�D�rûQú�û���ú*O × £ O � Óì7û¼ü<��uwO � & (.O × w@§Mö'O � ¼kuxO × §TwaÓª �<�t�<«#«�¬­�b®~q¯#�>¥�ü�úcO � È�O × ÈzO { �rüÎ�Ëü4� �s��þ�ürü4�v�rü<�J�-ü��o� ú£�Sú}ü#�o�#�D�rû�ú�û���úmO ×?£ O � ÒO { £ O � Ò=�3�c��O × §Mö3O { §CÓì7û¼ü<��uwO � & (.O × w@§MöýuxO � & (.O { w°§cÓ�ZÙ:çRÝ�åhæ¹é§î ã�ì�ö5Ü�Ý�ܽà»×ÂܽÝ?Ý�Ü�� ó â½ßLÚ ó ã�à ÚUì>á�Û�ܽà»×Âà�í%Þ7ì�â½Ü½ì�íµí�çRÝ�ܽà»×�åhܽ×ÂØ�ÙËÜ�árÙ1à»×ËìÜ��'×Ëà»ã�ì>Ý©Ú�â½â�ÞËç�Û�Û�ÙËì?ÞRì>âiܽì�í�Ý�ì�Û�à�í�à'×Ëì?à�í�Û�ÙËì?ܽ×ËÜdÛ�ܽڻâ�ÞRì>âiܽì�í�Ý�Û�ÚSÛ�ì>Ý>å"Ú�×RëPÚ»âiâ)ÞËç�Û�Û�ÙËì

ÞRì>âiܽì�í�Ý�ì�Û¹à�í�Û�ÙËì�ã�ì>Ý�çËâiÛ�ܽ×��?ÞRì>âiܽì�í�Ý�Û�ÚSÛ�ì�

± ² «�ñañ���³�«Â©�­*»�«¹ÒÅÎ�«�¯F¯ Ñ�òAÕrÏ�«µÐ�­©Ï�«¹Ò�´�± Óoñ Ï)Õ\�\­�µ�«µÎ\϶³�«l�ñ�Õa«;òÙò'Ó�¯�Õ�ÑÆÎ

xPì��\ì>×:Û�ܽà»×Ëì+ëAÜi×=Û�ÙËìPÜi×:Û�ã�à�ë�ç¼á�Û�ܽà»×AÛ�Ù¼ÚSÛWÛ�ÙËì ó ã�à'ÞËâ½ìa� à�í�ÜiÛ�ì�ãrÚSÛ�Üià'×=Ù¼Ú'Ý ó ã�àSö»ì+ëÚ¦�¿Ú3ÚUà'ã\árÙ¼Ú»âiâ½ì�×D�»ì¿Û�à�æ¹é§î��}Ý�ÛUß:â½ì�ã�ì�ö5Ü�Ý�ܽà»×8��x�ìW×¼àSØ Ý�ÙËàSØ Û�Ù¼ÚSÛ\Û�ټܽÝ\Ü�ÝB×Ëà»ÛÆÛ�ÙËìá�Ú»Ý�ì§í�à»ãµí�ç¼Ý�Üià'×��;�"à¿Þ7ìa�'Üi×ÀØ�ÜdÛ�Ù�åR×¼à�Û�ì�Û�Ù¼Ú�ÛµÜdÛ�ì�ãrÚSÛ�Üià'ׯܽݹí�à»ã��?Ú»âiâ½ß�Øhì�â½â���ë�ìar¼×Ëì+ëB�¼Û�ÙËìà»ç�Û ó ç�Û�à�í7í�ç¼Ý�Üià'×ou-Ú ó ì>ë�Ü��»ã�ì�ì+ëBÞ7ì�â½Üiì�í�ÝUÛrÚSÛ�ì!w�ܽÝ�Ú§â½ìa�'ÜdÛ�Ü��¿Ú�Û�ìZÜi× ó ç�Û�Û�à�Ú�×Ëà»Û�ÙËì>ã�í�çRÝ�ܽà»×à ó ì�ãrÚSÛ�ܽà»×8�

ØËã�à� Û�ÙËìBÝ�ì�ÛR�¬Û�ÙËì�à'ã�ì�Û�Ü�á�ë�ì<r¼×¼ÜdÛ�Üià'×Åà�í�í�ç¼Ý�Üià'×�åRÜiÛ¹í�à'âiâ½àSعÝ�Ü����?ì+ë�ܽÚ�Û�ì>âißWÛ�Ù¼Ú�ÛµÜiÛ�ì�ãR�ÚSÛ�ì+ëQÞRì>âiܽì�í�í�ç¼Ý�ܽà»×¯Ü�ݹ׼à�Û%à'×Ëâ½ßWØhì�â½âs�}ë�ìar¼×Ëì>ë�å¼ÞËç�Û§Ú�â�Ý�à¿ìay5Û�ã�ìa�?ì�â½ßWØhì�â½â��;Þ7ì�Ù¼Ú/ö'ì>ëB�ZÖ}×ó Ú�ã�Û�Ü�á�çËâ�Ú�ã+å¼ÜiÛ§Üi×ËÙ¼ì�ã�ÜdÛrݹÛ�ÙËìBÜ�ë�ìa� ó à�Û�ì�×¼á�ì»åCá�à����Æç�Û�Ú�Û�ܽö5ÜdÛUß'å7Ú»×¼ëÅÚ»Ý�Ý�à5á�ܽÚ�Û�ܽö5ÜdÛUß ó ã�à ó �ì�ã�Û�ܽì>ÝZà�í.`b�

�"à©ë�ìa�?à»×RÝUÛ�ã�Ú�Û�ì�Û�ÙËì�Øhì�â½â��;Þ7ì�Ù¼Ú/ö'ì>ë�×¼ì>Ý�Ý�à�íËÜiÛ�ì>ã�Ú�Û�ì+ë�ÞRì>âiܽì�íRí�ç¼Ý�Üià'×�åSØ�ìN�»Ü½ö»ì�Ý�ì�ö'ì�ãrÚ�âã�ì>â½Ú�Û�ì>ëÂì<yËÚ� ó âiì+ÝÆØ�ټܽárÙ�ë�ì ó ì�×RëÂà'×LÛ�ÙËì>Ý�ì ó ã�à ó ì�ã�Û�ܽì>Ý���Û�ÙËì¯ì<yËÚ� ó âiì+Ý\Ú»ã�ìQÝ�Û�ÚSÛ�ì>ëÜi×�í�à'ãR�¿Ú»âiâ½ß�í�à'ãµã�ì>Ú'ëËÚ�ÞËܽâ½ÜdÛUß'å¼ÞËç�Û�á�ڻׯì>Ú'Ý�ܽâiß!Þ7ìÆÝ�Û�ÚSÛ�ì>ëQí�à'ãR�¿Ú»âiâ½ß!Ú»×¼ë ó ã�àSö»ì+ëB��Ö}×ÀÚ�â½âà�í"Û�ÙËìa� Ú»Ý�Ý�ç��?ì%Û�Ù¼Ú�ÛZÛ�ÙËì>ã�ì§Ú�ã�ì5·�Ú �'ì�×:Û�Ý>å'ì+Ú»árÙWØ�ÜdÛ�ÙWÙËÜ�ÝZàSØ�×�Þ7ì�â½Ü½ì�í�ÝUÛrÚSÛ�ì©àSö»ì>ã�Û�ÙËìÝ�Ú�\ì!Ý�ì�ÛÆà»í¹Ø�à'ã�â�ëËÝ/Å å�Ú�â½âZÚ �»ã�ì�ì>Üi×D�¯à»× Û�ÙËì�ܽãÆìay ó ì>ã�Û�ܽÝ�ì�ãrÚ�×Ëä5ܽ×��Åã�ì�â�ÚSÛ�Üiö'ì¿Û�àPà»×ËìÚ�×Ëà»Û�ÙËì>ã>åËÚ»×¼ëWÚ»âiâCì�� ó âiàSß5ܽ×��?ÞRì>âiܽì�í)í�ç¼Ý�ܽà»×QÚ»ÝhÛ�Ù¼ìe�\ì�Û�ÙËà�ë!à�í)ç ó ëËÚSÛ�ì�

� �%×ËìQà»í©Û�ÙËìÅÚ �'ì�×:Û�Ý?Ü�Ý?Û�ÙËì¦�¿Ú�×¼Ú�»ì>ã��'¸%çËì+ÝUÛ�Üià'×��¦xAܽâiâ¹Û�ÙËìÅà'ã�ëËì�ã¿Ü½×1Ø�ټܽárÙÙËì��'ì�Û�Ý©ÞËã�ܽì�í�ì>ëPÞ5ßÅÙËÜ�ݧöSÚ�ã�Üià'ç¼Ý©ìa� ó â½àSß»ì�ì+ݧÚ3¥Hì>á Û�ÙËÜ�ݧã�ì>Ý�çËâdÛ�Üi×D�!Þ7ì�â½Ü½ì�íhÝ�Û�Ú�Û�ìbBæµ×¼Ý�Øhì�ã(���¹à\�� �ZÙËì�Ý�Ú �?ìe�¿Ú�×RÚ �»ì>ãZܽݹá�à»×¼Ý�Ü�ë�ì�ã�Üi×D�\Ø�Ù¼ì�Û�Ù¼ì�ã�Û�àÇ�»ì�ÛµëËÜiã�ì>á�Û�â½ß�ç ó ëËÚSÛ�ì>ëQÞ5ß Û�ÙËìì�� ó âiàSß'ì�ì>Ý>å�à'ã?Û�à�Ù¼Ú/ö»ì!ÙËÜ�Ý¿ö5ܽá�ì<�X�¿Ú�×¼Ú�»ì�ã��»ì�Û¿ç ó ë¼ÚSÛ�ì+ë1Þ:ß Û�ÙËì¯ã�ì>Ý�Û¿à�íµÛ�ÙËìì�� ó âiàSß'ì�ì>Ý>å/Ú�×¼ë§Û�ÙËì�×ÆÙ¼Ú/ö»ì�Û�ÙËì�ö:Ü�á�ìa�&�¿Ú»×¼Ú �'ì�ã�ç ó ëËÚ�Û�ìhÙËÜ����.¸%çËì>Ý�Û�ܽà»×8�8�5ÙËà»ç¼â½ëÛ�ÙËì��?Ú»×¼Ú �'ì�ã©Ø�à'ã�ã�ß!Û�Ù¼Ú�Û§Û�ÙËì?ã�ì>Ý�çËâdÛ§Ø�ܽâiâ�Þ7ì Ý�ä'ì�Øhì>ë¯Þ5߯Û�ÙËì?ö5ܽá�ì<�X�?Ú»×¼Ú �'ì�ã( Ýó ì�ãrÝ�à'×¼Ú�âC޼ܽÚ'Ý�ì+Ý\BÀæ¹×¼Ý�Ø�ì>ã��N�¹à¼åËÛ�ÙËì/�¿Ú»×¼Ú �'ì�ã( Ýhã�ì>Ý�çËâdÛ�Üi×D�¿ÞRì>âiܽì�í�Ý�Û�ÚSÛ�ì�Ø�Üiâ½â"Þ7ìÚ'ÝZÜi×WÛ�ÙËìtrRã�Ý�Ûµá�Ú'Ý�ì��

131

Page 136: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

� �ZÙËìt�¿Ú�×RÚ �»ì>ãZܽÝ=�'à»×Ëì�Ú�×Rë Û�ÙËì§Û�ì+Ú � ×Ëì>ì>ëËÝ�Û�à?ã�ì+Ú»árÙ!á�à»×RÝ�ì>×¼Ý�çRÝa��ç�Ú'árÙWÚ�»ì�×:ÛÞËã�à'Ú'ëËá�Ú'ÝUÛrÝ�ÙËÜ�Ý\ÞRì>âiܽì�í§ÝUÛrÚSÛ�ì�Û�à�Ú�â½â�Û�ÙËì!à�Û�Ù¼ì�ãrÝBØ�ÙËàPã�ì+á�ì>Üiö'ì�ÜiÛ\Ü����\ì+ë�Ü�ÚSÛ�ì>âiß'åÚ»×¼ë�ܽ׼á�à»ã ó à'ã�Ú�Û�ì>ÝZÚ»âiâHÛ�ÙËì�ÞRì>âiܽì�í�Ý�Û�ÚSÛ�ì>ݹá�à���Bç¼×Ëܽá>ÚSÛ�ì>ë�Û�à?ÜdÛ(�a¸%çËì>Ý�Û�ܽà»×8��xAÜiâ½âÛ�ÙËìBÚ �'ì�×:Û�ݹì�×¼ëQç ó Ø�ÜdÛ�ٯܽë�ì>×:Û�Ü�á�Ú�â�ÞRì>âiܽì�í�Ý�Û�Ú�Û�ì+Ý\B�æµ×¼Ý�Øhì�ã(�.¹�ì+Ý�å7ì�ö'ì�ׯÜií)Û�ÙËì>ßì+Ú»árÙ ó ì�ã�í�à»ã���Û�ÙËì§í�ç¼Ý�Üià'×¼Ý�ܽ×QëËÜs¥Hì�ã�ì�×:Û�à»ãrë�ì�ãrÝ��

� �ZÙËìZÝ�ÜdÛ�ç¼ÚSÛ�Üià'×BÜ�Ý�Ú'Ý)Ú�Þ7àSö»ì»åSÞ¼ç�Û�Ú �»ì>×:Û�Ý�ë�à»×� Û)Ù¼Ú/ö»ìhçË×Ëâ½Ü��?ÜiÛ�ì+ëBÞËã�à'Ú'ëËá�Ú'ÝUÛ)á�Ú ó Ú3�ÞËܽâ½ÜdÛUß���Ö}×RÝUÛ�ì>Ú»ë�å�ì>Ú»árÙ!Ú �'ì�×:Ûhá>Ú�×Wá�à���Bç¼×Ëܽá>ÚSÛ�ì%Ø�ÜiÛ�ÙQÝ�à��?ì%à»í"Û�Ù¼ì©à»Û�ÙËì>ã�Ý>å�Ú�×¼ëÛ�ÙËܽÝ?á>Ú ó Ú»ÞËÜiâ½ÜiÛUß�Ü�Ý\×Ëà�Û?×Ëì+á�ì>Ý�Ý�Ú�ã�Üiâ½ß�Ý�ßT���\ì�Û�ã�ܽá��¸%çËì>Ý�Û�ܽà»×8��ÍhÚ�× Û�ÙËì¯Ú �'ì�×:Û�Ýã�ì>Ú'árÙPá�à'×¼Ý�ì�×¼Ý�ç¼Ý§Û�ÙËã�à'ç��»Ù�Ú ó ã�à�á�ì>Ý�Ý©à�íZí�ç¼Ý�Üià'×<Bæµ×¼Ý�Ø�ì>ã���¹�ì+Ý�å�Üs¥ÂÛ�ÙËì á�à����ÆçË×ËÜ�á�ÚSÛ�Üià'×��»ãrÚ ó Ù!ܽÝ%ÝUÛ�ã�à'×��»â½ß�á�à»×Ë×Ëì+á Û�ì>ëju Û�ÙËìBá�à���BçË׼ܽá>ÚSÛ�ܽà»×��'ã�Ú ó ÙWÜ�Ý�Û�ÙËìë�ܽã�ì>á Û�ì>ë��»ãrÚ ó ÙÆܽ׿Ø�ÙËÜ�árÙ Ú �»ì>×:Û�Ý�Ú�ã�ìZ×Ëà�ë�ì>Ý�Ú�×¼ë?ë�ܽã�ì>á Û�ì>ë?Ú�ãrá�Ý�ã�ì ó ã�ì+Ý�ì>×:Û�á�à����ÆçË×ËÜ�á�ÚSÛ�Üià'× á�Ú ó Ú�ÞËܽâiÜiÛUßC��Ú¯ëËÜiã�ì>á�Û�ì>ë��'ã�Ú ó ÙPÜ�Ý�Ý�Û�ã�à»×��'âißPá�à»×¼×Ëì>á�Û�ì>ë�ÜiíhÛ�ÙËì>ã�ì¿Ü�ÝÚ�ë�ܽã�ì+á Û�ì>ë ó ÚSÛ�Ù1í�ã�à�� Ú�×5ßÂ×Ëà�ë�ìQÛ�à Ú»×:ßLà�Û�Ù¼ì�ã#w#�=Ö}×1Û�ټܽÝWá�Ú»Ý�ìQì>Ú'árÙ1Ú�»ì�×:ÛÝ�ÙËà»ç¼â½ë�Ý�Ü�� ó â½ßPá�à���BçË׼ܽá>ÚSÛ�ì¿Ù¼Ü½Ý�Þ7ì�â½Ü½ì�í¹ÝUÛrÚSÛ�ì?Û�àÅÚ�â½â�Û�ÙËì�Ú �'ì�×:Û�ݧټì á�Ú�×"å"ܽ×M�á�à»ã ó à'ã�Ú�Û�ìWÛ�ÙËìÅÞ7ì�â½Ü½ì�í�ÝUÛrÚSÛ�ì>Ý á�à���BçË׼ܽá>ÚSÛ�ì+ëÂÛ�à ÙËÜ��Qå¹Ú�×¼ë1ã�ì ó ì>ÚSÛ(� æ¹í Û�ì�ã%ºã�à»çË×RëËÝ�Ú�â½â�Ú �'ì�×:Û�ݧØ�ܽâ½â�Ù¼Ú/ö»ì?Ü�ë�ì�×:Û�ܽá>Ú�â�Þ7ì�â½Üiì�íZÝ�Û�Ú�Û�ì+Ý�å�Ø�ÙËì�ã�컺¯Ü�ݧÛ�ÙËìWë�ܽÚ�?ì<�Û�ì�ã©à�í�Û�ÙËì\á�à���Bç¼×Ëܽá>ÚSÛ�Üià'צ�»ãrÚ ó ÙIu�Û�ÙËì¿ë�Ü�Ú �?ì�Û�ì�ã©à�íhÚ�ë�ܽã�ì+á Û�ì>ë¦�'ã�Ú ó Ù¯Ü�ݵÛ�ÙËìâ½à»×��'ì>Ý�ÛZÝ�ÙËà'ã�Û�ì>Ý�Û�ëËÜiã�ì>á�Û�ì>ë ó Ú�Û�Ù!ÞRì�ÛUØ�ì>ì�ׯÚ�×5ß¿ÛUØhà?×Ëà�ë�ì>ÝZܽ×!Û�ÙËìe�'ã�Ú ó Ù\w<�

¼l½�¾ ¿NÀFÁÃÂ�Ä�Å,ÆhÇ8ÀaÈHÉ­ÀÊÆ^É­ËÌÅZÄÍÉ­ËqÎ�ÅZË�ÏÐÆhÇ�Æ�ÀFÈÑÄÍÂ5Â�ÅZÀFÄ�Ò<ÓÐËqÇÖ�Û"ܽÝ"×¼ÚSÛ�çËãrÚ�â/Û�à%Ú»Ý�äµØ�Ù5ßµØhì�ë�à¹×Ëà»Û"Ý�Ü�� ó â½ß%ìay:Û�ì�×¼ë§à'×Ëì�à�í5Û�ÙËì�ì<y�Ü�ÝUÛ�Üi×D��ÜdÛ�ì�ãrÚSÛ�ì+ë§ÞRì>âiܽì�íã�ì>ö:Ü�Ý�Üià'×ÂÚ ó¼ó ã�à:Ú»árÙËì+Ý�Û�à Ú»á�á�à�?à�ëËÚSÛ�ì!Ú��Bç¼âdÛ�Üs�}Ú �'ì�×:Û ó à'Üi×:Û¿à�í©ö:ܽì�Øe�þ� ó ì>á�ÜsrRá>Ú�â½âiß'åØ�ì�á�à'çËâ½ë Ú»Ý�Ý�çD�\ìPÛ�ÙRÚSÛQÞRà»Û�Ù¾Ú�ã��»ç��?ì�×:ÛrÝ Û�à�Ú»×Fà ó ì>ã�Ú�Û�à»ã!Ú�ã�ìÀí�ç¼âiâ�Þ7ì�â½Üiì�í\Ý�Û�Ú�Û�ì>Ý>åÞËç�ÛQÛ�Ù¼Ú�Û!à'×Ëâ½ß�Û�ÙËì�Þ7ì�â½Üiì�í\Ý�ì�Û ó à'ã�Û�Üià'×Fà�íBÛ�ÙËì Ý�ì>á�à'×¼ë=Ú»ãR�'ç��?ì�×:Û!Ü�Ý!ç¼Ý�ì>ëFëËçËã�ܽ×��ã�ì>ö:Ü�Ý�Üià'×��7u&�%Þ5ö5ܽà»ç¼Ý�âiß'å�Ú»Ý�Ý�à�á�ܽÚ�Û�ܽö:ÜiÛUߧë�à5ì>Ý�×Ëà�Û��¿Ú�ä»ìb�BçRárÙBÝ�ì>×¼Ý�ìN�'Üiö'ì�×�Û�ÙËì�Û�ìa� ó à»ãrÚ�âÜi×:Û�ì>ã ó ã�ì�Û�ÚSÛ�Üià'×�à�íËÜiÛ�ì>ã�Ú�Û�ì>ë§ã�ì�ö5Ü�Ý�ܽà»×�Ú'ÝCÛ�ÙËì�Ú»ãR�'ç��?ì�×:Û�Ý�Ú�ã�ì�à�í¼ë�Ü�¥Hì�ã�ì�×:Û"ÛUß ó ì>Ý>å/ÚµÞRì>âiܽì�íÝUÛrÚSÛ�ì�Ú»×¼ëÆÚ©Þ7ì�â½Üiì�í�Ý�ì�Û�� Ô(w�æ%á�á�à»ãrë�Üi×D�»â½ß»å�ØhìZÞËã�ܽì<éRß�Û�Ú»ä»ìZÚ%â½à5à»äBÚ�Û�Ý�à��\ìZà»í¼Û�Ù¼ì�ã�ì+á�ì�×:ÛÜdÛ�ì�ãrÚSÛ�ì+ëÅã�ì�ö5Ü�Ý�ܽà»× ó ã�à ó à:Ý�Ú»â½Ý>åHì<y5Û�ì>×¼ë�ì>ë�ڻݩë�ì+Ý�á�ã�ܽÞRì+ëPÚ»ÞRàSö'ì»åCÚ�×RëÀÝ�çËÞMÚUì>á�Û§Û�ÙËì��¥Û�àà»×Ëì§à�í�Û�Ù¼ìe�\à:ÝUÛ�Þ7ì�×ËÜ��»×Qܽ×5ö/Ú»ã�Ü�Ú�×Rá�ì%Û�ì>Ý�Û�ݹÜ��¿Ú �'Üi×RÚ�ÞËâ½ì»å�×¼Ú�?ì�â½ß»åËÚ»Ý�Ý�à5á�ܽÚ�Û�ܽö5ÜdÛUß��

xPì?Ý�Ù¼à»çËâ�ë ó à»Ü½×'Û§à»çËÛ%Û�Ù¼ÚSÛ§Ø�ì?ë�à'×� Û§×Ëì+á�ì>Ý�Ý�Ú�ã�Üiâ½ßWö5ܽì�Ø�Û�ÙËì\Üi×5öSÚ�ã�ܽڻ׼á�ìBà�í�Ú'Ý�Ý�à �á�Ü�ÚSÛ�ܽö5ÜdÛUßWÚ»ÝZà»Þ5ö5ܽà»ç¼Ý�âiß¿öSÚ»âiÜ�ëCå�ì�ö'ì�×��»Ü½ö»ì>×!Ú��Bç¼âdÛ�Üs�}Ú �'ì�×:Ûhܽ×:Û�ì>ã ó ã�ì�ÛrÚSÛ�Üià'×���ì<y ó ì�ã�Üiì>×¼á�ìÙ¼Ú»ÝBÛ�Ú»ç��»Ù:ÛÆçRÝ�Û�àPÞ7ìWØZÚ�ã�ß�à�í ó à'Ý�Û�çËâ�ÚSÛ�ì>ÝÆã�ì+ÝUÛ�Üi×��Pà»× â½à5à'Ý�ì�ܽ×:Û�çËÜiÛ�ܽà»×LÚ»âià'×Ëì��9hç�ÛÚ»Ý�Ý�à�á�Ü�ÚSÛ�Üiö5ÜiÛUßÆÜ�ÝhÚ�×¼Ú�Û�çËãrÚ�âHá�ã�ÜdÛ�ì�ã�Üià'×?Û�à?á�à'×¼Ý�Ü�ë�ì>ã>å5Ú�×¼ë ÜdÛZܽÝ�ܽ×'Û�ì�ã�ì>Ý�Û�ܽ×���Û�àÆÝ�ì�ì<½�ì>ö»ì�×Ø�ÜdÛ�ÙËà»ç�Û©ÚSÛ�ÛrÚ»árÙËܽ×���Ú?öSÚ»âiçËì;ÚUç¼ëM��?ì>×'Û%Û�à?Û�ÙËìBà'ç�Û�á�à�?ì#½AÙËàSØ Û�ÙËì+Ý�ì ó ã�à ó à:Ý�Ú»â½Ýhí-Ú»ã�ìã�ì>â½Ú�Û�ܽö»ì%Û�à?Û�ټܽݹá�ã�ÜiÛ�ì>ã�ܽà»×8�Õ4Y�.!-�+�)!'�.<�X24��� ]<? + @T6B'�-� #.!,�+�)!'�58���!'� #5�+�$#+�.324�h�FEHGó0 #,^�&"!�J2���'�,H24,F240!0!��+�'�)Â�� b���!'�5&'£f�+�,&+� #.� 4@� #.!'

���!'� #5�]79<]N24.! 4�&�('�5�?#�&�('q�a"!'�,���+� #.b 4@368�!'£���!'�5c24,�,& a-�+J2��&+ f�+ ��]b�( #��)!,�+�,D��'�$#+ ��+�:;2��&'#WZÖ8 R6B'£f<'�5R?#2�,&+�:*0!��''£>(24:*0!��'*,&�( R68,7���32��b24,&,� �-�+J2���+ f(+ ��]�+�,724-���"324��� ]�+�.!-� #.!,&+�,^�&'�.<�N68+ �������!';0 #,���"!�J2��&'�,�gÐ×C #.!,&+�)('�5����!'��6C k0 #,&,�+�9!��'b24,�,& a-�+J2��&+� #.!,1 4@�_/5&'£f�+�,&'�)/9a]t`�5�'£f(+�,�'�)/9<];_CØ à i `(W7YL@�24,&,� �-�+J2���+ f(+ ��]/+�,�24,�,&"!:*'�)h?���!'*�FEHG 0 #,���"!�J2���'�,^å²+�.�03245^�&+�-�"!�J245�? i P#? iFl ?�24.() i�� å²@� #5&-�'=���!'*5&'�,�"!� �1 4@8��'�@s�724,&,� �-�+J2���+� #.��� '�.<�X24+��ÚÙ_bmk`(?h24.!)%���!'=5�'�,&"!� �� 4@C5&+�$#�<�124,&,� �-�+J2���+� #.%�� Â'�.<�X24+��h_=m�ÙM`(?M2;-� #.a��5X24)(+�-���+� #.hW��8�(+�,H-�24.9 '1�&5&24-�'�)t�� *�&�('N+�.!)('�0 '�.!)!'�.!-�'= 4@D���!'N #5�+�$#+�.324�T�FEHGù5&'£f�+�,&+� #.t #.k0324,���5&'£f(+�,�+� #.!,RWqÖ8 R6B'£f<'�5R? ���!'+ ��'�5X2���'�)t5&'£f�+�,&+� #.� #0'�5&2��& #5�,86B'�-� #.!,�+�)!'�5F�('�5�'1245&'#?!��+E�#'H@�"!,�+� #.h?(�(+�,��� #5�]aA�)!'�0 '�.()!'�.a��W

132

Page 137: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

æµÝWÜdÛ¯Ý�àÂÙ¼Ú ó¼ó ì>×¼Ý�å%×Ëà�ÛQà»×¼âißAë�à5ì>ÝWÛ�ÙËܽÝ!ܽ×5ö/Ú»ã�Ü�Ú�×Rá�ìÅ×Ëà»Û!Ù¼à»â�ë�í�à»ãQÚ�×5ß�à�í�Û�ÙËìó ã�à ó à:Ý�Ú»â½Ý>å/ÞËç�Û�ܽ×Bì>Ú'árÙBá�Ú'Ý�ìhÜiÛ�Ü�Ý ó à:Ý�Ý�ÜiÞ¼âiì�Û�à©ì�ö'ì�×%�'ì�Û�á�à'×Mé¼Ü�á Û�ܽ×��%ã�ì+Ý�çËâiÛ�Ý)ë�ì ó ì>×¼ë�ܽ×��à»×ÂÛ�ÙËìQã�ì�ö5ܽÝ�ܽà»×Là'ã�ëËì�ã(�IxPì¯á�à»×¼Ý�ܽëËì�ã?ÙËì�ã�ì!Û�ÙËì ó ã�à ó à'Ý�Ú�â�ÝÆÜi×¾ï z�å�ð�{�å7}�í5å�ð�|Ëå7} ~/òX�xPì¿ë�ì+Ý�á�ã�ܽÞRìÆì>Ú'árÙÀà�í�Û�ÙËì+Ý�ì ó ã�à ó à'Ý�Ú�â�ݹÞ7ì�â½àSØ Ú�×¼ëPÝ�Ù¼àSØ�Û�Ù¼Ú�Û%Û�ÙËì�ã�ì\ì<y�ܽÝ�Û�ݧÚ�Û©â½ì>Ú'ÝUÛà»×ËìPì<yËÚ � ó â½ìPÝ�ç¼árÙAÛ�Ù¼ÚSÛju-Úhw�Ú»âiâÂr¼ö»ì ó ã�à ó à'Ý�Ú�â�Ý�Ú �'ã�ì>ìÅà»×=Û�ÙËì�ã�ì+Ý�ç¼âdÛ!à�íBÜiÛ�ì>ã�Ú�Û�ì>ëã�ì>ö:Ü�Ý�Üià'×�å+í�à»ã)Ú�×5ßkr�y�ì>ëBÚ'Ý�Ý�à�á�Ü�ÚSÛ�Üià'×�à»ãrë�ì�ã�à�í¼ã�ì�ö5Ü�Ý�ܽà»×"å/Ú»×¼ë�u-Þ\w�Û�ÙËì+Ý�ìZë�Ü�¥7ì>ã�ì>×'Û)à»ãrë�ì>ã�Ýà�í%ã�ì�ö5Ü�Ý�ܽà»×Âß5Üiì>â½ëÂÞRì>âiܽì�í�Ý�ì�Û�ÝÆÛ�Ù¼ÚSÛ�Ú»ã�ì!×Ëà�Û¿à'×Ëâiß ëËܽÝ�Û�ܽ׼á Û+å�Þ¼ç�Û�Ú»á�Û�ç¼Ú»âiâ½ßj�Bç�Û�ç¼Ú�â½â½ßÜi×¼á�à»×¼Ý�Ü�ÝUÛ�ì�×:Û��1�ZÙËÜ�Ý�á�à'çË×:Û�ì>ã���ì<yËÚ � ó â½ì©Ü½Ý¹Ý�ÙËàSØ�×WÜi×oØ�Ü��'çËã�ì§ñD�

pr

r

pr

pr

A B C

A B C

pr

pr

pr

p

r

rC

B

A

Ø�Ü��»ç¼ã�ì\ñ��kÍ�à'çË×:Û�ì�ãR��ì<yËÚ � ó â½ìBÝ�ÙËàSØ�Üi×D��Ú�âiÛ�ì>ã�×¼Ú�Û�ܽö»ìÆÜdÛ�ì�ãrÚSÛ�ì+ë¯ã�ì�ö5ܽÝ�ܽà»×Åà ó ì>ã�Ú�Û�à»ãrݹڻã�ì×Ëà�ÛµÚ'Ý�Ý�à�á�Ü�ÚSÛ�ܽö»ì��Í$%Èh"�È\= Ú�ã�ì%Þ7ì�â½Üiì�í�ÝUÛrÚSÛ�ì+Ýa�

ÛC��Üq����«#�����XÝÞ���¬­��Üq�t¬Z«2�t��ßÌ��������� à��Sú&��þR�3�%þ�üa���L�#�-ý3�Rå ó ã�à ó à:Ý�ì+ëAÞ5ß>9�à'ç�Û�ܽâ½Üiì>ã�ï�z>ò�åì<y5Û�ì>×¼ëËÝÆÛ�ÙËì¯æ¹é§î ܽë�ì+ÚÀà»í��?Üi×¼Ü��¿Ú�â½â½ß árÙ¼Ú�×��'Üi×D�ÀÞ7ì�â½Üiì�í-ÝBÛ�à�Ú óËó â½ß�Û�àPÛ�ÙËìQÚ �'ì�×:Û� Ýá�à»ç¼×'Û�ì�ã�í-Ú»á�Û�ç¼Ú»âCÞRì>âiܽì�í-Ý©Ú»ÝZØhì�â½â���é©Ü½ö»ì�ׯÚ?Þ7ì�â½Üiì�í�ÝUÛrÚSÛ�ì»å�Û�ÙËܽÝ%Ú óËó ã�à:Ú»árÙQÝ ó ì>á�Üsr¼ì+ÝZÛ�ÙRÚSÛØ�ì�à'×Ëâiß\árÙ¼Ú�×D�»ìhÛ�ÙËìµà»ãrë�ì�ã�Üi×D�©Ú'Ý��Bç¼árÙ¿Ú'Ý�ܽÝ�ã�ì(q'ç¼Üiã�ì>ëÆÞ5ß�Û�Ù¼ìµæ¹é§î ó à:ÝUÛ�çËâ�ÚSÛ�ì+Ý�å:Ú�×¼ë×Ëà��\à'ã�ì�����b�<�F���������#�<�>�Öì7û¼üZþ�ü<�#�M�zú����T�t��üa�J�-ü�����ü�ú^�1�T�#���h�Â�iü��rú8�3�C�%þ#�s�+û�úB�3�4��ý(�a�^�Sú&�-ý3��ý��*á%ý3��ú&���J�-ü�þZâ ��c�Sú&��þR� �Cþ�ü<�(�L�#�-ý3� ý��¼ü�þ��Sú}ýSþ4���4�3�Û��ü%���c�rý ���#�L� ú�üa�¼ú£Ó

�¬­�bã»��äbåÍ�J¬Z�Íæ����X¬­�t«��8Ý�çw�<�tèNÜÍ«�¬Z������� é©Ú�ã�Ø�ܽárÙËì!Ú�×¼ë¢ë)ì>Ú�ã�â%ïið�{Sò�Ý�ç����»ì>Ý�Û\Ú'ëT�ë�ÜdÛ�Üià'×¼Ú�â ó à'Ý�Û�ç¼â½Ú�Û�ì>Ý?ܽ×AÚ�×�ÚSÛ�Û�ìa� ó Û¿Û�à Ú»ëËÚ ó Û?Û�ÙËìÅæ¹é§î í�ãrÚ �?ì�Øhà»ã�ä�í�à»ã¿ÜiÛ�ì>ã�Ú�Û�ì>ëã�ì>ö:Ü�Ý�Üià'×��Wæ%Ý�Üi×�Û�ÙËì�á>Ú»Ý�ì¿à�íZ×¼ÚSÛ�çËã�Ú»â�ã�ì�ö5ܽÝ�ܽà»×�å"Û�ÙËÜ�ÝBÚ óËó ã�à'Ú»árÙ�ë�ì�ã�ܽö»ì>Ý�ÜdÛrÝ�Üi×RÝ ó ܽãrÚ3�Û�ܽà»× í�ã�à� ÚÆ×Ëà»Û�ܽà»×�à»í8�?Üi×¼Ü��?Ü���ܽ×��\árÙRÚ�×��'ì¹Û�àÆÛ�Ù¼ì©Þ7ì�â½Ü½ì�í�Ý�Û�Ú�Û�ì��H�µàSØ�ì>ö»ì>ã>å'ÜdÛ�ã�ì�â�Ú3y�ì>ÝÛ�ÙËì¿á�à'×¼Ý�Û�ãrÚ�ܽ×'Û%Û�Ù¼Ú�Û§árÙRÚ�×��'ì%�BçRÝUÛ§ÞRì¿á�à� ó â½ì�Û�ì>âißo�?ܽ×ËÜ��\Ü���ì+ëCåHÛ�Ù5ç¼Ý§Ú»âiâ½àSØ�Üi×D� í�à»ã�ÚØ�ÙËà»â½ì�í-Ú�\ܽâ½ß!à»í�ã�ì>ö5ܽÝ�Üià'ׯà ó ì�ãrÚSÛ�à»ãrÝ�åRÜi×¼á�âiçRë�Üi×D�W×¼Ú�Û�çËãrÚ�â�ã�ì>ö5ܽÝ�Üià'×Åڻݵà»×ËìÆÜi×¼Ý�Û�Ú»×:Û�Ü�Ú3�Û�ܽà»×����5à�?ì�Ø�ÙRÚSÛ�Ý�ç¼ã ó ã�Ü�Ý�ܽ×��»â½ß»åH×¼Ú�Û�çËãrÚ�â)ã�ì�ö5Ü�Ý�ܽà»×� Ý©â½Ú'áräQà�íhÚ'Ý�Ý�à�á�Ü�ÚSÛ�Üiö5ÜiÛUßQÚ óËó â½Üiì+Ý©Û�àì�ö»ì>ã�ßÇ�?ì��BÞ7ì�ã¹Ü½×!Û�ÙËì§í-Ú�\ܽâ½ß à�í)à ó ì�ãrÚSÛ�à»ãrÝa�

133

Page 138: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

���b�<�F���������#�<�?¡¸ì7û¼ü\þ�ü<�4�M�dú����T���rü<�s�-ü^�/��ü�ú��k�T�#���h���iü��rú;�3�c� þ#�s�+û�ú;�!�4��ý��a�^�Sú&�-ý3� ý����3�¼ÿþ�ü<�(�L�#�-ý3�Pý4�Ëü�þR��ú�ý�þ4�Â�a�Sú&�L�æ�rÿ3���h��ú û¼üÐê��Sþz���^�rûËü��3�c�më¹ü���þ#�h�Ëý3� ú��M����ú�ü<�/��� ���rüt���c�rý3�D�#�L�4�ú�üa�¼ú£Ó

ì �F��åÍ�FÝí��ä8�<�ÍæÍ�°���#�<�ͬZ«#��îV¬Z������� Ö}×�ï }í+ò�å� ó à»ÙË×BÜi×:Û�ã�à�ë�ç¼á�ì>Ý"á�à»×¼ëËÜdÛ�Üià'×¼Ú�â½Ü��+ÚSÛ�Üià'שà ó �ì�ãrÚSÛ�à'ã�Ý�àSö»ì>ã�ýSþ��3���c� �/��ý �c� � ú��-ý �c�3�1�4�M�C��ú&�-ý3�D�¦u&�eÍbØ)Ý�w<� �eÍbØ)ÝWá>Ú�×FÞ7ì�ö:ܽì�Øhì>ëFÚ»ÝÚ�×Ëà'×:ßT�?à»çRÝCÞRì>âiܽì�í7ÝUÛrÚSÛ�ì+Ý"Ü��BÞ¼çËì>ë�Ø�ÜiÛ�ÙBÛ�ÙËìhÚ»ëËëËÜdÛ�Üià'×¼Ú�â:ÝUÛ�ã�ç¼á�Û�çËã�ì�à»í¼Ú�×�à'ã�ëËÜi×¼Ú»â»ãrÚ�×Ëäh�Üi×���u-Ú»ä/ÚWÚ�ï8�}þR�3�MÑ3���T��wµàSö»ì>ã¹Ø�à'ã�â�ëËÝ%Ý�à¿Û�Ù¼ÚSÛ§ÜiÛ%Ü�Ý ó à'Ý�Ý�ܽÞËâ½ì�Û�àWÝ ó ì+Ú�ä¯Ú�Þ7à»ç�Û§ë�ìa�'ã�ì>ì>Ýà�í)ÞRì>âiܽì�í�� ð�� ó à»Ù¼× ó ã�à ó à:Ý�ì+ëWá�à»×¼ë�ÜiÛ�ܽà»×RÚ�â½Ü��+ÚSÛ�ܽà»×!à ó ì�ãrÚSÛ�à»ãrÝhàSö»ì�ãhÛ�ÙËì>Ý�ì�í�çË×¼á�Û�ܽà»×¼ÝµÚ»Ýq'çRÚ�â½ÜdÛrÚSÛ�ܽö»ì\ö»ì>ã�Ý�ܽà»×¼Ýµà�í ó ã�à'Þ¼Ú�ÞËܽâ½Ü½Ý�Û�Ü�áBá�à»×¼ë�ÜiÛ�ܽà»×RÚ�â½Ü��+ÚSÛ�ܽà»×8�%�ZÙËì?Ý�ì�Û§à»í�à ó ì�ãrÚSÛ�à»ãrݹÜ�Ýó Ú�ãrÚ �?ì�Û�ì�ã�Ü��>ì>ë Þ5ß��¾Ø�ÙËÜ�árÙ ÛrÚ�ä'ì>Ý\à»×Âà»ãrë�Üi×RÚ�âZöSÚ�â½çËì>Ý�� Ö}×:Û�çËÜiÛ�ܽö»ì>âiß'å�ã�ì�ö5ܽÝ�ܽ×��PÞ5ß ÚÝ�ì>×'Û�ì�×¼á�ìN¿!ç¼Ý�Üi×D�\Ú ó Ú�ã�Û�Ü�á�ç¼â½Ú»ã��8��á�à»×¼ëËÜdÛ�Üià'×¼Ú�â½Ü��+ÚSÛ�Üià'×�à ó ì�ãrÚSÛ�à»ã�Ø�Üiâ½â�á�Ú»ç¼Ý�ì%Û�Ù¼ì©Ú�»ì�×:ÛÛ�à¿Þ7ì�â½Üiì>ö»ì=¿QØ�ÜdÛ�Ù2�Ir¼ã��\×¼ì>Ý�Ýa����b�<�F���������#�<�>ñÖì7û¼ü\þ�ü<�4�M�dú����T���rü<�s�-ü^�/��ü�ú��k�T�#���h���iü��rú;�3�c� þ#�s�+û�ú;�!�4��ý��a�^�Sú&�-ý3� ý����3�¼ÿ��ý ���a���c��ú��-ý ��ý��Ð�N����ý �c� � ú��-ý �c�3�s�Þò���ú��-ý � ý��¼ü�þR��ú�ý�þ4���4�3���rü/���c�rý3�D�4�L� ú}ü<�RúXÓxPì ÙRÚ»Ý�Û�ì�×�Û�à ó à'Üi×:ÛBà'ç�Û�Û�ÙRÚSÛBܽ×�Û�ÙËÜ�Ý ó Ú ó ì�ã�� ó à»Ù¼× Ú�â�Ý�à¯ë�ìar¼×Ëì>ÝÆÚ�×�à ó ì�ãrÚSÛ�à»ãí�à»ã\Û�ÙËìQá�à'×¼ë�ÜiÛ�ܽà»×¼Ú»âiÜ��>Ú�Û�ܽà»× à»íµà»×¼ìo�eÍbؾÞ5ß Ú�×Ëà»Û�ÙËì>ã��jxAÜiÛ�Ù1ܽ×'Û�çËÜiÛ�ܽà»×¼Ý\Þ¼Ú'Ý�ì+ë à»×

ó»ìa¥7ã�ì�ßC Ý1�»ì>×Ëì�ãrÚ�â½Ü���ì>ë?á�à»×¼ë�ÜiÛ�ܽà»×RÚ�â½Ü��+ÚSÛ�ܽà»×Åïið�~Sò;å'Û�ÙËÜ�Ý�à ó ì�ãrÚSÛ�à'ãF���%Ú»Ý�Ý�à�á�ܽÚ�Û�ܽö»ì»å»Û�ÙËà'ç��»Ù×Ëà�ÛZá�à���Æç�Û�Ú�Û�ܽö»ì�u^�»Ü½ö»ì>׿ÛUØ�à��eÍbØ�ÝlïWÚ�×¼ë%ô�åZïWá�à»×Rë�ÜdÛ�Üià'×Ëì>ë Þ:ßiô²�»ì>×Ëì�ãrÚ�â½â½ßÆÜ�Ý�×¼à�ÛÛ�ÙËìBÝ�Ú �?ì�Ú'Ý�ôÅá�à»×¼ë�ÜiÛ�ܽà»×¼ì>ë!Þ5ß%ïcw#�=�ZÙËì�à ó ì�ãrÚSÛ�à»ãZÞ7ì�Ù¼Ú/ö'ì>Ý;q:çËÜiÛ�ì�Ý�Ü��?ܽâ½Ú»ã�â½ß Û�à¿à»çËãrÝÜiׯÛ�ÙËì?Ý ó ì>á�ܽڻâ�á�Ú'Ý�ìÆØ�ÙËì�ã�ì�Û�ÙËìÆá�à»×¼ë�ÜiÛ�ܽà»×¼Üi×��!Ú �'ì�×:Û� ݵÝ�à»çËãrá�ì+ݹÚ�ã�ìBÚ»âiâ��?à»ã�ì�ã�ì>âiÜ�Ú�Þ¼âiìÛ�Ù¼Ú»×WÛ�ÙËì�á�à'×¼ë�ÜiÛ�ܽà»×Ëì+ëWÚ�»ì>×'Û( Ý��

õF�Xå�è/¬Z���aÝÞ�lçx�<�bè!ÜÍ«�¬­������� Ö}×�ïdð(|/ò�å Ä�ì�ÙD�?Ú»×Ë× ó ã�à ó à'Ý�ì>Ý"ß»ì�Û�Ú�×Ëà»Û�ÙËì>ã�Ý�ì�Û�à»í ó à:ÝUÛ�çM�â½Ú�Û�ì>Ý"Üi×:Û�ì�×¼ë�ì+ë�Û�à¹ã�ì��»çËâ�ÚSÛ�ì�Ý�ì�q:çËì�×Rá�ì>Ý"à�íËã�ì�ö5Ü�Ý�ܽà»×¼Ý��8�µì ó ã�àSö5ܽë�ì+Ý�ÚµÝ�ìa�¿Ú»×'Û�ܽá�Ú»á�á�à»çË×:ÛÞ¼Ú»Ý�ì>ëQà'ׯØ�Ù¼ÚSÛ©ÙËìBá�Ú»âiâ�Ým���^�'ü<�\���h�Qþ��3�TÑ��¿ý��'ü<�J�ZØ�ټܽárÙ�å7â½Üiä'ì��eÍbØ)Ý�åHá�ڻׯÞRìÆö:ܽì�Øhì>ëÚ»Ý�Ú»ç���?ì�×:Û�ì>ë Ú�×¼à»×5ßT�\à'ç¼Ý§Þ7ì�â½Üiì�íµÝUÛrÚSÛ�ì+Ýa�o�µì ó ã�àSö:Ü�ë�ì+Ý�ÚQã�ì+á�çËãrÝ�Üiö'ì?ëËì<r¼×ËÜiÛ�ܽà»× í�à»ãá�à� ó çËÛ�ܽ×��\Û�ÙËì§ã�ì>Ý�çËâdÛ�à»í�Ú?Ý�ì(q'ç¼ì�×¼á�ì%à»í�ã�ì�ö5ܽÝ�ܽà»×¼Ý�Þ¼Ú»Ý�ì>ëWà»×!Ú��»Ü½ö»ì�×WØ�ܽëËì�×Ëܽ×��\ã�Ú»×Ëä�\à�ë�ì>â�����b�<�F���������#�<�?ö¸ì7û¼üZþ�ü<�#�M�zú����T�t��üa�J�-ü�����ü�ú^�1�T�#���h�Â�iü��rú8�3�C�%þ#�s�+û�úB�3�4��ý(�a�^�Sú&�-ý3��ý��*¥�ü�ûM��� �D��â �þ�ü<�(�L�#�-ý3� ý��¼ü�þ��Sú}ýSþ��4�3����ü%���c�rý ���#�L� ú�üa�¼ú£Ó

÷G�#«�«#��¬Zè��8Ý��;�t¬Z�Í��è!Üq��¬­���#�<�Í� xAܽâiâ½Ü½Ú�¿Ý©ï } ~SòC�'ì�×Ëì>ã�Ú»âiÜ���ì+ÝN� ó à»Ù¼×� Ýh×Ëà»Û�ܽà»× à»í�á�à'×M�ë�ÜdÛ�Üià'×¼Ú�â½Ü��>ÚSÛ�Üià'×�à ó ì�ãrÚSÛ�à'ã�Ý©Û�àÅܽ׼á�â½ç¼ë�ì!Ú�×5ßÀà ó ì>ã�Ú�Û�à»ãrݧàSö»ì>ã��eÍbØ)Ý�Û�ÙRÚSÛÆÝ�ÚSÛ�ܽÝ�í�ßÀÛ�ÙËìæ¹é§î ó ã�à ó ì�ã�Û�ܽì>Ý>å'ã�ì�í�ì>ã�ܽ×���Û�à�Û�ÙËܽÝhâ�Ú�ã��»ì>ã�á�â�Ú»Ý�Ý�à»í"à ó ì�ãrÚSÛ�à»ãrÝ�Ú»Ý�Û�ÙËì§Ý�ì�Û�à»í�ú¬þ��3���#���M�úX��ú��-ý ���#�H��ÙËì�ë�ì>Ý�á�ã�ÜiÞ7ì>Ý�ÛUØhà ó Ú�ã�Û�Ü�á�çËâ�Ú�ã�Ý�çËÞM�}á�â�Ú»Ý�Ý�ì+Ý"à»í¼Û�ã�Ú»×¼Ý��Æç�Û�Ú�Û�ܽà»×¼Ý��N�rý3�C�3� ú&�-ý3�C�3�J��Þò���ú��-ý � à ó ì�ãrÚSÛ�à»ãrÝZØ�ÙËÜ�árÙQÚ»ã�ì§ì(q'ç¼ÜiöSÚ�â½ì�×:Û�Û�à�� ó à'ÙË×� ݹá�à'×¼ë�ÜiÛ�ܽà»×¼Ú»âiÜ��>Ú�Û�ܽà»×!à ó ì�ãrÚSÛ�à»ãrÝ�åÚ�×¼ëo���ø#�T� ú��¿ü<�Rú)ý4�Ëü�þR��ú�ý�þ4�rå�Ú%í-Ú �?ܽâiß\à�íHà ó ì�ãrÚSÛ�à»ãrÝ ó Ú�ãrÚ �?ì�Û�ì�ã�Ü��>ì>ëÆÞ5ßc�QØ�ÙËÜ�árÙÆÛrÚ�ä'ì>Ýùzú 0 #�!.o24-£��"324��� ]�)!'£U3.!'�) à ×Ú��,*6F+ ����5�'�,&0 '�-��;�� �,&"(9(U3'���)!,� 4@ lzû -��� #,&'�)�"(.!)!'�5�üj24.!)2ýBWÖ8 R6B'£f<'�5R?��&�('=24)!)!+ ��+� #.324�\,���5&"!-£��"!5&'N)! �'�,1.! 4��0(�J2�]/2;5& #��'7+�./ #"!5�5&'�,&"(� �&,�24.()%,� !?�@� #5����!'=,&2��#'b 4@

-��J245�+ ��]a?�6B'�"!,�'H�&�!'1,�+�:*0!��'�5C)!'£U3.!+ ��+� #.hW

134

Page 139: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

à»×Àà'ã�ëËÜi×¼Ú»â)ö/Ú»âiç¼ì>Ý��%x�ì\ÙRÚ/ö»ìÆÚ»âiã�ì>Ú'ë�߯Ý�ì>ì�×ÀÛ�Ù¼Ú�Û�á�à»×¼ëËÜdÛ�Üià'×¼Ú�â½Ü��+ÚSÛ�Üià'×Àà ó ì�ãrÚSÛ�à»ãrݵڻã�ì×Ëà�ÛµÚ'Ý�Ý�à�á�Ü�ÚSÛ�ܽö»ì���æ%ÝZÜdÛ�Û�çËã�×¼Ý�à»çËÛ>å5Û�Ù¼ì�Ý�Ú�\ì§Ü�ÝZÛ�ã�çËì©í�à»ã%Ú»ë!ÚUç¼Ý�ÛR�?ì>×'Û¹à ó ì>ã�Ú�Û�à'ã�Ý�����b�<�F���������#�<�?þ¸ì7û¼ü\þ�ü<�4�M�dú����T���rü<�s�-ü^�/��ü�ú��k�T�#���h���iü��rú;�3�c� þ#�s�+û�ú;�!�4��ý��a�^�Sú&�-ý3� ý����3�¼ÿ��ý ���a���c��ú��-ý ��ý����1�����^ø#�T� ú��¿üa�¼úhý4�Ëü�þR��ú�ý�þ4����� ���rü/���C�rý3���#�L� ú}ü<�RúXÓ

ÿ ¨ª©�«s³�«�ñ�Õa«;òóña­©Ï"Ï)Õ�Ô*«�� ò»Ó�¯)Õ�ÑÆÎ ­�ÎQÒ ÒÅÕ���Ó�¯)Õ�Ñ\ÎÖ}ׯÛ�ÙËì ó ã�ì�ö5Üià'ç¼Ý¹Ý�ì>á�Û�ܽà»×¯Øhì/�?ì�×:Û�ܽà»×Ëì+ëQÛ�ÙRÚSÛ�&(ÂܽÝ%ܽëËìa� ó à»Û�ì>×'Û+å7á�à���BçËÛ�ÚSÛ�Üiö'ì»åHÚ�×¼ëÚ»Ý�Ý�à�á�Ü�ÚSÛ�Üiö'ì���ZÙ:çRÝ�å5Ú�Ý�ì�Û�à»í ó ì+ë�Ü��'ã�ì>ì>ë?Þ7ì�â½Üiì�í"ÝUÛrÚSÛ�ì>Ý�Û�Ù¼ÚSÛhܽÝ�á�â½à'Ý�ì>ë?çË×Rë�ì�ã�&(Àí�à»ã��¿Ý�Ú��üa�����£���Sú;ú��^�rü�ï í/òX��Ö}×'Û�çËÜiÛ�ܽö»ì�â½ß»å»ÙËÜ��'ÙËì�ã�Ý�Û�Ú�Û�ì+Ý)Üi×ÆÛ�ÙËì�â�ÚSÛ�Û�Ü�á�ì�á�à'×:Û�Ú�ܽ×��\à'ã�ìhܽ×�í�à»ã��¿ÚSÛ�ܽà»×Û�Ù¼Ú»×!â½àSØhì�ã�à»×¼ì>Ýeu�Ø�Ù¼ì�ã�ì»å¼Ú'Ý�ì<y ó â�Ú�ܽ×Ëì>ë�åBÔ �?à'ã�ì��ܽݹëËì�Û�ì>ãR�?ܽ×Ëì>ë²r¼ãrÝ�Û¹Þ:ß�q:ç¼Ú�â½ÜdÛUß!Ú�×¼ëÛ�ÙËì>×ÀÞ5ßoq'çRÚ�×:Û�ÜiÛUß�w#�m&(�Ú»á>á�ì ó Û�ݵÛUØ�à ó ì+ë�Ü��»ã�ì�ì>ëÅÞRì>âiܽì�í�Ý�Û�Ú�Û�ì+Ý%Ú»×¼ë¯ã�ì�Û�ç¼ã�׼ݵÛ�Ù¼ìÆâ½ì>Ú'ÝUÛó ì+ë�Ü��'ã�ì>ì>ë1Þ7ì�â½Üiì�í§Ý�Û�Ú�Û�ìQÛ�ÙRÚSÛ�á�à'×:Û�Ú�ܽ׼ݿÚ�Û âiì+Ú»Ý�Û�Ú»Ý��BçRárÙ1ܽ×�í�à'ãR�¿ÚSÛ�Üià'×1Ú'Ý?ÞRà»Û�Ù�à�íÛ�ÙËì����k�¹à»Û�ì�Û�Ù¼ÚSÛ%Û�ÙËÜ�Ý%Ý�ìa�?Ü��;â�ÚSÛ�Û�Ü�á�ìÆÙ¼Ú»Ý%Ú ô�çË×¼ÜdÛrõ ì�â½ìa�?ì>×'Û+å�d� �u-Ý�Üi×¼á�ì©O?&E(*d� /ö}OtwÚ�×¼ë!Ú�×�ô�Ú»×Ë×ËܽÙËܽâ½Ú�Û�à»ãrõ�ì>âiì��?ì�×:Û>å,d e*fhg u-Ý�ܽ׼á�ìCO& (�d e*f�g öJd e*fhg w<�

�ZÙËܽÝ?Ý�ç���'ì>Ý�Û�Ý�Û�Ù¼Ú�Û\Û�ÙËì�ã�ì��\Ü��»Ù:Û?ÞRìQÚ�Ý�ßT���?ì�Û�ã�Ü�á à ó ì�ãrÚSÛ�à»ãÆÛ�à�&(hå�à'×Ëì�Ø�ټܽárÙÛ�Ú�ä'ì>Ý©ÛUØhà ó ì+ë�Ü��»ã�ì�ì>ëPÞ7ì�â½Üiì�í�Ý�Û�Ú�Û�ì+Ý�Ú�×¼ë�ã�ì�Û�çËã�×RÝ©Û�ÙËìÎ�'ã�ì+ÚSÛ�ì>Ý�Û�Ý�Û�Ú�Û�ì á�à»×:Û�Ú»Üi×Ëܽ×��Q×Ëà�\à'ã�ì%Üi×Ëí�à»ã��?Ú�Û�ܽà»×�Û�Ù¼Ú»×�ì�ÜiÛ�Ù¼ì�ãZà»×Ëì��)Ö}×�í-Ú'á Û>å:Û�ÙËܽÝZà ó ì>ã�Ú�Û�à'ã�á�Ú»×�Þ7ì§ã�ì+Ú»ë�ܽâiß¿ë�ìar¼×Ëì+ëB���� ������������Jñ øb���Süa�/K � È�K × L�M;ÒBú�ûËü��Ëü4�3�s�»þ�ürü4�j�rü<�J�-ü��²� úX��ú�üiO � ���C�3�D�rü4�j��ÿ6K � Ò�3�c���¼ü4�3�s�»þ�ürü4�Ç��üa�J�-ü��Â�rú£�Sú}ü�O × ���c�3�D�rü4�Ç��ÿ1K × Ò�ú�ûËüZëËÜs¥Hç¼Ý�Üià'×�ý��lO � �3�C�mO × Ò1�»üa�7ý�ú�ü4�O � & 4.O × Ò7�L��ú�ûËü*�Ëü4�3�s�»þ�ürü4�²�rüa�J�-ü��t� úX��ú�ü/���c�3�D�rü4����ÿCK ��� K × ÓÖ}×�à�Û�ÙËì�ãhØhà»ãrëËÝ�å:ØhìµÛ�ã�Ú»×¼ÝUí�à'ãR� í�ç¼Ý�ܽà»×�ܽ×:Û�à\ë�Üs¥Hç¼Ý�Üià'×WÞ5ß?ã�ì ó â�Ú»á�Üi×��BÛ�ÙËì©çË×Ëܽà»×Wà�í�Û�ÙËìÝ�à'çËã�á�ì>ÝhÞ5ß¿Û�Ù¼ì�ܽã¹Üi×:Û�ì�ãrÝ�ì+á Û�ܽà»×8�

�"ã�Üiö5Ü�Ú�â½âiß'å"Øhì Ù¼Ú/ö'ìÆÛ�ÙËìWárÙRÚ�ãrÚ»á Û�ì�ã�Ü��+ÚSÛ�Üià'×Àà�í5O � &í4.O × ë�ܽã�ì+á Û�âiß�Üi×�Û�ì�ã��¿Ý�à�í�O �Ú�×¼ë6O × ����b�<�F���������#�<�'���uwO � & 4.O × w<uæô � È�ô × w1ö'O � u^ô � ÈRô × w � O × u^ô � È�ô × waÓ�¹àSØhì�ö»ì>ã>å+çË×Ëâ½Ü½ä»ì�Û�ÙËìhá�Ú»Ý�ì�à»íËí�ç¼Ý�Üià'×�åSÜiÛ�Ü�Ý"×¼à�Û ó à:Ý�Ý�ÜiÞ¼âiì�Û�à ó ã�àSö5Ü�ë�ì�Ú%árÙ¼Ú�ãrÚ»á�Û�ì�ã�Ü��>ÚSÛ�Üià'×à�í;uwO � &E4.O × w nßu�à»ã%ÜdÛrݹܽ׼ë�ç¼á�ì>ë¯à'ã�ëËì�ã�Üi×��:Ý�whë�ܽã�ì>á Û�âiß!ܽׯÛ�ì>ãR�¿Ý¹à»íaO � � Ú�×¼ëDO × � u-à»ãÛ�ÙËì>Üiã%Üi×¼ëËç¼á�ì+ë¯à»ãrë�ì�ã�ܽ×��'Ý4w#�¼Û�ÙËìÆâ½Ú�Û�Û�ì>ã%Ý�Ü�� ó â½ßQë�à ×Ëà»Û©á�à'×:Û�Ú�ܽ×Qì�×Ëà'ç��»Ù¯Ü½×�í�à'ãR�¿ÚSÛ�Üià'×���ZÙËܽݵܽݹÜiâ½âiçRÝUÛ�ã�Ú�Û�ì>ëQܽצØ�Ü��'çËã�ìeÃD�N�ZÙ¼ìerD�»ç¼ã�ì�Ý�ÙËàSعÝhÛ�Ù¼ìBë�Ü�¥7ç¼Ý�ܽà»×¯à�í)ÛUØhà ó ì+ë�Ü��»ã�ì�ì>ëÞRì>âiܽì�í�Ý�Û�Ú�Û�ì+ݵÚ�â½à»×D� Ø�ÜdÛ�Ù¯Û�Ù¼ìÆá�à'ã�ã�ì>Ý ó à»×¼ë�ܽ×�� ë�à��?Üi×¼Ú�Û�ܽ×�� Þ7ì�â½Üiì�í�Ý�Û�ÚSÛ�ì>Ý����¹àSØ�åHá�à'×M�Ý�Ü�ë�ì�ã¹Û�ÙËì�á>Ú»Ý�ì©Ø�Ù¼ì�ã�ì�Ú �»ì>×:Û�"�Ú»â½Ý�à?Ù¼Ú»ëQÝ�à»çËãrá�ì?ð�ڻݵÚ\Ý�à»ç¼ã�á�ì»å�Ü&� ì��iåÚK�èöêø'ðÈ4}MÈ�ñMú��æ¹âiÛ�ÙËà'ç��»ÙBÛ�ÙËì�ë�à��\ܽ׼Ú�Û�ܽ×��µÞ7ì�â½Üiì�í7Ý�Û�Ú�Û�ì>Ý�í�à'ãÍ$�Ú�×Rëm"3Ø�à'çËâ½ë�ÞRìZÜ�ë�ì�×:Û�Ü�á�Ú»â:Û�à¹Û�ÙËà'Ý�ìhÜi×Û�ÙËìÂrD�»ç¼ã�ì'å»Û�ÙËì%ë�à��?Üi×¼Ú�Û�ܽ×��BÞ7ì�â½Ü½ì�í�Ý�Û�Ú�Û�ìµã�ì>Ý�çËâiÛ�ܽ×���í�ã�à��ë�Ü�¥Hç¼Ý�ܽà»×WØ�à'çËâ�ë?ÞRì©ì<yËÚ»á�Û�â½ßÛ�Ù¼Ú�Û�à�í�$/���ZÙ5ç¼Ý>å»ÜiÛ�ܽÝ�Ü�� ó à'Ý�Ý�ܽÞËâ½ì»å'Üi×Ç�'ì�×Ëì>ã�Ú»â¬åSÛ�à�ë�ì�Û�ì�ã��?Üi×Ëì¹Û�ÙËì¹×¼ì�Ø�ë�Üs¥Hç¼Ý�ì>ë Ý�Û�Ú�Û�ì�»Ü½ö»ì�×QÝ�à»â½ì�â½ß¿Û�ÙËì�ëËà�?Üi×RÚSÛ�ܽ×��?Þ7ì�â½Üiì�í)Ý�Û�Ú�Û�ì>Ý��

Í�âiì+Ú�ã�âiß'å�&4�Ú�â�Ý�à¿í�à'ãR�¿Ý©Ú Ý�ìa�?Üs��â�ÚSÛ�Û�ܽá�ì�t�¹àSØhì�ö'ì�ã+åËÛ�Ù¼ìÆã�à»â½ì>ݵà�íFd� ÆÚ»×¼ë:d e*f�g Ú»ã�ìã�ì>ö»ì�ãrÝ�ì>ëB��Û�ÙËì�ô�ç¼×ËÜdÛrõ�ì�â½ìa�?ì�×:ۿܽݻd e*fhg uwO &@4.d e*fhg ö Otw\Ú�×¼ëLÛ�Ù¼ì1ô�Ú�×Ë×¼ÜiÙËܽâ�ÚSÛ�à'ã�õì�â½ìa�?ì�×:ÛµÜ�Ý5d uxO?&E4�d ö�d w#�¹æ¹â�Ý�à ×Ëà�Û�ì�Û�ÙRÚSÛ>åRÛ�à��»ì�Û�ÙËì>ã>å¼Û�ÙËìBí�ç¼Ý�ܽà»×ÅÚ»×¼ë¯ë�Ü�¥7çRÝ�ܽà»×

135

Page 140: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

à ó ì�ãrÚSÛ�à'ã�Ý�ã�ì ó ã�ì+Ý�ì>×:Û"Ú%�3�L� ú;þ#�^�<��ú&���/ü*���Sú;ú��^�rü�ï íSò:àSö'ì�ãHÛ�ÙËì�Ý�ì�Û�à�í ó ì+ë�Ü��»ã�ì�ì>ë§Þ7ì�â½Ü½ì�íRÝ�Û�Ú�Û�ì>Ý��Ö}× ó Ú�ã�Û�Ü�á�çËâ�Ú�ã+å:Û�ÙËì>ßWÚ�ã�ì�Ú�Þ¼Ý�à»ã�ÞËÜiÛ�ܽö»ì§Ú�×¼ë!ë�Ü�ÝUÛ�ã�ܽÞËç�Û�Üiö'ì�

� ÓWÏ)ÓQÐ�« ¬ ÑBÐ �xPì¿Ý�ç����?Ú»ã�Ü���ì+ëQÛ�Ù¼ì��¿Ú»Üi×�á�à»×:Û�ã�ܽÞËç�Û�Üià'׼ݩà�í�Û�ÙËÜ½Ý ó Ú ó ì�ã©Üi×PÛ�ÙËì?ܽ×:Û�ã�à5ëËç¼á Û�Üià'×�å�Ú�×¼ëë�ܽÝ�á�çRÝ�Ý�ì>ë�ã�ì�â�ÚSÛ�ì>ë�ã�ì>Ý�ì>Ú�ãrárÙ�Üi× Û�ÙËì�r¼ã�Ý�Û�ÛUØhàÀÝ�ì>á Û�Üià'×¼Ý��¦xPì!á�à'×¼á�â½ç¼ë�ì Ù¼ì�ã�ì�Ø�ÜiÛ�ÙLÚë�ܽÝ�á�çRÝ�Ý�Üià'×?à�í�Ý�ì>ö»ì>ã�Ú»â�à�íHÛ�ÙËìÂ�¿Ú�×5ßÆë�ܽã�ì>á Û�Üià'×¼Ý�ܽ׿Ø�ÙËÜ�árÙ?Û�ÙËÜ�Ý�Øhà»ã�äBá>Ú�׿Þ7ì¹ì<y5Û�ì>×¼ë�ì+ëB�

�ZÙËìÆã�ì>Ý�Û�ã�Ü�á Û�ܽà»× ó ã�ì>á�â½ç¼ë�ܽ×��!ì�q:ç¼Ú»âiâ½ß¯ã�Ú»×Ëä»ì+ëÀÝ�à»çËãrá�ì+ݵÜ�Ý�Ú�×PÜ�� ó à»ã�Û�Ú»×'Û§à»×¼ì�BÖ�ÛrÝã�à5à�Û�ã�ì+Ý�Ü�ë�ì+Ý�ܽ×�Û�ÙËì�í-Ú»á�Û"Û�Ù¼ÚSÛ)ÜdÛ)Ü�Ý"çË×Rá�â½ì>Ú�ã�Ø�Ù¼ÚSÛ�Û�à§ë�àµÜ½×BÝ�ÜdÛ�ç¼ÚSÛ�Üià'×¼Ý�Ø�ÙËì�ã�ì�Ý�à»çËãrá�ì+ÝCà�íì�q:ç¼Ú�â�á�ã�ì+ë�ܽÞËÜiâ½ÜiÛUß�à ¥Hì�ã¹á�à»×MéRܽá�Û�ܽ×��¿Üi×Ëí�à»ã��?Ú�Û�ܽà»×��N�%×¼ì ó à'Ý�Ý�ÜiÞËܽâ½ÜdÛUßWØ�à'çËâ½ëWÞ7ì§Û�à?Û�Ú»ä»ìÛ�ÙËì�ë�Ü�Ý�Ú�»ã�ì�ì��\ì>×:Û�Ú»Ý�ã�ì>Ú»Ý�à»×�í�à»ã)Ú �»×¼à'Ý�Û�Ü�á�Ü�Ý����F�µàSØ�ì>ö»ì>ã>å/Ý�çRárÙÆÚ ó à'âiÜ�á�ß�á>Ú�×Bâ½ì>Ú'ë�Û�à©Úâià:Ý�Ý�à�íHÛ�ãrÚ�×¼Ý�ÜiÛ�ܽö:ÜiÛUßÆÝ�à�Û�Ù¼ÚSÛ�Û�ÙËì%ã�ì+Ý�çËâiÛ�à»íCÚ�í�ç¼Ý�ܽà»×¿Ü�Ý�×¼à�âià'×��»ì>ã�Ú�×¼à�Û�Ù¼ì�ã�ëËà�?Üi×RÚSÛ�ܽ×��ÞRì>âiܽì�í�Ý�Û�ÚSÛ�ì�H�ZÙ¼ÚSÛ�Û�ì>árÙË׼ܽá>Ú�â½ÜdÛUßBÚ'Ý�Ü�ë�ì'åSÛ�ÙËìµÜ½Ý�Ý�ç¼ìhã�ì>Ý�çËã�í-Ú»á�ì>Ý�ÜdíCâ�ÚSÛ�ì�ã�Øhì¹Ú�ã�ìhܽ×�í�à'ãR�?ì>ëÞ:ßÅÚWâ½àSØ�ì>ãµãrÚ�×Ëä'ì>ëÅÝ�à»çËãrá�ìBÛ�Ù¼Ú�Û©ÙRڻݧë�ì<rR×ËÜdÛ�ì?à ó ܽ×Ëܽà»×¼Ý%à»×ÀÛ�ÙËì��¿ÚSÛ�Û�ì�ã(���%×Ëì?á�à'çËâ½ëàSö»ì�ã�ã�ܽë�ì¹Û�ÙËì�Ú �'×Ëà'Ý�Û�Ü�á�Ü�ÝR� Ú'ÝhØ�ì©Ù¼Ú/ö»ì%Üi×WÛ�Ù¼ì%Û�ã�ì+ÚSÛ��\ì>×:ÛZÚ»ÞRàSö'ì»å:Û�ÙËì>ã�ì>Þ5ß\ì+Ý�Ý�ì�×:Û�ܽڻâiâ½ßó ã�à��?à�Û�ܽ×��¯Û�ÙËì!à ó ܽ×Ëܽà»× à»í�Û�ÙËìWâiàSØhì�ãÆã�Ú»×Ëä»ì+ë Ý�à'çËã�á�ì àSö»ì>ã�Û�Ù¼ì!á�à�BÞ¼Üi×Ëì+ë à ó Üi×Ëܽà»×à�í�Û�ÙËì§ÙËÜ��»ÙËì>ã�ã�Ú»×Ëä»ì+ë à'×Ëì>Ý��1�ZÙËì>Ý�ìt�¿Ú/ß�Ý�ì>ìa� ã�ì>Ú'Ý�à'×¼Ú�Þ¼âiì<½�Øhì��?Ü��»Ù:Ûµá�à'×¼Ý�Ü�ë�ì>ã�Û�ÙËìâiì+Ý�Ý\á�ã�ì>ë�ܽÞËâ½ìWÝ�à»çËãrá�ì�Û�àPÞRìQÚ¯Û�Üiìa�;ÞËã�ì>Ú»ä»ì>ã����¹àSØhì�ö»ì>ã>å�Û�Ù¼ì!Ú óËó ã�à'Ú»árÙ Þ¼ã�ì+Ú�ä�ÝBë�àSØ�×Üdí�ܽ׼ÝUÛ�ì>Ú'ëQà�í)Ù¼Ú/ö5Üi×��?ÛUØhà?ì�q:ç¼Ú»âiâ½ßh�;ãrÚ�×Ëä'ì>ë!Ý�à'çËãrá�ì>ÝZØ�ÜiÛ�Ù¯à óËó à'Ý�ÜdÛ�ì�à ó ܽ×Ëܽà»×¼Ý>åËØhì�Ù¼Ú»ëà»×Ëì©Ù5çË×¼ë�ã�ì>ë Û�Ù¼Ú�ÛZö'à�Û�ì>ë¿à'×Ëì%ØZÚ/ß¿Ú�×¼ëWà»×¼ì¹Û�ÙRÚSÛ�ö»à»Û�ì>ë Û�ÙËì§à»Û�ÙËì>ã�ØZÚ/ß���ZÙËÜ�ÝhØ�à'çËâ½ëÚ�â�Ý�à�ã�ì>Ý�çËâiÛ�Üi× Ú�Û�ܽì�)Ö�í�Ú�âiàSØhì�ã�ã�Ú»×Ëä»ì+ëÆÝ�à»çËãrá�ì¹á>Ú �?ìµÚ»âià'×���â½Ú�Û�ì>ã�Ú�×¼ë¿Ý�Ü�ë�ì>ë?Ø�ÜiÛ�Ù¿Û�ÙËìà»×Ëì§ã�ì�×Ëì��'Ú»ëËì©Ý�à»çËãrá�ì'å5Û�ÙËì§í�çRÝ�ܽà»×!à ó ì�ãrÚSÛ�à»ãZØ�à'çËâ�ë í�à'ã�á�ì§Ú�»ã�ì�ì��\ì>×:ÛhØ�ÜiÛ�Ù¯ÜiÛ��

�%×Ëìhá�à»çËâ�ëCå�à»íRá�à'çËãrÝ�ì'åSÜi×5ö»ì>×:Û��?à»ã�ìZá�â½ì�ö»ì>ã�Ý�árÙËì��?ì>Ý)Ý�çRárÙÆÚ»Ý�ö'à�Û�ܽ×��©Ø�ÜdÛ�ÙBÛ�ÙËì=�¿Ú3�ÚUà»ã�ÜdÛUßQà»ã©ç¼Ý�Üi×D� Û�Ù¼ìÆ×Ëìay5Û%ÙËÜ��»ÙËì+ÝUÛ§à ó Üi×Ëܽà»×RÚSÛ�ì+ëÅÝ�à»çËãrá�ì+Ý�Û�àWÞËã�ì>Ú�ä¯ë�ì+Ú»ë�â½à�árä�Ýa�k�¹àSØ=�ì�ö»ì>ã>å�Ø�ÜdÛ�ÙËà»çËÛBØ�ì+Ú�ä'ì�×Ëܽ×��¯Ý�à��?ì¿Þ¼Ú»Ý�ܽá¿Ú'Ý�Ý�ç�� ó Û�Üià'×¼Ý�å"Û�ÙËì+Ý�ì�Ø�Üiâ½â�Ú»âiâ�Þ7ì�ë�à5à��\ì+ë�Üi×�»ì�×¼ì�ãrÚ�â;å:Ý�Üi×Rá�ìµÜiÛ�Ü½Ý ó à'Ý�Ý�ܽÞËâ½ìµÛ�àÆö5ܽì�ØAà»ç¼ãhÝ�ì�Û�Û�ܽ×��\Ú'Ý�Ú��»ì>×Ëì�ãrÚ�â½Ü��>ÚSÛ�Üià'×?à�í�Û�ÙËì©Ý�ì�Û�Û�ܽ×��æ¹ã�ã�àSؾڻëËëËã�ì+Ý�Ý�ì>ë!ܽׯÙËܽݵ֣� ó à:Ý�Ý�ÜiÞ¼Üiâ½ÜdÛUß��ZÙËì>à»ã�ìa� ï }/òX��9ZÚ»Ý�ܽá>Ú�â½âiß'å¼Øhì�á�Ú�×o�?à�ë�ì>â"ÙËÜ�ÝÝ�ì�Û�Û�ܽ×���Ú'Ý�à»×¼ì�Ø�ÙËì�ã�ì�Ú�â½â�Ú�»ì�×:ÛrÝ�Ú�ã�ì�Üi×�í�à'ãR�?ì+ëQÞ:ß�·�ì�q:ç¼Ú�â½â½ß���ãrÚ�×Ëä'ì>ë�Ý�à»ç¼ã�á�ì>Ý�� �ox�ìÚ�ã�ì©ì+Ý�Ý�ì�×:Û�ܽڻâiâ½ß Ú»Ý�ä5Üi×��\Û�Ù¼ÚSÛ�Û�ÙËì§í�à»â½âiàSØ�ܽ×��?á�à'×¼ë�ÜiÛ�ܽà»×¼ÝZÙËà'â½ë��

� çË×¼ã�ì+ÝUÛ�ã�Ü�á Û�ì>ë!ë�à�¿Ú�ܽ×���Ý�à»çËãrá�ì+Ýhá>Ú�×QÞ7ì�Ú�ã�ÞËÜdÛ�ã�Ú»ã�ß\Û�à�Û�Ú»â ó ã�ì<��à»ãrë�ì�ãrÝhàSö»ì�ã=Å å� ã�ì>Ý�Û�ã�ܽá�Û�ì+ë�ãrÚ�×��'ì�)Û�ÙËì�ÞRì>âiܽì�í�Ý�Û�Ú�Û�ì�ܽ׼ë�ç¼á�ì>ëQÞ5ßWÚ?Ý�ì�Ûµà»í�Û�ÙËì>Ý�ì�Ý�à'çËãrá�ì>Ý�Ý�ÙËà'çËâ½ëÞ7ì�Ú�×Ëà»Û�ÙËì>ãhÛ�à�ÛrÚ�â ó ã�ìa�;à'ã�ë�ì>ã>å� ܽ׼ë�ì ó ì>×¼ë�ì�×Rá�ìBà»í�Üiã�ã�ì>âiì>öSÚ�×:ÛµÚ�âiÛ�ì>ã�×¼Ú�Û�ܽö»ì+Ýa��Û�ÙËìÆà»ãrë�ì�ã�Üi×D�¿ÞRì�ÛUØ�ì>ì�×QÛUØhà Ø�à'ã�â�ëËÝܽ×LÚ�×ÂÜi×Rë�ç¼á�ì+ë Þ7ì�â½Üiì�í©Ý�Û�ÚSÛ�ìWÝ�ÙËà»ç¼â½ë à'×Ëâ½ß�ëËì ó ì�×¼ëLà»×LÙ¼àSØ Û�ÙËìQÝ�à'çËã�á�ì>ÝBã�Ú»×ËäÛ�ÙËà'Ý�ì©ÛUØ�à?Øhà»ã�â½ëËÝ>å� Øhì>Ú»ä²ë�Ú�ã�ì�Û�à ó ã�Üi×¼á�Ü ó âiì��hÜií�Ú�â½â)Ý�à'çËã�á�ì>ݹÝ�Û�ã�Ü�á Û�â½ß ó ã�ì�í�ì�ã%à»×ËìBØ�à'ã�â�ë!Û�à�Ú»×Ëà�Û�ÙËì�ã+åÛ�ÙËÜ½Ý ó ã�ì�í�ì�ã�ì�×¼á�ì§Ý�ÙËà'çËâ½ë!Þ7ì ó ã�ì>Ý�ì�ã�ö»ì>ë�å5Ú»×¼ë� Gt #5&'824-�-�"!5X2���'�� ]<?!+�.=�!+�,D@� #5�:b"!�J2���+� #.h?#'�24-X��24$#'�.<�B+�,c+�.(@� #5&:*'�)N9a]��Çâ&+�.!)!+ f�+�)!"324��,�äH68�!'�5&'8'R24-&�

+�.()!+ f(+�)("324��� ,F9 '���+�'£@c,^�X2���'b-�24.e9 'N24.a] 4@c���!'70 #,�,&+�9!��'1,� #"!5&-�'�,RWq���!'7)!+�,���+�.!-£�&+� #.e+�,8.! 4��+�:*0 #5^�X24.<��('�5�'#?3�! R6C'�f<'�5�W

136

Page 141: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

� ×Ëà'×¼ë�Ü�á ÛrÚSÛ�à'ã�Ý�ÙËÜ ó ��Ý�Üi×¼á�ì�Û�ÙËìZÝ�à'çËã�á�ì>Ý�Ú�ã�ì�ì�q:ç¼Ú»âiâ½ßh�;ãrÚ�×¼ä»ì>ë�å>×Ëà ó Ú�ã�Û�Ü�á�ç¼â½Ú»ã�Ý�à»çËãrá�ìÝ�ÙËà»ç¼â½ëWÙ¼Ú/ö»ì©ÜiÛ�Ý�à ó Üi×¼Üià'×¼Ý�ë�à�?ܽ׼ÚSÛ�ì� �æ¹ã�ã�àSØ ó ã�àSö'ì>ë¿Û�Ù¼ÚSÛ�Û�ÙËì�ã�ì�Ü�ÝZ×Ëà ó à'âiÜ�á�ß�Û�Ù¼Ú�Û¹à»Þ7ì�ß�Ý�Ú�â½âCà�í�Û�ÙËì+Ý�ì�á�à»×¼ë�ÜiÛ�ܽà»×RÝa�Ö�Û"ܽÝ�á�â½ì>Ú»ã>å>ÙËàSØ�ì>ö»ì>ã>å�Û�Ù¼Ú�Û�Ý�ܽ׼á�ìH�Ëü4� �s��þ�ürü4��Þ7ì�â½Ü½ì�í¼Ý�Û�ÚSÛ�ì>ÝCã�ì�ÛrÚ�ܽשÛ�ÙËì�í�çËâiâ ó ì+ë�Ü��»ã�ì�ì�à�íì>Ú»árÙÆÞRì>âiܽì�íUå»ÜdÛ�Ü�Ý ó à'Ý�Ý�ÜiÞËâ½ì�Û�à©ì<y ó ì>ã�Ü��?ì�×:Û�Ø�ÜdÛ�Ù��¿Ú»×:ߧä5ܽ׼ëËÝ)à�íRÜi×¼ëËç¼á�ì+ëBÞ7ì�â½Üiì�í-Ý�ÝUÛrÚSÛ�ì>Ýà�Û�Ù¼ì�ã\Û�Ù¼Ú»×Âë�à��?Üi×¼Ú»×:Û\à'×Ëì>Ý�� Ö}× ó Ú»ã�Û�ܽá�çËâ�Ú�ã+å�Üi×LÛ�ÙËì¯Ý�ì>á�à»×¼ë Ý�ì>á�Û�ܽà»×ÂØ�ìQë�Ü�Ý�á�ç¼Ý�Ý�ì>ëã�ì+á�ì�×:Û�ܽ×:Û�ì>ã�ì+ÝUÛhÜi× ô�í-Ú�ܽã�õ/�\ì>ãR�'Üi×D��à�í�Þ7ì�â½Üiì�í-Ýa�)Ö�ÛhØ�Üiâ½âHÞRì%Üi×:Û�ì>ã�ì+ÝUÛ�Üi×D��Û�à\Ý�ì>ìµÜií�Ø�ì©á�Ú�×á�Ú ó Û�çËã�ì©Û�ÙËì�Ý ó ì+á�Ü�rRá ó ã�à ó à'Ý�Ú�â�Ýb�?Ú'ë�ì§ã�ì+á�ì>×'Û�âiß'åËÚ�×¼ëWÜií)×¼à�Û�Ø�Ù5ß�Ö}×�Û�ÙËÜ½Ý ó Ú ó ì�ãZØ�ÙËì�×!Øhì©Ú'Ý�Ý�ç��?ì>ë Û�Ù¼Ú�Û�Ú»âiâ�Ú�»ì>×'ÛrÝ�Ý�Ù¼Ú»ã�ì%Û�ÙËì�á�ã�ì+ë�ÜiÞ¼Üiâ½ÜdÛUß�ã�Ú»×Ëä5Üi×��à»× Ý�à»çËãrá�ì+Ýa�QÖ}×I�'ì�×Ëì>ã�Ú»â¬å)Ú�×¼ë�Û�ÙËì>Ý�ì�ãrÚ�×¼ä:ܽ×��:Ý�á�Ú»× ö/Ú»ã�ß�Ú �?à'×��ÅÚ �'ì�×:Û�Ý>å�Ú»×¼ë ì>ö»ì�×árÙ¼Ú�×��'ì�Ø�ÜiÛ�Ù¼Üi×?Ú�×?Ú �'ì�×:Û�àSö'ì�ã�Û�Ü��?ì�HØËçËã�Û�ÙËì>ãR�?à'ã�ì'å�Ú�×?Ú �»ì>×:Û� Ý�ãrÚ�×¼ä:ܽ×��µí�çË×¼á Û�Üià'׿á�Ú�×ë�ì ó ì�×¼ëÀà»×¯Û�ÙËì?á�à»×:Û�ì<y5Û��Cë�Ü�¥7ì>ã�ì>×:Û§Ý�à»çËãrá�ì+Ý;�¿Ú/ß!Ù¼Ú/ö»ìBëËÜs¥Hì�ã�ì�×:Û�Ú�ã�ì>ڻݹà�í�ìay ó ì�ã�Û�Ü�Ý�ì�ç�y ó â½à»ã�Üi×��ÆÛ�Ù¼ì�ÞRì>Ù¼Ú/ö5Üià'ã�à�í"í�çRÝ�ܽà»×¯Ú�×Rë!ëËÜs¥Hç¼Ý�Üià'×!ܽ×!Û�ÙËì+Ý�ìe�?à'ã�ìt�»ì>×Ëì�ãrÚ�âCÝ�ì�Û�Û�ܽ×��'ÝZÜ�ÝÚ�×!à»Þ5ö5Üià'ç¼ÝZ×Ëì<y5Û¹Ý�Û�ì ó �

�ZÙËì¿Ø�à'ã�äÀÙËì�ã�ì?Ù¼Ú»Ý�Þ7ì�ì>×Iq:ç¼Ú�â½ÜdÛrÚSÛ�Üiö'ì\ܽ×�×¼ÚSÛ�çËã�ì�²�µàSØ�ì>ö»ì>ã>åCà»í Û�ì�× ëËà�¿Ú�ܽ׼ݧà�íÜi×:Û�ì>ã�ì+ÝUÛ§Ù¼Ú/ö'ì\Ú'ëËë�ÜiÛ�ܽà»×¼Ú»âHq:ç¼Ú»×:Û�ÜiÛ�ÚSÛ�Üiö'ì\Ý�Û�ã�ç¼á�Û�çËã�ì�u-ì� �D�½å"Ú ó ã�à»Þ¼Ú»ÞËÜiâ½ÜiÛUßQëËܽÝ�Û�ã�ÜiÞ¼ç�Û�ܽà»×ã�Ú�Û�ÙËì>ãµÛ�Ù¼Ú�×�ÚWÝ�Ü�� ó âiìÆÛ�à»Û�Ú�â ó ã�ì<��à»ãrë�ì>ã%àSö'ì�ãµØhà»ã�â�ëËÝ%ëËì<r¼×Ëܽ×��¯ÚWÞRì>âiܽì�íZÝUÛrÚSÛ�ì!w¹Ø�ټܽárÙÚ �»ì>×:Û�Ý�á�Ú�×�Û�Ú�ä'ìhÚ'ë�öSÚ�×:Û�Ú�»ì�à»í¼Ø�ÙËì>×%�?à�ë�Üdí�ß5ܽ×��%Û�ÙËì>ÜiãH�?ì�×:Û�Ú»â�ÝUÛrÚSÛ�ì+Ýa��Í�à»×¼Ý�ì�q:çËì>×:Û�â½ß»åì<y5Û�ì>×¼ë�ܽ×��QÛ�ÙËÜ�ݧØhà»ã�ä¯Û�à ó ã�àSö5ܽë�ì ó ã�Üi×Rá�Ü ó âiì+ë�Ú»á�á�à»çË×:ÛrÝ%à»íhÙ¼àSØÛ�Ù¼ì¿ÞRì>âiܽì�í�ÝUÛrÚSÛ�ì>ݧà�íÚÎ�»ã�à»ç ó à�í�Ú �'ì�×:Û�Ý%árÙ¼Ú�×D�»ìBç¼×¼ë�ì�ã§Ý�ç¼árÙÀá�à'×¼ë�ÜiÛ�ܽà»×¼ÝµÜ½Ý§Ú»×Ëà�Û�ÙËì�ã%Ü�� ó à»ã�Û�Ú�×:Û%í�à»â½âiàSØ�ç óÝUÛ�ì ó �

Ø�ܽ׼ڻâiâ½ß»å:Ø�ìµ×Ëà�Û�ì¹Û�ÙRÚSÛ�Ø�Ù¼Üiâ½ì¹Û�ÙËã�à'ç��»Ù?Û�ÙËÜ�Ý ó Ú ó ì>ã�Øhì¹ö5ܽì�Øhì>ë?Û�ÙËì�u ó ã�ìa��à'ã�Ý�Û�ã�ܽá�Û4wà»ãrë�ì�ã�Üi×D�'Ý)à»× ó à'Ý�Ý�ܽÞËâ½ì¹Øhà»ã�â½ëËÝ�Ú'Ý�ë�ì+Ý�á�ã�ܽÞËܽ×���Ô Þ7ì�â½Ü½ì�ía½å5Üi×?í-Ú'á Û�Û�Ù¼ì�ã�ì¹Ü½Ý�×Ëà�Û�ÙËܽ×���Üi×?Û�ÙËìí�à»ã��?Ú»âiÜ�ÝR� Û�àÇ�¿Ú»ä»ì§Û�Ù¼ìÇÔ ó ã�ì�í�ì�ã�ì�×¼á�ìËܽ×:Û�ì�ã ó ã�ì�ÛrÚSÛ�ܽà»×Qâ½ì>Ý�Ý¹Ú ó Û%u�ܽ׼ë�ì�ì+ëCåRÛ�ÙËÜ�ݹã�ìa�¿Ú�ã�äÚ óËó â½Ü½ì>Ý�Û�àÎ�?à'Ý�Ûµà�í)Û�ÙËìBØhà»ã�ä ܽ×Àæ¹ÖZà»×¯Þ7ì�â½Ü½ì�í�ã�ì�ö5ܽÝ�ܽà»×¯Ú�×¼ëQ×¼à»×��?à»×¼à�Û�à'×ËÜ�á§ã�ì+Ú»Ý�à»×M�Üi×��Tw#�k�ZÙËܽݵã�ڻܽÝ�ì>ݹÛ�ÙËì�q:çËì>Ý�Û�ܽà»×¯Ø�ÙËì�Û�ÙËì>ãµÛ�ÙËì�ã�ìBÜ�ݵڻ×Åܽ×'Û�ì�ã�ì>Ý�Û�ܽ×���á�à»×Ë×Ëì+á Û�Üià'×QÛ�à�Þ7ì�?Ú'ë�ì¿Þ7ì�ÛUØhì�ì�×�Û�ÙËì ë�ì�ö'ì�â½à ó �?ì>×'Û�ܽ×�Û�ÙËÜ�Ý ó Ú ó ì>ãBÚ�×Rë�á�â½Ú'Ý�Ý�Ü�á�Ú�â�Øhà»ã�ä¯Ü½×�ì+á�à'×Ëà�?Ü�á�Ýà»× ó ã�ì�í�ì�ã�ì�×¼á�ì�Ú ��'ã�ì��'Ú�Û�ܽà»×��

µ Ô���ÎQѬ ña«¹Ò6µ�±Ü«µÎ\ÏH¯xPì Øhà»ç¼â½ë�â½Üiä'ì¿Û�àÅÛ�Ù¼Ú»×Ëä!é©Ú�×¼Üiì>âNÄ�ì>Ù��¿Ú�×Ë× Ú»×¼ë�Û�Ù¼ìÎ�?ìa�BÞ7ì�ãrÝ�à»íZÛ�ÙËì��µà»Þ7à�ÛrÝ�ã�ì<�Ý�ì+Ú�ãrárÙ��'ã�à'ç ó ÚSÛe�:Û�Ú»×�í�à»ãrë¦Þµ×ËÜiö'ì�ãrÝ�ÜiÛUßWí�à»ãµÜ½×¼Ý�Ü��'Ù:Û�í�çËâ�í�ì�ì+ë�Þ¼Ú»áräC�kë)ì>ë�ã�ÜdÛ�à îÅÚ/ß5×¼Ú�ãrëT�Õ�ì�Ü�ë?Ö�Ö# Ý�Øhà»ã�ä�ØZÚ»Ý ó Ú»ã�Û�âiß\Ý�ç ó¼ó à'ã�Û�ì>ë\Þ5ß\Úe�%ÚSÛ�ܽà»×RÚ�âDë�Ù:ß�Ý�ܽá>Ú�âD��á�Üiì>×¼á�ìÂÍ�à'×¼Ý�à»ã�Û�ܽç��ØËì�â½â½àSعÝ�ÙËÜ ó �͹�à'Ú/öÎ�5Ù¼à»Ù¼Ú�� Ý�Ø�à'ã�ä?ØZÚ»Ý ó Ú�ã�Û�â½ß¿Ý�ç ó¼ó à'ã�Û�ì>ë Þ5ß?Û�ÙËìk�%ÚSÛ�Üià'×¼Ú�âB��á�ܽì�×Rá�ìØËà»ç¼×¼ëËÚSÛ�Üià'×Wç¼×¼ë�ì�ã©é©ã�Ú»×'Û;�µàD��Ö�Ö��T�X|Tz5ðaÃMz3{¼ð�� ���!+�,C+�,c24-���"324��� ];2�,^�&5� #.!$#'�5C5&'�,���5&+�-£��+� #.=�&�!24.;���32��B:;24)('F9<]=�F5&5& �61W\E�+ f<'�.*�(+�,c@� #5�:b"!�J2���+� #.b+�.��'�5&:*,� 4@D+�.!)!+ f�+�)!"324�hf4245&+J249!��'�,8�&�32��8�&2��#'N #.Â,& #"!5�-�'�,�24,8f424��"!'�,R?3���!'1-� #.()!+ �&+� #.e,^�X2���'�,F���32��H.( = #.!'+�.()!+ f(+�)("324�3,&�! #"(��)*24� 6�2�]�,c)( #:*+�.32��&'#W,Ö8'�5�'C6B'F245&'B)!'�:;24.!)!+�.($H�&�32��C.! �+�.!)!+ f�+�)!"324���������M)! #:*+�.32���'#W

137

Page 142: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

� «;ò «¹Ð�«µÎQÔ*«�¯ïið�ò�ÍhÚ�ã�âià:Ý8ç��/æµâ½árÙ¼à»çËã�ãRèà'×�å3ë)ì�Û�ì�ã�é?êÚ�ãrë�ì�×Ëí�à»ãrÝ�åSÚ�×Rëcé©Ú/ö:Ü�ëBîÀÚ�ä5Üi×RÝ�à'×��M�%×ÆÛ�Ù¼ì�â½à�'ܽáà�íCÛ�ÙËì�à'ã�ß¿árÙ¼Ú»×��»ì���ë�Ú»ã�Û�ܽڻâD�?ì�ì�ÛZá�à»×:Û�ãrÚ»á�Û�ܽà»× Ú�×Rë¿ã�ì>ö5ܽÝ�Üià'׿í�çË×¼á�Û�ܽà»×¼Ý�����ý3��þ#�C�3�ý����7ÿ3���rý3�s�^��¥�ý�� �^��åcí{���í�ð�{! Mí»ñ{Ëå¼ð�|ÝhíM�

ï }/ò#"§ì�×Ë×¼ì�Û�Ù�óD��æ¹ã�ã�àSØe�$�Cý��a�^�3� %"û¼ý3�^�rü�� �c��v<�c�3���(�^�3�D� �'&\� �J�¼ü#�#��xAܽâ½ì�ß»åM�µì�Ø>¹�à'ã�äHå}S×¼ëWì>ëËÜdÛ�Üià'×�å�ð�|�~»ñD�

ï ñSò�ÍÂ�V9ZÚ�ãrÚ�â;å�C�("§ãrÚ�çRÝ�å ó\�»î¯Ü½×Ëä»ì>ã>å:Ú�×¼ë*)����C���5ç¼ÞËã�Ú»Ù��¿Ú�×ËÜ�Ú�×8�CÍ�à�ÆÞËܽ×ËÜi×D�©ä5×ËàSØ�â��ì>ëM�'ìWÞRÚ»Ý�ì>Ý?á�à'×¼Ý�Ü�Ý�Û�ܽ×��Pà»í;r¼ãrÝUÛR�;à'ã�ëËì�ãÆÛ�ÙËì>à»ã�Üiì+Ýa�+%�ý �;�C��úX��ú��-ý �c�3�Fv<�¼ú}ü<���s�s�'üa�c�rü�åÝDu�ð(w#� Ãhí, �z�ð»å7ð�||h}M�

ï Ã�ò%�C��9hì>×�í�ì�ã�Ù¼ÚSÛ+å*é%��éµç¼ÞRà'ܽÝ>å�Ú�×¼ëj�%��ë�ã�Ú'ë�ì��ØËã�à� Ý�ìa�¿Ú�×:Û�Ü�á?Û�àÀÝ�ß5×:Û�Ú'á Û�ܽá Ú ó �ó ã�à'Ú»árÙ¼ì>Ý�Û�à¿Ü½×�í�à»ã��¿ÚSÛ�Üià'×!á�à�BÞ¼Üi×¼Ú�Û�ܽà»×!Üi× ó à'Ý�Ý�ÜiÞËܽâ½Ü½Ý�Û�Ü�á©âià��»Ü�á ��Ö}×69k�,9hà»ç¼árÙ¼à»×M�î¯ì�ç¼×ËÜiì>ã>å5ì+ë�ÜdÛ�à»ã+å�÷N�(��þ�üX�h�Sú&�-ý3��� �c�m¦��T�#�-ý3�Àý���v<�;�¼ü�þæ��ü4��úÍv<����ýSþ#�Ç�Sú&�-ý3�DÒ��Rú&�D�3�-ü<����2¦��8ò�ò<���Hü#�4�/�3�C�-�Hý��rú.%�ý �;�C��ú����T��å ó Ú �»ì+ݧðaÃRð/ 7ð(~Ëð���ë�Ù5ß�Ý�ܽá>Ú�)�ì>ã�â�Ú �RåCð�||TzT�

ï í/òBé§Ú�ã�ã�ì�Û�ÛC9hܽã�ä5ÙËà¥��N¥F�Sú;ú&�^��üjìHûËürý�þ ÿSå�ö»à'âiçD�\ì�}íQà»í-%�ý ���iý�Ïa�M���M� ëb�D�a�J�^�4�Sú&�-ý3�D�#�æ;�?ì�ã�Ü�á�Ú�×QîÅÚ�Û�ÙËì��?Ú�Û�Ü�á�Ú»â8�5à�á�ܽì�ÛUß»åCð(| ÃhÝ��

ï ~Sò�æ/�b9hà»ã��»Ü�ëËÚ�Ú�×¼ë%�t�>Ö£�?ܽì�â½Üi×¼Ý�ä5Ü���éµì+á�Ü�Ý�ܽà»×%�?Ú»ä5Üi×��µÜi×Bá�à���?ÜdÛ�Û�ì>ì>Ý��"æ í�ãrÚ �?ì�Øhà»ã�äí�à»ãZë�ì>Ú»âiܽ×��ÆØ�ÜdÛ�Ù�ܽ׼á�à'×¼Ý�Ü�Ý�Û�ì�×Rá�ß¿Ú�×¼ë ×Ëà»×M�X�?à»×Ëà»Û�à'×Ëܽá�ÜdÛUß���Ö}×2ëZþ�ý��rürü4�3���T�3��ý��©ú û¼ü0Qý�þRÑ!��û¼ý��Lý3�%àÆý3�\�?ý �7ýSú}ý3�\�^�.1%ü4�!��ý3�\���h�»å ó Ú �'ì>Ý;}Ëð/ 5ñh}�å7ð�|�Ý ÃD�

ï�z+ò�Í�ã�Ú»Ü��59�à'ç�Û�ܽâ½Üiì>ã��:Ö�Û�ì>ã�Ú�Û�ì>ëBã�ì>ö5ܽÝ�Üià'×�Ú�×¼ë%�?Üi×ËÜ��¿Ú�â�árÙ¼Ú»×��»ì�à�íRá�à»×¼ë�ÜiÛ�ܽà»×RÚ�â:ÞRì>âiܽì�í-Ý����ý3��þ#�c� �"ý��5ë�ûT���iý3��ý4�ËûM�^��� ��¥"ý���^��å\}�íM��} ~'ñ, 5ñ�{�íËå¼ð�|�|~D�ï ÝSò�Í�ã�Ú»Ü���9hà'ç�Û�ܽâiܽì�ã!Ú�×¼ëAîÅà»Ü�Ý�èì+Ý é©à»â�ëËÝR�a�?ܽë�Û�� �%×�Û�ÙËìÅã�ì�ö5ܽÝ�ܽà»×Aà�í�á�à»×¼ë�ÜiÛ�ܽà»×RÚ�âÞRì>âiܽì�í�Ý�ì�Û�Ý���Ö}×=é��bÍ�ã�à�á>á�à¼åNÄ1�7ØRÚ�ã�Ü32×¼Ú'Ý?ë�ì�âkÍ�ì>ã�ã�à¼åhÚ�×¼ë1æ/�N�¹ì�ã���Ü��¼å�ì>ë�ÜiÛ�à'ã�Ý>å%�ý3�c� � ú��-ý �c� ���;ua¦�þ�ý3�|ë�ûT���iý3��ý��Ëû�ÿBú}ý-%�ý3���c��ú}ü�þ��B�<�-üa�c�rü�å�árÙ¼Ú ó Û�ì�ãH|Ëå ó Ú�»ì+Ý�}~hz4 ñ{�{��D��y5í�à»ãrë²Þµ×ËÜiö'ì�ãrÝ�ÜiÛUßÎë�ã�ì>Ý�Ý>å7ð(||híM�

ï |Sòmó»à'ÙË×�ÍhÚ»×'ÛUØhì�â½â&�tÕ¹ì>Ý�à»â½ö:ܽ×��!á�à»×�é¼Ü½á�Û�ܽ×���ܽ×�í�à'ãR�¿ÚSÛ�Üià'×��5��ý3��þ#�c� ��ý��m¥"ý���^�#Ò�¥F�3�\�� �D���'ü<ÒN�3�C�Cv<����ý�þ#����ú��-ý �¼åczM�ið(|Ëð/ �}}{Ëå¼ð(||�Ý��

ïdð({Sò�æµë�×RÚ�×6é%Ú»ã�Ø�Ü�árÙËì�Ú�×Rë%ó'ç¼ë�ì+Ú�ë)ì>Ú�ã�â&�1�%×!Û�ÙËì�âià��»Ü�á©à�í)ÜdÛ�ì�ãrÚSÛ�ì+ë!ÞRì>âiܽì�í)ã�ì>ö5ܽÝ�Üià'×��÷%þ ú�� Ðb�<�^� �<v#�Rú�üa���J�s�:ü<�C�rü�åcÝ|Du�ð/ �}w#�½ð/ �} |¼å¼ð�|�|hzM�

ïdð'ð�ò�î¯Ü�árÙ¼Ú�ì>â�ØËã�ì�ç¼×¼ë�Ú�×Rëmé%Ú»×Ëܽì�âTÄ�ì�ÙD�?Ú»×Ë×��X9hì�â½Üiì�íËã�ì>ö:Ü�Ý�Üià'×�Ú�×¼ë�ãrÚSÛ�Üià'×¼Ú�â»Üi×�í�ì>ã�ì>×¼á�ì���"ì+árÙË×ËÜ�á�Ú»â�Õ�ì ó à»ã�Û;�*ÕÁ|à �rð�~¼åh�µì�ÞËã�ì�ØùÞ¹×Ëܽö»ì>ã�Ý�ÜiÛUß»åHð�|�| ÃD�

ïdð!}/ò/�¹Ü½ã�ØËã�ܽì>ëM�¿Ú�×ÅÚ»×¼ë6ó'à'Ý�ì ó Ù%¹��c�µÚ»â ó ì�ã�×��l9�ì>âiܽì�í�ã�ì�ö5Ü�Ý�ܽà»×��hæá�ã�ÜiÛ�ÜLq'ç¼ì�hÖ}×Nëhþ�ý3��rü�ü4� ���h� �Æý���ú�ûËü�¦7� �rú û%v#�Rú�ü�þ#�c��ú��-ý �c� ��%�ý3�(��ü�þ�üa�c�rü?ý3�6ëhþ#���c�a� �c�½ü#�¿ý��76��7ý �=�iü4�a�:ü1µü£�7þ�ü<��ü<�RúX��ú��-ý ���3�C�81%ü4�!��ý3�\���h�6¢96:1 â ;=< ¤ å ó Ú�»ì>Ý=ÃT}�ð� hÃ'ñ¼ð»åRð�||�~��

ïdð+ñSò/ë)ì�Û�ì�ã�é?êÚ»ã�ëËì�×�í�à'ã�Ý��=6��7ý �=�iü4�a�:ü;���C¦7�J�?>Xu�ùÀý��'ü<�s���h�Bú û¼üFêBÿ3�c�3���^�<��ý�� @F�c�L� ú�üa���^��Rú£�Sú}ü#�#��î¯Ö��Áë�ã�ì>Ý�Ý�åHð(|ÝÝD�

138

Page 143: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

ïdð�Ã�ò�æµëËÚ� é©ã�àSö»ì����ZØhà��\à�ë�ì>âiâ½Ü½×��'Ýhí�à'ã�Û�ÙËì>à»ã�ß�árÙ¼Ú»×��»ì��A��ý ��þ#�c�3�"ý��5ë�ûM���iý!��ý��¼ûT�^�4�3�¥"ý���^��å�ð3zT�½ð(íhz4 7ð3z3{¼å¼ð�|�ÝÝD�

ïdð!í/ò%�5ö»ì>×è�%ö»ìÎ�%Ú�×¼Ý�Ý�à'×��ßì7û¼ürýSþ#�^�VuQ÷B�<��ü4�3�L��ûC�5ý ��þ#�c� �hý��©ë�ûT���½ý!��ý��¼û5ÿSåN~'ñDu�ð/ M}�w åð�|�|hzT�

ïdð(~Sò/Õe�cÍÂ�Zó'ì<¥Hã�ì>ß�%ìHûËü�¥�ýR��^�?ý��Ðê\ü��a�L�#�-ý3�\�7Þ¹×Ëܽö»ì>ã�Ý�ÜiÛUß�ë�ã�ì>Ý�Ý�åDÍ�ÙËÜ�á�Ú�»àRå7ð(|~híM�ïdð3z+ò/�¹Ü½ã�à»í�ç��?Ü7"�ÚSÛrÝ�ç¼×Ëà Ú�×¼ë�æ¹â½Þ7ì�ã�Û�àè����î¯ì>×¼ë�ì>â��>à»×��Ùë�ã�à ó à:Ý�ÜiÛ�ܽà»×RÚ�âZä5×ËàSØ�âiì+ëM�»ìÞ¼Ú»Ý�ì ã�ì�ö5Ü�Ý�ܽà»×LÚ»×¼ëj�?ܽ×ËÜ��¿Ú»âhárÙ¼Ú»×��»ì���÷%þ ú&� ÐN�a�^�3�*v#�Rú�üa���J�s�:ü<�C��ü�åNí�}�u-ñhw#��} ~'ñ, M}| ÃRå

ð�|�|Ëð�ïdð(ÝSò%��èì>Þ¼Ú»Ý�Û�ܽì�×D"§à»×¼Üiì+á<��×5ß!Ú�×¼ëoÕ¹Ú��èà»×oë�ܽ×Ëà�ë?èì�ã�ìa���=�%×QÛ�Ù¼ì�âià��»Ü�á�à�í��?ì�ã��»Ü½×��\��Ö}×

ëhþ�ý(�rü�ü4� ���h� ��ý��§ú û¼üE�c�F>'ú û»v<�¼ú}ü�þ#�C�Sú&�-ý3�C�3��%�ý3�(��ü�þ�ü<�C��ü?ý �%ëhþ#���c�a� �c�½ü#�\ý��'6��7ýt�=�J�ü4�a�:ü.1%üX�Hþ�ü#��üa�¼ú£�Sú&�-ý3�Û�3�c�81µü4�!��ý �D���h�D¢96:1 â ;?G ¤ å ó Ú �'ì>ÝbÃ�Ý�Ý, hÃh|ݼåRð(||ÝD�

ïdð(|Sòmé%Ú»×Ëܽì�â�Ä�ì>Ù��¿Ú�×Ë×8�Í9hì�â½Ü½ì�í�ã�ì�ö5Ü�Ý�ܽà»×"å'ã�ì�ö5Ü�Ý�ì+ëB��Ö}×:ëhþ�ý(�rürü�� ���h� ��ý��§ú û¼ü5¦�ý3��þ ú}ürü<�Rú ûv#�Rú�ü�þ#�c��ú��-ý �c�3�H��ý3���¼ú�%�ý3�(��ü�þ�üa�c�rü�ý���÷%þ ú�� Ðb�<�^� ��v<�¼ú}ü<���J�s�:üa�c�rü�¢°v��H%"÷Ðviâ ;=I ¤ å ó Ú �'ì>Ýð(í»ñ Ã! Hð(í Ã�{Ëå¼ð�||híM�

ï }{Sò/ë�Ú�à'âiàÎÄ"ÜiÞ7ì�ãrÚSÛ�à»ã�ìBÚ�×¼ëÀîÅÚ�ãrá�à²�ËárÙ¼Ú�ì>ã�í��§æ¹ã�ÞËÜiÛ�ãrÚSÛ�ܽà»×8�Zæ á�à����Bç�ÛrÚSÛ�ܽö»ìÆà ó ì�ãrÚ3�Û�à'ãµí�à'ã%Þ7ì�â½Üiì�í�ã�ì�ö5Ü�Ý�ܽà»×��©Ö}×�ëZþ�ý��rürü4�3���T�3� ý��Æú û¼ü*�Hü4�rý3�C�J0Qý�þ#���K%�ý3�(��ü�þ�ü<�C��ü�ý3�ú û¼ü�¦��M�c�� �?üa�¼ú£�3�J��ý���÷%þ ú&� ÐN�a�^�3�,v<�¼ú}ü<���s�s�'üa�c�rü»¢,0!y:%̦¼÷Ðv2â ;(I ¤ å ó Ú �'ì>Ý=}�ð3z4 M}�} ݼåð�|�|�íM�

ï }Ëð�ò�æ¹ÞËÙRÚ/ß'ÚeÍÂ��µÚ/ß:Ú�äC��Ö�Û�ì>ã�Ú�Û�ì>ëÆÞ7ì�â½Üiì�íCárÙRÚ�×��'ì�Þ¼Ú'Ý�ì+ëBà'×?ì ó Ü�ÝUÛ�ìa�?Ü�áZì�×:Û�ã�ì�×¼árÙD�\ì>×:Û��@hþRÑ'ü<�\�¼ú&�D�L� åMüð�� ñhí�ñ, �ñ|�{Ëå¼ð(||ÃD�ï }�}/ò�æ¹ÞËÙRÚ/ß'ÚÇÍÂ�\�µÚ/ß:Ú�äHåM�¹à'ãR�¿Ú�×�¹��cØËà5à¼åËîÀÚ�çËã�ܽá�ì/ë�Ú �'×:çRá�á�àRå:Ú»×¼ëQæ¹ÞHë�çËâH��ÚSÛ�Û�Ú�ã(�

Í�Ù¼Ú�×D�»Ü½×��¯á�à»×Rë�ÜdÛ�Üià'×¼Ú�â�ÞRì>âiܽì�í-ÝBçË×¼á�à»×¼ë�ÜiÛ�ܽà»×RÚ�â½âiß��WÖ}×�ëhþ�ý��rürü4�3���h� �Wý�� ú û¼üL�c�F>'ú û%�ý3����ü�þ�üa�c�rüQý3�êì7û¼ürýSþ�ü�ú��^�4� ��÷t���¼ü4��ú���ý��*1��Sú&�-ý3�C�3�J� ú;ÿ�� �c�M6%�Hýt�=�iü4�a�:üN¢aì'÷71'6&�v ¤ å ó Ú �»ì+ݧð»ð(|, 7ð+ñ�íËå¼ð�|�|~D�ï }»ñSò/ë)ì�Û�ì�ã.N��DÕ¹ì�ö»ì+Ý����=�%×!Û�Ù¼ìBÝ�ì��¿Ú�×:Û�Ü�á�Ý�à»í�Û�ÙËì�à'ã�ß!árÙ¼Ú�×D�»ì��æ¹ã�ÞËÜdÛ�ã�Ú�Û�ܽà»×!ÞRì�ÛUØ�ì>ì�×

à»â�ëÂÚ»×¼ëÂ×Ëì�Ø Üi×Ëí�à»ã��?Ú�Û�ܽà»×��1Ö}×3ëhþ�ý(�rü�ü4� ���h� �Åý��!ú û¼üèìq��ü<� �rú û1÷8%�ùO�­vSø�÷8%�ì�Xv�ø�ù+y.êP�­vSø�÷'1/ìK�Rÿ3���Ëý3�4���M� ý3�%ëZþ#���c�a� �c�iü<�Bý��Ðê��Sú£��4�!��ü:�7ÿ!� ú�üa�%��¢�ë1y.êE�â ; � ¤ å ó Ú �'ì>Ý;z5ð� TÝh}�å7ð�||'ñ��

ï } Ã�ò/ç�ã�ÜiäÎ��Ú»×¼ë�ì>ØhÚ»âiâHÚ�×¼ëi¹�à'Ú/öÎ�5ÙËà'Ù¼Ú �����¹à»×��&�?à»×¼à�Û�à'×ËÜ�á�Û�ì�� ó à'ã�Ú»â7ã�ì>Ú'Ý�à'×Ëܽ×��D�CÖ}×é%�Cé§Ú�Þ¼Þ¼Ú/ß»å�ÍÂ��ó\�\�µà���»ì�ã+åRÚ»×¼ë:ó\�Ræ%�cÕ�à'ÞËܽ׼Ý�à'×�å7ì>ë�ÜiÛ�à'ã�Ý>åRQ��3�C��rý>ý(ѯý���¥"ý���^����¿÷%þ ú&� ÐN�a�^�3�­v#�Rú�üa���J�s�:ü<�C�rüe�3�c��¥�ýR��^�Ðëhþ�ý���þ��3�������T��å5ö»à'âiçD�\ì*ÃRå ó Ú�»ì+Ý�Ã'ñ�|, hÃh|ÝD���y5í�à»ãrë�Þ¹×¼Üiö'ì�ãrÝ�ÜiÛUß�ë�ã�ì>Ý�Ý>å7ð(||híM�

ï }�í/òexPà'âdíæ�:Ú�×��%� ó à'ÙË×��F�%ãrë�Üi×RÚ�â¼á�à»×¼ë�ÜiÛ�ܽà»×RÚ�â�í�çË×Rá Û�ܽà»×RÝa��æ=ë�ß5×¼Ú�?ܽáZÛ�ÙËì�à'ã�ß\à�íCì ó ܽÝ��Û�ì��\Ü�áBÝ�Û�Ú�Û�ì+Ýa��Ö}×oxAܽâiâ½Ü�Ú �¸Ä1�C�µÚ»ã ó ì�ã©Ú�×¼ë69hã�ܽڻ×��5ä5ß:ã��¿Ý�åRì>ë�ÜiÛ�à'ã�Ý>å�%1� �T�<��ú��-ý �����êÆü4�a�L�#�-ý3��ÒÍá%üa�J�-ü^�-%"ûD�3�h�:ü<Òb�3�C�8�Rú£�Sú&�L�rú&�^�<��Òqvhv å ó Ú�»ì>ݵð�{híS 7ð+ñ Ã\�?"§âiç5Øhì�ãZæµá>Ú3�ë�ìa�?Ü�áeë�çËÞ¼âiÜ�Ý�Ù¼ì�ãrÝ�åHð�|�ÝÝD�

139

Page 144: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

ï }~Sò�îÅÚ�ã�ßh��æµ×Ë×Ëì�xAÜiâ½âiÜ�Ú �¿Ý��/�"ãrÚ�×RÝ��Æç�Û�Ú�Û�ܽà»×¼Ýµà�í�ä5×ËàSØ�â½ì>ëM�'ìÆÝ�ß�ÝUÛ�ìa�¿Ýa�§Ö}×+ëhþ�ý(��ürü4�3����h� �¯ý��Qú û¼ü%¦�ý3��þ ú�û+v<�¼ú}ü�þ#�C�Sú&�-ý3�c� �E%�ý3����ü�þ�üa�c�rü�ý3�?ëZþ#���c�a� �c�iü<�Pý��L6��7ý �=�iü4�a�:ü1µü£�7þ�ü<��ü<�RúX��ú��-ý ���3�C�81%ü4�!��ý3�\���h�6¢96:1 â ;UT ¤ å ó Ú�»ì>Ý*~¼ð�|! T~�}|ËåRð�||ÃD�

140

Page 145: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

µ µ �6��«µÎQÒÅÕ/VW�YX Ð�Ñ�Ñ%ò ¯���b�<�F���������#�<�'�ùÆ �L�/� ú�ý�úX� �T�7þ�ü<��ý�þR�»ü�þÆý3�²ÅªÓ

���b�q�<ç[Z Ö}×QÛ�Ù¼ì�í�à»â½âiàSØ�ܽ×��RåËØ�ìBç¼Ý�ìc�S\^]�Û�à ë�ì>×Ëà�Û�ì�Û�Ù¼ìBÝ�à'çËãrá�ìe�¿Ú3yBuwO�u^ôA\RÈ�ô_]!wRw#��9hßéµìar¼×ËÜiÛ�ܽà»×Ç}�å8O�uæô \ È�ô ] w�ܽÝ)Ú»âiØZÚ/ß�Ý"ë�ìar¼×Ëì>ë\Ú�×Rë�×Ëà»×��;ì�� ó ÛUß»å»Ý�à1� \^] Ü�Ý�Ú»âiØZÚ/ß�Ý"ë�ìar¼×Ëì+ëB�

xPìP×Ëì�ì+ë�Û�à1Ý�ÙËàSØ Û�Ù¼ÚSÛ�Æ Ü�Ý!á�à»×¼×Ëì>á�Û�ì>ëFÚ�×¼ë1Û�ã�Ú»×¼Ý�ÜiÛ�ܽö»ì��§Ø�ìorRã�Ý�Û ó ã�àSö»ìQÛ�ÙËìí�à»ã��\ì>ã��Ç�5ç ó¼ó à:Ý�ì�ô � ÈRô × Ë¢Å �2jÜ�Ý�ÚWÛ�à�ÛrÚ�â�à'ã�ë�ì>ã%à'×DM%å�Ý�à!ì>ÜdÛ�ÙËì�ãc� ×#� 0j}� ��× à»ã� ��× 0j'� ×#� �t�ZÙ5ç¼Ý�åHÞ5ß6éµìar¼×ËÜiÛ�ܽà»×�ñËåHì�ÜiÛ�ÙËì>ã�OCn1u^ô � ÈRô × w;ö}� à»ã�OCn7uæô × È�ô � w;ö � åã�ì+Ý ó ì>á�Û�ܽö»ì�â½ß�*9hß�é%ì<r¼×¼ÜdÛ�Üià'×oüå�ì�ÜiÛ�Ù¼ì�ã*ô × Æ[ô � à'ã*ô � Æ¢ô × ��¹àSØ1Ø�ìµÝ�ÙËàSØÂÛ�Ù¼Ú�Û�Æ�ܽÝ�Û�ãrÚ�×¼Ý�ÜiÛ�ܽö»ì�HØ�ÜiãrÝ�Û>å:Ø�ì*�¿Ú�ä'ì�Û�Ù¼ì�í�à»â½âiàSØ�ܽ×���à'Þ¼Ý�ì>ã�öSÚ�Û�ܽà»×¼Ýí�à»ã¹Ú�ã�ÞËÜiÛ�ãrÚ�ã�ßÇô`\�ÈRô�]eËoÅ Ú�×Rë%K�L�M��ð�=ô \ Æèô ] Ü�¥:� ]a\ k?� \^] å'Þ5ß?Ú�ÝUÛ�ã�Ú»Ü��'Ù:Û�í�à»ã�ØZÚ�ãrëBÚ ó¼ó âiÜ�á�Ú�Û�ܽà»×¿à»í�é%ì<rR×ËÜdÛ�Üià'×¼ÝNÃBÚ�×¼ëñ?Ý�ÙËàSØAÛ�ټܽÝ��}M�=ô \ � [3bFc ô ] å¹Ý�ܽ׼á�ìÅÞ5ß�é%ì<r¼×ËÜiÛ�ܽà»×Á}�ì�ÜiÛ�ÙËì>ã�� \d] ö � Ú»×¼ëCåZÛ�Ù¼ì�ã�ì�í�à»ã�ì»åhí�çËâ½â½ßá�à»×Ë×Ëì+á Û�ì>ëCåRà»ã;ô \ � [3b^c ô ] åRÜiׯØ�ÙËÜ�árÙÅá�Ú'Ý�ì§Û�ÙËì�ã�ì>Ý�çËâiÛ��Bç¼Ý�ÛµÞ7ì�Û�ã�çËì/�'Üiö'ì�×:� \d]Ü�Ý�á�à'×Ë×Ëì>á�Û�ì+ëB�

ñ��ZÖ�íF� \d] ö � Û�Ù¼ì�×�üÌ�%Ë!K�7\ô ] � [ ô \ ��Ö�í�Û�Ù¼ì�ã�ì�ØhÚ'ݹÚ�×D�Se�Ý�çRárÙQÛ�Ù¼Ú�ÛµÛ�ټܽݵØ�ì>ã�ìí-Ú»â½Ý�ì»åCÛ�ÙËì�×/�Se�Øhà»ç¼â½ëPÞRì¿Ü½×�O�uæô \ È�ô ] w å�Ú�×¼ë�Ý�Üi×Rá�ì�� ܽÝt�?ܽ×ËÜ��?Ú»â�Ø�ã�Û�j§åÍ� \d]Øhà»çËâ�ëW×Ëà�Û�Þ7ìe�?Ú y�uwO�uæô \ È�ô ] w�w�åRÚ?á�à'×:Û�ãrÚ»ë�Ü�á Û�Üià'×��

ÃD�ZÖ�íÐ� \d] ö � ]a\ Û�ÙËì�×/� \^] ö � ��Ö�í�×Ëà�Û+å�Û�ÙËì�×jô \ � [3bFc ô ] Þ:ß!éµìar¼×ËÜiÛ�ܽà»×¢}!Ú�×RëCåÝ�Üi×Rá�ìm� \d] ö3� ]a\ å�ô ] � [3b^c ô \ å¼Ú?á�à»×:Û�ãrÚ»ë�Ü�á Û�Üià'×WÝ�Üi×Rá�ì»� [3bFc Ü�Ý�á�à'×Ë×Ëì>á�Û�ì+ëB�íM�ZÖ�í��4\^]/j��,f�g�å�Û�Ù¼ì�×[ô�]�� [ihaj ô`\R�ÂÖ�í©×Ëà�Û+å�Û�Ù¼ì�×>�Sf/gÇË}O�u^ô`\�ÈRô�](w\Ú»×¼ë�S\^]>0ö�¿Ú3yBuwO�u^ô`\�ÈRô�]3w�w åRÚ\á�à»×:Û�ãrÚ»ëËܽá�Û�ܽà»×���¹àSØ�å"Ý�ç óËó à'Ý�ìÇô � È�ô × È�ô { Ë[Å åFô � ÆÉô × å�Ú»×¼ë�ô × Æpô { �Çx�ì ×Ëì�ì+ëPÛ�à¯Ý�Ù¼àSØÛ�Ù¼Ú�Û�ô � Æ ô { �/9hß�Û�Ù¼ì�r¼ãrÝ�ÛÎ�%Þ¼Ý�ì�ã�öSÚSÛ�Üià'×=ð»åa� ×#� k�� ��× Ú»×¼ë�� {�× k�� ×\{ å�Ú»×¼ë ÜiÛÝ�çlk á�ì>Ý%Û�à!ì>Ý�Û�Ú»ÞËâiÜ�Ý�ÙÅÛ�ÙRÚSÛm� {#� k}� ��{ ��Ö�íl� {#� ö �; \Û�ÙËì�×PØ�ì� ã�ì\ëËà»×Ëì���æµÝ�Ý�ç��?ì\×Ëà�Û(��ZÙËì�×�ô { � [3m3n ô � Þ:ß�éµìar¼×ËÜiÛ�ܽà»×¦}M�ª ¬,���|�HZN� ×#� ö � ��× �[�ZÙËì�×>� ×4� ö � u&�%Þ¼Ý�ì�ã�öSÚSÛ�Üià'×¢ÃhwÆØ�ÙËÜ�árÙLÜ�� ó â½Üiì+Ý�ü��jËK�7Fô � � [ ô × �ÀÖ}× ó Ú�ã�Û�Ü�á�çËâ�Ú�ã+å�ô � � [ m3n ô × å�Ý�à¦ô { � [ mon ô × Ø�ÙËÜ�árÙ Ü�� ó â½Ü½ì>Ý©� ��{ ËO�uæô { ÈRô × w�Þ5ß�éµìar¼×ËÜiÛ�ܽà»×o}M���ZÙ5ç¼Ý>å,� {#� k>� {�× åËÝ�à»� {#� k>� ×\{ �7��Üi×¼á�ìm� {#� 0ö'� Ú»×¼ë�� ܽÝÂ�?Üi×ËÜ��¿Ú�â)Ø�ã�Û©j§åÌ� ×h{ 0ö � �/Í�à»×¼Ý�ì�q:çËì>×:Û�â½ß»åCô × � [3p m ô { åCÚ�×¼ëPÝ�ܽ׼á�ì�ô � � [3p m ô × åô � � [3p m ô { åhÝ�à/� ×h{ ËÃO�uæô � È�ô { w<�ó�ZÙËì>ã�ì�í�à»ã�ì»å�� ×\{ k � ��{ åZÚ�×¼ë1Þ5ß Û�ã�Ú»×¼Ý�ÜdÛ�Üiö5ÜiÛUß»å� {#� k>� ��{ �ª ¬,���NIqZÐ� {R× ö�� ×\{ �N�ZÙ¼ì ó ã�à5à»í�Û�Ù¼ÚSÛ5� {#� kJ� ��{ Ü�ݵÚ�â��?à'Ý�Û�ܽë�ì>×:Û�Ü�á�Ú�â�Û�à?Û�Ù¼ìer¼ãrÝUÛá�Ú»Ý�ì»åËÝ�Ø�ÜiÛ�árÙËܽ×��i� {R× Ø�ÜiÛ�Ù:� ×4� Ú»×¼ë2� ×\{ Ø�ÜiÛ�Ù:� ��× �ª ¬,���?pqZ2� ×4� jÊ� ��× Ú�×Rë/� {R× j�� ×\{ �ox�ì ó ã�àSö»ì?Û�ÙËì ã�ì>Ý�çËâiÛBÞ5ß�á�à»×:Û�ãrÚ»ëËܽá�Û�ܽà»×���5ç óËó à'Ý�ìc� ��{ j?� {4� �1�ZÙËì�×:� {4� 0ö'�� �Ú�×¼ë�ô { � [3m3n ô � Þ:ß�éµìar¼×ËÜiÛ�ܽà»×¦}M�Ø�ܽãrÝUÛ%Ý�ç óËó à'Ý�ì�� ×#� ö �� ���ZÙËì>×�å7Þ:ß��%Þ¼Ý�ì�ã�öSÚSÛ�Üià'×Qñ¼å�üÌ��Ë!K�7cô � � [ ô × Ú»×¼ëCå7Üi×ó Ú�ã�Û�Ü�á�çËâ�Ú�ã+åcô � � [imon ô × Ú»×¼ëoô � � [ p m ô × �k�µà�Û�ì�Û�ÙRÚSÛm� ×\{ 0ö}�� »åCÝ�à�ô × � [ p m ô { Þ5ß

141

Page 146: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

éµìar¼×ËÜiÛ�ܽà»×�}M�N�ZÙ:çRÝ�å�ô { � [imon ô × Ú»×¼ë�ô � � [ p m ô { åËÜ�� ó âiß5ܽ×��?Û�ÙRÚSÛ1� {#� ËDO�uæô { ÈRô × wÚ�×¼ë!� ×\{ Ë�O�u^ô � È�ô { w#����à¼å�Þ5ßD� {4� k|� {�× j|� ×h{ k}� ��{ å�Ú!á�à'×:Û�ãrÚ»ë�Ü�á Û�Üià'×��%�5ç óËó à'Ý�ìÜi×¼Ý�Û�ì+Ú»ë2� ×4� 0ö3� �1�ZÙËì>ײô × � [ip n ô � ��x�ì�á�à'×¼Ý�ܽë�ì>ãZÛUØ�à¿á>Ú»Ý�ì>Ý��ª ¬,���Dp<¬Z�� ��× k � {�× �k�ZÙËì>×!� ×#� j'� ×h{ åCÝ�àWÞ5ߦ�%ÞRÝ�ì>ã�öSÚSÛ�Üià'×�í�åCô � � [ p m ô × �e�5Üi×Rá�ìô × � [ p m ô { Þ5ßè�%Þ¼Ý�ì�ã�öSÚSÛ�ܽà»×[}�å1ô � � [ p m ô { å�Ý�à�� ×\{ Ë O�u^ô � È�ô { w#�è�ZÙ5ç¼Ý>å� ×\{ k3� ��{ Ú�×¼ë!Þ5ß�Û�ã�Ú»×¼Ý�ÜdÛ�Üiö5ÜiÛUß�à�í�k�å<� ×#� j>� {#� �l9hßo�%Þ¼Ý�ì�ã�öSÚSÛ�Üià'×�íËåDô � � [imonô × �7��àÇô { � [ mon ô × Þ5ß Û�ã�Ú»×¼Ý�ÜiÛ�ܽö5ÜdÛUß�à�ía� [ mon �1�ZÙËì�ã�ì�í�à'ã�ì'å­� {#� Ë:O�uæô { ÈRô × w åRÝ�à� {#� k?� {�× �.9hç�Û�Û�Ù¼ì�×:� {4� j?� ×h{ k>� ��{ å¼á�à'×'Û�ã�Ú'ë�Ü�á Û�ܽ×��\à»çËã¹Ú'Ý�Ý�ç�� ó Û�Üià'×��ª ¬,���Dpsr�Z1� {R× j�� ��× �H�ZÙËì�×�� ��× 0öJ�� µÚ�×Rë�ô × � [an p ô { Þ5ß��%Þ¼Ý�ì�ã�öSÚSÛ�Üià'×�í���� ��× 0öJ�� Ü�� ó âiܽì>ÝHô � � [an p ô × Þ5ßcé%ì<r¼×ËÜiÛ�ܽà»×Î}Ëå'Ý�àÂô � � [�n p ô { �H�ZÙ5ç¼Ý>å8� ��× ËDO�uæô � È�ô { w�åÜ�� ó âiß5ܽ×�� Û�Ù¼ÚSÛ:� ��× ks� ��{ � �ZÙ:çRÝ�å%Þ:ß�Û�ã�Ú»×¼Ý�ÜdÛ�Üiö5ÜiÛUß1à»í�k�å5� ×#� j � {4� Ú�×RëCåÞ5ßI�%Þ¼Ý�ì>ã�öSÚ�Û�ܽà»×Ií�å�ô � � [ mon ô × �D9hßPÛ�ãrÚ�×¼Ý�ÜdÛ�Üiö5ÜiÛUßPà»ím� [ m3n åHô { � [ m3n ô × å�Ý�à� {#� ËÃO�u^ô { È�ô × w?Ø�ÙËÜ�árÙ1Ü�� ó â½Üiì+Ý�� {4� k � {R× �39hç�Û¿Û�Ù¼ì�×3� {4� j � ��× k � ��{ åá�à»×:Û�ãrÚ»ë�Ü�á Û�Üi×D�Æà'çËã¹Ú»Ý�Ý�çD� ó Û�ܽà»×��

���b�<�F���������#�<�>I� Ó%uxO � & (.O × w<u^ô � ÈRô × w1ö�O � uæô � È�ô × wq`2O × uæô � È�ô × w� Ó%uxO � & (.O × wznbuæô � È�ô × w7ö�?Ú yCuxO � � u^ô � È�ô × w#È�O × � uæô � ÈRô × wRw� �;�?Ú yCuxO � � u^ô × È�ô � w#È�O × � uæô × ÈRô � wRw�j�?Ú yCuxO � � u^ô � È�ô × w#È�O × � uæô � ÈRô × wRw4Òb� �c��; QýSú�ûËü�þz���L��ü�Ó� Ó5v^�*Æ � Ò�Æ × ÒH� �c��Æp�Sþ�ü%ú�ûËü§ýSþ��»ü�þ#���h� �����c� �D��ü4����ÿ�O � � Ò�O × � Ò��3�C�%uxO � & (FO × wzn�Òþ�ü<�&�Ëü4��ú����Süa�zÿ!Ò�ú�ûËüa�ô � �[ô × � ¤ô � � � ô × � �c�/�?Ú yCuxO × � u^ô � È�ô × w#È�O ×;� uæô × ÈRô � wRw�k>O �z� u^ô � ÈRô × wBý�þô � � × ô × � �c�/�?Ú yCuxO ��� u^ô � È�ô × w#È�O �z� uæô × ÈRô � wRw�k>O ×;� u^ô � ÈRô × waÓ

���b�q�<ç[Zð�;�5ç óËó à:Ý�ì�ô � ÈRô × Ë Å å§Ú�×¼ë O � Ú»×¼ë�O × Ú�ã�ìÀܽ׼ë�çRá�ì>ë Þ5ß=Ý�ì�Û�Ý!à�í\Ý�à'çËã�á�ì>ÝK � È\K × ËDMµåCã�ì+Ý ó ì>á�Û�ܽö»ì>âiß��Â�5ç óËó à'Ý�ì��ÇËèuwO � &A(.O × w<uæô � È�ô × w#�t�ZÙËì�×�å7Þ5ß6éµìar��×ËÜiÛ�ܽà»×RÝ;}ÆÚ»×¼ëoí�å

� Ë øTu^Å È;��wbË6K � `2K × �ô � �¢ô × úa`�øt� úö øTu^Å È;��wbË6K � ��ô � �[ô × úa`¦øb�� úa`

øTu^Å È;��wbË6K × ��ô � �[ô × úa`¦øb�� úö O � uæô � È�ô × wq`%O × uæô � ÈRô × w

142

Page 147: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

�µàSؾÝ�ç óËó à'Ý�ìC��Ë+O � uæô � È�ô × w�`DO × u^ô � È�ô × w#���ZÙ¼ì�×�åCÚ �:Ú�Ü½×¯Ú óËó â½ß5Üi×D��éµìar¼×ËÜ��Û�Üià'×¼Ý;}\Ú�×¼ëoí�å� Ë u�øhu�Å È;��wNË:K � � ô � �¢ô × úa`�øb�� 3ú!wq`

u�øhu�Å È;��wNË:K × � ô � �¢ô × úa`�øb�� 3ú!wö øTu^Å È;��wbË6K � `2K × �ô � �¢ô × úa`�øt� úö uxO � & (.O × wauæô � È�ô × w

}M��9hß�é%ì<r¼×¼ÜdÛ�Üià'×Qñ¼åuxO � & (.O × wznbuæô � È�ô × w7ö�?Ú yCuRuwO � & (.O × w<u^ô � ÈRô × w�wÜiíH�?Ú y�u�uwO � & (.O × w<u^ô × ÈRô � w�wlj�?Ú y�u�uwO � & (.O × w<u^ô � ÈRô × w�w�å¼Ú�×Rë�; �à»Û�ÙËì>ã�Ø�Ü�Ý�ì��ZÙ5ç¼Ý>å�ÜdÛµÝ�çlk á�ì+ÝhÛ�à Ý�ÙËàSØAÛ�Ù¼ÚSÛ

�¿Ú3yBuwO � � uæô × ÈRô � w#ÈzO × � uæô × È�ô � w�wj �¿Ú3y�uwO � � uæô × ÈRô � w<È�O × � u^ô × È�ô � w�wÜ�¥

�¿Ú3yBu�uxO � & (.O × wauæô × È�ô � wRwj �¿Ú3y�u�uxO � & (.O × wauæô � È�ô × wRwö �¿Ú3y�uwO � � uæô � ÈRô × w<È�O × � u^ô � È�ô × w�w

�µàSØ�å�Þ:ß Û�ÙËìtr¼ãrÝ�Û ó Ú�ã�Û¹à�í"Û�ÙËÜ�Ý ó ã�à ó à'Ý�ÜdÛ�Üià'×�åuxO � & (.O × wauæô � È�ô × w ö O � u^ô � ÈRô × wÌ`6O × u^ô � È�ô × w

Ý�à�¿Ú3yBu�uxO � & (.O × wauæô � È�ô × wRwö �¿Ú3yBuwO � uæô � È�ô × w�`6O × uæô � ÈRô × wRwö �¿Ú3yBuæ�¿Ú yBuxO � u^ô � È�ô × w�w#ÈR�¿Ú3y�uwO × uæô � È�ô × wRw�w

Ú»×¼ëCåËÝ�Ü��?ܽâ½Ú»ã�â½ß»å�¿Ú3yBu�uxO � & (.O × wauæô × È�ô � wRwö �¿Ú3yBuæ�¿Ú yBuxO � u^ô × È�ô � w�w#ÈR�¿Ú3y�uwO × uæô × È�ô � wRw�w

uot�ötw*��ç óËó à'Ý�ì�¿Ú3yBu�uxO � & (.O × wauæô × È�ô � wRwj �¿Ú3y�u�uxO � & (.O × wauæô � È�ô × wRwö �¿Ú3y�uwO � � uæô � ÈRô × w<È�O × � u^ô � È�ô × w�w

143

Page 148: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

�ZÙËì>×�¿Ú3yBu�uxO � & (.O × wauæô × È�ô � wRwö �¿Ú3yBuæ�¿Ú yBuxO � u^ô × È�ô � w�w#ÈR�¿Ú3y�uwO × uæô × È�ô � wRw�wj �¿Ú3yBuwO �z� uæô � ÈRô × w<È�O ×;� u^ô � È�ô × w�w

Ý�à?ÜdÛ¹Ü�ÝZì�×Ëà'ç��»ÙWÛ�à¿Ý�ÙËàSØ�¿Ú3yBuwO � � uæô × ÈRô � w#ÈzO × � uæô × È�ô � w�wk �¿Ú3yBuæ�¿Ú yBuxO � u^ô × È�ô � w�w#ÈR�¿Ú3y�uwO × uæô × È�ô � wRw�w

��à ë�à Ý�à¼åhØ�ìÅà'×Ëâ½ßL×Ëì>ì>ëÂÛ�àLÝ�ÙËàSØ Û�Ù¼Ú�Û2O ��� uæô × È�ô � wNk �¿Ú yCuxO � uæô × È�ô � w�wÚ»×¼ë�O ×;� u^ô × ÈRô � w�k �?Ú yCuxO × u^ô × È�ô � w�w<� �ZÙËì+Ý�ìPí�à»â½âiàSØ¥Ü����?ì>ë�Ü�ÚSÛ�ì�â½ßAí�ã�à��é%ì<r¼×¼ÜdÛ�Üià'×Qñ?Ú»×¼ë Û�ÙËì§í-Ú»á�Û�Û�Ù¼Ú�Û5�� �ܽÝ=�?ܽ×ËÜ��¿Ú»â�Ø�ã�Ûcjt�uXöqu¢w*��ç óËó à'Ý�ì

�?Ú y�uwO �z� u^ô × È�ô � w#ÈzO × � uæô × È�ô � wRwj �¿Ú3yBuwO �z� u^ô � ÈRô × w#ÈzO × � uæô � È�ô × wRwz7æ%Ý�Ý�ç��?ì»å�Ø�ÜiÛ�Ù¼à»ç�ÛÀâ½à'Ý�ݯà»í��'ì�×Ëì>ã�Ú»âiÜiÛUß»å©Û�Ù¼ÚSÛ�O × � uæô � È�ô × w3k O � � uæô � È�ô × w<��ZÙËì>×�åZO ��� uæô × È�ô � wlj>O ��� uæô � È�ô × w�Ú�×¼ë¿Ý�ܽ׼á�ì1O �z� uæô � ÈRô × w10ö3�; 'å'Þ5ß©éµìar¼×ËÜ��Û�Üià'×�ñe�¿Ú3y�uwO � uæô � È�ô × wRw7ö'O �z� uæô � ÈRô × w#�7�%Þ¼Ý�ì�ã�ö»ì%Ú�â�Ý�à�Û�Ù¼ÚSÛÐO × � uæô × È�ô � wÐjO �z� u^ô � ÈRô × w#�x�ì�×ËàSØ Ý�ÙËàSØAÛ�Ù¼ÚSÛ;�¿Ú yBuxO × u^ô � È�ô × w�wlkJO ��� uæô � È�ô × w<�7�5ç óËó à'Ý�ì�¿Ú3yBuwO × uæô × È�ô � wRwlj¢�¿Ú yBuxO × u^ô � È�ô × w�w#�N�ZÙËì�×QÞ5ß�é%ì<r¼×ËÜiÛ�ܽà»×¯ñ¼å

�¿Ú3yBuwO × uæô � È�ô × wRwö O ×;� u^ô � ÈRô × wk O �z� u^ô � ÈRô × w

�%×WÛ�ÙËì�à»Û�ÙËì>ã�Ù¼Ú�×RëCåËÜií��¿Ú3yBuwO × uæô × È�ô � wRw1öþ�¿Ú yBuxO × u^ô � È�ô × w�whÛ�ÙËì�×�¿Ú3yBuwO × uæô � È�ô × wRw¦ös� j O � � uæô � ÈRô × w<�ýØ�ܽ׼ڻâiâ½ß»å�Üií/�¿Ú3y�uwO × uæô � ÈRô × wRw�j�¿Ú3yBuwO × uæô × È�ô � wRw�Û�ÙËì>×!í�ã�à��Héµìar¼×ËÜiÛ�ܽà»×Åñ?Ú�×Rë Û�ÙËì�à»Þ¼Ý�ì�ã�öSÚSÛ�Üià'×WÚ�Þ7àSö»ì'å�¿Ú3yBuwO × uæô � È�ô × wRwj �¿Ú3y�uxO × uæô × È�ô � w�wö O ×;� u^ô × ÈRô � wj O �z� u^ô � ÈRô × w

�ZÙËì>ã�ì�í�à»ã�ì»å�¿Ú3yBu�uxO � & (.O × wauæô � È�ô × wRwö �¿Ú3yBuæ�¿Ú yBuxO � u^ô � È�ô × w�w#ÈR�¿Ú3y�uwO × uæô � È�ô × wRw�wö �¿Ú3yBuwO �z� uæô � ÈRô × w<È��¿Ú3y�uwO × uæô � È�ô × wRw�wö O � � u^ô � È�ô × wz7

144

Page 149: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

�µà�Û�ì©Û�Ù¼ÚSÛ¹Øhì§Ù¼Ú/ö»ì�Ú�â�Ý�à\Ý�ÙËàSØ�×WÛ�Ù¼Ú�Û;�?Ú yCuxO × u^ô × ÈRô � w�w�j?O ��� uæô � È�ô × w#��µàSØ�å'Ý�Üi×¼á�ì1O �z� u^ô × È�ô � wlj?O �z� uæô � ÈRô × w�Ú»×¼ëCå5á�à'×¼Ý�ì(q:çËì�×:Û�â½ß»å­O ��� uæô � È�ô × w�0ö�� »åCÞ5ß6é%ì<r¼×ËÜiÛ�ܽà»×�ñ²�?Ú yCuxO � u^ô × È�ô � w�w�jÁ�¿Ú3yBuwO � uæô � È�ô × wRw�öÃO ��� uæô � È�ô × w<�ë�ç�Û�Û�Üi×��ÀÛ�ÙËÜ�ÝÆÛ�à��»ì�Û�ÙËì�ãÆØ�ÜdÛ�Ù Û�ÙËìWã�ì>Ý�çËâiÛ�ÝBí�ã�à�� Û�ÙËì ó ã�ì�ö5Üià'ç¼Ý ó Ú�ãrÚ �'ã�Ú ó Ù�å�Ø�ìÙ¼Ú/ö'ìµÛ�Ù¼ÚSÛ

�¿Ú3yBu�uxO � & (.O × wauæô × È�ô � wRwö �¿Ú3yBuæ�¿Ú yBuxO � u^ô × È�ô � w�w#ÈR�¿Ú3y�uwO × uæô × È�ô � wRw�wj O � � u^ô � È�ô × wö �¿Ú3yBu�uxO � & (.O × wauæô � È�ô × wRw

Ú»×¼ëCåM�»Ü½ö»ì>×Wà'çËã�ì>Ú»ã�â½Ü½ì�ã�Ú»Ý�Ý�çD� ó Û�ܽà»×�å�?Ú y�u�uwO � & (.O × w<u^ô � ÈRô × w�wö �¿Ú3yBuwO �z� u^ô � ÈRô × w#ÈzO × � uæô � È�ô × wRwz7

ñ��=x�ì�ÝUÛrÚ�ã�Û�Þ:ß ó ã�àSö5ܽ×��?Ú�×QÚ�çMy�ܽâiÜ�Ú�ã�ß¿âiì����?ÚD�õa��è/è�¬�� øb���Süa��Å �3�C�©M �3�Ç����ý �/ü<Ò���ý�þ²�3�Rÿ¯ú#��ý��Ëü4�3�s�»þ�ürü4���rüa�J�-ü���� úX��ú�ü<�O � Ò*O × Òb�3�C��� �¼ÿ¿úw��ý»��ýSþ#���!�;ô � ÈRô × Ë�ÅÉÒ

O � u^ô � È�ô × w � O × uæô × È�ô � w1öùøt�; ú87���t�Ì��ç[Zs9�ßCéµìar¼×ËÜiÛ�ܽà»×�}�å��; %Ë6O � uæô � ÈRô × w�Ú»×¼ë��� eËDO × u^ô × ÈRô � w å:Ý�à�øb�� ú#vO � u^ô � ÈRô × w � O × u^ô × È�ô � w#��5ç óËó à:Ý�ìi��ö¸u^Å È;� [ w�ËJO � uæô � È�ô × w � O × uæô × È�ô � w%í�à»ã�Ý�à�?ìi��Ë�M��²�ZÙËì�×��Ë3O � uæô � È�ô × w�Ú�×¼ë���Ë'O × u^ô × ÈRô � w#�Àæ%Ý�Ý�ç��?ì���0ö��; ��¦�ZÙËì�×"å�Þ5ß�éµìar¼×ËÜ��Û�Üià'×j}Ëå8ô � � [ ô × Ú»×¼ë�ô � � [ ô × å)Ú!á�à»×:Û�ãrÚ»ëËܽá�Û�ܽà»×��Î�ZÙËì>ã�ì�í�à»ã�ì»å��²ö��� 'å�Ý�àO � u^ô � ÈRô × w � O × u^ô × È�ô � w�vÙøt�� ú��ª �<�b��«�«#¬­�b®I̯�� ø=���/üa�eÅpÒ�M;Ò<O � Ò,O × ÒTô � Ò��3�C�Nô × �3�b��rý3�Sü<ÒT�¿Ú yBuxO � u^ô � È�ô × w�w7ö�¿Ú3yBuwO × uæô × È�ô � wRwe�����c�J�-ü<�;�¿Ú3y�uwO � uæô � È�ô × wRw1ö3�� TÓ���t�Ì��ç[Z �5ç óËó à'Ý�ìe�¿Ú3yBuwO � u^ô � ÈRô × w�w1öß�¿Ú3y�uwO × uæô × È�ô � wRw#�7�ZÙËì>×�å¼Ý�Üi×Rá�ì�¿Ú3yBuwO � uæô � È�ô × wRwbËDO � uæô � È�ô × wZÚ»×¼ë��¿Ú3yBuwO × u^ô × ÈRô � w�wbËDO × uæô × ÈRô � w å�¿Ú3yBuwO � uæô � È�ô × wRwbËDO � uæô � È�ô × w � O × uæô × ÈRô � w<�.9�ß�Û�ÙËì�Ú»ÞRàSö'ì%ã�ì>Ý�çËâiÛ>å�¿Ú3yBuwO � uæô � È�ô × wRw1ö3�� ��x�ì ó ã�à5á�ì�ì+ë Û�à ó ã�àSö'ìµÛ�ÙËì ó ã�à ó à'Ý�ÜdÛ�Üià'×��uot�ötwB��ç óËó à'Ý�ì1ô � � � ô × Ú�×¼ëk�¿Ú3yBuwO ×;� uæô � ÈRô × w<È�O ×;� u^ô × È�ô � w�w�kJO ��� uæô � È�ô × w<��ZÙËì>×�åBÞ5ß éµìar¼×ËÜiÛ�ܽà»×ýÃRå�O � � u^ô � È�ô × w|0öQ� Ú�×¼ë O � � uæô × ÈRô � wvö � å\Ý�à

145

Page 150: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

O �z� u^ô × ÈRô � wlj?O �z� uæô � ÈRô × w<��æ¹â�Ý�àRå<O × � uæô � È�ô × w�kJO �z� u^ô � È�ô × wÚ»×¼ë/O × � uæô × È�ô � w»kHO ��� uæô � È�ô × w<���5ç óËó à:Ý�ì6O × � uæô × È�ô � w�öGO ��� uæô � È�ô × w<��ZÙËì>×�å�O × � uæô × ÈRô � w�0ö � Ú�×RëCå�Û�Ù:çRÝ�å*O × � uæô � È�ô × weö�� ��ØËçËã�Û�Ù¼ì�ã��\à'ã�ì'å�Þ5ßé%ì<r¼×¼ÜdÛ�Üià'×Qñ¼å<O × � uæô × È�ô � w1öþ�¿Ú3yBuwO × uæô × È�ô � wRwZÝ�à

�¿Ú3yBuwO � uæô � È�ô × wRwö O �z� u^ô � ÈRô × wö O ×;� u^ô × ÈRô � wö �¿Ú3y�uxO × uæô × È�ô � w�w

Ú»×¼ëCå�Þ5ß[Í�à'ã�à'âiâ�Ú�ã�ßI}M�½ð»åN�?Ú y�uwO � u^ô � ÈRô × w�w²ö � åhÚ�á�à'×:Û�ãrÚ»ë�Ü�á Û�Üià'×��þÍ�à'×¼Ý�ì<�q:çËì>×'Û�âiß'å<O × � uæô × È�ô � wlj?O � � u^ô � ÈRô × w å¼Ý�à

�?Ú y�uwO �z� u^ô × È�ô � w#ÈzO × � uæô × È�ô � wRwj O ��� uæô � È�ô × wö �¿Ú3yBuwO � � u^ô � ÈRô × w#ÈzO × � uæô � È�ô × wRwz7

9hß�é%ì<r¼×¼ÜdÛ�Üià'×Qñ?Ú»×¼ë�ë�ã�à ó à'Ý�ÜdÛ�Üià'×�}�åuwO � & (.O × w n uæô � ÈRô × w 0ö �� ��3�c�uwO � & (.O × wznNuæô × ÈRô � w ö �

Ý�à¼å�Þ5ß�é%ì<r¼×¼ÜdÛ�Üià'×�ÃRåMô � �èô × ��5Ü��?Üiâ�Ú�ã�â½ß»å�ÜiíMô � � × ô × Ú�×¼ëk�¿Ú3y�uxO ��� uæô � È�ô × w<È�O �z� u^ô × ÈRô � w�wlk>O × � uæô � È�ô × w åÛ�ÙËì�×�ô � �èô × �uXöqu¢w*��ç óËó à'Ý�ìkô � �[ô × �.9hß�é%ì<rR×ËÜdÛ�Üià'×�üå

uwO � & (.O × w n uæô � ÈRô × w 0ö �� ��3�c�uwO � & (.O × wznNuæô × ÈRô � w ö � È

Þ5ß�é%ì<r¼×¼ÜdÛ�Üià'×Qñ�¿Ú3y�u�uxO � & (.O × wauæô × ÈRô � w�wj �¿Ú3yBu�uxO � & (.O × wauæô � È�ô × wRw#È

Ú»×¼ë�Þ5ß�ë�ã�à ó à:Ý�ÜiÛ�ܽà»×�}�¿Ú yBu^�¿Ú3y�uwO � uæô × ÈRô � w�w<È��¿Ú3y�uxO × uæô × È�ô � w�w�wö �?Ú yCu^�¿Ú3y�uwO � uæô × ÈRô � wRwÌ`²�?Ú yCuxO × u^ô × È�ô � w�wRwj �?Ú yCuxO � u^ô � È�ô × wÌ`6O × uæô � È�ô × w�wö �?Ú yCu^�¿Ú3y�uwO � uæô � ÈRô × wRw#È��¿Ú yBuxO × u^ô � È�ô × w�w�w

146

Page 151: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

æ%Ý�Ý�ç��?ìt�?Ú y�uwO × u^ô � ÈRô × w�wlk[�¿Ú3yBuwO � uæô � ÈRô × wRw#�1�ZÙËì>×�¿Ú3y�uwO � uæô × ÈRô � w�w�j �¿Ú3y�uwO � uæô � È�ô × wRwe�3�C��¿Ú3y�uwO × uæô × ÈRô � w�w�j �¿Ú3y�uwO � uæô � È�ô × wRwz7

é%ì<r¼×¼ÜdÛ�Üià'×Qñ��'Üiö'ì>Ýhç¼ÝZÛ�Ù¼ÚSÛO � � uæô � È�ô × w ö �¿Ú3y�uwO � uæô � È�ô × wRwe�3�C�O � � uæô × È�ô � w ö �

�ZÙ5ç¼Ý>å�Þ5ß�éµìar¼×ËÜiÛ�ܽà»×oÃ�ô � � � ô × ��æµâ½Ý�à¼å5í�ã�à��éµìar¼×ËÜiÛ�ܽà»×¯ñ\Ø�ì§ä5×ËàSØO ×;� u^ô � ÈRô × w!k �¿Ú3yBuwO × uæô � È�ô × wRw Ú»×¼ë3O × � uæô × È�ô � w!k �?Ú y�uwO × u^ô × ÈRô � w�w<��ZÙËì>ã�ì�í�à»ã�ì»åT�»Ü½ö»ì>×�Û�Ù¼ÚSÛ¹à'çËã�à»ã�Ü��»Ü½×¼Ú�âCÚ»Ý�Ý�ç�� ó Û�ܽà»×QÚ�×¼ë�Û�Ù¼ì§í-Ú»á Û�Û�Ù¼ÚSÛ�¿Ú3yBuwO × uæô × È�ô � wRwlj¢�¿Ú yBuxO � u^ô � È�ô × w�w åRØ�ì§Ù¼Ú/ö'ì

�¿Ú3yBuwO ×;� uæô � ÈRô × w#ÈzO × � uæô × È�ô � w�wk �¿Ú3yBuæ�¿Ú yBuxO × u^ô � È�ô × w�w#ÈR�¿Ú3y�uwO × uæô × È�ô � wRw�wk �¿Ú3yBuwO � uæô � È�ô × wRwö O � � u^ô � È�ô × w

�5Ü��?Üiâ�Ú�ã�â½ß»å�Üií��¿Ú3yBuwO � uæô � È�ô × wRwlk¢�¿Ú yBuxO × u^ô � È�ô × w�w�å�Û�ÙËì>×�ô � � × ô × Ú»×¼ë�¿Ú3y�uwO �z� uæô � ÈRô × w<È�O �z� u^ô × È�ô � w�wlkJO × � uæô � È�ô × w 7

���b�<�F���������#�<�?p|¥"ü�ú1O � �3�C�:O × ��ü%�¼ü4�3�s�»þ�ürü4�I�rü<�J�-ü��²� úX��ú�ü<�4Ò��3�c���½ü�ú/Æ � � �c��Æ ×��üWú�ûËüQýSþ��»ü�þ#���h� �Î���c�3�D�rü4����ÿ6O ��� �3�C�%O ×;� Ò§þ�ü<�&�Ëü4��ú����Sü<�dÿÓN¦���þ ú û¼ü�þ4Òt�iü�úÂÆ �rüWú û¼üýSþR�'ü�þ#���T�����c� �D��ü4����ÿ²uwO � &E(FO × w n Ó�væ��O ×�£ O � Òhú û¼ü<��ô � � × ô × �����c�J�-ü<�Âô � �Ùô ×��ýSþ�� ����ô � ÈRô × Ë�Å Ó

���b�q�<ç[Z �5ç ó¼ó à:Ý�ì%O � £ O × å�ô � È�ô × ËþÅ å�Ú�×¼ë�ô � Æ × ô × �29hß�éµìar¼×ËÜiÛ�ܽà»×IÃRå�ÜiÛÝ�çlk á�ì>ݧÛ�àÅÝ�ÙËàSØ Û�Ù¼Ú�Û�uwO � &í(.O × w n7uæô � È�ô × w�0öH� Ú»×¼ëvuwO � &í(.O × w n7uæô × È�ô � weöH� �9�ß6é%ì<r¼×¼ÜdÛ�Üià'צüåqO ×;� u^ô � È�ô × wm0ö �� ÆÚ�×RëNO × � uæô × È�ô � w=ö �� �k��Üi×¼á�ì©O �c£ O × Ú�×¼ëO × � uæô � È�ô × wN0ö �� »åN�?Ú y�uwO �z� u^ô � È�ô × w#ÈzO ��� uæô × È�ô � wRw2k¶O ×;� uæô � ÈRô × w\Þ5ß�éµìar¼×ËÜ��Û�ܽà»×�~D�k�ZÙ5ç¼Ý>å�O �z� uæô � ÈRô × w1k|O × � uæô � È�ô × w%Ú�×¼ë!O ��� uæô × È�ô � w1k|O × � uæô � È�ô × w�åCÝ�à�?Ú y�uwO �z� u^ô � È�ô × w#È�O ×;� uæô � ÈRô × wRw�öÊO × � uæô � È�ô × w<�Bæ¹â�Ý�àRåCÝ�ܽ׼á�ì%ü��ÎË!M�7Ì�60ö �� *u�; mj?�'åËØhì©ÙRÚ/ö»ìk�?Ú y�uwO �z� u^ô × È�ô � w#È�O ×;� uæô × È�ô � wRw7ö3O �z� u^ô × ÈRô � w#��5ç óËó à'Ý�ì�O �z� u^ô × È�ô � w*ö O ×;� u^ô � È�ô × w=ö �Bí�à»ã%Ý�à�?ìC�ÇË:M��t�ZÙËì�×"åRÞ5ß6éµìar¼×ËÜ��Û�ܽà»× ñËå*�oËJO � uæô × È�ô � w�Ú�×¼ë+�¦Ë>O × uæô � ÈRô × w#�29hç�Û�Û�Ù¼ì�×�å�Ý�Üi×¼á�ì���0öH�; 'å�Þ5ß!éµìar��×ËÜdÛ�Üià'×Û}²ô × � [ ô � Ú»×¼ë�ô � � [ ô × ���ZÙËܽݧÜ�Ý�Ú!á�à»×:Û�ã�Ú'ë�ܽá�Û�ܽà»×�Ý�ܽ׼á�ì»�²Ë+M¾Ü�� ó â½Üiì+Ý� [ Ü�Ý©á�à»×Ë×¼ì>á Û�ì>ëB���ZÙËì�ã�ì�í�à'ã�ì'åc�¿Ú yCuxO � � uæô × È�ô � w<È�O × � u^ô × ÈRô � w�wköÊO � � uæô × È�ô � wcjO × � uæô � È�ô × w1öß�¿Ú3yBuwO � � u^ô � ÈRô × w#ÈzO × � uæô � È�ô × wRw#�F9hß�ë�ã�à ó à:Ý�ÜiÛ�ܽà»×�}�åuwO � & (.O × wzn7u^ô � È�ô × w50ö3� Ú�×Rë�uxO � & (.O × w n7uæô × È�ô � wNöJ� �

147

Page 152: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

���b�<�F���������#�<�~}¥"ü�ú*O � �3�c�©O × �rü*�Ëü4�3�s�»þ�ürü4�²�rüa�J�-ü��t� úX��ú�ü<�Â�#�D�rûQú�û���ú*O ×�£ O � Óì7û¼ü<��uwO � & (.O × w@§Mö'O � ¼kuxO × §TwaÓ���b�q�<ç[Z Ä�ì�Û¦Æ � åkÆ × åZÚ�×¼ëóÆ Þ7ìQÛ�Ù¼ì¯à»ãrë�ì�ã�Üi×D�'ÝÆÜi×¼ëËç¼á�ì+ëÂÞ5ß>O � å�O × åhÚ»×¼ëJO öO � & (.O × å:ã�ì+Ý ó ì>á�Û�ܽö»ì>âiß��H�5ç óËó à:Ý�ì�ôêË6Oa§\����àBÝ�ÙËàSØLÛ�Ù¼ÚSÛNôêË6O � ¼1uxO × §Tw å'ÜdÛZÝ�çlk á�ì>ÝÛ�àer¼ãrÝ�Û�Ý�Ù¼àSØÂÛ�Ù¼ÚSÛNôêË6O × §BÚ�×Rë\Û�Ù¼Ú�Û�í�à»ã�ì�ö»ì>ã�ß�ô7e�ËDO × §¼åôêÆ � ô.e��*9hß?ë�ì<r¼×¼ÜdÛ�Üià'×�åôðËj�?Üi×�u�ÆeÈRÅêw�å�Ý�à�ücô7eNËIÅÃ7CôðÆêô7e&�c9hߦë�ã�à ó à'Ý�ÜdÛ�Üià'×ÀñËåcüCô7eNËIÅ 7Bô Æ × ô7e&��ZÙ:çRÝ�åMôêË��?Üi×Fu�Æ × È�Åêw å¼Ý�à�ôÁËDO × §D��¹àSØ âiì�Û�ô7ekË�O × §\�¦x�ìWÝ�ÙËàSØ Û�Ù¼Ú�Û�ô Æ � ô7e&���5ç óËó à:Ý�ìW×Ëà�Û+å�Ü�� ì�½å�ô7e1� � ô/��ZÙËì�×�å¼Þ:ß%éµìar¼×ËÜiÛ�ܽà»×�üå<O �z� uæô/ÈRô7eæw7ö'�; �Ú»×¼ë6O ��� uæô7e&ÈRôÂw10ö'�; h�7�5ܽ׼á�ìeô.e�ËDO × §Mö�\ܽ×8u�Æ × È�Åùw�åNô Æ × ô7e%Ú�×¼ë[ô7e�Æ × ô/�Á�5à�O × � uæô%È�ô7e�w�ö O ×;� uæô/ÈRô7eæw²ö �� ¯Þ5ßéµìar¼×ËÜiÛ�ܽà»×�Ã\�Â�ZÙ5ç¼Ý>å\�¿Ú yCuxO × � uæô%È�ô e w#ÈzO × � u^ô e È�ôÂwRw;ö}�� »j�O �z� u^ô e È�ôÂw;ö �ÆÝ�ܽ׼á�ìü��eË%M�7��»0ö'� us� j?��a9�ß²ë�ã�à ó à'Ý�ÜdÛ�Üià'×�}�åDô7eq�[ô/�F9hç�Û�Û�ÙËì�×oôÊ0Ë��?ܽ×8u�ÆeÈ�Åùw<�Í�à»×:Û�ãrÚ»ë�Ü�á Û�Üià'×����ZÙËì�ã�ì�í�à'ã�ì'å�ücô7eBËoÅÃ7MôêÆ � ô7e&�xPì�×¼àSØ ó ã�àSö'ì)Û�ÙËì�à�Û�Ù¼ì�ã�ëËÜiã�ì>á�Û�ܽà»×�à�í5Û�ÙËì ó ã�à ó à'Ý�ÜiÛ�ܽà»×��F��ç óËó à'Ý�ì7ôêË:O � ¼\uxO × §hw<��ZÙËì�×�ô Ë��?ܽ×8u�Æ � È��?ܽ×8u�Æ × ÈRÅêw�w<�/�ZÙËܽݩÜ�� ó âiܽì>Ý%Û�Ù¼ÚSÛtô Ë��?ܽ×Fu�Æ × ÈRÅêwµØ�ÙËÜ�árÙ�åHÜi×Û�çËã�×�åËÜ�� ó âiܽì>ÝhÛ�Ù¼ÚSÛ=üCô7eBËoÅÃ7MôêÆ × ô7e&�7�5ç óËó à'Ý�ìeô7e�Ë�Å ��xPì�Ý�ÙËàSØAÛ�Ù¼ÚSÛ;ôêÆèô7e&��5ç óËó à'Ý�ì�×Ëà�Û+å�Ü�� ì�½å�ô7eq�èô%�1ë�ã�à ó à'Ý�ÜdÛ�Üià'×�}��»Ü½ö»ì+Ýhç¼ÝZÛUØhà\á>Ú»Ý�ì�ð�=ô7eq� × ô/�7�ZÙËì>×�ôÊ0Ë��?Üi×Fu�Æ × ÈRÅêw#�NÍ�à'×'Û�ã�Ú'ë�Ü�á Û�ܽà»×8�}M�=ô7em� � ô%�è�5ܽ׼á�ì�ô7em� ô�åNô.eeÆ × ô Þ5ßIë�ã�à ó à'Ý�ÜiÛ�ܽà»× ñ��I�ZÙ5ç¼Ý>å�Ý�Üi×Rá�ì�ôºË�?ܽ×8u�Æ × ÈRÅêw å�Ý�à¯Ü½Ý%ô7e&�:9�çËÛBÜdí;ô.e�� � ô�å)Û�ÙËì�×jôs0Ë[�?ܽ×8u�Æ � È��?ܽ×8u�Æ × ÈRÅêw�w<�Í�à'×'Û�ã�Ú'ë�Ü�á Û�ܽà»×8�

�ZÙËì�ã�ì�í�à'ã�ì'åüCô7e�Ë�ÅÃ7�ôùÆèô.e¬åRÝ�à�ôùËjuwO � & (.O × w°§D�ª �<�t�<«#«�¬­�b®~q¯#�>¥�ü�úcO � È�O × ÈzO { �rüÎ�Ëü4� �s��þ�ürü4�v�rü<�J�-ü��o� ú£�Sú}ü#�o�#�D�rû�ú�û���úmO ×?£ O � ÒO {�£ O � Ò=�3�c��O × §Mö3O { §CÓì7û¼ü<��uwO � & (.O × w@§MöýuxO � & (.O { w°§cÓ���b�q�<ç[Z æ ó¼ó ì+Ú�â½Üi×��?Û�àÎë�ã�à ó à'Ý�ÜdÛ�Üià'×Îüå

uxO � & (.O × w°§ö O � ¼kuwO × §hwö O � ¼kuwO { §hwö uwO � & (.O { w@§»7

xPì�ܽ×:Û�ã�à5ëËç¼á�ì�Ý�à�?ì�×Ëà»Û�ÚSÛ�Üià'×!í�à»ãZÛ�Ù¼ì ó ã�à:à»í-ÝhÛ�Ù¼Ú�Û¹í�à»â½â½àSØe��é©Üiö'ì�ׯÚ\ÞRì>âiܽì�í�Ý�Û�Ú�Û�ìu^Å È;��w�å�â½ì�Ûxw ÞRì¯ÚPÛ�à»Û�Ú»âhà'ã�ëËì�ã?àSö»ì>ã\Ý�çËÞ¼Ý�ì�ÛrÝÆà»íkÅ Ý�ç¼árÙLÛ�Ù¼Ú�Û?Üií/ÊÛÈ�ÊyeEvÖÅ åÊzw Êye�Üs¥ð�;Ê Ú�×¼ëoÊye�Ú»ã�ì©×Ëà»×��;ì�� ó ÛUß»å

148

Page 153: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

}M�;Ê Ú»×¼ë�Êye�Ú�ã�ì²u-×Ëà�Û§×Ëì>á�ì>Ý�Ý�Ú»ã�ܽâ½ß²�¿Ú3y�Ü��¿Ú�â�w�ì�q:çËܽöSÚ�â½ì�×¼á�ìÆÝ�ì�ÛrÝ�åHÜ&� ì��iåHí�à'ã%ì>ö»ì�ã�ßôªËIÊ Ú�×Rëoô7e1ËjÅ åHÜdíNô7e1ËIÊ Û�Ù¼ì�×�ôÃ�Áô7e�Ú»×¼ëoô7eF�óôÖu-Ú»×¼ëÅÝ�Ü��?ܽâ½Ú»ã�â½ßí�à'ã�Êye�w�å¼Ú�×¼ë

ñ��ZØhà»ã�â½ë¼ÝZÜi×�Ê Ú�ã�ì�Ý�Û�ã�ܽá�Û�â½ß ó ã�ì�í�ì>ã�ì+ëWÛ�à Øhà»ã�â½ë¼Ý�Üi×�Êye¬å¼Ü&� ì�½å¼í�à'ã¹ì�ö'ì�ã�ß�ô � Ë�ÊÚ»×¼ë�ô × ËoÊye¬å�ô � �èô × ��ZÙ:çRÝ�åRØ�ì�á>Ú�×Qã�ì ó ã�ì>Ý�ì�×:Û�Û�ÙËìBÞRì>âiܽì�í�Ý�Û�ÚSÛ�ì>ݵÜi×oØ�Ü��'çËã�ìBñ¿Ú»ÝÐ$ùö ÷£¿�÷.wÜ÷4{c¿C4²ûM÷�w÷U{c¿»4|{�û�÷SåÍ"¸öÖ÷<û�÷*wÌ÷U{c¿»4|{�û�÷-wÌ÷£¿»4|{�û�÷Så�Ú�×¼ë+= öÖ÷<ûM÷-wÌ÷U{�û�÷3�¿æ¹é§îã�ì(q'ç¼Üiã�ì>ÝhÛ�Ù¼ÚSÛ�u#$~}/u#"�}�=�§Tw#§hw#§TöÉ÷U{c¿C4�{�û�÷3�ZÖ}×Qì>Ú'árÙQà�í�Û�ÙËì�í�à'âiâ½àSØ�ܽ×�� ó ã�à:à»í-Ý�å¼Ø�ìÝ�ÙËàSØFÛ�Ù¼Ú�Û¹âiì�í ÛµÚ»Ý�Ý�à�á�ܽÚ�Û�ܽà»×��'Üiö'ì>ÝZÚ�×Qܽ׼á�à'×¼Ý�ܽÝ�Û�ì>×'Û�ã�ì>Ý�çËâiÛ>åËÝ ó ì+á�Ü�rRá�Ú»âiâ½ß»åF÷�¿m4M{�û�÷ �ÛC��Üq����«#�����XÝÞ�m�ͬ­��Ü��t¬Z«.�t��ßÌ��������� �ZÙËì©×¼ÚSÛ�çËãrÚ�â7ã�ì>ö:Ü�Ý�Üià'× à ó ì�ãrÚSÛ�à»ã�} � Ü�ÝZë�ì<rR×Ëì>ëWÚ»Ýí�à»â½âiàSعÝ����� ������������>ö væ�.� ö u^Å È;��w%�L������üa�J�-ü��e� ú£�Sú}ü#Òhú û¼ü<�Ûu���}���¿cw=ö u^Å È;�'eLw%�L�Bú û¼ü��üa�J�-ü��k� úX��ú�ü§þ�ü#�#�M�zú&���h���rþ�ý3� ú û¼üt�c��ú���þR� �Cþ�üa���L�#�-ý3��ý��'� ��ÿ���ü<�Rú�üa�c�rüH¿¢� �t� �c� ý �D�zÿ�� ���ýSþÇ�3���Cô � È�ô × 0Ë��\ܽ×Fu��eÈ^¿Cw4ÒHô � �'eCô × � ¤óô � �Ùô × � �c�3Ò*��ÿWú�ûËü�÷Æøhù¸�¼ý!� ú&�M���Sú}ü<��Ò��ýSþ�� ����ô � Ë��\ܽ×8u��eÈ^¿cw%�3�c�%ô × ËoÅpÒ�ô � � e ô × Ó���b�<�F���������#�<�>�Öì7û¼üZþ�ü<�#�M�zú����T�t��üa�J�-ü�����ü�ú^�1�T�#���h�Â�iü��rú8�3�C�%þ#�s�+û�úB�3�4��ý(�a�^�Sú&�-ý3��ý��*á%ý3��ú&���J�-ü�þZâ ��c�Sú&��þR� �Cþ�ü<�(�L�#�-ý3� ý��¼ü�þ��Sú}ýSþ4���4�3�Û��ü%���c�rý ���#�L� ú�üa�¼ú£Ó���b�q�<ç[Z æ ó¼ó âiß5ܽ×��WÛ�ÙËì\à ó ì�ãrÚSÛ�à»ã�Û�à Û�Ù¼ìÆÞ7ì�â½Üiì�í�Ý�Û�ÚSÛ�ì>Ý1$/È\"ÎÈ\=\åCØhì%�'ì�Û�u�u�$�}��:"m§w�}=�i=�§Tw@§Möp÷�¿�4*{�û�÷µØ�ÙËÜ�ár٠ܽÝ�ܽ׼á�à'×¼Ý�Ü�Ý�Û�ì�×:ÛhØ�ÜdÛ�Ù Û�Ù¼ì%ã�ì>Ý�çËâiÛ�à�í�ã�Ü��»Ù:ÛhÚ'Ý�Ý�à�á�Ü�ÚSÛ�ܽà»×8��¬­�bã»��äbåÍ��¬,��æ ���X¬­�t«��8ÝÐçx�,�tè!Ü�«#¬­���#�<� Ä"ì�Û8� ö u^Å È;��w©Þ7ì ÚQÞ7ì�â½Üiì�í�Ý�Û�Ú�Û�ì'å\¿ÞRì�Ú?Ý�ì>×:Û�ì�×Rá�ì§Üi× À �.é©Ú�ã�Ø�Ü�árÙËì©Ú»×¼ë�ë)ì>Ú�ã�âHÝ�ç����»ì+ÝUÛ�Ú?Ý�ì�ÛZà»í ó à'Ý�Û�çËâ�ÚSÛ�ì>Ýeu¬Ý�ì>ì?ïið�{SòCí�à»ãÛ�ÙËì>Üiã�ì�×5ç��?ì�ãrÚSÛ�Üià'×\w7Û�à%Ý�ç ó¼ó âiì��?ì�×:Û�Û�Ù¼ì�æ¹é§î ó à'Ý�Û�çËâ�ÚSÛ�ì>ÝHí�à'ã"ÜiÛ�ì>ã�Ú�Û�ì>ë§ã�ì�ö5Ü�Ý�ܽà»×"å�Û�ÙËì�×Ý�ÙËàSØ�Þ5ß?ØhÚ/ßÆà�í"ÚBã�ì ó ã�ì+Ý�ì>×:Û�ÚSÛ�Üià'×\Û�ÙËì>à»ã�ìa� Û�Ù¼Ú�ÛhÚ�×Wæ¹é§î¥à ó ì�ãrÚSÛ�à»ã�}��q�ÀÝ�ÚSÛ�Ü�Ý�í�ß:ܽ×��Û�ÙËì ó à'Ý�Û�çËâ�ÚSÛ�ì>ÝZà'ÞRì>ß5ÝhÛ�ÙËì§í�à»â½â½àSØ�Üi×��?ã�çËâ½ì>Ý��ð�ZÖ�íHô � õ öI¿ÅÚ�×Rë�ô × õ öI¿"å5Û�Ù¼ì�×�ô � �èô × Ü�¥�ô � �'eMô × �}M�ZÖ�íHô � õ ö�{c¿ÅÚ�×Rë�ô × õ ö�{c¿"å�Û�ÙËì�×�ô � �¢ô × Ü�¥�ô � �'eMô × �ñ��ZÖ�íHô � õ öI¿ÅÚ�×Rë�ô × õ ö�{c¿"å�Û�ÙËì�×�ô � �èô × à'×ËâißWÜdí�Ú�×Rë�ô � �'eMô × �ÃD�ZÖ�íHô � õ öI¿ÅÚ�×Rë�ô × õ ö�{c¿"å�Û�ÙËì�×�ô � �èô × à'×ËâißWÜdíHô � �'eMô × �

Ø�ÙËì�ã�ìÇu���} �q� ¿Cw1öêu^Å È��'e�whÜ�ÝZÛ�Ù¼ì�ã�ì+Ý�çËâiÛ�à�í)ã�ì>ö5ܽÝ�Üi×D�-� Þ:ß�¿8����b�<�F���������#�<�?¡¸ì7û¼ü\þ�ü<�4�M�dú����T���rü<�s�-ü^�/��ü�ú��k�T�#���h���iü��rú;�3�c� þ#�s�+û�ú;�!�4��ý��a�^�Sú&�-ý3� ý����3�¼ÿþ�ü<�(�L�#�-ý3�Pý4�Ëü�þR��ú�ý�þ4�Â�a�Sú&�L�æ�rÿ3���h��ú û¼üÐê��Sþz���^�rûËü��3�c�më¹ü���þ#�h�Ëý3� ú��M����ú�ü<�/��� ���rüt���c�rý3�D�#�L�4�ú�üa�¼ú£Ó���b�q�<ç[Z Ä�ì�Û_} �q� Þ7ì%Ú�×Wæ¹é§î¥à ó ì�ãrÚSÛ�à»ã�Û�Ù¼ÚSÛhܽÝ�Ú/�?ìa�ÆÞRì>ã�à�íCÛ�ÙËì%Ú»ÞRàSö'ì�í-Ú �?ܽâiß\à�íà ó ì�ãrÚSÛ�à'ã�Ý��8�ZÙËì>×�å�'Üiö'ì�×�$/Èh"�È\= Ú»Ý�Ú»ÞRàSö'ì»åSÞ5ß�Û�Ù¼ì�Û�ÙËܽã�ë\ã�çËâiì/÷�¿F4.{�û�÷'w ÷4{c¿F4.{�û�÷Üi×�$J} � "©§¼å"Ý�àju�u�$J} � "©§Twq} � =c§hw\§Mö ÷£¿»4|{�û�÷\Ø�ÙËÜ�árÙPÜ�ݧÜi×Rá�à»×RÝ�Ü�ÝUÛ�ì�×:Û§Ø�ÜdÛ�ÙPÛ�ÙËìã�ì+Ý�çËâiÛ�à�í)ã�Ü��»Ù:Û¹Ú»Ý�Ý�à5á�ܽÚ�Û�ܽà»×��

149

Page 154: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

ì �F��åÍ�FÝí��ä8�<�ÍæÍ�°���#�<�ͬZ«#��îV¬Z������� Ä�ì�Û�� Þ7ì§Û�ÙËì�Ý�ì�Û¹à»í�à'ã�ë�ܽ׼ڻâ½Ý����� ������������>þ�÷Â� à»ãrë�ܽ׼Ú�âCá�à'×¼ë�ÜiÛ�ܽà»×¼Ú»âHí�çË×¼á�Û�ܽà»×ju&�eÍbØ�wt�L�%�3�Rÿ;�4�M�c��ú��-ý ��ï��Å RT�®�4�D�rû\ú�û���úH�TôêËoÅÃ7�ïFuæôÂwNöþ{\Ó�¦"ý�þ1Ê�vvÅÉÒq��ü��'ü^Ð7�7ü�ï8u&Êêw1öþ�?ܽ×s���=��ï8u^ôÂwaÓì7û¼ü���üa�J�-ü��t��ü�úZý��%� �/y:%q¦Ù�L�ÐïV§Töùø�ôùË¦Ê õVï8u^ôÂw7öÙ{MúTÓ��� ����������������|¥�ü�úFïþ�rüÎ� �?y�%q¦bÓ�}��è�L��� �:���}á�à'×¼ë�ÜiÛ�ܽà»×¼Ú»âiÜ��>Ú�Û�ܽà»×¯à ó ì�ãrÚSÛ�à»ã�� ¤ ��L�/�Î�7ý �D�#ò+ü�þ�ý�ýSþ��3���c� ��� �c�!Òc��ýSþ%�3�Rÿ���ü<�Rú�üa�c�rü�¿oË À �3�c�²�3�RÿeôêË�ÅÉÒ

u#ïE} � ¿cwauæôÂw7ö�� ï8u^ôÂw��+ï8us¿cw � ��ôpõ öj¿�M��ï8u^ôÂw��+ï8u�{c¿cw � ��ôpõ ö�{c¿

���b�<�F���������#�<�>ñÖì7û¼ü\þ�ü<�4�M�dú����T���rü<�s�-ü^�/��ü�ú��k�T�#���h���iü��rú;�3�c� þ#�s�+û�ú;�!�4��ý��a�^�Sú&�-ý3� ý����3�¼ÿ��ý ���a���c��ú��-ý ��ý��Ð�N����ý �c� � ú��-ý �c�3�s�Þò���ú��-ý � ý��¼ü�þR��ú�ý�þ4���4�3���rü/���c�rý3�D�4�L� ú}ü<�RúXÓ���b�q�<ç[Z Ä�ì�Ûlï��7È�ïs��È\ï���ÞRì¹Û�ÙËìe�eÍbØ�Ý�ã�ì ó ã�ì+Ý�ì>×:Û�ܽ×���ÞRì>âiܽì�í�Ý�Û�Ú�Û�ì>Ýa$/È\"ÎÈ\=\å:ã�ì>Ý ó ì+á#�Û�ܽö»ì�â½ß»å�Ý�ç¼árÙ�Û�Ù¼ÚSÛiïs�=us¿�4�û w�öGïs�=us¿�4K{�û3w�ö {��Hï��=u�{c¿�4�û w��Hïs�=u�{c¿%4W{�û w�åïs�ÂuJ¿»4�û wÂöÃïs�Âu�{c¿i4�û wÂöÉ{6�|ï���u�{c¿i4�{�û wm�|ï���us¿©4|{�û3w�å�Ú»×¼ëNï��;uJ¿»4�{�û wtöïs�;u�{c¿%4~{�û w�ö {J� ïs��uJ¿%4jû3w�ö ïs��u�{c¿%4jû3w<�ÙÄ�ì�ÛL}�� n Ú�×¼ë�}�� p ÞRìÅÚ»×:ß ÛUØ�à���}á�à'×¼ë�ÜiÛ�ܽà»×¼Ú»âiÜ��>Ú�Û�ܽà»×Qà ó ì�ãrÚSÛ�à»ãrÝa��Ö�ÛµÜ�ݹì+Ú»Ý�Üiâ½ß�Ý�ì�ì>×QÛ�Ù¼Ú�Û�u#ï���}l� n uwïs�|}�� p ïs��§hw#§Tw�§Mö÷U{c¿F4�{�ûM÷ �1�¹àSØ�å�Þ5ßcé%ì<rR×ËÜdÛ�Üià'×Qð�{Ëå\u#ï��7}l� n ï��q§Tw<uJ¿F4�{�û w��ùu#ï��'}l� n ï��̧Tw<u�{c¿F4�{�û w#��5çËÞ¼Ý�ì�q:çËì>×'Û�á�à'×¼ë�ÜiÛ�ܽà»×Ëܽ×��WÞ:ß:ï��*§�çRÝ�ܽ×��WÛ�ÙËì©� × à ó ì�ãrÚSÛ�à»ã ó ã�ì+Ý�ì>ã�ö'ì>ÝZÛ�ÙËÜ�Ý%à'ã�ëËì�ã�Üi×��Ú�×¼ë ó ã�à5ëËç¼á�ì+ÝhÛ�ÙËì�Þ7ì�â½Üiì�í�Ý�ì�Û/÷�¿c4�{�ûM÷ �õF�Xå�è/¬Z���aÝÞ��çx�,�tè!Ü�«#¬­���#�<� x�ì�ã�ì�í�ì�ã�Û�ÙËì�ã�ì>Ú'ë�ì�ãHÛ�àBïdð(|/ò�í�à'ãCÛ�ÙËì ó à:ÝUÛ�çËâ½Ú�Û�ì+Ý�Ä�ì>Ù��¿Ú�×Ë×ó ã�à ó à:Ý�ì+ÝZÝ�ÙËà'çËâ�ë��'àSö»ì�ã�׿Û�ÙËì�ÞRì>Ù¼Ú/ö5Üià'ãZà�í�Ú?Ý�ì�q:çËì>×¼á�ì§à�í�ã�ì>ö5ܽÝ�Üià'×¼Ý���Ä"ì�Ù��¿Ú�×¼×��'Üiö'ì>Ý�\à�ë�ì>âs�;Û�Ù¼ì�à»ã�ì�Û�ܽá�Ý�ìa�¿Ú»×'Û�ܽá>Ý�ܽ׿Û�ì>ãR�¿Ý�à»í����^�'ü<�\���h�?þ��3�TÑ%�¿ý��'ü<�J�rå�ë�ìar¼×Ëì+ë?ÞRì>âiàSØe��ÞµÝ��Üi×���Û�ÙËì+Ý�ì;�?à�ë�ì>â½Ý>å»ÙËì%ë�ì>Ý�á�ã�ÜiÞ7ì>Ý�Ú§ã�ì+á�çËãrÝ�Üiö'ì¹ë�ì<r¼×¼ÜdÛ�Üià'×?í�à»ã�á�à� ó ç�Û�Üi×���Û�ÙËì%ÞRì>âiܽì�í�Ý�ì�ÛÛ�Ù¼Ú�Û¹ã�ì+Ý�çËâiÛ�Ýhí�ã�à� Ú\Ý�ì�q:çËì>×¼á�ì§à»í�ã�ì�ö5Ü�Ý�ܽà»×¼Ý�Û�Ù¼ÚSÛ¹à'ÞRì>ß¿Û�ÙËì ó à'Ý�Û�ç¼â½Ú�Û�ì>Ý����� ��������������<�F÷ Ø�Ü�ë�ì�×¼Üi×��!ãrÚ�×Ëäo�?à�ë�ì�â��L�Ç�e�4�M�C��ú&�-ý3�Y051 �=� RTÜ}�� �7¡��4�D�rûú ûD�Sú� Ó7��ý�þ%�3�¼ÿ�·�Èa¢ Ë-�ÁÒ7� ��·D�J¢ ú û¼ü<��051�u°·8w�v£0-1�u¤¢�w#ÒN�3�c�� Ó7��ý�þ%�3�¼ÿeôêË�ÅÉÒ�ú û¼ü�þ�ü/�L�t��ý3�¿ü5·�Ë5�Ì�#�D��û¯ú ûD�Sú�ôÁËy051%u�·8w4Ò

Ø�ÙËì�ã�ì�� ܽÝ�Ú¹Ý�çlk á�ܽì�×:Û�â½ß%â½à»×D��Üi×ËÜiÛ�Ü�Ú�â5Ý�ìa��?ì>×'Û�à»í5Û�ÙËì�à»ãrë�Üi×RÚ�â�Ýa��ؼà»ãD¿oË À å/Øhì�ë�ìar¼×ËìþR�3�MÑBuJ¿Cw1ö=Ú»ãR���?Üi×s¥l�4¦¦uæôêËy051%u�·8wÌ4²ôpõ öj¿Cw#�1�ZÙËì§Þ7ì�â½Üiì�í�Ý�ì�ÛD051a§Tö§051�u^{hw#�Ä�ì�Û�¨�ÞRì\Ú Ý�ì(q:çËì�×¼á�ì�à�í�Ý�ì�×:Û�ì>×¼á�ì+Ý¹Ü½× À Ø�Ù¼ì�ã�ì�¡?ܽݵÛ�Ù¼ìBìa� ó ÛUßQÝ�ì�q:çËì�×Rá�ìBÚ»×¼ëK©Ü½ÝZÛ�ÙËì�á�à'×¼á�Ú�Û�ì>×¼ÚSÛ�Üià'×�à ó ì�ãrÚSÛ�à»ã(�

��� ��������������­I ø=���Sü<�j�i���^�'ü<�\���h�!þR� �TÑ��¿ý��'ü<�.0-1%Ò�ú�ûËü��rü<�s�-ü^�/��ü�úZþ�ü<�4�M�dú����T���rþ�ý �ú û¼ü�þ�üa���L�#�-ý �¦��ü4Ï<�¼üa�c�rü/�rýSþ þ�ü#�&�¼ý3�c� ���h��ú�ý#¨v� �c� ý��rü�ÿ3���T��¥�ü�ûM��� �D��â �N�Ëý3�rú&�M����ú�ü<�4Ò1�'ü<��7ýSú}ü4� ï ¨7ò�ª'«1Ò1�L�`¬1u¤¨�w��»ü�Ð1�Hü�� þ�ü4�<��þ4�#���Sü<�dÿ²�!�1��ý3���½ýt�H�;u� Ó;ûMu�¡�w7öþ{��3�c��¬1u�¡�w1ößÅ�­�u�{�waÓ

150

Page 155: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

� Ó5v^�:®ß�L������ü<�Rú�üa�c�rüÎ��ü�Ïa�¼üa�c�rü#ÒF¿[Ë À Òe�3�C�Åú�ûËü�þ�ü!üa>��L� ú^�eôÖËJ¬1u9®DwÎ�4�D�rû�ú�û���úôýõ öj¿8Ò�ú û¼ü<�²ûMu9®5©�¿Cw7övûMu9®Dw�� �c�E¬1u¤®5©�¿cw1öêø�ôùË�¬1u9®Dw�õ3ôýõ öj¿8úTÓ

� ÓiyZú�ûËü�þz���L��ü<Ò�ûMu9®|©a¿cw��L�!ú û¼ü��#��� ���iü<�rú5·°¯ ûMu9®Dw��#�D�rûÂú ûD�Sú�ú�ûËü�þ�üÅü[>��L� ú^�%ô Ë051%u�·8w%� �c�%ôêõ öI¿�ÒN� �c�E¬1u¤®5©�¿Cw1öÁø�ôê˱051�u�·8w*õ!ôpõ öj¿8úTÓ��ûËü�þ�üÂû¦���4�\��þ�üa�(�L�4�-ý ����ü4Ïa�¼ü<�C�rü#��ú}ýWý�þR�3���C�3���4Ò;�3�c��¬ß�Ç���c��þ�ü<�(�L�#�-ý3����ü4Ï<�¼üa�c�rü<�§ú}ý�4�D�<��ü�ú���ý��;Å Ó�ZÙËÜ½Ý ó ã�à�á�ì+ë�çËã�ì%Ü�Ýhì�q:çËܽö/Ú»âiì>×:Û�Û�à\ÜiÛ�ì>ã�Ú�Û�ܽö»ì>âiß¿Ú ó¼ó âiß5ܽ×��ÆÛ�ÙËì©í�à»â½â½àSØ�Üi×��Æã�ì>ö5ܽÝ�Üià'× à ó ì�ãR�ÚSÛ�à'ãZÛ�à\Û�Ù¼ìe�\ì��BÞ7ì�ãrÝZà�í ¨H���� ��������������Xp v^�8051ê�L�;�����^�'ü<�\���h�BþR�3�MÑk�?ý(�»üa�Rý �/ü�þ�ÅpÒ�ú�ûËüa� ú�ûËül���^�'ü<�\���h�BþR�3�MÑ�?ý(�»üa�Fu?0-1~}=²�¿cw�þ�ü#�#�M�zú&���h���rþ�ý3� ú�ûËüBþ�ü<�(�L�#�-ý3� ý��D051 ��ÿÇ��üa�¼ú}ü<�C�rü�¿I�L���'ü^Ð7�7ü4���3���ý3���iý �H� u� Ó%u?0-1~} ² ¿cw<u�{�w1öêø�ôùË�Å õ!ôpõ öI¿è�3�C��ôêË�051�uUþR� �TÑ�uJ¿cwRw4úMÓ� Ó1¦�ýSþ%� ���Z·�Ë5�Ì�#�D�rûQú�û��Sú�·³¯v{MÒ7u?0-1~} ² ¿cwau°·8w1ö´0-1�uUþR� �TÑ8uJ¿Cw��+·8w�Ó

Ä�ì�Û�0-1�µ1ÞRìÅÛ�ÙËìÅã�ì>Ý�çËâiÛ�à�í�ç¼Ý�ܽ×��~}�²AÛ�àLÜdÛ�ì�ãrÚSÛ�ܽö»ì>âißÂã�ì>ö5ܽÝ�ìY0-1 Þ5ßÂá�à»×¼Ý�ì>á�ç�Û�ܽö»ì�\ì��BÞ7ì�ãrÝ�à�í`¨)å�Û�Ù¼Ú�ÛBܽÝ>å*0-1.¶²ö·051 Ú»×¼ëCå�ã�ì+á�çËãrÝ�Üiö'ì�â½ß»å#051 µ=¸ ¹ ö·051 µ } ² ¿�í�à»ãÚ�×5ß/¿oË À �1Ä�ì�Û�þR�3�MÑ µ us¿cwhÞ7ì§Û�ÙËì�ãrÚ�×Ëä¿à»íc¿Qܽ×y0-1 µ �õF�Xè�è�¬NI ¥�ü�úW051 �rü������^�'ü<�\���h�1þR� �TÑI�?ý(�»üa��Óºì7û¼ü<�J¬1u¤¨�w�öº0517µBu�{�w4Ò�� �c���ýSþ�� ���a· ËY� �#�D��ûFú�û���ú1·£¯ {MÒ»0-1�uæûMu�¨�w7��·8w�ö¼0-1�µCu°·8w�Ó v<�I�D�Sþ ú&�^�a�M���Sþ4Òï ¨7ò/ª`«�ö´0-1�µb§cÓ���b�q�<ç[Z �ZÙËì ó ã�à5à�í�ܽÝ�Þ5ß Ü½×¼ë�çRá Û�ܽà»×Qà'×�Û�ÙËì�âiì>×���Û�ÙQà�í�¨H�ÛC¬Z���6äV¬,���HZZÖ�í�¨oö�¡�å5Û�Ù¼ì�×y0-1ÁöB051 µ å¼Ý�à¬1u¤¨�w

ö 051�u�{�wö 051.µBu�{�w 7

ØËçËã�Û�ÙËì>ãR�?à'ã�ì'å5í�à»ã¹Ú�â½â�·W¯[{¼å 051�u^ûMu¤¨�w��+·8wö 051�u°·8wö 051�µcu�·8wz7½ ��æÍÜÍä ����ß<��äV¬Z���HZ��5ç óËó à:Ý�ì5¬1u�¨�weö�0-1 µ u^{hw§Ú�×¼ëÀí�à»ã�Ú»âiâ*·�¯p{Ëå#051�uæûMu¤¨�w_�

·8wNö¾0-1 µ u°·8w<�Nx�ìBÝ�ÙËàSØ Û�Ù¼Ú�Û7¬1u¤¨�©R¿cw7ö¾051 µ=¸ ¹ u�{�wZÚ�×Rë!í�à»ãµÚ»âiâÌ·K¯þ{Ëå.0-1�uæûMu�¨D©¿cw���·8w1ö§051 µ=¸ ¹ u�·8whØ�ÙËì�ã�ì=¿oË À �

Ø�ܽãrÝUÛh×Ëà�Û�ì¹Û�ÙRÚSÛ%þ��3�TÑ µ uJ¿Cw1öþûMu¤¨-©�¿cwR��ûMu¤¨�w<��Ö�í�þR� �TÑ µ us¿cw1öÙ{Ëå'Û�ÙËì>× Û�ÙËì�ã�ìµìay5Ü�Ý�Û�Ýô Ë�051 µ u�{�w©Ý�çRárÙÅÛ�ÙRÚSÛeô õ öÙ¿8��9�ßÅÛ�ÙËì?ܽ׼ë�ç¼á�Û�ܽö»ì?Ù5ß ó à�Û�ÙËì>Ý�Ü�Ý�åBôðËC¬1u¤¨�w§Ý�à¼åCÞ5ßéµìar¼×ËÜiÛ�ܽà»×Åð!}�åûMu�¨#©L¿cw7öþûMu¤¨�w�Ú�×¼ëQþR� �TÑ¿µcus¿cw7ößûMu¤¨�©�¿Cw¿��ûTu�¨�w#�b�%×?Û�ÙËì¹à»Û�ÙËì>ã�Ù¼Ú�×RëCå

151

Page 156: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Üdí�þR� �TÑ µ us¿cwNöJ·K¯þ{Ëå5Û�Ù¼ì�×QÛ�Ù¼ì�ã�ì�ì<y�ܽÝ�Û�Ý;ôêËY0-1 µ u°·8wböB051�uæûMu¤¨�w��+·8wZÝ�ç¼árÙ!Û�ÙRÚSÛôºõ öp¿8�Àî¯à'ã�ì>àSö»ì>ã>å"í�à»ãÆÚ»âiâNô ËB051 µ u�{�w�öÀ¬1u�¨�w å�Ú»×¼ë�í�à'ãÆÚ�â½â7ô ËB0-1 µ u°·qe�w�ö0-1�uæûMu�¨�w��·qe�w§Ý�ç¼árÙPÛ�Ù¼Ú�Û/{:�|·qe��|·�å�ô õ ö�{c¿F���ZÙ5ç¼Ý>å�ûMu¤¨³©#¿cwkö ûMu¤¨�w��·�å�Ý�àþR�3�MÑlµBuJ¿Cw1ö>·¦övûMu�¨�©�¿cw���ûMu¤¨�w<��5ç óËó à'Ý�ì�ôêËM¬1u¤¨-©�¿cw#���ZÙËì>×�ôýõ öj¿F�)Ö�í�Û�ÙËì>ã�ì%ì<y�Ü�ÝUÛrÝ7ô7e8Ëx¬1u�¨�whÝ�ç¼árÙ Û�Ù¼Ú�ÛNô7e�õ ö

¿"å5Û�Ù¼ì�×�Þ5ß�é%ì<rR×ËÜdÛ�Üià'×Pð!}�åhûMu¤¨x©&¿Cw1öþûMu�¨�w å¼Ý�àWþ��3�TÑ¿µBuJ¿Cw1öß{\Ú�×¼ë�å:Þ5ßié%ì<r¼×¼ÜdÛ�Üià'×Pð+ñËåôêË�051�µ=¸ ¹�u�{�w<�1�µÛ�ÙËì�ã�Ø�Ü�Ý�ì'åMûTu�¨M©�¿cw7ö>·³¯[ûMu¤¨�wZÜ�ÝhÛ�Ù¼ì�Ý��¿Ú�â½â½ì>Ý�ÛZà'ã�ëËÜi×¼Ú»âCÝ�ç¼árÙWÛ�ÙRÚSÛÛ�ÙËì>ã�ì?ì<y�Ü�ÝUÛrÝtô e ËÀ051%u�·8w§Ú�×¼ë�ô e õ öó¿"å�Ú�׼볬1u¤¨³©4¿cwtö ø�ôðËÁ0-1�u°·8w�õ\ô õ öÁ¿8ú���ZÙËì�ã�ì�í�à'ã�ì'å

ôê˱051�u^ûMu¤¨�©�¿Cw�wö 0-1�uæûMu¤¨�w��=þR�3�MÑ µ uJ¿Cw�wö 0-1 µ uUþR� �TÑ µ uJ¿cwRwz7

Ú�×¼ëCå¼Þ:ß�éµìar¼×ËÜiÛ�ܽà»×�ð>ñ¼åMôêË�051 µ=¸ ¹ u�{�w#��5ç óËó à'Ý�ì�ôêË�051 µ=¸ ¹ u�{�w<�F�ZÙËì>×eôpõöI¿�Ú�×RëÂôê˱051 µ u�þR�3�MÑ µ us¿cw�w<�"Ö�íCþR� �TÑ µ us¿cw1ö

{Ëå5Û�Ù¼ì�×�ûMu¤¨�©�¿Cw1öþûMu�¨�w�Ú�×¼ëôê˱0-1�µCu^{hwö ¬1u¤¨�wö ¬1u¤¨�©�¿cwz7

�µÛ�ÙËì>ã�Ø�Ü�Ý�ì'åCþR�3�MÑ µ us¿cw�¯v{ËåËÝ�à�ûMu�¨�©�¿cw�¯[ûMu¤¨�w�Ú»×¼ëôê˱051 µ u�þR�3�MÑ µ us¿cw�wö 0-1�uæûMu¤¨�w��=þR�3�MÑ µ uJ¿Cw�wö 0-1�uæûMu¤¨�©�¿cwRw

Ý�à�ôêËM¬1u¤¨�©�¿Cw#�1�ZÙËì>ã�ì�í�à»ã�ì»ål¬1u¤¨M©�¿Cw7ö§051 µ=¸ ¹ u^{�w<��¹àSؾâiì�Û5·~¯ó{¿í�à»ã©Ý�à��?ìm·IËx�ù�k�ZÙËì�×"åHÝ�ܽ׼á�ì þ��3�TÑ µ uJ¿Cw=öùûTu�¨D©�¿cw �jûMu�¨�w åCÞ5ß

éµìar¼×ËÜiÛ�ܽà»×�ð>ñ 051�uæûMu¤¨�©R¿cwq��·8wö 051�uæûMu¤¨�w��Aþ��3�TÑ¿µBuJ¿Cw��+·8wö 051�µCuUþR� �TÑsµcus¿cw���·8wö 051 µ=¸ ¹ u�·8w

Ø�ܽ׼ڻâiâ½ß»å�ÜiÛ�í�à»â½âiàSعÝhÛ�Ù¼ÚSÛï ¨7ò ª'«ö ¬1u�¨�wö 051 µ u^{�wö 051 µ §©7

152

Page 157: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

���b�<�F���������#�<�?ö¸ì7û¼üZþ�ü<�#�M�zú����T�t��üa�J�-ü�����ü�ú^�1�T�#���h�Â�iü��rú8�3�C�%þ#�s�+û�úB�3�4��ý(�a�^�Sú&�-ý3��ý��*¥�ü�ûM��� �D��â �þ�ü<�(�L�#�-ý3� ý��¼ü�þ��Sú}ýSþ��4�3����ü%���c�rý ���#�L� ú�üa�¼ú£Ó���b�q�<ç[Z xAÜ�ë�ì�×Ëܽ×��QãrÚ�×Ëäo�?à�ë�ì�â�ݧá�à'ã�ã�ì>Ý ó à'×¼ë�ܽ×�� Û�à�Û�ÙËì?ÞRì>âiܽì�íZÝUÛrÚSÛ�ì>Ý©Üi×ÛØ�Ü��'çËã�ì\ñÚ�ã�ì� 051.�=u�·8w1öÃÂÄ Å ÷�¿�÷ � �6·¦öÙ{

÷�¿c(�û�÷ � �6·¦öðÅ ý�ú û¼ü�þz���L��üTÈ

051 � u°·8w1ö�ÂÄ Å ÷#û�÷ � �6·¦öß{÷U{c¿c(²û�÷ � �6·¦ö�ðÅ ýSú�ûËü�þz���L��üTÈ

Ú�×¼ë 051��*u°·8w7ö � ÷4{�û�÷ � �:·�öÙ{Å ýSú�ûËü�þz���L��ü

Ø�ÙËì�ã�ì�·�Ë5�ù�b�ZÙËì�ã�ì+Ú»ë�ì>ãZØ�Üiâ½â"ì+Ú»Ý�Üiâ½ß�á�à'×Mr¼ã���Û�ÙRÚSÛ�u?0-1��x} ² u?0-1��D} ² 051��̧Tw°§w@§TöÉ÷U{c¿c4�{�û�÷©Ø�ÙËì>ã�ì+Ú»Ýtu�u?0-1:��} ² 0-1��̧TwH} ² 0-1���§Tw@§Möý÷£¿c4�{�û�÷3�÷G�#«�«#��¬Zè��8ÝÍ�;�t¬Z�Í��è!Üq��¬­���#�<�Í� Ö�Û%Ü�Ý%ì>Ú»Ý�ßWÛ�à�ö'ì�ã�Üdí�ßWÛ�ÙRÚSÛ©Û�ÙËìBí�à»â½âiàSØ�ܽ×���ë�ìar¼×ËÜiÛ�ܽà»×à�í�Ú»ë!ÚUç¼Ý�ÛR�?ì>×'Û¹à ó ì>ã�Ú�Û�à'ã�Ý�ܽݹì�q:çËܽö/Ú»âiì>×:ÛhÛ�àÇxAÜiâ½â½Ü½Ú�� Ý�� Æ��� ���������������~}¥�ü�ú1ïý�rü��3� y�%q¦bÓ_}?Çó�L�o� ���B�}Ú»ë!ÚUç¼Ý�ÛR�?ì>×'Û à ó ì>ã�Ú�Û�à'ã�� ¤Ê� �L�¦��7ý3�\�#ò+ü�þ�ýWýSþR� ���c� ���3�C�!ÒD��ý�þ��3�Rÿ���üa�¼ú}üa�c�rü�¿oË À � �c���3�¼ÿeôêË�ÅÉÒ

uwïE} Ç ¿cwauæôÂwbö ÂÈÈÄ ÈÈÅ{ � ��ôpõ öÛ¿[� �c�:ïFuæôkw7öJïFuJ¿Cw� � ��ôpõ ö�{c¿8È��3�C�

ïFuæôÂw��/� ý�þ�ï8u^ôÂw7öJïFu�{c¿CwïFuæôkw ý�ú û¼ü�þz���L��ü�7

���b�<�F���������#�<�?þ¸ì7û¼ü\þ�ü<�4�M�dú����T���rü<�s�-ü^�/��ü�ú��k�T�#���h���iü��rú;�3�c� þ#�s�+û�ú;�!�4��ý��a�^�Sú&�-ý3� ý����3�¼ÿ��ý ���a���c��ú��-ý ��ý����1�����^ø#�T� ú��¿üa�¼úhý4�Ëü�þR��ú�ý�þ4����� ���rü/���C�rý3���#�L� ú}ü<�RúXÓ���b�q�<ç[Z Ä�ì�Û�} Ç n Ú�×¼ë*} Ç p ÞRì¹ÛUØ�à��B�}Ú»ë!ÚUçRÝUÛ��\ì>×:Û�à ó ì�ãrÚSÛ�à'ã�Ý��FÄ"ì�Ûaï � È\ï � È\ï � ÞRìZÛ�ÙËìÝ�Ú�\ì�Ú'ÝZÜi×WÛ�ÙËì ó ã�à5à�í�Û�àÇë�ã�à ó à'Ý�ÜiÛ�ܽà»×�z�å�Ø�ÜdÛ�Ù!Û�ÙËì�Ú'ëËë�ì>ëWã�ì>Ý�Û�ã�ܽá�Û�ܽà»×WÛ�Ù¼Ú�ÛÐï � u�{c¿C4{�û3w`¯�� � ��æµÝ¹ç¼Ý�çRÚ�â;å�u#ï���}(Ç n u#ï��D}(Ç p ï���§hw°§hw@§Möp÷4{c¿c4x{�û�÷ �F9hß�éµìar¼×ËÜiÛ�ܽà»× ð�üå

uwïs�x}?Ç n "�§Tw<uJ¿c4�{�û wö � �� uwï � } Ç n "�§Tw<us¿c4x{�û wö ï � u�{c¿C4�{�û w

Ý�àou�uwïs�x}(Ç n ïs�q§hwR}(Ç p ï���§hw°§Töp÷£¿c4�{�û�÷3�É ����,� b,�'�'*O l P#?!0hW �#¡4� ST@� #582b,&+�:*+��J245\)!'£U3.!+ ��+� #.hW

153

Page 158: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

���b�<�F���������#�<�'���uwO � & 4.O × w<uæô � È�ô × w1ö'O � u^ô � ÈRô × w � O × u^ô � È�ô × waÓ���b�q�<ç[Z �5ç ó¼ó à:Ý�ì�ô � ÈRô × Ë¢Å å"Ú»×¼ë+O � Ú»×¼ë�O × Ú»ã�ì\Üi×Rë�ç¼á�ì+ë�Þ:ßÅÝ�ì�Ûrݧà�íZÝ�à'çËã�á�ì>ÝK � È�K × Ë%Mµå'ã�ì>Ý ó ì>á Û�Üiö'ì�â½ß���5ç óËó à'Ý�ì5�/Ë�uxO � & 4.O × wauæô � ÈRô × w<�H�ZÙ¼ì�×�å:Þ5ßCé%ì<r¼×ËÜiÛ�ܽà»×RÝb}Ú�×¼ëoz5å

� Ë øhu�ÅÉÈ���wNË2K ��� K × � ô � �èô × úa`¦øt� úö u£øTu^Å È;��wNË:K � � ô � �èô × ú �

øhu�ÅÉÈ���wNË2K × � ô � �¢ô × ú!w�`�øb� úö u£øTu^Å È;��wNË:K � � ô � �èô × úa`�øb� ú3w �

u£øTu^Å È;��wNË:K × � ô � �èô × úa`�øb� ú3wö O � uæô � È�ô × w � O × uæô � È�ô × w

�¹àSØ Ý�ç óËó à'Ý�ì2�ÛË|O � u^ô � È�ô × w � O × u^ô � È�ô × w#�I�ZÙËì>×�å�Ú�'Ú»Üi×LÚ óËó â½ß5Üi×��!éµìar¼×ËÜ��Û�ܽà»×¼Ý;}?Ú»×¼ë�íËå� Ë u£øTu^Å È;��wNË:K � � ô � �èô × úa`�øb�� ú3w �

u£øTu^Å È;��wNË:K × � ô � �èô × úa`�øb�� ú3wö u£øTu^Å È;��wNË:K � � ô � �èô × ú �

øhu�ÅÉÈ���wNË2K × � ô � �¢ô × ú!w�`�øb�� úö øhu�ÅÉÈ���wNË2K � � K × � ô � �èô × úa`¦øt�� úö uwO � & 4.O × w<uæô � È�ô × w

154

Page 159: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

A Bb

d ca

2b

d ca2

(Ψ )Dominating Belief State(Ψ)Pedigreed Belief State

d b c a1 1

2 2

2

2

a b c d23 3

3

33

d ab c

S = {1, 2}A

A

1

1

1

1

2

2

1,2

db ca

S = {2, 3}B

B 2

2

2

23

3

3

3 3

a bc d3 c

abd2 c b d a1

Sources:

1,2

Ø�Ü��»ç¼ã�ì;Ã\���ZÙ¼ìµë�Ü�¥7ç¼Ý�ܽà»×¿à ó ì�ãrÚSÛ�à»ã(�ÍK � Ú»×¼ëiK � Ú»ã�ì�Û�ÙËìµÝ�ì�ÛrÝ�à�í"Ý�à»ç¼ã�á�ì>Ý�Û�ÙRÚSÛ�ܽ׼ë�çRá�ìÛ�ÙËì ó ì>ë�Ü��»ã�ì�ì+ë�Þ7ì�â½Ü½ì�í�ÝUÛrÚSÛ�ì>Ýhí�à»ã¹Ú�»ì>×'ÛrÝl$�Ú�×¼ë2" å�ã�ì>Ý ó ì>á Û�Üiö'ì�â½ß�

155

Page 160: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

�� ����� ��� ��� ���� � �� � ��� �� ������� ���������� � �� ����� � ��� ������ ��� ������

������������ � ���������� � �������� �������

������� ����������

��������� �� �������� �� ���������� � ��� �!

�������" �# $%&�'" ��#������������� �����

����� �����

����� �� �������� �� ��� �� (� ��� �)�*��� � ��� �!

+������� $,$�%" ����������������������

��������

-� ��� ��� ��� ���.��� ���*�� �� ���.����� � ����� � *� ��� �� �������� ������ *� ���� -� ������� ������" ���� � ����� �� ��� ����� � *� ���� /��! ����� �� �������� ��0 �� � �� �� �� ���!�� � � ���� ����� ��� -� ������� ����� �� ��� 1��� .���� � � ���� �� ���� ���� ��� � ��� ��� /��" �� ����� *� � ��!�� �������� ��� *� �� ����� �� � ���� .������ *! � ��� �� ������� �� ��! � ��������� �� �* �!� /� � �������� � � ���� ���#����2� ������ * �! /������ � ��� ����.���! ����� 3 �!" �� � � � � ��� ���.�����!.*���� �������� ��� ���* � ��� .������ � �� ��� �� ������ -� ���� ������ � �������� ��� �4�� ��� ��� ��*� �� ����� ���������" �������� �!" �� ����� �.� �!" ��" ����" � ��.*��� �� ��� ���.����" �� �� ����� *� � �������� ��! ��.���� � ��! �� ������ � ��� ����� � *� ��������

�������� ���������� � �� *� ���" ��� .���� �!�.����

� �������

-� ��� �������� ��� ��� .���� ���� � ���������� ��� ������ *! ������� �� ��! � � �� �� �� .�* �!" �� ����� ����� �� ����� �! ���* � ��� �*� �� ������� /� � ���� � �������� ����� ���*���5�,� 3 � � � ������� ��� ���������� � ��� ����.� � *� ���6 ��� �������� � � ����2� *� �� ����� *!�������� � ��� ������ � ���� ������ �������"������ � ��� ��� ���� � �� �* �! �� ����� �������6��" �&� ���* � ��� ������ � �� ��� �� ����� � ���� ���� � ��.*��� �� ���� ����� ��

/�� ��� � ��� �� ����� �! ��� ���� �7��� �!� �� ��� 4��� ���*�� �������� ��� ����7� ���������� � ����� � ���������� ������ ��� *� �������� ��� ,$89��� /�� ���� �� �������� ��� *�� ����� 1��� .���� � � ���� �� ��� �� �� ���� ���.��������� � ���� � ��*����� � �� ��� ��� �� ���� *� ������)��� ��" ����� ���� �� �� �� � �� �� �� *���������� � :����� �� ����� ��0 ��" �� �� � � �� �.� � �� *� ���� �� ��� ���" ��� �7����" � � ���.� � �� �� � ���*��� �� � ����� ��� � :����*����� �� � � �� �� � �� �� ���� ������*! �7.������� � ��������� ��� �" ��� ����� ��! ��! �������� �� ����� �� � �� � ��� ��� ����� �� *������� )��� ��" � ��� ����� ��� �����! � ��� � �� ������� � ��� �� �� � �� �" �� ���� *� � �� �� ��� ����*����� ����� � �� � �� ��� ����� ;��� *������ � ����������� �� � ������� � ��� ����� -� ������� ����������� � �� �� ��� � �� �� � � �7� � �� -���� ����� ���� ��� ���������� � �� �� ���� �� ���������� ����� �� ���*��� ��:���� *! ��� ��� �����������

/�� ����� ���*�� ��������� ��� � ���� ����������! �� �*��� ���* � ��� ������ � ���� ������ � ��� �� ������� �� ������ � *� �� ������ ���� ������ �� ����� �� �� ��� �� �� ��� *! ���� ��. �*� �������" !�� ��� ��� �� �*� ������� �� � ���� �� ��� � ���� ��<�� ������� �� � � �� . �� /���" ���� ���� � ���������� � ���� ��*� �� ��*� ��� � �� � ���� � ��� �� �*� �������� � ��� �� ��� ����� � �� �� � ���� �� �*� ������"*�� ���� � � *����� �� ������ ����� �� �� �� ��.���� ������� �� �� *����� ������ � � � � �*���=�!���.�� � �� �� ������ ����� ��� ��� � ���.� � �� �� � ���*�� ��� *� �� ������ ��� ������������ ���� ���.������" *�� ��� �� #����2� ������ * . �! /������ �#���� ,$9&� ��� ����� ��� ������� ���1�� �� �* �!� #� �� ��� ���" ��� ����� >�� ���.������� � ���� �� �� � ���� �� �� � � ��� ��

/� ��� ��� ��� �� �� ���*��" ��� ��� ��� ����.

Appendix J

156

Page 161: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

� �!�� � ����� �5 # ��*�� ����� � � �� � ����� ���� �� ���� � ��*�� �� ����� ��� � ��.���� � (���� ������ � �*��� ��� ������ �� �� �. ����� �� ���<�� (��� ����� ���� �� �����.� � ���� � ����� �� ������� �� ��! � ���� * �! ���������! �����" ���*! ���� ��� �� �7������ �� ��.�������� �� / �� ��� �� ��� � �.��< � ����� ����� ���� �" �� �� �� �� ��� ��� ��*�� �� �� � ���� � �� � ���� ����� ���� �� ������ � �� ��������� ���� � ��� � �� *� 4��� ���* �� *����� *�. � ��������� �� ��� ��*��� ������" ���� ����� ���� �� ���������� ������ � � ����! �� ��� ��*��� ?���! ���� �� � ������ ������ ���� � ��" � ��� ������� @�!� ��A *��� �� � ��� ��*��2� ����5 ��� ��*�� ���������� �� ������ � �� � ��� �� �� ��<����� *��� ��� � �� � �� � �� ����� �� ������ � ���� �� �! � � �� �� /� � � ��� *���� �������� ���� ���� ��*��� � �� ��� �������� � ���������� � ���� ���� ������� ����� ������ � ���� � . ��� ������ ��� ��� �� ���!���

� ���� � ����� �" ��� ��*�� ���� � ����� �� ������* � �� ������ ��� *� �� ������ �� ��� �� ���������� �! ��� � �� � :���� � ���� =���� ��" ���*� �� ����� ������ *! ��� ����� �� ����� *� �� .�� � �� ������� �� ��� ����� �� ���� ��� ��� -� � ����� *� ���� � ����� ���

/�� ����� � ���� >�� �� �����5 #���� ���� ���. � ��! ��4 � �� �� � � ����� � �� ��� ���������� �������� � ��<� ���� �� ��� � ��� ��" �� .������� ������" ���� � � ���� �� ��� �������� ������ >�� *� �� ������� -� ��� ����� *� ��� ���������� ��� *� �� ����� �� � ���� � � ��� *�. �� ������ �� �� ������ ������� ��� ����� ���������� ����! ���.�������� 3 �!" �� ����� *� � � ������.�����!.*���� �������� ��� ��� � ���� *� �� ���������� ��� �4�� ��� ��� ��*� �� ��� �� ���������"�������� �!" �� ����� �� �!" �� �� ����� *� ��������� ��! �:��� � ��! �� ������ � �� � *� ��������

� ���� � �� ��

-� *�� *! ��4 � �� ��� ��.<�� ������� �� ��* ��! ���� ���6 ���! � *� ����� �� �� ������������� ������

������� � ������ � �� �� ��� ��� ��� � B� ���� �� B�B� � ���� �� � � � � ��� ��� �� �� �� � �� � � ��� ��� �� ���� �� �� ���� ���

��� ���� ��� ���� ������� � � � ����� �� �� �������� �� � �� ����� � ��������

�� ��0�7 � �� � � � ��� � � B� � �� ���0�7 � ��� �� � ��� � � B�

�� �!����� � �� � � � � � � � ��� �� � � B� � ����!����� � �� � � � � � �� � ��� �� � � B� � �� �� .�!����� � �� � � � � � � �� � C � ����� � � B�

�� � ��� �� ��� � �� �� ��� �� ��� B ��� � � � � �� � � � ��� � ��� �� � � B�

�� ���� �� � � � � � � � ��� �� � � B�

�� ������ �� � � � � � � � � � � � ��� �� �� � � B�

�� ���� � � �� � � � � � � � � � � � ����� �� � � B�

�� 1��� .���� � � �� � � � �� ������ �� ���� ���

!� � ���� � � ������ �� �� ��� ��

��� B �� � � � � ��� � � � � �� � B�� C �� �� �� �� C � ��� ��" �� �� ����� �� � � B�

#� ��!� � �� ���� � � � � �� � B� �� �� �"$���� �� � �� ��� �� �� ��� �� %�� �� �� �� ������ �� ��

�&� ���� ���.����� �� � �� � � �� ���� ��� � �� ���� ����� �� � �� ��� � �$�'"" �� �

��� � �1� ���� ���� � �� � �� �()��� �'"" $�� � �� ���� ���

���������� �

�� �� ���� �� ����� �� "����� �� ��� ��"�������

�� *��' ���� �� �� ��� �� +���$ ���� ���

�� ,�� �#!�- *��' +���$ ���� �� �� ��� �� ' �� �

D � � ���� � � �� � ��� �� ������ �� �� � ��*����� ����� ������ ��" �� ���� ��� �� � �< ��� ��*���2�@*���A ������ � �� ������� �� ��� ���� �� -� ��4��� � ��� �� @*���A ������ �� *� ��� ��*���2� ��� � 5

������� � �� � �� �� ��� ��� ��� � B� �� � � � �� ������� �� � B� �� � ��� �� ��� �� � �� ������� �� � ��

����� C �� � 5� �� � � �� � �

���� �� ���� ������������ �� �� ������ �� ���������� � � ��������� � �� �� �� ����� � � � ���������������������� ���������������� ��������

157

Page 162: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

# ��� ��� ��� � �� �� �� ��� �� �� � ��! ��*��� � �.����! ��*��� �� 5

������� � . ��� �� ���� � ��� ��� � B �� ��� ��� � 5 �� � � � �� � � �� � � ��� � �����' � B�

?��" � ��! ��!� � ���� � ��4�� � ��� �� ���� �"�� �� �� ��� �� �� ���� ��*��� �� ��� �� ���5

���������� � ,�� �#!�- /��� �� ��� � ��� ��� � B� � ��� � ��� ��� � ���� ��� ��� ��� �� � �� ' �� ��

�� � ���� � � �� ��!� �" ������ � �� � �!����� �� � �� *� � ��(� *������ �� ���� ���������5

������� � /��� �� ��� ��� ��� � B� � �� � � �� ��0 �� ��� �� �� )�� ��� � � � � ��� ��� � � � � �� � B �� � � � C �� �� C � C �� �� C �� %���� � � B�

� �������� �� �� �� ��

-� ��� �������� *� �� �������� �" *�� ��� ���.�� �! � ���� ��! ���� �������� �������� � ���*�� ���� �� ��� � ��� �� �����!� /�� �������� � �� �� ���������� ������ ��� *� ���" �� ��� � ����� � �� � ��*���� � � ����� � �������� � ������.����6 ���� ��" �� �� ��� ���" ��� ������ ��� �1��!��� �� �� �������� � *� ���� � ��� ��� � ��� ������� �!" ��� ������� ���������� � �� � ����2����������� � � ���� ���.������ (��� ���� ���.������� � ��������� �� ����� * � ��� ���< ���������� ��� � ��� " �� ���� � �� � ���� ��� ���� �.����� � � �� *� �� ���� �� �������*� �� ������ ���� �� � �� � �� � �� �" ��� � � :���� *������ �� ��

���������!" #����2� ������ * �! /������ �#����,$9&� ������ ���� � �������� � �������� � �� �������.������ �7 ��� ��� ��! � ��� ���� � ��� ��� ����� ��*� ������� ��5

������� � 0 � 1 � ���� ��� ��� �� ��� � ����� � ��� � � � � �� �� � ����������� ��� $ ���'� ��� ��� � �� � �� ��� B� �� � � C ����� � � � �����

������ ���� ��� ��� �������� ������� �� � ��� �������� � � � ����� ���� ���� � ��� ���� ������ ����

�� � ������� �� � � ������ ������ � ��������� ��������� ������ � � � ����� ������ ���������

� ����� ���� ����� �� ��� �� � �� � � �� � � ��$����� ��� B�

� ������ ���� ���� � �� ��"�� �� � �� � � �� �$ ���� �� � � ��$����� ��� B�

� ������ �� � ��� �� � �� � ��� �� � �� � � ��

� ��������� �� ����� �� #����� �� ���#�������� �� C ������ � � � ��

���� ��� ��� �� � � B�

� �� � �� � ��� � ��� �� � �� � � � �� � �� ��

� ?�.� �������� �� ��� �� �� ��������� �� � � � ��� ��' ��� �� � ��"�� �� � �� ��'�� � � B� � �� � �"���� � � ��

���������� � ,.���% �#��- ��� �� �� ���� ������ �� � � ���� �� �� � ���� ���� �� � ��$"��� ,%2- 3� � ���� ���� ��������� �� ��$���� � �� ���� �� ����� �������

/� � ����� * �! ������� �� ����������� �� ��< ������<� �� �� #����2� ��������< ���� ���� � ����. �� ��� ������ E� ��� �� ���<� ��� ����� ���� ������� � �" ��1� � � ���� ��� ����� �� � �������� ��! ��� ��! ���� �! �� 1��� .���� � �! ������ ������ �� ���� � �! �� � ���� ���.������ /� � ���<.� � ��� ��� �� �� �������� ��� �7 ����� �� ��������� � ���� � ��� ��! � ��� ����� ��� � ��"�� � �� ������ � ���� �� ���� ��4�� ��� �� ���.� �� ��� ,$89�� )��� ��" �� � ���� � ��� �� � ��.��� �� �� ���*����

3 ���" ����" 1��� .���� � � ���� �� �� � ���� ����.���! ����� ��� �� � � ���� �� 1��� .���� � � *���� � ���� ���.�����" �� � :����� ���� � � ������ � �5

���������� � 0 � 1 �� ��� ��� ��� � B �� � � 1 � � �'"" �� �� �� ��� ,���� � � �

�� � � � �� � � �-� �� � �� � � �� +���$ ���� ��1� �� ���� ��� �� � �� �� ���� ���

/���� ��� *�� ���� � ����� � �� �� ������� �� �� � :����� ����� *� ���� � �6 ��! ����� ������ � :����� ����� *� ���� � �� �� ��* �;�!����� �� ������ �1��! �� ��� �;�!� �������� ������� �1��!" �� ���� ������ ���� ��� ����;�!� ���� �� ������� �1��!� �� ����� ���� ����1��� .���� � � ���� �� ���� ��� �� ���� ���.���������� *� ��������� ��� ! �� ��������� �� � :��.����

� �� ��� �7 ����� �� � ��� �� ���� � � �����! ���.4� �� ��� ���� �� ��� � ��� �� ���*���" �� � ������� �� ���� ����*�� )��� ��" ������� �������.� �" ��� ����� �� ��� �������� � ���� �� �! *� ��.�*� ��� ��< � ��� � ��" *�� ���� *� ��������*� ��

158

Page 163: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

� �� ��������� ���� � ���� ��! *� � �� ������������ ��6 ����1���!" � ���� �� �� ��� ��.��� ���

�����!" ��� ���� �! ������� � � �7���� �! ����� �.� � ��� �������� � ��������� ����������� � �����"� * ��! ���� � � �� �7����� ���� ���� *� ���� �.�� �� *����� � �� � �� ������ �� � �� �5 � � � ��� �� �" � � � �� � �� �" � � � �� � � �" �� � �� �

�� � �� �� /��� �! ������� �� � ��� �� ��� 4��� ������� ��" ���� ��� ��������� � �� ���� �� �� �����.��� � ���< ���������" ��������� �� ��� ��� ��� ������� �� �� � �� �" �� � :������ )��� ��" ��.� ��� ��� � ���� � ����� � ����� � ��! � �� ������*����� � ��� � �� � �� � ���������" *�� ����� ��! ������� ��� � ���� �� �� � ����" ������� �������� ��� ��! ������� �� � �� ��� �� /�� �����2��� �� ��� ��0 ��" � � ���� � ���� ���� �� 4� �� �! �� ��� ����� ���� � ������� ��� /���" ������� �! ������� � � ����� �! � ������� � ������0 ��� �� �� �7 ��� /� �" �� ��! �����" � �����.�� ��� � �� ��� �� �������� ���������� �� �� �����*�� ��� ������� �� / ���<! ,$�$� ��� ������.� � �������� ���� � ���� � ���� ��* �����)��� ��" ��� ������� � � ������� ��� � �� ��� ���������� ��������� ���������� � �� � ���� � ����� ����� ! �� � � :������ �� �� ��

� ������ ��� ��� �� ������

F�� �� ��� �� ��� ���� �� *� �� �������� �� #���� ���.����� � �� ��� ��� �� ���� *� ����� � ��� �! ��.�������� ���������� � ��� � *� �� ����� ��� *� �� �� � � ����� �! �D�� � ,$886 �������� =���>� ,$$,6 F���� �� =�� ��� ,$$�6DG�������� �� =�< �� ,$$%�� ������ �� ������.���" ���� �� �������� ���� � <� ����" ����� �� � :�����" �1�� <� ����� 3�� ��� ���� ��� �� ��������" ������ �� ��� � � ���� ������ � � �� ���� ����� � ���� � �C ��� �� F�� � *� � 4 ��" �.����! ��� �� ���� *� ����� � ��������� ��� � �� ��������� � � � ���� ���.����� � � � /�� *� �� ��. � � �������� �� �� � ���� ��� ��� � �� *� ��@ � � ��� �A ������ � �� � ��� ������� �� ���� � � ��� ����� ��� ��� �� ��� �� ����� ��� ��! � �

��� ��� ��! �6 �� �� �� �����H��� /�� � ���2� �.��� � �� *� ��� ��� � ����� ����� � � ��� ����������� �� � ���� ��� *� �� �H� �� �� ���� � ��� ��� *� �� �����" � � �� � �� *� ���� � � �� ��������� �H�" �� ��� �����H���

�� ����� ���� �� � ����� �� ���� *� �� �������� � � �����! � � �� �� ��������� �������� � ��" �� ������" � ��� ������� *� �� ��� ���*��� ����� *�� ��� ��� ��� ���� �� -� ������� � ���� � �� �����

���*��� �� �� ����� >�� ��� ���� ���.����� �����.����� � �� �� �� ������� ������ � �*��� ��0 ����

��� ������� �������!� ������

-� ��<� ��� �� <� ���� �� �� � � �� � �� ��� �� <�. ���� � �� ������� ! ����" � � ���� *� �� �����.��� ����� � �� �� ��0 �� � �� �� ��� �������������� /� � ��� �� �� ���� ���� ���� �� ���� ��.�����" *�� ��� � � �� �� ����� �� ����� ������ ,$$�"�� ,$� ��� � �������� �������� � *��� � :��.��� �� �������* �!� � <� �����" ������ ������ � ��!����� � ���� � �� �������� ��� �� <� .���� �����" ��� ��� �� ��� � �� � ���< <� ���� ���.� ��" �� ����� ��� ��� ����� �� � ��� � � �� ���.��� �!�

-� �����! ��4� �����4� 1��� � �5

������� " . ����� >�� *� �� ����� � �� "��$���� ���� �� �� ��� ��� �� �� � �� �����1������4� 1��� � � ��� � �� ��� � ��

-� ������� � � � �� ��� @����� � ����� �� ��.� ��� � �� ��� ��! ���� <�! ��� ��A -� ���������1�� <� ����" �� �� �� ��� ����� �� �� @����� .� ��"A � �� ��� ���� ��� � � ��4�� ���� ���� � � �

� �� �! � � �� � �� � �� �� -� ��4� ��� ��0 ������ � ��������� � �� �" ������ �" �� ���� ���

: � � � �� � � �� �� ����� *�� � ���� �� ����� �������� ������ �� ��� ��� � ���� �� � �� � �� ����� ����� ��! ���� <�! ��� ��� ������ � ����" �� ����� ! ����< ���� � ���� ��! ��������� ��0 ��� �*� �� ����� ��� ���� �� ��4 � � %�

3�� �� � ���" �� � ����� �� ����� >�� *� �������� � ��! �� *� �� ������ ��� ��� ���� ��� �� �������� �7���� ��� �� �� �� ���� ����� ����� ��

��� ���#�����

F�� �� ��� ��� ��! ��� ��� �� �� ���������� � �;��� 4��� 3 ���" �� ����� � �� ��� ��� � ��� �� ���.�� �! ���� ��� �� <� ���� ����� *� ���� � ��

#� �� � ������� ��� ��� ��� ���� �" ����� � ��.�� � ����� � ����� ��! ����� � ��I � :���������� �� *� ���� � �6 �� ��� ����� �� � ���)��� ��" ���� � �! �� ��� �� <� ���� *! ���� ������ �������� ���� � �! �� ����� � ��� # � �.�� �7���� � ��� ���� �5 �C ���� �� " �� �����C ���� ��� ��� �� � )��� ��" � �� *�! ���� ��� �� <�. ���� ����� *� ���� � �" ��� ����� � �� � ���� .� � ��� ��! ��� ��� �� <� ���� � ��� ������5

���������� � ������ �� ��� � �� ���� �� ��

159

Page 164: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

� �� � ����������� ���� � ��" �� ���� ��� ��� ���� �� �� � �� "������

� ������!" ���� � �! �� ������ �! ��� �������! � ��� �� <� ���� �� ����� � �� ��� *��� ��1� ����� *� ���� � ��

-� ����� �� � ��� ���� ��0 ��� ��� ��� ���� � � ��� ��������<� #� 4��� ����" �� � ��! ���������� ��*�5 � � �� ��! ���� *� ��� � ����� �� � �.����� � ��� ���� � <� ���� �� ����� � �� �" ��� �� �" !�� ����� ���� � � ���� <�! ��� �� )��.� ��" �� ��� ���� �� � ���� � �! ����� ���� ����!��.*���� ��4 � � �� ��0 ��� ���4 � � %�" ������ ��� *� �� ����� ���������� �� �� � �� ���� ������� ���� �� ��� �� �! ������� � �� ��0 ��� ������ �� ���� � ��� � ���������� � �� �� �� �� �����.� ��" *�� ����� ���� �� *� ������ ���� � ��� �� �� ��� ���� �� �� �� ���

?��" �� ����� ���� ������" ���� � � ���� �� ������ �� �� ������� ���� � <� ����" ����� � ��"�� ��0 ��� ���� � ����� �� ������ � �������"�� 4��� �� � ��� ���� ��� � ���0�7 �! ���� � ��� ��� ���� �� ���� �� ���� ��� ��� �� ��� �� �� �������.������" ���" ��0 ��.����� F�� � *� ��� ��� �� �������.������ � �� � " ��" ��� ��� �� ��� � ��� �� ��� ���

���������� " �� � �� ���()�� �� ���� �� � �����"����� � � ��� �� � � +��� ���

�����!" ��� ���� � ���������� � ������� ��������� ���� *� �� ����� ���� � �� ��� ���� *� ����� �� ���� �� ����� � ���� � �1��! <�! �� � ����.� �! � �� � ��0 ��" �� ����! ������ ���������6 ����� � �� �� ���� �� � ��� ���� ���� � ������ ����� �� �� ��� �����

���������� $ �� � �� �� �� �� � ���% C ���� � � � ���� �� � �� � � �

�� 5�� ��' � ��� �� � ��� � �C � �"���� �

�� � � ��

�� *��' �� �� � �� ����' ��� � ,� � �� ��� ������ ���- �� ����' ��� ��� � ,� �� �� ��� ������ ���-�

3 ���� , ����� ����� �7����� �� *� �� ������5 ���� �� � � ���� ���.�����" �� �� �� � ��� ��� �� ��.� � �� � ���� ���.�����" �� �� �� �� � � �����

/���" ����� >�� *� �� ������ ��� �� � * � ��������� ��� ��� �� ��� �� �� ���� ���.������� /��!����! ����� >� ����� *! ���<� � ��� ������� ����� ���� �� ����� �� ��� ��! ������� ��� �1��! <�!" ��� � ��� ��� ���� * �! �� ��0 ���� ?��

(c)

P P P

P P

(b)(a)

P

3 ���� ,5 /���� �7����� �� ����� >�� *� �� ������5��� � ���� ���.�����" �*� ��� ��� �� ��� � �� � �������.�����" ��� � ����� �(��� � ��� ��������� � �������� � �� �� ��� ��! ��� ������ � ��� # ���*����� � ���� � ����� ���� � � �� ��� � ��! � ��� ���� � ��� �� �� ��� �� � ���6 � ��� � .����� ���� � �� �� ��� ���� �� ����� �� ��� � ���� ����"��� ��� �� ����� ���������� *! � � ��� � ��! ��.����� � ����� � � ��� ���� ��� � ��� �� ����" ��!� �������� ������ ����

�� �� � �� �� �� *����� ����� � �� ��0 �� ���� � �� *� ���� # *� �� ����� � � ����� � �*������ � �� *� �� �H� � ���" �����H��� � ��� ��� �� ����� ����� ��� ��! � � ���� � *��� ����� �� �� ���. ��! � �� �� �� � ��! � ��������� �� � ��0 ���*��� �� � *� ��" �� ��� �����H��" � ��� ��� �� ��� ���! ��������

3 �!" �� ������� ��� ���������� �� ����� �� �����4 � �� �� ����� � ������� ��� ��� ��� ���� ��3 ���" � ��*����� ��� ���� �� ���� ���.������5

���������� & � � � �� �� � � �� �()�� ��$ ���� �� ��

�����!" � � ���� ��*����� �� � ��*����� *! ������ �� ����" 1��� .���� � � ���� ��" �� ��� ���.���� � �� ��� ��� ������ � � � F�� � *� ��� ��� ������" 1��� .���� � � ���� �� � �� � " �� ��" ������ �� ��� � ��� �� ��� ���

���������� '

�� � � C � �

�� � �� ��

�� � �� � �� � �� �� �� �"� ��

�� � � � �� � �� �� �� %� �"� ��

J������ ������" ���� � � ���� �� �������� ��� ������������" � � ���*�*! �� ��� �� ������� ���� ����� ���� �� ��� �� ��� �� �� ����" 1��� .���� � � ��.�� ��� #�� " � ���� ���� ��*����� ��� �����" *���� � � �� ��� ������� � � ��5

160

Page 165: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

���������� �(

�� �� � C ���

�� � �� ���

�� �� �� � �� � �� �� �� �"� ��

�� �� � � �� � �� �� �� %� �"� ��

� ��� �7� ���� �" �� ��4� � ����� �������� ��� �! *���� � �� � �� ���������� � ���� ��� ������ ����� �� �� �*�!� ������� ���! ��� 4�� ��.� �� �� #����2� ��� � ���

� � �������� ��� �� �����

�������

������� � ���� � ������ *! � ��� �� �������" ����� �� �� � ��� *� �� ������ ������� ������� ������� ���� ��� ��<�� ��� ������� *! � � �� ���� * �!�-� ������� � �������� ��� �������� � ��� ����2�*� �� ����� � *! �������� � ��� *� �� ������ �� ���������� � �� � ������ � ��� ��� ���� * �! ��<. � �� ��� ��������

)*����� � � %��� �� ������� )"�� ���" ����� ��1� ��"�� � ��� ������ �� �� ��� ��� ������� ����� �� ��1� ���� � � � � �" �� �" �' � � ��� 1' � �� �� � � ���� �� � ��� ���� �� ��1 2 � � � �� 1��� �$ ���� . ��" ���� � ���� �� %� � � � �$" � ��1 2 �'� "� �� � ���� �+�� ��� �����$" ��� � � �� �� � � � ��� �� �� � �����%� ��� 1' ����� �� � ��1 2 �'� " �� 1' ������� �� � � � ���� �'� "� �� � ���"� �� � %���� �� ��� � ��� � � �� � � �� �$��� � .� � �� ������ �� �� 1� �� ������� 1� � �"�� � ������� � ��1 2 �'� " �� ������� �� ����� � � �����%��� �� )�� �� ���$� � ��� 1����� �� ���� � ���� � 1 2 � ���1� �

�� ��� � �"�� � �����""� � ������ ���1 2 �����"� 1���� �� ���� ���� �� ��%���� %� � �� ��� �� �� "�� �� 1� ������� ���1�"� 6�%��� �� �"� � � ���� �����" �� ����� � ���1�" ���� ���$�� ������ � �� � �������

�� ��� � "��� ��� � �" �' ��������� �����$ �� �' �� �� $� � �����" ��� � � ��$1 2 �'� " �� %��2���� �� %� ��� ��� 1' ������ %�� ���� �� � �'� " � ��������� ���� ��� ����� �� �� �� �� %� %����

���� �� �� %� � ������ �� � ��1 2�'� " �����

�� ��� � ��� �� %��2��� �� � ��1 2 �'�$ "� 2��%� � � ��1 2 �'� " ����� 1� ����7 2��% %� �� �� %� � $�������8� 1��� �"���� %� � � � ���� �'� "� �� ����� ��1� � �� �� %� �� � � � �����'� " %���� �� ������� �� � ��1 2 �'�$ " �� �� �����

0 � �� � 1 ������� ���� ���1�� ����� ��� � � ��1 2 �� � � ���� �'� "�� ��� $ ���'� � �2'� �� 1��� � � ��� � �� ���� �� ���%� �� 5���� ��

t

FD

FD

FD

FD DF

FD

F

F

s ssp m

3 ���� �5 /�� *� �� ������ �� ��" ��" �� �� (7��.�� ,�

F�� �� *�� ��� ����� �� ������ *! ��4 ��������5

������� $ ! �� ��� � �� �������� �� � ����� � � ! �� ��� � � 1��� � �� ��

-� ����� ��� ����� � �� �� ��0 �� ���� �� �� ������� � *! "� �� !�" ������� �!� �� � ���� *� �������� ���� ��� *� �� ����� �� � ������ � ��0 �� ����" ���" ��!� �� )��� ��" �� � � �� �������! � �� ���������� �� ��:�� ���� ��� ���� ����! �� @*� ���� *����� ���� * � ���A

-� ������ ���� ��� ����2� ���� * �! ��< � � ����� ������� � � ���� ���.�����5

������� & # �� � ��' ����� ��� � �� ��<��

������� ' ��2 5 ! � # ������ � � ���� ��2�

������� �( $ �� � � � ��$���� ��� ! ��$�� � 1' � ������� ��� #� �� ��� � $ �� ��

161

Page 166: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

���"��� % ���"����9 % �' �� �� �� ���� *� �� �� $�

�� � �� �� ��� �� $ � � � !�

-� ��� � �� & �� ����� ��� ��!����� � �� �!�.���� � ����� �� �� �� $" ������� �!�� /�� 4 ������� ! �#� ������ ���� � ��7 �� ������ ���<� �.��!� �7 ���" �� �� � �������! ��� ���� �� ��� �������-��<�� ������� �� ��� ���� *�" *�� �� ��� �� �� ���������� ! ���� ��� � ��� � ����� ��

-� ��� ����! �� ��� ��� ��� ������ �������� � ���*.��� � ��� ���� �" ������ � ���� � ������ *!� ��� �� ������� � � !� -� ��< �� ��� ���� � �����K�1��.��<�� �� ��� ��!.��<�� ������ �������� �K*����� ��� ��� � ��� ����� �����

��� )+������,�� ����#�� �--��-����

������� � ��� ������� �� � ��� ���� ��< �� ����$� � ��! �������� ��� � �!" �� ��� ��<� ��:���� �� �� ��� ���!" �� �� ��<� ��� � � �� ������� ��5

������� �� �� � � !� �� #���� �� � �� ������� ��

J! � ��! ��< � ��� � � �� ��� ������ *� �� ������"�� ��! ��� ���� � �!� )��� ��" �� �� �� ��� ���.��� �!5

���������� �� �� � � !� �� #���� �� "����� 1� �� � �����' ���� ���

/���" �� <�� ���� ������ � � , ���� �� ��� �!��<� ��� ���� � � ������ �� #���� �� ��� � *� �������5

������� �� �� � � !� �� �$%#���� �� � �$� ��� #������

���������� �� �� � � !� �� �$%#���� � ��

?�� ����� � �!" *! ��< � � �� �� �� � ���������� ���!" �� ��! ������� ��! ��0 ���" �� �������� ��! ������� ��*���� �� � �

)*����� � ������ �� �� ���� � �� � �� ��1� � ���� �� *)"�� � � ������� +���' ���1�� �� � ���� 1��� � %��� 1 � ����' ��� � �� ��� ���� ��� � �� � ��(� ���� ��' 1����

����� � �� ����� � � ������� ����������� ����������� � �� � �� ��� � � � ������ ��� � � ��������� ����������� �� ��������

��� .���#�����,�� ����#�� �--��-����

?�7�" ��� ��� ��� ���� ����� ��� ������� ��� ��� ��!��<��" ���" $� � � ���� ������ -� ��4� � ������������ ���� ����.��<�� ������� ��4� ��� *� �� �������� � ���� ��<�� �������� /��� �" ������ � �������� � �� � �� � �� �����" ��� �� �� �� � ����.��<�� ������� �����! � ��� �� ����� �� ����.��<���������" �� ����.��<�� ������� ��� ������� ���� ����.��<�� ������� ��� ����� �5

������� �� �� � � !� �� �$%%���� �� � �$� ����

��� �� 5 � � �� � � � ����� � � � �� � "�� �

���

/�� ��4 � � �� ��� �$%%� �������� ���� �� ��!� $� *� � � ���� �����" �� �� � ��� � �� ����� ����� ���� � ��� ���� � ��*.���� �� )��.� ��" ��� ���� ���� $� � � ���� �����" ��� ����� �����! � �$%%� � ��������� �� *� � *� �� ������

���������� �� �� � � ! �� $� �� � � ����� �� �$%%���� � ��

)*����� � ������� �� � �� ��1� � ���� ��*)"�� �� � ��� �� �� ������� "�� ��$�1� �� � "��� %��� �� ���� �� �������"�� ���1� �� � �����""�� �� ���� 1$��� � � ���%� �� 5���� �� �����"� � ��1� ��$� �' � � ��1 2 �'� " �� ����� 1� � � �������7 %���' 1�� � ������ ���1�" �� ������2� ������ � �

FD

FD

FD

FD

3 ���� &5 /�� *� �� ����� ����� �������� � (7��.�� & ��� �� � �� � ���

?��� ���� �� � ���� �� ��� ��!.��<�� ������� � �.���� �7���! ���� ��� ����� �=�!���.�� � �� �������� �����" �7���� ���� ��� ������� ��� �� �*� ����� ��� ��0 ��� *� �� ������� # ����� � � ��������! ���� � ���� ������� #D= *� �� �� � � �#.������L� �� ,$8'� �� *� ������ �� ��� �������.

162

Page 167: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

� � �� ��� �������" ��� ������ �� ��� ������"����� ��� ������ � ��� ����� ���� ���� *� ������ �������

��� /����� �--��-����

� ��� ����� ����" �� ��! �� � �� ��� ��<� ���.������� �� ��� �� ������� �� ���� ��<� �� � *� ������ � �� 4��� ��� ��� ��� ���� � ����. �! ����� ������� ��������" �$%�5 3 ��� ���.* � �1� .��< ������� �� � �$%#�" ��� ������������ ��� ��!.��<�� ������ �� � ���� � ����� �!�$%%� 5

������� �� 0 � � !� 5�� �' � � #�� C �$%#���� � � 5 ���"��� C � � �� "� � � ����������� ���� � ��" �� ���� .���� � ���"��� C �� � # 5 � � �� ���"��� C � � �$%������ � �� ����

��� �� 5� � #� � ������ & � � ���"����� � "� ��

�$%� ���� ��4�� � �� � ���� *� �� �����5

���������� �� �� � � !� �� �$%���� � ��

���������!" � ���*�� � �� �� � @� ��.��.��1���A �������� � � ������� ��� ����� �� ��.������ � � ������� �� ����� � ������ �� *�.���� ��� � ��� ������� �� � :���� ��<�� ��.��1���!" �� �� ���� � � ����! ��� � ��� �����! �� �� � � � ���� �:��� � ��� 4� �������.� � ����� *! ������ � �����0���� �� �� ��� ���� ������ ��� �1� .��< �������� � ����" �� ��� ��.�� � �7���� �����5

)*����� � 0 � C ��� �� � � ������ � � !�� � � � C ���� ��� �� %� � 1��� � ���C ���� ��� ��� �� �� ��C��C ���� ��� ��� �� ��� %�� �� � �� & ��� ��� �$%���� ������ ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� �� � .������ � � ���� � ��� � �� �� ' ��� �� �� ��� ��� �� � ���� 1 �� �� � ���� �� ����� �� � ��%� ��2 ��������� �������� ,��� �� �� ��� ��-%�� � ���' � �������� �� � ��� ���� �

J������ �� ����� ���� ��� �:����" �� ������� �������������� � �������� �� �� � ���� ��� �� � ���*��*! ���! � ��4���� ��� ��4�� ��4 � � ,&��� ��� ��� �� ������ *� �� ������ *����� ��� � ���� �� � ������5

������� �� �� ��<.*���� �������� � �� � ������ � � � ! �� �$%��� C �$%%������

(������ �!" �$% ������� � � � *� �� �����5

���������� �� �� � � !� �� �$%��� � ��

)*����� � ������� �� � �� ��1� � ���� ��*)"�� �� � ��� �� �� � ��� ������� "�� ��$�1� �� � "��� �� � �����""�� 1� �� � %� � ������� +���' ���1�� �� ���$� 1��� � � ���%� �� 5���� �� � ��� ���� � ��1� � ��� �����" ��� 1�� � � �� � �'� "��� ��1� ��� ���� ��� �� �� ���� � ���� ��" �����"� ��� %� �� �� �� �� %������ 1� � ������ �� � ��1 2 �'� " %�%��2����

F

FD

FD

3 ���� %5 /�� *� �� ����� ����� �������� � (7��.�� ' ��� �� � �� & ���

-� �*��� � ���� �$%" ��� ��� �� �� ��� ��� ��������� (7���� %" ���� ���� *!���� ��� ���*������� *�� �*� � �� �7������� �� � ������� �5

)*����� " .���" �� �� �� $ � � �� *)"$�� �� �$%��� C ���� ��� ��� �� �

-� ��� �*��� � ���� �$% *��� �� �� ��� ���� ������ ��2 � ��� �����" ����� � �� �$%#� ��� �������� �� � �1�� ��<" �� �� �$%%� ��� ���������� ��� ����! ��<��5

���������� �" ������ � � !�

�� �� $� �� ����' ��� �� �$%��� C �$%#�����

�� �� $� �� � � ����� �$%��� C �$%%�����

��� 0���� ��!������

3 �!" � ����� ������� �� �� �� �$% � ���� � ��� �4�� ������� ��� ��� 4��� �� �� #����2� ��� .� ��� F�� � *� � �������� �� �� ���������� ��� *� ���������� " � � � " �� � ��� �� � ������� ��� � � � � �� � ! "������� �!" �� �� � C ���� � � � � � ���� -� ��.� ��� ���� ��� � � ��������!�

163

Page 168: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

������#��� ��-� /�� ������ �� ��� �������� ����� � � *� � ������" ���� � � *� �� ����������� ��� � ���� ���.������

������� �" ���� 4��� ����� ���� ����� ����� �� � �� ��

1������#��� ����� � � ��!" ��� ��� �� ����������� � ���� � � *� ������" ���� � � *� �������� �� ������� ������ ��� ���� ���.�������

������� �$ ���� 4��� ������ ���� ���� � 5�� � � �� � 1 �' ""1� �� ��

������ ���#���� D���� >�� *� �� ������ �����!�������� ��� �� <� ����� ����1���!" �� ��� �������� ��� �� ������ ���� �� �� ��� �������� ����� � ���� �� ��� � ��� �� ��� �� �� ��4� ��������� �� � ��� E* ���!" *������ �� ��� ��� ��� ������� � �� ��0 ���" �$% � �� ��� ��! ��� �� �. � ����� ������ �� � �� �� �� ����� �! ���������� � � ������� �� � � ���0 ���� *� �� ���� ������ � ��� ��! ���� <�! ��� ������" �� � ������� *� ���� �� ��� ���������� *� �� ������ ?� ������� � � � ������� ! ������� ��� ��� ������

������� �& ���� 4��� ������ �� � ��� ��� �� � ��� �� � �� � � ��

������#� �2 ������!�� ��������!�� ��.0 ��� ��� ��4�� ����� �� �!���" �� ������� !* ��!� J! ��� � ��� �7 ����� �� ��0 ���" �� ��.���� �! �� � ���� � ���� *� ��� ���� �� ����� ���:��� ��� ���� � *����� � �� � �� �����" >�" *! � � ���� � �!��� #� � �����" �� ��� �����<� ��# �� ��! ���� ��� ���� � *����� ���������� *� ������� �� ����� ����� ����� ���������� ����� ��� ���� ��0 ���

������� �' ���� 4��� ��������� �� ����� ��#����� �� ���#�� ������ ���� � � � � �

�� � ! �� � �

�� & ��� ��� �� � �� ��C ����� � � � � � ����� ��� ���

�� � � �� � �� � �� � ��� � ��� �� � ����� �� ������ �� � � � �� � �� ��

3����#��������� #� � �� ��� ������ �� � �� ���. � �" �� ��� ��� ����� ��� �� ������ ���� ���� ��4� �.� �������� � � �� *� �� ������ �����.��� ��� �� <� ����� 3��� �� � �������� �" ��� ���.� � ��1� ��� ���� ������ ������� �� ��� � �������< *� @� �������A ��� ���� ��� ����� *! #�.���� )��� ��" ��� ���� � �� � �! ��� ����� *! #�.��� ��� �� ����� � � ���� ��� ��<�� �1��!�/���" �� ��<� �� � �7� � � ��� �� ��4 � � ��

�.� �������� � *! ��� � ��� ���.��� � � ���� �������� *� �� �1�� ��<� ?��" �$% ������ � ��� ���1� .��< ������� �1��! *! ��< � � ��� � �� ����� ���!" �� ��� �� �� �� ������ � ��0 ���� ��" .�� � �!" ����� ��� � � �������� )��� ��" *������ #�.��� � � �� ������ ��� ��0 ��� � � ������� �" ���� ������� � *� @� �������A *! � � ��4 � �� -���� �� ��� �! ��� ��4 � � �� �.� �������� � ����! ���� � ������ �� ���!� ���� �� �� �������� ����� ���� � �� *� � ���������

������� �( ���� 4��� ?�.� �������� �� ���� & �� ��� �� � �� �� �� �� �� �� � � � �����' �"1�� ��� �� ���� 1��� � � �� ��'�� � � �� � �� � �� � ��� � �"���� � � � ��� �� ��

-� �� ���� ���� �$% ���� ��� �4�� ����� ��� .� ��5

���������� �$ 0 � C ���� � � � � �� � ! ���$% �

�� � � � � � ��� C �$%���� �$% � ���� , �"����� ������� ��- �� �� � ���� ���� �� ���"��� 3� � ���� ���� ��.� �� ���$�� �������

� ��� ����� ���

�� ���" �� �� � �! ��� ����� ��� ���� ����� � � .�� ���� ���� �������� �� ������ ��� *� �� �������� ������ *! � ��� �� �������� =�� .���� ��.� � � ��� ������� �� �������� � ��� *� �� ������ �� ���� �� �����" ���� � �� �� ������� � ��� �� �������������� -� ������� �� ����� >� �� � ���� ��

# ���� � � �����"� 1' � ��� �� ������� � � ! �#��� �2� ���� � 1��� � � ��� *� �� ����������� *! �������� � ��� *� �� ������ �� �� ���.��� �������" ���" �$%���� #����� ��� ��� �� ������� ���� ����� ��� ���" ���" ����1���!" $��� -���4� ��� ��� � �� �� � ��� �� *� � ���� ������*! ��� ���* �� � �� ������ �������5

������� �� 0 ' C ���� � � � � �� 1 � ���� � �� � � � �� �� �� �����"� 1' �� � !��� ��� � �� '� %�� � (� �'�� �� � �� �����"�1' � C

���� ���

��� ����� ���� �!���� � � �������� �� ���� ��� ��"���� ������� ��� �� ���� � � � � ���� �� � ������"�� ����� ��� #����� � � ������ ������!�� �� ���� ������������ ������� �� ����� ������� �� ���� � �"�� �������� ������� ���� ���� �� ������� ��������$������� � � ������ ��� ��� �!����� �������� �����"��� ���� ��#��� ���� �� ������� � ���� �� ������� ��� ��� �������� ������� ���� � �� ���� ��� ��� �� �����

164

Page 169: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

?�� ����� � �! � � �� ���.������� � ��4 � �" ��.� � � ��������" �������� �" �� ����� �� ��/���� ������� �� �������� ��� �� ��� ��1� ��� ��� .���� *� �� �������� � ��� ��� �� ���� �� �������� ��*�� ���� �

� ��� ��� .���� ����� ��*�� ����� � ����� *�� ���� � ," �� �! �� � � � ���� ��� ��� ��� *� �������� ���� ����� ���� ��� �� -� ��� �! �������� ��� *� �� ������ �� ��� �� � � ������� �� ��� ���� ��� ��� ����� *� �� ����� �� ��0��� �� ������� ����!� # �* ��� 1���� � � ������� � � ���� .*� �� ������� ��� *� �� ����� ����� *! ��� �����2��� � ���! ���� ��� � � � *� �� ������" ���� �"� ����� �� � �� �������� ��� *� �� ������ �� ��� � ������ �������� /� � � � ��! ��� ��*� *������ ����� �7���� �� ���� �K��" �� ��� ���� �� ��� �������*�� �7����" ����� �� �K� ������ *� �� ������6�� ���� <� �� �������� ���� ����2� <������ ���������! �� ���� *��

� ����" �� �� �� �� � � � ������� �� � �1�� ��<�-� � ��! ��<� ��� ���� � � ������ �� ��� � � ����� �����2 *� �� ������5

���������� �& 0 ' �� � 1 � �� :��� ��� ������ � �� ��7� ���� � 1��� � � �� $�� ����' ��$

� �� �� � C(� �'�� ����

�������

���� �7� ��$

�� � 1��� � �

���������!" ��� �1�� ��< ���� � ���� �� �� ���� � ������� �� � :���� ��<�" �� �����! ����������� ��� ����� *� �� ����� ����� ��� � �� ��! ��� ���� *� �� ������ *����� ��� �" �� ��� ��.�� � � ��� �7���� �����������5

)*����� $ 0 � C ��� � � ������ %� �� � ��

�� �� � �����"� 1' ���� � �� %� � 1��� � ��C ���� �� �� �� %� � 1��� � ��C ���� �� ���� ���'� ��7� 1��� � �� � �" � ��7� ����7� �� � �" � ��7�� �� �� � ��� �� � 1���� ���� � 1' (� ���� ��� ��

�� � %��� �� �� � ��� �� � �� �� � ����� ;�� 2��%��� � 1��� � ��� � ���� �� � �� �� ��Æ �� ��� �"�� ��� ����� � 1��� � � � �� "�� �����" ��� 1�� � ������� ���� ��

)��� ��" � ������� ��� ����! ���.������� *! ���� .* �!" �� �� �� �� ���� *����� ��� ���� � � ����� � � �������� �� � ����� �� ����� ��� ���� �� ��� �$%%���� ��� ��< �� ��� � �����.��<�� ������������� � �� -� ��4� ������ 1��� � � �� ���� �� *� �� ������ � �� �� � ��� � �� ������ �5

������� �� 0 � 1 � �� �����"� 1' � �� ���� � � � !� �7� ��� ����� *� �� ����� �� ���

��� �� %�� �C �$%%���� �� � 5�� # �� � � ����� ��� C ��7����"��� 5 � � �� � � � � � �� ��

� ��� � ����� �� � �� �2� ��� ����� *� �� ������� �� � ��� ��

C ���� �� ��5 ����� ��� C � �

-� �� �! ���� � �� �2� �*� �" ����" ��� ��< ����� ������ ���� �� ������ � ��� �� �2� ���*���� � �$%%����" �� ���� �� ���� � ����.��<�� ������5

���������� �' 0 � 1 � �� �����"� 1' � �� ���� � � � ! �� %� � ������ 1��� � ��� ������

� �� �

��

� � �� � � � � � C ���"�������� � � � �� � "�� �

��

/�� *� �� ����� ����� *! � ��� ����� *� �� �������� �� �" �* ���!" ��� ���� � � ������ �� ��

?��" � � �! ��� ��� ����� *� �� ������ �� � ��� �������" �� �� ������� ��� �� ��� ����� *� �� ���������� ��� �� -� � ��! ���* � ��� �*��� �� ���� � ��� ��4���� ���� 1����

���������� �( 0 ' �� � 1 � �� :��� ��� ���$�� � � ��$����� �� � C(� �'�� ��

�� � �� � �� ������ �� 5

�� � '� � � #� � ��� ���

��� � '� �� & � � #� � ���

� ��

��� ��

�� � 5�� # �� � � ����� ��� C ��7�� 5 � ���

���� � ' � ��

�� ��� �� �� �7� ������ 1��� � �

3��� ��� �������� � �� ��� ����� *� �� ������"�� ��� ����� �! � ����� � ��*��� �� �� � ���"����� ��� �� *! ��� ������ ������ �� *����� ��� ����� � �!" �� ��� ��� � �� ������ � �� �� ��!��� �� ������� ���� �� ��� ������ �� ��� #������� �" �� ��.���! ������ �� ������� ��� �� *�. �� ������ ������� �!" ��� ���� � ���� ����� ���������� ��� ��� ��!.��<��" ��� ������ � ��������!5

���������� �� �� ' �� � � � �� :��� ��� ���$� �� � � ����� �� ��� �� �� � ������ 1���� �� (� �'�� �� ��C��

165

Page 170: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

)*����� & 0 7� ���2 �� "�� � �� ��1� � ���� ������� �� *)"�� �� ������ � ���$�� �����""� �� �� �� �� � �" �' "� 1� ��� � %��2� ��� �"��' �� � � �� ��� �� � ��� �'� ��� � ��1� �� � �+�� �����" ������" %� ��� �� �� �� � +��' � "����� ��� �� �� �� � +��' � �����""�� .�$��" � � �� � �� � ��1� �� ��2 � ���� � � �"� �������� � ��� �� ��2 � �� � � �� %� �� � ��2 �� %�� � ���� � � �" ���1��� '������� ��� �� *)"�� �� �� �� �7 ������ 1$��� � � �� � ���� �� ��� ������ � ���%� ��5���� ��

2

FD

FD

FD

A2

FD

1

1

1

1

1

1

FD

FD

FD

F

FD

1A 1A A2

FD1

1

2 2

1

FD

2

2

1 2

3 ���� '5 /�� ��� ����� *� �� ������ �� ���� �� .������ *! �� �� �� �� �� ���� �� ������ *! ��"�� ��� ����� �� ��� � ��� � (7���� 8�

�� ��� �� ��� �� ������ �' �����" ��� 1�� ��������� �� � � ��� �� ������� �� ��� ��$���" ���� 6�%��� % � � � � ������ � %�� � ��1� �� 1��� � � �� ��� � � � %� � �"�� � �� *)"�� � %�� �� %� ���' �� �� �����"� 1' �� �� ���� � ,%7� ���' ��� � � �� � �� %����� �� � � ���% � �1����-� <���$+�� �'� � ��% 2��%� � ��� � �� � �'� "�.��� � ���'����'� � ��� ���� ��� �� ���� �� � ���� �� %�� � � ��1� � ��� � �� �7 �$��� ��

/�� �� �� �*�� �� ��1� ��� ������� ����� *!�� � ������ �� *� ��*���� �� -������ �7� � �!���� � � �� � ����2� ������ ������� � ��1� ���)�*�*��� ����� �� ����� ��� ����� ���� ����� ��� �������2 *� �� ������ ��� ��! ������� ���.� ���" ���� � � ��� ����� *� �� ����� �! ��1� ���)���� ����� ��� ����� ����� =���� ��" �� �!���� ��� �� ���� ���������� � ��� �� �� ����� ������" *�� � ��� ��� ��� ��� ����� � �� �� ����� ��! �� ������ � ��� � � ��" ��� ���� �� � �������" �� �! ��� �� ��� ��� ��� �� �� �� ���

�� � ������ ��� ����� �� � ��� ������� ��� ���.* �� ��� �� ��������

�� ����!" � �� ��� ���� �$%� �� ��� *�� � ������ ����� �������� �" � ��! ���� � ��� ��< �� �����7 ��� ������� � ������� ���� �� � � �� ��Æ.� �� ������ � �� ������� ��� ����� *� �� ���������� ��� �� /� ���������� �� �" �� � � � �7��������� ��� �� �� �� ������� ���� ��� ���� ����������� *� �� ������" !�� ! �� � :���� *� �� ������ �������� �5

)*����� ' 0 �� !� �� $ 1 � �� *)"$�� �� ������ �� � ��� ��� ���� �� ���� �����"� 1' � � �� ���� � ��� ��� ���� ������ ��� ���'� %�� �� C �� C ��� � ��� C ���� �� ��� ��� C ���� �� � �$%� �� � � � ���$��� 1��� � � �� �� ���� �� � +�� �� %� ��� �������� ��� � %� � ���"����� �� ��� �� ��� ����� �������1��� '� �� � C(� ����� �� � ���� C(� ������ �

�� �� �� �7� ���� � 1��� �

+��� �� � ���� ���� ��� ��� �� � %��� ��7� ������ ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� �� �

! �����

-� �� � ����� *�� � ����� ��! ��� ���������.� � ��� ��������� *� ��� �� �� ���� �� �� ����������0 �� � �� �� � ����� ���� 4� � ��� �* �! ����<� ��� � ��� -� �� � �������� � �� � � ����.���� �� �� ��<�� �� ����� �� �� � ���������� � ������ � ���� �� ���* � ��� *� �� ������ �� � ����� ������ ������� ����! ���.������� *! ���� * �!�3 �!" �� �� � ����� *�� � ����� �� ��� ��� � ���*� �� ������ �� � :���� ����� �� �� ������� ���

/�� �������� � ������� �� �� � � ������� ���� ���;��� ���� � ����� �� � ���� ����� ��������< *���� � �� �� /��� �" �� ������ �� �! ��� ��� ��< � ����� ������� ������� � �� � ������ � � �� � �� �� ���" ��� +��� ' �� ��������" *�� ��� ��� ��������� ��������� ���� ���� ���� +�� � ' �� ��������� ����� �7��� � ���� ��� � ���� ���� ��4�� ���������� �������� �" �� ���� ����� �� ���� ����� ��.�� ��� ����� ��� (7��� � �� � � ���� ����� � �����*;��� �� ������� ���������

#����� ���*�� �� �� ����� �� ������� ����! � ��. ��� � � ���� �������� � �� ��� ������� �� �� ������" ���" �� ��� ��������� �� ��� ���! ���������

0#,�����-�����

���� �� =�!���.�� � �� ��� ����! ��������� *!� ?�� �� ��!� �� �� ��� ������ �� 3����� ��/�� 4� ��� � �� �� � ����� ��� �� ��� � �� ���

166

Page 171: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

4�� � ������� �� ��� +�� �� )L�M�� #����� 3����� �������� #�� 4� � ��� �����

��2���#��

����� (� #������L�" ����� DG��������" �� �� �=�< ��� E ��� �� � �� �����! �����5 ���� ����� ������� � �� �� � � ���� ��� =����� ���'"1��� 0��� " '�5',�N'&�" ,$8'�

����� +� #����� �� �� <��� �� ��������� >�$��� - �!" ?�� O��<" �� �� � �" ,$9&�

����� DG�������� �� �� � =�< ��� ?����.�� � ������ *���� � �7������ ��� .� �� �� ��$ ����� " 9'�,�5,$�N�%'" +����! ,$$%�

#��� D�� �� /�� ���� �� ��� �����! ������=����� �� 3��������� � 0��� " ,�5,'�N,��" ,$88�

�� ������ �� #� / ���<!� �������� �����!5# ��!� � �� ��� � � ���� � �<� * ���" �� "%����5�9&N�$," =���� ,$�$�

) ����� ������ �� #*���� E� =���>�� �����.� � �� <������ *��� �� � � �� � �� ������.� �� �� �� ����� " '��&�5�9&N�$%" ,$$,�

�� � =� ������ . <���� �� ?� �� ���"� ����'��� ���� � ��� �! �����" ,$$��

�� � F���� �� =������=�� ���� -��� ����� ��� � �� <������ *��� ��� H .� �� �� �� �$���� " ''�,�5,N9�" =�! ,$$��

���� �� =�!���.�� � �� �� O�� ������� J� ����� �5 #������� � ��� ����� *� �� ������� =������� 0��� � 0����� �� �����" ���" ����� /� ��.�����

#����!� ��� ��� � ��� �� �����!� � �� +� #������ =� �� ��� �����" �� ����" 6��1��2 �� ? �$" � � * ���"� �" ���� ���" ������� ��" �����,��&N,,8,� (� �� �� ��� ��* �����" ,$89�

167

Page 172: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

A NormativeExamination of EnsembleLearning Algorithms

David M. Pennock [email protected]

NEC ResearchInstitute,4 IndependenceWay, Princeton,NJ08540USA

Pedrito Maynard-Reid II [email protected]

ComputerScienceDepartment,StanfordUniversity, Stanford,CA 94305USA

C. LeeGiles [email protected]

NEC ResearchInstitute,4 IndependenceWay, Princeton,NJ08540USA

Eric Horvitz [email protected]

MicrosoftResearch,OneMicrosoftWay, Redmond,WA 98052-6399USA

Abstract

Ensemblelearning algorithmscombinethe re-sultsof several classifiersto yield an aggregateclassification.Wepresentanormativeevaluationof combinationmethods,applying and extend-ing existing axiomatizationsfrom SocialChoicetheory and Statistics. For the caseof multipleclasses,we show thatseveralseeminglyinnocu-ousanddesirablepropertiesare mutually satis-fied only by a dictatorship. A weaker set ofpropertiesadmitonly theweightedaveragecom-bination rule. For the caseof binary classifi-cation,we give axiomaticjustificationsfor ma-jority vote and for weightedmajority. We alsoshow that,even whenall componentalgorithmsreport that an attribute is probabilisticallyinde-pendentof theclassification,commonensemblealgorithmsoftendestroy thisindependenceinfor-mation. We exemplify thesetheoreticalresultswith experimentson stockmarket data,demon-stratinghow ensemblesof classifierscanexhibitcanonicalvotingparadoxes.

1. Intr oduction

A recenttrendin machinelearningis to aggregatetheout-puts of several learning algorithms togetherto producea compositeclassification(Dietterich, 1997). Under fa-vorableconditions,ensembleclassifiersprovably outper-form their constituentalgorithms,an advantageborn outby muchempiricalvalidation. Yet theredoesnot seemtobeasingle,obviousway to combineclassifiers—many dif-ferentmethodshave beenproposedandtested,with noneemerging as the clear winner. Most evaluation metrics

centeron generalizationaccuracy, eitherderiving theoreti-cal bounds(Schapire,1990;Freund& Schapire,1999)or(morecommonly)comparingexperimentalresults(Bauer& Kohavi, 1999;Breiman,1996;Dietterich,in press;Fre-und& Schapire,1996).

We take insteada normativeapproach,informedby resultsfrom Social Choicetheory and statisticalbelief aggrega-tion. First, we identify severalpropertiesthatanensemblealgorithmmight ideally possess,andthencharacterizetheimpliedform of thecombinationfunction.Section4 exam-inesthecaseof morethantwo classes.Weshow that,undera setof seeminglymild andreasonableconditions,no truecombinationmethodis possible.Theaggregateclassifica-tion is alwaysidentical to that of only oneof the compo-nentalgorithms. The analysismirrors Arrow’s celebratedImpossibility Theorem,which shows that the only votingmechanismthatobeysa similar setof propertiesis a dicta-torship(Arrow, 1963).Underslightly weakerdemands,weshow thattheonly possibleform for thecombinationfunc-tion is aweightedaverageof theconstituentclassifications.

Section5 considersthespecialcaseof binaryclassification.Basedon May’s (1952)seminalwork, we presenta setofaxiomsthat necessitatethe useof simplemajority vote tocombineclassifiers.Wethenextendthis result,deriving anaxiomaticjustificationfor theweightedmajority vote.Ma-jority andweightedmajority aretwo of themostcommonmethodsusedfor classifiercombination(Dietterich,1997).Onecontribution of this paperis to provide formal justifi-cationsfor them.

Section6 exploresthe independencepreservation proper-ties of commonensemblelearningalgorithms. Supposethat, with someattribute valuesmissing, all of the con-stituentalgorithmsjudgeoneattributeto bestatisticallyin-dependentof the classification.We demonstratethat this

Appendix K

168

Page 173: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

independenceis generallylost after combination,render-ing the aggregateclassificationstatisticallydependentontheattributein question.

Section7 presentsempiricalevidenceof violationsof thevarious axioms. We show that an ensembleof neuralnetworks—trainedto predictstockmarket data—cangen-eratecounterintuitive results,reminiscentof so-calledvot-ing paradoxesin the Social Choiceliterature. Section8summarizesanddiscussesfuturework.

2. EnsembleLearning

We presenta very brief overview of ensemblelearning;see(Dietterich,1997) for an excellentsurvey. Represen-tative algorithmsincludebagging(Breiman,1996),boost-ing (e.g.,ADABOOST (Freund& Schapire,1999)),andamethodbasedon Error-CorrectingOutputCodes(ECOC)(Dietterich& Bakiri, 1995). Ensemblealgorithmsgener-ally proceedin two phases:(1) generateandtrain a setofweaklearners,and(2) aggregatetheir classifications.

The first stepis to constructcomponentlearnersof suffi-cient diversity (Hansen& Salamon,1990). Onecommontechniqueis to subsamplethetrainingexamples,eitherran-domly with replacement(Breiman,1996),by leaving outrandomsubsets(as in cross-validation),or by an induceddistributionmeantto magnifytheeffectof difficult trainingexamples(Freund& Schapire,1999). Another techniquebaseseachlearner’s predictionson differentinput features(Tumer& Ghosh,1996). Themethodof Error-CorrectingOutputCodes(ECOC)generatesclassifiersby having eachlearnwhetheran examplefalls within a randomlychosensubsetof theclasses.Anotherapproachinjectsrandomnessinto the training algorithmsthemselves. Thesefour tech-niquesapply to arbitrary classifieralgorithms—therearealsomany algorithm-specifictechniques.And, of course,itis possibleto createanensembleby mixing andmatchingdifferenttechniquesfor differentclassifiers.

After generatingand training a set of weak learners,theensemblealgorithmcombinestheindividual learners’pre-dictionsinto a compositeprediction. The choiceof com-binationmethodis thefocusof this paper. Commonmeth-odscanbe categorizedloosely into two categories: thosethat combine votes, and those that can combine confi-dencescores.Theformertypeincludesplurality vote1 andweightedplurality; thelatterincludesstacking,serialcom-bination,weightedaverage,andweightedgeometricaver-age.

Baggingand ECOC are examplesof algorithmsthat useplurality vote. Theensemble’s chosenclassis simply that

1This is the familiar “one person,onevote” procedurewherethecandidatereceiving themostvoteswins. We reserve majorityvoteto referto thespecialcaseof two candidates.

which is predictedmost often by the individual learners.Weighted plurality is a generalizationof plurality vote,whereeachalgorithm’s vote is discounted(or magnified)by a multiplicativeweight;classesarethenrankedaccord-ing to thesumof theweightedvotesthey receive. Weightscanbechosento correspondwith theobservedaccuracy oftheindividualclassifiers,usingBayesiantechniques,or us-ing gatingnetworks(Jordan& Jacobs,1994),amongothermethods.TheADABOOST algorithmcomputesweightsinanattemptto minimizetheerrorof thefinal classification.

Stacking turnstheproblemof finding a goodcombinationfunction into a learning problem itself (Breiman, 1996;Lee & Srihari, 1995; Wolpert, 1992): The constituental-gorithms’ outputsare fed to a meta learner’s inputs; themetalearner’s output is taken as the ensembleclassifica-tion. Serial combinationusesonelearner’s top

�choices

to reducethe spaceof candidateclasses,passingthe sim-plified problemontothenext learner, etc. (Madhvanath&Govindaraju,1995).Weightedalgebraic (or geometric)av-eragecomputestheaggregateconfidencein eachclassasaweightedalgebraic(or geometric)averageof theindividualconfidencesin that class(Jacobs,1995; Tax et al., 1997).Somevariantsof boostingemploy weightedaveragecom-bination(Druckeret al., 1993).

3. Notation

Let ������������ �������������� denotea vectorof � attributevariableswith domain����������������� �!� . Denoteacor-respondingvectorof values(i.e., instantiatedvariables)as" �#��$ � %$ �������%$ � ��&�� . Eachvector " is categorizedinto oneof ' classes, ( � %( �������%(*) . Thereare + clas-sifiers, or learners,whichattemptto learnafunctionalmap-ping from instantiatedattributesto classes.Differenttypesof classifiersreturn different amountsof information—somereturn a single vote for one predictedclass,othersreturna rankingof theclasses,andstill othersreturnconfi-dencescoresfor all classes.2 Our contentionis thatconfi-denceinformation is usuallyavailable,whetherexplicitly(e.g., from neuralnet activation values,or Bayesiannetor decisiontree likelihoods)or implicitly from observedperformanceon the trainingdata. Thuswe denotelearner,’s classificationasan assignment�.-�/0��������12-3/ ) � of con-

fidencescoresto theclasses,where -3/54�&76 . Eachclassi-fier is a function 8 /:9 �<;=6 ) . Whenconfidencemag-nitudeinformationis truly unavailable,we adoptLee andSrihari’s (1995) conventionsfor encodingclassifications:A singlevotefor class( 4 is representedasa classificationvectorwith a 1 in the > th positionandzeroselsewhere;arank list of the classesis representedasa vectorwith a ?

2Thesethreeoutputconditionscorrespondto LeeandSrihari’s(1995)definitionsof TypeI, TypeII, andTypeIII classifiers,re-spectively.

169

Page 174: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

in thetop classposition, ?�@A?CB1' in thesecondplacepo-sition, ?�@EDFB1' in thethird placeposition,etc. Note that,technically, thesetwo encodingsintroduceunfoundedcom-parative information. For example,a vote for ( 4 conveysonly that all otherclassesare lesspreferredthan ( 4 , butareotherwiseincomparableamongthemselves.Variantsofthelimitativetheoremsin thispaperarealsopossibleusingmorefaithful representationsof votesandrankings.

An ensemblecombinationfunction G acceptsan + -tupleofclassificationsandreturnsa compositeclassification;that

is, G 9IH ;J6 ) , where HLKNM 6 )PORQ . Thus,assumingH � M 6 ) ORQ , theaggregateclassificationof arbitraryclas-sifiers 8 � �������28 Q on aninput " is G M 8 ��M " O �������28 Q M " ORO .For a given input vector " &S� , we find it convenienttodefine T asthe +U�V' matrix of all learners’confidencescoresfor all classes.That is, - /W4 is learner

,’s confidence

that " is in class> . Let X / be an + -dimensionalrow vec-tor with a 1 in the

,th positionandzeroselsewhere;simi-

larly, let Y 4 be an ' -dimensionalcolumnvectorwith a 1in the > th positionandzeroselsewhere. Then X / T is the,th row of T , and TZY[4 is the > th column of T . In other

words, XC/\T]�^8C/ M " O is learner,’s classification,and TZY[4

is thevectorof all confidencescoresfor class> . NotethatXC/\T�YC4E�_-3/54 . We denotethe ensembleclassificationbyT�`P�^�a-b`c��%-3`% d������1%-3` ) �e�fG M T O . We write gihkj toindicatethatevery componentof g is strictly greaterthanthecorrespondingcomponentof j .

4. Multiple Classes

In this section,we proposea normativebasisfor ensemblelearningwhen ' l�m . Our treatmentis similar in spiritto Pennock,Horvitz andGiles’s (2000)analysisof theax-iomaticfoundationsof collaborativefiltering.

4.1 An Impossibility Theorem

Wepresentfivepropertiesadoptedfrom SocialChoicethe-ory, arguetheir meritsin thecontext of ensemblelearning,anddescribewhichexistingalgorithmsexhibit whichprop-erties. Eachpropertyplacesa constrainton the allowableform of G .Property 1 (Universaldomain (UNIV)) H � M 6 ) O Q .

UNIV requiresthat G be definedfor any combinationofclassificationvectors.Sinceanarbitraryclassifiermayre-turn an arbitrary classification,it seemsonly reasonablethat G shouldreturnsomeresult in all circumstances.Allexistingensemblecombinationmethods,to ourknowledge,aredefinedfor all possibleclassifieroutputpatterns.

Property 2 (Non-dictatorship (ND)) There is no dictator,such that, for all classificationmatricesT andall classes> and

�, - /W4 hn- /0o:p - `R4 hn- `2o .

In words, G is not permittedto completelyignore all butoneof theclassifiers,irrespectiveof T . Weconsiderthede-sirability of this axiom to be self-evident,sincethe wholepoint of ensemblelearningis to improve uponthe perfor-manceof theindividualclassifiers.

Property 3 (Weak Pareto principle (WP)) For allclasses> and

�, TZY 4 hnTZY o:p - `q4 hn- `2o .

WP capturesthe natural ideal that, if all classifiersarestrictly moreconfidentaboutoneclassthananother, thenthis relationshipshouldbe reflectedin the ensembleclas-sification. Essentiallyall voting schemes(e.g., plurality,pairwisemajority, Bordacount)satisfyWP. Weightedplu-rality andweightedaveragingmethodsobey WP whenallweightsarenonnegative (andat leastone is positive). Ifa particularclassifier’s predictionsare badenough,somecombinationfunctions(e.g.,weightedaveragewith nega-tive weights,or stacking)mayestablisha negative depen-dencebetweenthat classifier’s opinion and the ensembleresult,andthusviolateWP. However, researcherstypicallystrive to generateensemblesof algorithmsthatareasaccu-rateaspossiblefor agivenamountof diversity(Dietterich,1997;Dietterich,in press).

Property 4 (Independence of irr elevant alter-natives (IIA)) Consider two classification matri-ces Tr2TZs . If T�Y 4 �tT�suY 4 and TZY o �vT�s0Y o , then- s`q4 hn- s`2ow - `R4 hx- `co .Under IIA, the final relative rankingbetweentwo classescannot dependon the confidencescoresfor any otherclasses.For example,supposethat,in classifyinga fruit aseitheranapple,abanana,or apear, theensembleconcludesthat“apple” is mostlikely. Now imaginethatwe learnonepieceof categoricalknowledge(andnothingelse):thefruitis not a pear. Every classifierdiminishesits confidencein“pear”, but leavesits relative confidencesbetween“apple”and“banana”untouched.Intuitively, theensembleshouldnotsuddenlyconcludethatthefruit is abanana;indeed,ad-mitting suchareversalis contraryto mostformalreasoningprocedures,includingBayesianreasoning.Seeminglyun-foundedreversalslike this are preciselywhat IIA guardsagainst. Weightedaveragingmethodsdo satisfy IIA, al-thoughplurality vote, and most other voting techniques,canviolate it. In Section7, we illustrate the paradoxicalresultsthancanoccurwhenIIA is not met.

Property 5 (Scaleinvariance (SI)) Considertwo classifi-cation matrices TycT�s . If XC/zTZs{�S|}/.XC/aT�~7�{/ for all

,and

for any positiveconstants|�/ and any constants�{/ , then-rs`q4 hn-rs`2o w - `R4 hx- `co for all classes> and�.

Different classifiers(especiallythosebasedon differentlearningalgorithms)may reportconfidencesusingdiffer-ent scales—one,say, rangingfrom 0 to 1; anotherfrom-100to 100.Evenif they sharea commonrange,oneclas-sifier maytendto reportconfidencescoresin thehigh end

170

Page 175: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

of thescale,while anothertendsto usethelow end.SI re-flectstheintuition thatall classifiers’scoresshouldbenor-malizedto a commonscalebeforecombiningthem. Onenaturalnormalizationis:X[/zT sb� XC/zT�@V���u� M XC/qT O���C� M X / T O @������ M X / T O � (1)

This transformsall confidencescoresto the � ���?1� range,fil-teringoutany dependenceonmultiplicative( | / ) oradditive( � / ) scalefactors.3 LeeandSriharijustify asimilarnormal-izationsimply because“eachoutput[classification]vectoris definedover a differentspace”(1995,p.42). Ensemblecombinationschemesbasedon votesor rankingsare bydefinition invariantto scale;weightedaveragingmethods,on theotherhand,arenot.

Different researchersfavor differing subsetsof thesefiveproperties,at leastimplicitly via their choiceof combina-tion methods.Roberts(1980)provesthat no combinationalgorithmwhatsoevercan“have it all”.

Proposition1 (Impossibility) If '=hAD , no function G si-multaneouslysatisfiesUNIV, ND, WP, IIA, andSI.

Proof: Follows from Sen’s (1986) or Roberts’s (1980,Theorem3) extensionsof Arrow’s(1963)originaltheorem.

4.2 WeightedAverageCombination

We might weaken SI, allowing the final classificationtodependon the magnitudesof confidencedifferences,butnot on additivescaleshifts.

Property 6 (Translation invariance (TI)) ConsidertwoclassificationmatricesTr2TZs . If X / T�s{�S|}X / T�~7� / for all

,andfor any(single)positiveconstant| andanyconstants��/ , then -rs`q4 hn-rs`2o w -b`R4�hx-b`co for all classes> and

�.

TI canbeenforcedby anadditive normalization,or align-ing all classifiers’scoreswith a commonreferencepoint(e.g., X / T�s � X / TP@������ M X / T O ).This weakeningis sufficient to allow for a non-dictatorialcombinationfunction G . Moreover, the only such G com-putestheensembleconfidencein eachclassasa weightedaverageof the componentlearners’ confidencesin thatclass.

Proposition2 (Weighted Average) If '�hnD , then theonly function G satisfying UNIV, WP, IIA, and TI issuch that j�TZY 4 h j�TZY o�p - `q4 h - `2o , wherej������ � R� ������1R� Q � is a row vectorof + nonnegativeweights,at leastoneof which is positive. If G is alsocon-tinuous,then j�TZY 4 h�j�TZY o w - `R4 hn- `co .

3If �:���������a �¡�¢£��¤¦¥§�����z {¡ thenset ���a §¨ to © .

Proof: Follows from Roberts’s (1980)Theorem2.

Certainly there may exist classificationdomainswheresomeof thesepropertiesdo not seemappropriateor jus-tified. However, webelieve that,becausethepropertiesarevery natural,understandingthe limitations that they placeon thespaceof ensemblelearningalgorithmshelpsto clar-ify whatpotentialalgorithmscanandcannotdo.

5. Binary Classification

Now considerthesubsetof learningproblemswhere 'ª�« ( « �¬D . In this case,the impossibilityoutlinedin Propo-sition 1 disappears;thefive propertiesUNIV, WP, IIA, SI,andND arein factperfectlycompatible.For example,allfivearesatisfiedby thestandardmajority vote:­ - `c� @®- `2 ­°¯f±±±±± Q² /u³}� ­ - /�� @®- /� ­´±±±±± (2)

where ­1µ�­ � ¶· ¸ ? 9 ifµ hn�� 9 ifµ �S�@�? 9 ifµ�¹ � �

Notethatthepropertiesarenecessarybut not sufficient forcharacterizingmajority vote.Proposition3 below providesonesufficientcharacterization.

5.1 Majority Vote

The use of majority vote for ensemblelearning is typi-cally motivatedby its simplicity, its observedeffectiveness,andits perceivedfairnesswhenthe constituentalgorithmsareessentially“createdequal” (Dietterich,1997). For ex-ample,the componentalgorithmsemployed for bagging,ECOC,andrandomizationaregenerallya priori indistin-guishable,and(2) is typically usedto combineclassifica-tionsin thesecases.

May (1952)providesan axiomaticjustificationfor major-ity vote.His treatmentis directly applicablewhenthecon-stituentalgorithmsreturnonly votes(equivalentto rankingssince '=�ºD ), ratherthanarbitraryconfidencescores.Wenow generalizehisaxiomsandhischaracterizationtheoremto applyto confidencescores.

Property 7 (Neutrality (NTRL))

If G M �a- �%� 2- �q �c���������a- Q � %- Q � O �i�.- `1� %- `2 �then G M �a- �R 2- ��� �c���������a- Q %- Q � � O �i�.- `2 %- `1� �1�

Under NTRL, the effect of every algorithm reversingitsvote is simply to reversetheaggregatevote. NTRL estab-lishesa symmetrybetweenthe two classnames,( � and( , ruling outany a priori biasfor oneclassnameover the

171

Page 176: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

other. Indeed,the subscripts1 and2 are assignedto thetwo classesarbitrarily; NTRL simply ensuresthatthefinalresultdoesnotdependonhow thetwo classesareindexed.NTRL is astrictly strongerconstraintthanIIA.

Property 8 (Symmetry (SYM))G M �a- �%� 2- �q �c���������a- Q � %- Q � O� G M �.-3/�»��C%-3/�»R ��c���������a-3/0¼I�C%-3/0¼� �� Owhere ½ , � , ������� , Q:¾ is any permutation of½�?d2D´������c%+ ¾ .SYM is strongerthanND andis sometimesreferredto asanonymity. WhereasNTRL implies an invarianceunderclassnamereversal,SYM enforcesaninvarianceunderanypermutationof algorithmnames,or subscripts.It simplyinsiststhatournumberingschemehasnoeffecton theout-put of the combinationrule. Note that SYM doesnot, byitself, rule out a posteriorbiasbasedon theclassifiers’re-portedconfidencescores.

Property 9 (Positive Responsiveness(POSR)) Considertwo classificationmatricesTr2TZs . If

­ - `1� @®- `% ­ &�½¿���? ¾ ,and X / T s �nX / T for all

,�À��Á , and X�ÂÃT s is such thateither

1. -ys � hn-  � and -ys �S-  , or

2. -ys � �S-  � and -ys ¹ -  ,then

­ -ys`c� @®-ys`2 ­ �º? .If the currentaggregatevote is tied (

­ - `c� @S- `2 ­ �#� ),then,underPOSR,any changeby any algorithm

,in apos-

itive directionfor ( � (i.e., -� � increasesor -� decreases)breaksthis deadlock,yielding - `1� h¬- `% . Moreover, anychangeof oneof theconstituentvotesthatstrictly favors ( �cannotswing the ensemblevote in the oppositedirection,from ( � to undecidedor to ( . Combinedwith NTRL,POSRis a strongerversionof WP, but is still quite rea-sonable.Note that, becausethereareonly two classes,ifany learner’svotesareobservedto benegatively correlatedwith thecorrectclassification(and,for example,aweightedaveragemethodassignsa negative weight), thenits votescansimply be reversed,renderingPOSR(anda nonnega-tiveweight)appropriateagain.

Proposition3 (Majority Vote)Anaggregationfunction Gis the majority vote(2) if andonly if it satisfiesUNIV, SI,NTRL,SYM,andPOSR.

Proof: Choosescalingparametersasin Equation1: | / �M « - /0� @A- /� « OcÄ � (or if - /0� �^- /u , set | / �_? ) and � / �@�| / ���u� M - /0� %- /u O . Let X / T�sb�A| / X / T�~7� / for all,. Then�.- s/�� 2- s/u �Å� ¶· ¸ �q?F��F� 9 if - /�� hn- /u ���§��F� 9 if - /�� �S- /u ���§�?¿� 9 if - /�� ¹ - /u �

Thatis, with only two classes,andtwo degreesof freedomin choosingthe scalingconstants,SI effectively restrictsthedomainH of G to votes.May (1952)provesthatNTRL,SYM, andPOSRarenecessaryandsufficientconditionsformajority votewheninputsarevotes.We referthereadertoMay’sarticlefor theremainderof theproof.

Notice that, when the componentalgorithmsreturn onlyvotes,andnootherinformationis available,SI is avacuousrequirement;in this setting,Proposition3 becomesa verycompellingnormativeargumentfor theuseof majorityvotefor classifiercombination.

5.2 WeightedMajority Vote

Whenthecomponentalgorithmsdoreturnmeaningfulcon-fidencescores,SI mayseemoverly severe,asit essentiallystripsawaymagnitudeinformation.Confidencescoresmayreflectmany sourcesof information—forexample,theacti-vationlevelsof aneuralnetwork’soutputnodes,theposte-rior probabilitiesof a Bayesiannetwork’soutputvariables,or an algorithm’s observed performanceon the trainingdata(asis usedin Boosting). Regardlessof its origin weinterpret - /0� h�- /� asa predictionin favor of classone,- /� h^- /�� asa predictionin favor of classtwo, and themagnitudeof thedifferencein confidencescores

« - /� @Æ- /0� «astheweightof algorithm

,’s conviction.

Thenwe definetheweightedmajority voteas­ - `1� @®- `% ­Ç¯ ±±±±± Q² /�³�� « - /�� @®- /� « � ­ - /�� @£- /� ­ ±±±±±� ±±±±± Q² /�³�� - /�� @®- /u ±±±±± � (3)

Property 10 (SeparableSymmetry (SSYM))G M �.-Z�%�C2-Z�R ��c���������a- Q �[2- Q �� O� G M �.-3/�»%�[%-�4q»R ��c���������a-3/0¼I�C%-{4q¼� �� Owhere ½ , � , ������� , Q�¾ and ½c> � a> �������z> Qƾ are any twopermutationsof ½�?d%D�������c�+ ¾ .SSYM is a strongerconstraintthanSYM. UnderSSYM,the ensembleclassificationdependson the set of confi-dencescoresfor classoneandthesetof confidencescoresfor classtwo, but not on theidentity of thealgorithmsthatreturnthosescores.

Proposition4 (WeightedMajority Vote)Theonlyaggre-gationfunction G thatsatisfiesUNIV, TI, NTRL,SSYM,andPOSRis theweightedmajority vote(3).

Proof: Under UNIV and NTRL, TÈ�ÇÉ implies that- `c� �J- `% . Thus, under POSR,if - Q � h�- Q and

172

Page 177: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

-�/0���Ê-3/u £�#� for all,VÀ�#+ , then -b`1�£hË-b`2 . Simi-

larly, becauseof NTRL, if - Q hA- Q � and - /0� �S- /� �S�for all

, À��+ , then - `2 hÌ- `1� . Given an arbitraryclas-sificationmatrix T , we canmake the following invariancetransformations.We invoke TI andSSYM alternatelyandrepeatedlyasfollows:G M �R�.- ��� %- �R �1��a- 2� %- � �c��a-3Í � %-�Í �c������5� O �G M �R�.-}���Î@®-Z�R �%�F�c¿���§%-3 % *@£-� 2���c¿�.- Í �[2- Í [�c������ � O �G M �R���§����c��a-Z�%�Î@£-}�q d%-3 % *@£-� 2���c¿�.- Í �[2- Í [�c������ � O �G M �R���§����c��a-Z�%�r~E-� 2�Ï@®-Z�R °@£-� � d��F�1�.��2- Í *@®- Í ���1������5� O ������P�GPÐÎÑ��.��%�F�c¿���§��F�1��.������c������cFÑ Q² /u³}� - /�� @®- /� ���ÒÆÒ*ÓThus if Ô / -3/��@n-�/� is greaterthan(lessthan,equalto)zero, then -b`1�Õ@�-b`2 is greaterthan(lessthan,equalto)zero,preciselytheweightedmajority vote(3).

6. IndependencePreservation

Considerthe learners’predictionswhenasked to evaluatean example "�Ö with somemissingvalues. Without lossof generality, let � � %� ������1���× betheattributevariableswith missingvalues,and let ��×*Ø � ��������� � be the vari-ableswith known values. Let " Ö×*Ù �J��$ Ö×*Ø � ������1�$ Ö� �denotethe vector of known values. If we definea priorjoint probability distribution ÚrÛ M " O over all possiblecom-binationsof attribute values,then we can computeeachlearner’s inducedposteriordistribution over classificationsgiventheknown values"�Ö× Ù :ÚrÛ /RM X / T « " Ö×*Ù O � ²ÜÃÝ�Þ�ß »¿à�á5á5á�à ßIâÅã2äåRæ M ÜÃç è[éâ Ù O ³3ê æ�ëÚrÛ M�ì

« " Ö×*Ù O �Similarly, we cancomputetheensemble’s posteriordistri-butionoverclassifications:ÚÅÛ `FM T ` « " Ö× Ù O � ²ÜÃÝ�Þ�ß » àbá5á5á�à ß â ã2äí M å » M ÜÃç èCéâ Ù O ç5î5î5î ç å ¼ M ÜÃç è[éâ Ù O�O ³ ëdïÚÅÛ M�ì « " Ö× Ù O �Now we canascertainwhethersomeattributesarestatisti-cally independentof theclassification.Again without lossof generality, selectattribute �ð×*Ø � for this purpose.Whatif everyconstituentalgorithmagreesthat ��×ÎØ � is indepen-dentof theclassification,giventheremainingknownvalues$ Ö×*Ø �������%$ Ö� ? It seemsnaturalanddesirablethat suchaunanimousjudgmentof “irrelevance”shouldbepreservedin the ensembledistribution. The following propertyfor-mally capturesthis ideal:

� � � �ðÍ X � T X T XCÍ[T T `0 0 0 �q?F��F� �����?¿� �R?d%�F� �R?d����0 1 0 ���§�?¿� �q?d%�F� �.���?¿� �.���?[�1 0 0 �q?F��F� �����?¿� �.���?¿� �.���?[�1 1 0 �q?F��F� �q?d%�F� �R?d%�F� �R?d����ÚrÛ /RM �R?d%�F� « �ðÍ�A� O 0.75 0.5 0.5 0.5� � � �ðÍ X � T X T XCÍ[T T `0 0 1 ���§�?¿� �����?¿� �R?d%�F� �.���?[�0 1 1 �q?F��F� �����?¿� �R?d%�F� �R?d����1 0 1 �q?F��F� �q?d%�F� �.���?¿� �R?d����1 1 1 �q?F��F� �q?d%�F� �.���?¿� �R?d����ÚrÛ /RM �R?d%�F� « �ðÍ�ñ? O 0.75 0.5 0.5 0.75

Table1. Examplewhereplurality voteviolatesIPP.

Property 11 (Independence Preservation Property(IPP))

If ÚrÛ /RM X / T « "{Ö×°Ù O �SÚrÛ /�M X / T « $ Ö×*Ø �������%$ Ö� O for all,

then ÚÅÛ `ÃM T ` « " Ö× Ù O ��ÚrÛ `FM T ` « $ Ö×*Ø ��������$ Ö� O �Table 1 presentsa constructive proof that plurality votefails to satisfy IPP. Three attributes eachhave domain��/���½¿�§�? ¾ , andthe prior distribution over attribute val-ues ÚÅÛ M " O �Ê?CBCò is uniform. Variables�� and �ð havemissingvalues(i.e., óô�UD ). Eachof threeconstituentalgorithmsagreethat the classificationis independentof� Í . But combinationby plurality vote destroys this in-dependence:Accordingto theensemble,theclassificationdoesin factdependon the valueof ��Í . Similar examplesdemonstratethatalgebraicandgeometricaveragesalsovi-olate IPP. It remainsan openquestionwhetherany rea-sonableensemblecombinationfunction can satisfy IPP.Resultsfrom StatisticsconcerninggeneralizedvariantsofIPParemostlynegative: No acceptableaggregationfunc-tion hasbeenfoundthatpreservesindependence(Genest&Zidek, 1986),andseveral impossibility theoremsseverelyrestrict the spaceof potentialcandidates(Genest& Wag-ner, 1987;Pennock& Wellman,1999).

7. Experimental Observations

Wehaveshown, in theory, thattheclassof potentialensem-ble algorithmsis severelylimited if we wanta smallnum-ber of intuitive propertiessatisfied.Onemight arguethatsituationswherethesepropertiescomeinto conflict mayneverarisein practiceif we usepopularaggregationmeth-ods.Thepurposeof thissectionis to show by examplethat,in fact,suchconflictsdooccurin practice.Specifically, wewill give examplesfrom a stockmarket predictiondomainwhereIIA breaksdown if we baseour aggregationon vot-ing.

173

Page 178: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

rankorder # rankorder #

UP h SAME h DOWN 6 DOWN h SAME h UP 5UP h DOWN h SAME 1 SAME h UP h DOWN 5DOWN h UP h SAME 3 SAME h DOWN h UP 1

Table2. Six learnedvotepatterns,andthenumberof neuralnet-worksthatlearnedeach.An instanceof theBordaparadox.

To this end, we report results of empirical tests of anensemblelearnertrained on stock market data. We re-trieved daily closing pricesof the Dow between1/20/97and 1/18/00from MSN Investor.4 From this, we gener-atedan approximatelyzero-meanand unit-variancetimeseriesof the form ½[õdö��^òF÷ M�ø ��ù{ö°@ ø �yù{ö Ä � O ¾ , where ù{öis the Dow’s price on day ú . The attributes are � ��.õdö Äbû %õdö Ä�ü ��������õFö Ä ��� . The classesarediscreteintervalsof õFö suchthat (°��� UP

¯ M õFöÆhk��� mF÷ O , (Î �� DOWN¯M õdö ¹ @���� mF÷ O , and ( Í � SAME

¯ M @���� mF÷ ýñõdö�ýt�§� m�÷ O .Theintervalsaresuchthateachclassfrequency is roughly?CBCm . Thecomponentlearningalgorithmsarebackpropaga-tion neuralnetworks built usingFlake’s (1999) NODELIB

codelibrary; eachconsistsof an input layerof five nodes,a hiddenlayer of from oneto seven nodes,andan outputlayer of threenodes. Diversity is dueonly to differencesin thenumberof hiddennodesandto randomizationin thetraining algorithm. The time seriesõ ö wasdivided into atrainingsetof 562daysanda testsetof 187days.

Table 2 shows the learnedclassrankingsfor twenty onenetworks(threeeachwith ?d2D´������ccþ hiddennodes)on testday7/14/99. If we usestandardplurality vote to combinepredictions,then DOWN wins with 8 votes,UP placesinsecondwith 7 votes,andSAME comesin lastwith 6 votes.By this measurewe shouldshort the Dow. But are wesure?SinceSAME is presumablythe leastlikely outcome,let’s focuson therelative likelihoodsbetweenonly DOWN

and UP.5 If we ignoreSAME andrecomputethe vote,wefind that UP actuallybeatsDOWN by 12:9! This is a vividdemonstrationthat plurality vote violatesIIA; the prefer-encebetweenUP andDOWN dependson SAME. Soshouldwe invest in the Dow? Well, the other two pairwisema-jority votesrevealthat SAME beatsUP by 11:10andSAME

beatsDOWN by 12:9. Thenaccordingto thepairwisema-jority, SAME wins againstbothotherclasses,UP comesinsecond,andDOWN is last,completelyreversingtheoriginalorderpredictedby thethree-way plurality vote. This is anillustration of the so-calledBorda voting paradox, namedaftertheeighteenthcenturyscientistwho discoveredit.

Table3 demonstratesanotherclassicvoting paradox,dueto Condorcet,oneof Borda’s peers.Thetablelists theac-

4http://moneycentral.msn.com/investor5Or we mayhave receivedoutsideinformationthatdiscounts

thelikelihoodof SAME.

, - /�� - /� - / Í rankorder

1 -0.33 -0.41 -0.25 SAME h UP h DOWN

2 -0.45 -0.25 -0.27 DOWN h SAME h UP

3 -0.31 -0.35 -0.37 UP h DOWN h SAME

Table3. Confidencescoresand correspondingvote patternsforthreeneuralnetworks.An instanceof theCondorcetparadox.

tivationvalues(confidencescores)of threenetworks(withone,two, andthreehiddennodes)on testday4/23/99.Plu-rality vote is tied, sinceeachalgorithm ranksa differentclasshighest.Whataboutpairwisemajority vote? In thiscase,SAME beatsUP by 2:1,andUP beatsDOWN by 2:1. Sois SAME our predictedoutcome?Not necessarily—DOWN

beatsSAME, alsoby2:1. Weseethatpairwisemajorityvotecanreturncyclical predictions,a violation of our genericdefinition of a classification6 ) , which assumesthat ag-gregationreturnsa transitiveorderingof classes.

Thesetwo “paradoxes” illustrate the undesirableconse-quencesof violating someof the basicpropertiesof G de-fined earlier. The examplesalso constitutean existenceproof thatsomeof thesamecounterintuitiveoutcomesthathave perplexed social scientistsfor centuriescan and dooccurin thecontext of ensemblelearning.

8. Conclusion

We identifiedseveral propertiesof combinationfunctionsthat Social Choice theoristsand statisticianshave foundcompelling,andarguedtheir applicabilityin thecontext ofensemblelearning.Wecatalogedcommonensemblemeth-odsaccordingto thepropertiesthey do anddo not satisfy,andshowedthatnocombinationfunctioncanpossessthemall. We providedaxiomaticjustificationsfor weightedav-eragecombination,majority vote, andweightedmajorityvote.We describedhow commonaggregationmethodsfailto respectunanimousjudgmentsof independence.Finally,we exemplifiedthefundamentalandunavoidabletradeoffsamongthe variouspropertiesusing an ensemblelearnertrainedon stockmarketdata.

Drucker, et al. (1993) presentempirical evidence thatweightedaverageoutperformsplurality vote in somecir-cumstances.Future work will examinewhetherthe ax-iomatic framework developedin this papercanaid in de-riving theoreticalboundson the performanceof weightedaverageandothercombinationrules. We alsoplan to ex-plorenormativejustificationsfor individualclassifiers,andinvestigatewhether, in somecases,a complex individualclassifiermight reasonablybe interpretedasan ensembleof simplerconstituentclassifiers.

174

Page 179: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Acknowledgements

Thanksto Gary Flake, Michael Wellman,andthe anony-mous reviewers. Pedrito Maynard-ReidII was partiallysupportedby a NationalPhysicalScienceConsortiumFel-lowship.

References

Arrow, K. J. (1963). Socialchoiceand individual values.New Haven,CT: YaleUniversityPress.2ndedition.

Bauer, E., & Kohavi, R. (1999). An empiricalcomparisonof voting classificationalgorithms: Bagging,boosting,andvariants.MachineLearning, 36, 105–142.

Breiman,L. (1996). Baggingpredictors.MachineLearn-ing, 24, 123–140.

Dietterich,T. G. (1997). Machinelearningresearch:Fourcurrentdirections.AI Magazine, 18, 97–136.

Dietterich,T. G. (in press). An experimentalcomparisonof threemethodsfor constructionensemblesof decisiontrees:Bagging,boosting,andrandomization.MachineLearning.

Dietterich,T. G., & Bakiri, G. (1995). Solvingmulticlasslearning problems via error-correcting output codes.Journalof Artificial IntelligenceResearch, 2, 263–286.

Drucker, H., Schapire,R., & Simard,P. (1993). Boost-ing performancein neuralnetworks. InternationalJour-nal of PatternRecognition andArtificial Intelligence, 7,704–719.

Flake,G.W. (1999).Anindustrialstrengthnumericalmod-eling research and developmenttool (TechnicalReport99-085).NECResearchInstitute.

Freund,Y., & Schapire,R. E. (1996). Experimentswitha new boostingalgorithm. Proceedingsof the Thir-teenthInternational Conferenceon Machine Learning(pp.148–156).

Freund,Y., & Schapire,R. E. (1999). A shortintroductionto boosting.Journalof theJapaneseSocietyfor ArtificialIntelligence, 14, 771–780.

Genest,C., & Wagner, C. G. (1987). Furtherevidenceagainstindependencepreservation in expert judgementsynthesis.AequationesMathematicae, 32, 74–86.

Genest,C., & Zidek, J. V. (1986). Combiningprobabilitydistributions: A critiqueandanannotatedbibliography.StatisticalScience, 1, 114–148.

Hansen,L., & Salamon,P. (1990).Neuralnetwork ensem-bles. IEEE Transactionson Pattern Analysisand Ma-chineIntelligence, 12, 993–1001.

Jacobs,R. A. (1995). Methodsfor combining experts’probability assessments.Neural Computation, 7, 867–888.

Jordan,M. I., & Jacobs,R. A. (1994). Hierarchicalmix-turesof expertsandtheEM algorithm. Neural Compu-tation, 6, 181–214.

Lee,D., & Srihari,S.N. (1995).A theoryof classifiercom-bination: Theneuralnetwork approach.ProceedingsoftheThird InternationalConferenceonDocumentAnaly-sisandRecognition (pp.42–45).

Madhvanath,S.,& Govindaraju,V. (1995).Serialclassifiercombinationfor handwrittenwordrecognition.Proceed-ingsof theThird InternationalConferenceonDocumentAnalysisandRecognition (pp.911–914).

May, K. O. (1952).A setof independentnecessaryandsuf-ficient conditionsfor simplemajority decision. Econo-metrica, 20, 680–684.

Pennock,D. M., Horvitz, E., & Giles, C. L. (2000). So-cial choicetheoryandrecommendersystmes:Analysisof the axiomatic foundationsof collaborative filtering.Proceedingsof theSeventeenthNationalConferenceonArtificial Intelligence. In press.

Pennock,D. M., & Wellman, M. P. (1999). Graphicalrepresentationsof consensusbelief. Proceedingsof theFifteenthConferenceon Uncertaintyin Artificial Intelli-gence(pp.531–540).

Roberts,K. W. S. (1980). Interpersonalcomparabilityandsocial choicetheory. Review of EconomicStudies, 47,421–439.

Schapire,R. E. (1990). Thestrengthof weaklearnability.MachineLearning, 5, 197–227.

Sen, A. (1986). Social choice theory. In K. J. ArrowandM. D. Intriligator (Eds.),Handbookof mathematicaleconomics,vol. iii , 1073–1181.Amsterdam:Elsevier.

Tax,D. M. J.,Duin, R. P. W., & vanBreukelen,M. (1997).Comparisonbetweenproductandmeanclassifiercombi-nationrules.Proceedingsof theWorkshopon StatisticalPatternRecognition.

Tumer, K., & Ghosh,J. (1996).Error correlationanderrorreductionin ensembleclassifiers. ConnectionScience,8, 385–404.

Wolpert, D. (1992). Stacked generalization.Neural Net-works, 5, 241–260.

175

Page 180: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

AggregatingLearnedProbabilistic Beliefs

Pedrito Maynard-Reid IIComputerScienceDepartment

StanfordUniversityStanford,CA 94305-9010

[email protected]

Urszula ChajewskaComputerScienceDepartment

StanfordUniversityStanford,CA 94305-9010

[email protected]

Abstract

We considerthe task of aggregating beliefs of sev-eral experts. We assumethat thesebeliefs are rep-resentedas probability distributions. We argue thatthe evaluationof any aggregation techniquedependson the semanticcontext of this task. We proposeaframework, in which we assumethatnaturegeneratessamplesfrom a‘true’ distributionanddifferentexpertsform theirbeliefsbasedon thesubsetsof thedatatheyhave a chanceto observe. Naturally, the optimal ag-gregatedistributionwouldbetheonelearnedfrom thecombinedsamplesets.Sucha formulationleadsto anaturalwayto measuretheaccuracy of theaggregationmechanism.

We show that the well-known aggregation operatorLinOP is ideally suited for that task. We proposea LinOP-basedlearning algorithm, inspired by thetechniquesdeveloped for Bayesianlearning, whichaggregates the experts’ distributions representedasBayesiannetworks. We show experimentallythat thisalgorithmperformswell in practice.

1 Intr oduction

Belief aggregation of subjective probability distributionshasbeena subjectof greatinterestin statistics(see[GZ86,CW99]) and, more recently, artificial intelligence (e.g.,[PW99]) andmachinelearning(ensemblelearningin par-ticular [PMGH00]), especiallysinceprobabilisticdistribu-tions are increasinglybeing usedin medicineand otherfields to encodeknowledge of experts. Unfortunately,many of the aggregationproposalshave lacked sufficientsemanticalunderpinnings,typically evaluating a mecha-nism by how well it satisfiespropertiesjustified by littlemorethanintuition. However, ashasbeennotedin otherfieldssuchasbelief revision (cf. [FH96]), theappropriate-nessof propertiesdependson theparticularcontext.

We take a moresemanticapproachto aggregation:we firstdescribethe realistic framework in which the expertsorsourceslearn their probability distributionsfrom dataus-ing standardprobabilisticlearningtechniques.We assume

aDecisionMaker (DM) — thetraditionalnamefor theag-gregator— wantsto aggregatea setof theselearneddis-tributions. This framework suggestsa naturaloptimalag-gregationmechanism:constructthedistributionthatwouldbe learnedhadall the sources’datasetsbeenavailabletotheDM. Sincetheoriginaldatasetsaregenerallynotavail-able,the aggregationmechanismshouldcomeascloseaspossibleto reconstructingthe datasetsandlearningfromthecombinedset.

For intuition, considerthe the task of creatingan expertsystemfor somespecializedmedicalfield. We would liketo take advantageof theexpertiseof severaldoctorswork-ing in this field. Each of thesedoctors sharpenedhisknowledgeby following many patients. The doctorscanno longer recall the specificsof eachcase,but they haveformed over the yearsfairly accuratemodelsof the do-main that canbe representedassetsof conditionalprob-abilities. (In fact,many expert systemshave beencreatedover the yearsby eliciting suchconditionalprobabilitiesfrom experts[HHN92].) Of course,if therewasa doctorwho hadseenall of thepatientstheothersdoctorssaw, theidealexpertsystemwould resultfrom eliciting hermodel.However, thereisn’t onesuchexpert. Therefore,our sys-temwould benefitfrom incorporatingtheknowledgeof asmany expertsaswe canfind. The systemwould alsoac-countfor thedifferinglevelsof experienceof differentdoc-tors – someof themmay have practicedfor muchlongerthanothers.

One of the best-known aggregationoperatorsis the Lin-earOpinionPool(LinOP) which aggregatesa setof distri-butionsby taking their weightedsum. It hasbeenshownin the statisticscommunitythat, undersomeintuitive as-sumptions,learning the joint distribution from the com-bineddatasetis equivalentto usingLinOPovertheindivid-ual joint distributionslearnedfrom theindividualdatasets.However, whereasthe weights in typical usesof LinOPareoften criticized for beingad-hoc,our framework pre-scribessemantically-justifiedweights: the estimatedper-centagesof the dataeachsourcesaw. Intuitively, a highweight meanswe believe a sourcehas seena relatively

Appendix L

176

Page 181: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

largeamountof dataandis, hence,likely to bereliable.

However, joint distributionsarehardly the preferredrep-resentationfor probabilisticbeliefsin real-world domains.BNs (akabelief networks,etc.) [Pea88] have gainedmuchpopularityasstructuredrepresentationsof probabilitydis-tributions.They allow suchdistributionsto berepresentedmuchmorecompactly, thereforeoften avoiding exponen-tial blowup in bothmemorysizeandinferencecomplexity.

Thus,we assumethesourcesbeliefsareBNs learnedfromdata.Accordingto oursemantics,theaggregateBN shouldbeonetheDM would learnfrom thecombinedsetsof data.We describea LinOP-basedBN aggregationalgorithm,in-spiredby the algorithmdesignedto learnBNs from data.The algorithmusessources’distributions insteadof sam-ples to searchover possibleBN structuresandparametersettings. It takes advantageof the marginalizationprop-erty of LinOPto makecomputationmoreefficient. We ex-plore the algorithm’s behavior by runningexperimentsonthewell-known, real-lifeAlarm network [BSCC89]andonthesmallerartificial Asianetwork [LS88].

2 Formal Preliminaries

Werestrictourattentionto domainswith discretevariables.We considerhow to computethe aggregatedistribution,andhow theaccuracy of our computationdependson howmuchwe know aboutthesources.

Formally, we considerthe following setting: Thereare�

sourcesand � discreterandomvariables,whereeachvari-able � hasdomain ������� � . We follow theconventionofusingcapital lettersto denotevariablesandlowercaselet-tersto denotetheirvalues.Symbolsin bolddenotesets.�is thesetof possibleworldsdefinedby valueassignmentsto variables.Thetruedistribution or modelof theworld is� . Eachsource� hasadataset ��� sampledfrom (unknownto us) � . We will assumethateach��� is finite of size ��� .The correspondingempirical (i.e., frequency) distributionis �� � . Eachsource� learnsa distribution � � over � . Thisis � ’s modelof theworld. Thecombinedsetof samplesis����� � � � of size � . Thecorrespondingempiricaldistri-bution is �� . TheDM constructsanaggregatedistribution � .The optimal aggregatedistribution ��� is positedto be thedistribution theDM would learnfrom � .

Sinceit is unrealisticto expecttheDM to haveaccessto thesources’samplesets,we considerhow to useinformationaboutthesources’learneddistributionsto at leastapprox-imate ��� . Specifically, we considerthesituationwheretheDM knows the sources’distributionsandhasa goodesti-mateof thepercentage� � �!���#"$� of thecombinedsetofsampleseachsource� hasobservedaswell aswhat learn-ing methodit used.

We make a numberof assumptions.First, we assumethatthesamplesarenot noisyor otherwisecorrupted,andthey

arecomplete(no missingvalues).

Second,we assumethat theindividual samplesetsaredis-joint (so �%�!& � ��� ). This impliesthattheconcatenationof the ��� equals� , sowe don’t have to concernourselveswith repeatswhenaggregating. This assumptionis not al-waysappropriate.It is invalidatedwhenmultiple sourcesobservethesameevent.However, thereareinterestingdo-mainswherethis propertyholds. For example,in our mo-tivating medicaldomain,doctorsare likely to have seendisjoint setsof patients.

Third, we assumethat thesourcesbelieve their samplestobeIID — independentandidenticallydistributed.Thema-chinelearningalgorithmsusedin practicecommonlyrelyon thisassumption.

Finally, we assumethatthesamplesin thecombinedset �aresampledfrom � andIID. This assumptionmayappearoverly restrictive at first glance. For one, it may seemtoprecludethecommonsituationwheresourcesreceivesam-plesfrom differentsubpopulations.For example,if doctorsarein differentpartsof theworld, thecharacteristicsof thepatientsthey seewill likely bedifferent.

In fact,wecanaccomodatethissituationwithin our frame-work by assuming� is a distributionover thedomainvari-ablesand a sourcevariable ' which takes the differentsourcesasvalues; '(�)� meanssource� observedthe in-stantiateddomainvariables.This generalizeddistributionis sampledIID. Each � � consistsof thesubsetof sampleswhere '*�+� . It is not necessaryto keeparoundthe ' val-ues;computingthe � � and ��� without ' will give thesameresultsaslearningdistributionsover thecompletesamplesandmarginalizing out ' . Thus,althoughsampleswill beIID, differentsubpopulationdistributionswill bepossible,capturedby differentconditionalprobability distributionsof thedomainvariablesgivendistinctvaluesof ' . ,3 AggregatingLearnedJoint Distrib utions

We first considerthecasewheresourceshave learnedjointdistributions,andtheaggregateis alsoa joint.

3.1 Learning joint distrib utions: review

Givensamplesof avariable� , thegoalof alearneris to es-timatetheprobabilityof futureoccurencesof eachvalueof� . In oursetting,thedomainof � is � andtheparametersthat needto be learnedarethe - �.- probabilites.The dis-tribution over � is parameterizedby / . Two standardap-proachesareMaximumLikelihoodEstimation(MLE) andMaximumA Posteriori estimation(MAP).0

Two implicationsof this formulationarethattheassumptionthatthe 132 aredisjoint is implicit and 4�2 will approach576�8�9;:=<as > approaches? for all : .

177

Page 182: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

An MLE learnerchoosesthememberof a specifiedfamilyof distributionsthatmaximizesthelikelihoodof thedata:

Definition 1 If � is a random variable, ������@A�B�CED ,GFEHEHIH$F D�JLK , and /M�)NPO ,QFEHIHEH$F OARQS where O �T�VUW�@XY�Z-/[� , thentheMLE distributionover � givendataset � is

\^]7_ �� F �[�`�baQcedf�Wa Dgih �=�j-lkm�It is easyto show thattheMLE distribution is theempiricaldistribution if samplesareIID.

MAP learning,on theotherhand,followstheBayesianap-proachto learningwhich directsus to put a prior distribu-tion over the valueof any parameterwe wish to estimate.Wetreattheseparametersasrandomvariablesanddefineaprobabilitydistributionover them.More formally, we nowhave a joint probability spacethat includesboth the dataandtheparameters.

Definition 2 If � is a random variable, ������@A�B�CED , FEHEHIH$F D J K , and /M�)NPO , FEHIHEH$F O R S where O � �VUW�@X � -/[� , then the MAP distribution over � givendata set �andprior UW�#/[� is thedistribution

\^n h �@ F h �P/[� F �[�`o h ��p-@�q�`� r h ��p-=/[� h �P/!-@�q���Y/Theappropriateconjugateprior for variableswith multino-mial distributionsis Dirichlet. sutvcl�P/w-yx , FEHIHEHEF x J � , whereeachx � is a hyperparametersuchthat x �fz|{ .Wewill assumethatDirichlet distributionsareassessedus-ing the methodof equivalentsamples: given a prior dis-tribution } over � and an estimatedsamplesize ~ , x � issimply }��@X � ��~ . We usetheseto parameterize

\^n h :

Definition 3 If � is a random variable, ������@A�B�CED ,GFEHEHIH$F D�JLK , / ��N#O ,lFEHIHEH$F OARQS where O ��� � �@XY��-/[� , } is a probability distribution over � , and~ z�{ , then\^n h �� F N�} F ~�S F �[� denotesthe distribution\^n h �@ F����LF �q� where � � ��sutvcy�#/�- }�� D , ��~ FEHIHEHIF }�� D J ��~�� .

We will omit the � argumentfrom theMLE andMAP no-tationsinceit is understood.

3.2 LinOP: review

Let us turn to the problemof aggregation. We will showthat joint aggregationessentiallyreducesto LinOP. LinOPwas proposedby Stone in [Sto61], but is generallyat-tributedto Laplace.It aggregatesasetof joint distributionsby takinga weightedsumof them:

Definition 4 Given probability distributions � ,GFEHIHEHEF �Y�and non-negative parameters � , FEHIHEHEF � � such that& � � � ��� , the LinOP operator is definedsuch that, for

any �+�� ,] tv��� h �@� ,lF���,lFIHEHEHIF �Y� F�� ���E�@�u�`���Y�|� � � � ���u� HLinOP is popularin practicebecauseof its simplicity. Asdescribedin [GZ86], it also has a numberof attractivepropertiessuch as unanimity (if all the � � � ��� , thenLinOPreturns�Y� ), non-dictatorship(nooneinput is alwaysfollowed), and the marginalization property (aggregationandmarginalizationarecommutativeoperators).However,LinOPhasoftenbeendismissedin theaggregationcommu-nitiesasanormativeaggregationmechanism,primarily be-causeit failsto satisfyanumberof otherpropertiesdeemedto benecessaryof any reasonableaggregator, e.g.,theex-ternal Bayesianityproperty(aggregationandconditioningshouldcommute)andthe preservationof sharedindepen-dences.Furthermore,typical approachesto choosingtheweightsareoftencriticizedasbeingad-hoc.

However, this dismissal may have been overly hasty.LinOP provesto betheoperatorwe arelooking for in ourframework: usingit is equivalentto having the DM learnfrom thecombineddatasetunderintuitiveassumptions.

3.3 MLE aggregation

SupposethesourcesandtheDM areMLE learners.As hasbeenknown in statisticsfor sometime, theDM needonlycomputetheLinOPof thesources’distributions.

Proposition1 ([Win68, Mor83]) If � � � \^]7_ ��� � � foreach ��� C � FEHIHEH�F � K and ��� � \^] _ ���q� , then ��� �] tv��� h �=� ,QF���,lFEHIHEHIF � � F�� ��� .Althoughstraight-forward,thispropositionis illuminating.For one,theweightcorrespondingto eachsourcehasaveryclearmeaning;it is thepercentageof totaldataseenby thatsource.TheDM only needsto provide accurateestimatesof thesepercentages.A high weight indicatesthattheDMbelievesasourcehasseenarelatively largeamountof dataandis, hence,likely to bevery reliable. Thus,we addressa commoncriticism of LinOP, that the weightsareoftenchosenin an ad-hocfashion. Also, if � is known, theDM cancomputethe numberof samplesin � that were� : � ] t���� h ��� , F�� , FIHEHIHEF � � F�� � � . Thus,

] t���� h can beviewedasessentiallystoringthesufficientstatisticsfor theDM learningproblem.

It is now easyto seewhy a propertysuchaspreservationof independencewill not alwayshold given our learning-basedsemantics.In our framework, sourcesdo not havestrongbeliefsaboutindependences;any believedindepen-dencedependson how well it fits the source’s data. Theindependencepreservationpropertydoesnot take into ac-countthepossibility that,becauseof limited data,sourcesmayall have learnedindependenceswhicharenot justifiedif all thedatawastakeninto account.

178

Page 183: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

Consider, for example,thefollowing distribution over twovariables� and � : � ���� $�A�¡�l"l¢ , � ���¤£ $�A�¡�l"Q¥ , � � £�� E�¦��l"Q§ , and � � £� £  $�¨�B�l"G¢ . Obviously, � and � arenot in-dependent.Supposetwo sourceshave eachreceiveda setof six samplesfrom this distribution: � , consistsof oneeachof �¤  and � £   , two eachof £��  and £� £   ; �[© consistsofone eachof � £  and £� £  , two eachof ��  and £�¤  . FurthersupposeeachusedMLE to learna distribution over � and� . � and � areindependentin eachof thesedistributions.TheLinOPdistribution,ontheotherhand,effectively takesinto accounttheevidenceseenby bothsourcesandactuallycomputes� wherethevariablesarenot independent.

3.4 MAP aggregation

MLE learnersareknown to haveproblemswith overfittingandlow-probability eventsfor which datanever material-ized. MAP learningoftendoesa betterjob of dealingwiththeseproblems,especiallywhendatais sparse.

Consequently, supposethe sourcesandthe DM areMAPlearnerswith Dirichlet priors. The optimal aggregatedis-tribution is a variationon

] t���� h : ©Proposition2 Suppose, for each �ª� C � FEHIHEH$F � K , � �^�\^n h ��N�} � F ~ � S F � � � and ��� � \^n h ��N@} F ~�S F � � � . Then,

� � �@� �`� ��¬«�~ �P� ] tv��� h �=� , F�� , FEHIHEH$F � � F�� � �7«�}����u��~��« � �

~I��­«�~ � � � �@� �f®�} � �@�¦��� H (1)

Thefirst term in Equation1 is theDM’s MAP estimation,the secondterm accountsfor the sources’priors by sub-tractingout their effect.

Corollary 2.1 Suppose, for each ��� C � FEHIHEH�F � K , � � �\^n h ��N�} � F ~ � S F � � � and ��� � \^n h ��N@} F ~�S F � � � . Then,¯ t��°�±=²´³fµ°=¶�±=²·³TµE¸ ¶ � � � ] t���� h ��� ,GF���,lFIHEHIHEF � � F�� ��� HThus, as � becomeslarge, the LinOP distribution ap-proaches��� . This is not surprisingsinceit is well-knownthatMLE learningandMAP learningwith Dirichlet priorsareasymptoticallyequivalent.Theimplicationis thatif �is large,not only do we not needto know � to aggregate,we do not needto know what priors the sourcesusedei-ther. And if we approximatetheaggregatedistribution bytheLinOPdistribution,thisapproximationwill improvethemoresamplesseenby thesources.

4 AggregatingLearnedBayesianNetworks

Bayesiannetworks(BNs) arestructuredrepresentationsofprobability distributions. A BN   consistsof a directed¹

We omit proofsfor lackof space.

acyclic graph(DAG) º whosenodesarethe � randomvari-ables. The parentsof a node � are denotedby h a����p� ;� aY�@�p� denotesa particular assignmentto h a��@�p� . Thestructureof thenetwork encodesmarginal andconditionalindependenciespresentin thedistribution. Associatedwitheachnodeis theconditionalprobabilitydistribution (CPD)for � given h a����p� .Weconsiderthecasewheresources’beliefsarerepresentedas BNs learnedfrom data. We briefly review the tech-niquesusedfor learningBNs from data. For a morede-tailedpresentation,see[Hec96].

4.1 Learning Bayesiannetworks: review

If the structureof the network is known, the taskreducesto statisticalparameterestimationby MLE or MAP. In thecaseof completedata,thelikelihoodfunctionfor theentireBN convenientlydecomposesaccordingto thestructureofthe network, so we can maximizethe likelihood of eachparameterindependently.

If thestructureof thenetwork is notknown,wehaveto ap-ply Bayesianmodelselection.More precisely, we defineadiscretevariable » whosestatesº correspondto possiblemodels,i.e., possiblenetwork structures;we encodeouruncertaintyabout » with theprobabilitydistribution UW�@º�� .For eachmodel º , we definea continuousvector-valuedvariableO¦¼ , whoseinstantiationskl¼ correspondto thepos-sibleparametersof themodel. We encodeour uncertaintyabout O¦¼ with aprobabilitydistribution UW��kl¼½-yº�� .We scorethecandidatemodelsby evaluatingthemarginallikelihood of the data set � given the model º , that is,the Bayesianscore UW���¾-¿º���ÁÀ½UW���¾-·k ¼ F º���UW��k ¼ -º���UW�º���Ãmk ¼ HIn practice, we often use some approximation to theBayesianscore. The mostcommonlyusedis the MDLscore,which convergesto the Bayesianscoreas the datasetbecomeslarge.TheMDL scoreis definedasÄ�Å ��ceÆIÇ´È � �@É �7Ê �[�¿�

� Ë� ��Ì , �ÍIÎEÏ�Ð ¶@Ñ ��Ò ¶ �� �@XY� F�� a��@X��#��� ¯ ��dW�� �@XY�Ó- � a���XY�����® ¯ �md`�Ô sutv�Õ d �×Ö ®;s ] �@d � �

wheresutv�^Õ d � Ö is thenumberof independentparametersinthegraphand s ] ��d � � is thedescriptionlengthof º � . Find-ing the network structurewith the highestscorehasbeenshown to beNP-hardin general.Thus,wehaveto resorttoheuristicsearch. Sincethe searchcaneasilyget stuckina localmaximum,weoftenaddrandomrestartsto thepro-cess.TheBN learningalgorithmis presentedin Figure1.

Why are we interestedin learningBNs rather than joint

179

Page 184: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

1. pick a random DAG g2. parameterize g to form b3. score b4. loop5. for each DAG g’ differing from g by

adding, removing, or reversing an edge6. parameterize g’ to form b’7. score b’8. pick the b’ with the highest score and replace

g with g’ and b with b’ if score(b’) Ø score(b)9. until no further change g10. return b.

Figure1: Bayesiannetwork learningalgorithm.

distributions? Besidessomeobvious reasonsconcerningcompactrepresentationandefficient inference,a distribu-tion learnedby theBN algorithmmaybecloserto theorig-inal distributionusedto generatethedatain thefirst place.

First, note that the networks which canbe parameterizedto representexactly theMLE- or MAP-learnedjoint distri-butionsare,in general,fully connected.Intuitively, a dis-tribution learnedfrom finite sampledatawill alwaysbe alittle noisy, sotrueindependenceswill almostalwayslooklike slight dependencesmathematically. As a result, theBNs we areinterestedin (eitherfor the sourcesor for theDM) will not be exact representationsof the independen-ciespresentin theMLE- orMAP-learneddistributions,but,rather, will accountfor this overfitting.

BN learning ‘stretches’the distribution that bestfits thedata to match candidatenetwork structures. For everystructure,welook for thebest(producingthehighestscore)parameterizationof that structure. Thescorebalancesthefit to thedatawith modelcomplexity.

4.2 LinOP-basedAggregationAlgorithm

Now supposeeachsourcehaslearnedaBN  �� with DAG º��from �[� usingthe MDL scoreandthe DM is given theseBNs as well as the �Ù� . According to our semantics,theaggregateBN shouldbeascloseaspossibleto theonetheDM would learnfrom � .

We cannotapplytheBN learningalgorithmdirectly, sincewe don’t have thedatausedby sourcesto learntheir mod-els. A simplesolutionwould beto generatesamplesfromeachsourcemodelandtrain theDM on thecombinedset.That algorithm,althoughappealinglysimple, raisessomenew questions.It is notclearhow many samplesweshouldgeneratefrom eachsource.Onepossibilitywouldbeto usethesamenumberasthe(estimated)numberof samplesthateachsourceusedto learnits model.However, if thatnum-ber is small, thesampleswill not representthe generatingdistribution adequately, introducingadditionalnoiseto theprocess.If wegeneratemoresamplesthaneachsourcesaw(increasingit proportionallyto preservethe �Ù� settings),wegive too muchweightto theMLE componentof thescore,thuspossiblychoosinga suboptimalnetwork. In fact,our

experimentsdescribedin Section5 show thatthisalgorithmdoesverybadlyin practice.

Instead,we can adaptthe BN learning algorithm to usesources’distributionsinsteadof samples.

The main difference is in the way we compute theMLE/MAP parametersfor eachstructurewe considerandthe way we computethe score(lines 2, 3, 6 and7 in Fig-ure 1). Our algorithm relies on the observation that it isnot necessaryto have the actualdatato learna BN; it issufficient to have their empiricaldistribution. As we havedemonstratedin Section3, we cancomeup with saiddis-tribution by applying the LinOP operatorto distributionslearnedby oursources.

We cantake advantageof the marginalizationpropertyofLinOP to make computationmore efficient. As is notedin [PW99], we canparameterizethe network in top-downfashionby first computingthe distribution over the roots,then joints over the secondlayer variablestogetherwiththeirparents,etc.Theconditionalprobabilitiescanbecom-putedby dividing the appropriatemarginals(usingBayesLaw). In many cases,thatwould requireonly localcompu-tationsin sources’BNs.

The MDL scorealsorequiresknowing only the empiricaldistribution for � and � . Again, sincetheempiricaldis-tribution is theLinOPdistribution if theweightsarechosencorrectly and the sourcesusedMLE or MAP (assumingsufficient data)learning,it is possibleto scorethe candi-datenetworkswithouthaving theactualdata.Furthermore,themarginalsusedin theMLE scorearefamily marginals.If thepreviousparameterizationstepis doneby computingmarginals,thenthesewill havealreadybeencomputed.

Although the MDL scorerequiresknowledgeof � , thisdependencemay not be strong,especiallyfor large � inwhich casethesecondtermis dominatedby thelikelihoodtermand � becomesa factorcommonto all networksandcanbe ignored. Otherwise,a roughapproximationof �shouldsuffice.

As in traditional BN learning,cachingcan make the pa-rameterizationandscoringof ‘neighboring’networksmoreefficient. Sincewe aremakingonly local changesto thestructure,only a few parameterswill needupdating.If anarc is addedor removed,we only needto recomputenewparametersfor thechild node,andif anarcis switched,weonly needto recomputeparametersfor the two nodesin-volved. Also, sincetheseLinOP marginalsdon’t change,cachingcomputedvaluesmayhelpto furtherspeedup fu-turecomputations.

180

Page 185: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

5 Experiments

We implementedthe BN aggregationalgorithmin MatlabusingKevin Murphy’sBayesNetToolboxÚ andexploreditsbehavior by runningexperimentson thewell-known, real-life Alarm network [BSCC89], a 37-nodenetwork usedaspartof a systemfor monitoringintensivecarepatients,andon thesmaller8-nodeartificial Asianetwork [LS88].

In our experiments,we learnedtwo sourceBNs from datasampledfrom theoriginal BN, thenaggregatedtheresultsusingour algorithm(AGGR). We hadboththesourcesandtheDM useMAP to parameterizetheir networks. In com-putingLinOP, weusedthe � � asweights.Wecomparedourproposal’s accuracy againstlearning from the combineddatasets(OPT) by plotting the Kullback-Leibler(KL) di-vergence[Kul59] Û of eachdistribution from thetruedistri-bution for differentvaluesof �%�Ü- �;- .5.1 Sensitivity to �We consideredthesituationwheretheDM knows thepri-ors usedby the sourcesand adjustsfor the unduly largenumberof imaginarysamples.All sourcesandDMs usedtheDirichlet prior definedby theuniform distribution andan estimatedsamplesizeof 1. We varied the total num-berof samples� between200and20000,having sourcesseethe samenumberof samplesin somecasesand dif-ferentnumbersin others.We conductedmultiple runsforeachsettingandaveragedthem. Figure2(a) plots the av-eragesfor theAlarm network whensourceshaveequal � � .Due to softwarelimitations,we hadto starteachstructuresearchwith the fully disconnectedgraphandusedno ran-dom restartsfor this larger network. As canbe seen,inspiteof the limited search,our algorithmdoesfairly wellasfar ascomingcloseto theoptimalandimproving on thesources.Not surprisingly, the KL divergencedropsasthetotalnumberof samplesincreases.Furthermore,theexper-imentsonsourceswith different � � showednodependenceof the performanceof thealgorithmon the relative differ-encein � � .We ran similar experimentson Asia. Here,we variedthenumberof samplesbetween200 and3000,with five runsper setting. For eachrun, we usedfive randomrestarts.Figure2(b) plots the averagefor eachsetting. The plotshows that whenwe areable to explore the searchspacesufficiently in thelearningandaggregationalgorithms,ouralgorithmconsistentlyimprovesonthesourcesandcloselyapproximatesto theoptimal.

ÝAvailableat http://www.cs.berkeley.edu/murphyk/bnt.html.ÞThe KL divergenceof distribution ß from à is definedas&bá�âyã à�6Âä´<GåçæyèZéyê á�ëì ê á�ëlí

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

200 2000 10000 20000

M

KL

Div

erge

nce

S1S2OPTAGGR

(a)

0.000

0.010

0.020

0.030

0.040

0.050

0.060

0.070

0.080

0.090

0.100

200 600 1000 1400 1800 2200 2600 3000

M

KL

Div

erge

nce

S1S2OPTAGGR

(b)

Figure2: Sensitivity to � (a) Alarm network results. (b)Asianetwork results.

5.2 Sensitivity to the DM’ sestimation of �We hypothesizedearlierthat the actualvalueof the DM’sestimateof M doesnot matterall that much. To demon-stratethis,we ranexperimentson theAsianetwork similarto thoseabove,but leaving � fixedandvaryingtheDM’sestimate1 orderof magnitudeabove andbelow � . Fig-ures3(a)summarizestheresultsfor �%�(� {�{ .Any approximationabove0.25ordersof magnitudebelow� providesimprovementover the sources.Estimatesbe-low this madethecomplexity penaltysufficiently strongtoselectDAGs with fewer arcsthanthe original andunder-fit thedata.On theotherhand,althoughoverestimating�did notincreasetheKL distancefrom theoriginal,thereis adangerof extremeoverestimatescausingoverfitting. How-ever, we did not find any increasein thecomplexity of theaggregatenetworksfor the1 orderof magnituderangeweconsidered;they remainedat8–9arcson average.

Figure3(b)summarizingtheresultsfor �î�V� {�{m{�{ shows

181

Page 186: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

0.00

0.05

0.10

0.15

0.20

0.25

-1.00 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00

M'/M (log10)

KL

Div

erge

nce

S1S2OPTAGGR

0.0000

0.0002

0.0004

0.0006

0.0008

0.0010

0.0012

0.0014

0.0016

-1.00 -0.75 -0.50 -0.25 0.00 0.25 0.50 0.75 1.00

M'/M (log10)

KL

Div

erge

nce

S1S2OPTAGGR

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0.18

200 600 1000 1400 1800 2200 2600 3000

M

KL

Div

erge

nce

S1S2OPTAGGR

(a) (b) (c)

Figure3: Asianetwork results(a)varyingDM’sestimateof � ( �î�V� {m{ ). (b) varyingDM’sestimateof � ( �î�V� {mï ).(c) with differentsubpopulations.

that, aspredicted,the rangeof “slack” increaseswith � ;the moresamplesseenby the sources,the lessimportanttheaccuracy of theDM’sestimate.

5.3 Subpopulations

Our algorithmperformswell whencombiningsourcedis-tributionslearnedbasedonsamplesfrom differentsubpop-ulations.To show this,wemodifiedtheAsianetwork to ac-comodatetwo sources,adoctorpracticingin SanFranciscoandonepracticingin Cincinnati. Theprobabilitydistribu-tionsof thetwo root nodesin theAsia network, represent-ing whethera patientsmokesandwhethershehasvisitedAsia would be significantlydifferent for the two doctors.A patientfrom SanFranciscois lesslikely to bea smoker,andonefrom Cincinnatiis lesslikely to have visitedAsia.Thus,weaddedasourcevariableasdescribedin Section2,gave the sourcesequalpriorsof seeingpatients,madethesourcevariableaparentof thetwo rootvariables,andgavethem appropriateCPDs. We drew � samplesfrom thisextendednetwork andhadeachsourcelearnfrom the ap-propriatesubset,then usedAGGR to combinethe resultsusingthe correct �Ù� and � . Figure3(c) plots the KL di-vergenceof eachdistribution from theoriginal distributionwith the sourcevariablemarginalizedout. Becausethesourcesarelearningthedistributionsfor differentsubpop-ulations,what they learn is relatively far from the overalldistribution. The DM takesadvantageof the informationfrom both sourcesandlearnsa BN that approximatestheoriginalmuchmorecloselythaneithersource.

5.4 Comparison to samplingalgorithm

In eachof the above experiments,we also comparedtheperformanceof ouralgorithmto thealternative intuitiveal-gorithm SAMP we describedin Section4.2 in which wesample� � � samplesfrom eachsource� ’s BN and learna BN from the combineddata. SAMP did very badly in

general,consistentlyworsethannot only AGGR, but worsethanthesourcesaswell, oftenby anorderof magnitude.

6 RelatedWork

A wealthof work exists in statisticson aggregatingprob-ability distributions. Good surveys of the field include[GZ86, CW99]. Many of theearlier, axiomaticapproachessufferedfrom a lackof semanticalgrounding.For this rea-son,the communitymoved towardsmodelingapproachesinstead. The most studiedapproachhasbeenthe supra-Bayesianone, introducedin [Win68] and formally estab-lished in [Mor74, Mor77]. Here, the DM hasa prior notonly over the variablesin the domain,but over the possi-ble beliefsof the sourcesaswell. Sheaggregatesby us-ing Bayesianconditioningto incorporatethe informationshereceivesfrom the sources.In fact, Proposition1 de-rivesfrom this body of work. However, almostall of thiswork hasbeenrestrictedto aggregatingbeliefsrepresentedaspointprobabilitiesor odds,or joint distributions.

Therehasbeensomerecentinterest,particularlyin AI, intheproblemof aggregatingstructureddistributionsinclud-ing [MA92, MA93, PW99]. But, like the early axiomaticapproachesin statistics,muchof this work focuseson at-temptingto satisfy abstractpropertiessuchas preservingsharedindependences,andoftenrunsinto impossibilityre-sultsasa consequence.

In somesense,what we are doing could also be viewedasensemblelearningfor BNs. Ensemblelearninginvolvescombiningtheresultsof differentweaklearnersto improveclassificationaccuracy. Becauseof its simplicity, LinOP isoften usedwithout justificationto do the actualcombina-tion. Our resultsjustify this usewhen the weak learnersuseMLE, MAP, or BN learning.

Anothernew areain AI thatbearssimilaritiesto our workis that of on-line or incrementallearning of BNs (e.g.,

182

Page 187: CONTROL COORDINATION OF MULTIPLE AGENTS ...CONTROL COORDINATION OF MULTIPLE AGENTS THROUGH DECISION THEORETIC AND ECONOMIC METHODS 6. AUTHOR(S) Yoav Shoham 5. FUNDING NUMBERS C - F30602-98-C-0214

[Bun91, LB94, FG97]). There,we aregivena continuousstreamof samplesandwe want to maintaina BN learnedfrom all thedatawehaveseensofar. Becausethestreamisverylong,it is generallynotpossibleto maintainthefull setof sufficientstatistics.Approachesrangefrom approximat-ing the sufficient statisticsto restrictingthe network thatcanbe learned.We essentiallydo theformerby assumingthatthesufficientstatisticsfor thedataseenby eachsourceis encodedin its network. Cross-fertilizationbetweenthetwo fieldsmayproveprofitable.

7 Conclusion

We have presenteda new approachto belief aggregation.We believe that we cannot formulate that problem pre-cisely or measuresuccessof different techniqueswithoutansweringquestionsaboutthe way in which sources’be-liefs were formulated. We argued that a framework inwhich thesourcesareassumedto have learnedtheir distri-butionsfrom datais bothintuitivelyplausibleandleadsto averynaturalformulationof theoptimalDM distribution—onewhich would be learnedfrom the combineddatasets— anda naturalsuccessmeasure— a distancefrom thegenerating,‘true’ distribution.

Basedon the observation that LinOP is the appropriateoperatorfor this framework if sourcesandDM areMLElearners,we presenteda LinOP-basedalgorithmto aggre-gatebeliefsrepresentedby Bayesiannetworks.Ourprelim-inary resultsshow thatthis algorithmperformsverywell.

Onedirectionof future work will involve finding waystorelaxthevariousassumptions.For example,wewould liketo extendthe framework to allow for continuousvariablesandto allow for dependencebetweensources’samplesets.

In ourframework, theDM completelyignoressources’pri-ors. This maybeappropriateif thepriorsareknown to beunreliableor uninformative. However, the priors usedinrealapplicationsareofteninformativein andof themselves.Thus,a seconddirectionwill involvefinding valid waysoftakingadvantageof sources’priors to improve thequalityof the aggregation. For example,if sourcesuseDirichletpriorsandtheDM truststheir estimatedsamplesizes,shemaychoseto incorporatetheminto herestimateof � .

Acknowledgements

PedritoMaynard-ReidII waspartially supportedby a Na-tional PhysicalScienceConsortiumFellowship. UrszulaChajewska was supported by the Air Force contractF30602-00-2-0598underDARPA’sTASK program.

References

[BSCC89] I. Beinlich, G. Suermondt, R. Chavez, andG. Cooper. The ALARM monitoring system. In

Proc.EuropeanConf. onAI andMedicine, 1989.

[Bun91] W. Buntine. Theory refinementon bayesiannet-works. In Proc.UAI’91, pages52–60,1991.

[CW99] R. T. ClemenandR. L. Winkler. Combiningproba-bility distributionsfrom expertsin risk analysis.RiskAnalysis, 19(2):187–203,1999.

[FG97] N. FriedmanandM. Goldsmidt.Sequentialupdateofbayesiannetwork structure.In Proc. UAI’97, pages165–174,1997.

[FH96] N. FriedmanandJ. Y. Halpern. Belief revision: Acritique. In Proc.KR’96, pages421–431,1996.

[GZ86] C. GenestandJ. V. Zidek. Combiningprobabilitydistributions:A critiqueandanannotatedbibliogra-phy. StatisticalScience, 1(1):114–148,1986.

[Hec96] D. Heckerman. A tutorial on learningbayesiannet-works. TechnicalReportMSR-TR-95-06,MicrosoftResearch,1996.

[HHN92] D. Heckerman,E.Horvitz,andB. Nathwani.Towardnormative expert systems: Part I. The Pathfinderproject.Methodsof Informationin Medicine, 31:90–105,1992.

[Kul59] S.Kullback. InformationTheoryandStatistics. Wi-ley, 1959.

[LB94] W. Lam andF. Bacchus. Learningbayesianbeliefnetworks: An approachbasedon the mdl principle.ComputationalIntelligence, 10:269–293,1994.

[LS88] S.L. LauritzenandD. J.Spiegelhalter. Localcompu-tationswith probabilitiesongraphicalstructuresandtheir applicationto expert systems.In J. RoyalSta-tistical Society, SeriesB (Methodological), volume50(2),pages157–224,1988.

[MA92] I. Matzkevich andB. Abramson.Thetopologicalfu-sionof Bayesnets.In Proc.UAI’92, pages191–198,1992.

[MA93] I. Matzkevich andB. Abramson. Somecomplexityconsiderationsin thecombinationof beliefnetworks.In Proc.UAI’93, pages152–158,1993.

[Mor74] P. A. Morris. Decisionanalysisexpertuse.Manage-mentScience, 20:1233–1241,1974.

[Mor77] P. A. Morris. Combining expert judgements: Abayesianapproach. ManagementScience, 23:679–693,1977.

[Mor83] P. A. Morris. An axiomaticapproachto expert reso-lution. ManagementScience, 29(1):24–32,1983.

[Pea88] J. Pearl. ProbabilisticReasoningin Intelligent Sys-tems. MorganKaufmann,1988.

[PMGH00] D. M. Pennock,P. Maynard-ReidII, C. L. Giles,andE. Horvitz. A normative examinationof ensemblelearningalgorithms. In Proc. ICML’00, pages735–742,2000.

[PW99] D. M. PennockandM. P. Wellman. Graphicalrep-resentationsof consensusbelief. In Proc. UAI’99,pages531–540,1999.

[Sto61] M. Stone.Theopinionpool.Annalsof MathematicalStatistics, 32(4):1339–1342,1961.

[Win68] Robert L. Winkler. The consensusof subjec-tive probabilitydistributions. ManagementScience,15(2):B61–B75,October1968.

183