scheduling as a multi-person, multi-period decision problem

19
Pergamon Socio.Econ. Plann. Sci. Vol. 28. No. 3, pp. 147-165, 1994 Copyright ~C 1994 Elsevier Science Ltd 0038-0121(94)00009-3 Printed in Great Britain. All rights reserved 0038-0121/94 $7.00+ 0.00 Scheduling as a Multi-person, Multi-period Decision Problem ANN VAN ACKERE London Business School. Regents Park, London NWI 4SA. U.K. Absm~'t--We look at the issue of scheduling several jobs sequentially in a single facility as a multi-person decision problem, by making the assumption that a job cannot be performed unless some third party (labelled agent) is present. We introduce private information by assuming that the scheduler does not know the agent's valuation of time and extend the model by allowing the agent's private information to change over time. We conclude that reputation building takes place when the agent and the scheduler interact repeatedly. INTRODUCTION In the literature scheduling is generally considered to be a single person decision problem: someone selects a schedule and this schedule is then implemented. In many practical situations, scheduling turns out to be a multi-person issue--one (or more) individuals make up the schedule, but this schedule can only be implemented to the extent that the people concerned agree to it. In other words, it is not sufficient to select a schedule that is optimal with respect to the criteria being considered (e.g. minimal costs or maximum utilization), it is also necessary to motivate the people involved in the implementation process to respect it. We use the term optimal schedule to denote the schedule that optimizes the scheduler's objective, cost minimization in this model. More generally, a scheduler may be trading off several objectives, or he may be aiming for a robust schedule; i.e. a schedule that is expected to perform well under a wide variety of circumstances. It is also generally assumed that the scheduling problem occurs only once. But, in many instances, similar situations, involving the same individuals, arise repeatedly. A decision that is optimal in a once-only situation may no longer be appropriate in a long-term relationship. Consider the following scenario. A hospital operating room's schedule for the next day has been set. A new patient is admitted to the hospital. After a preliminary examination, it becomes clear that this patient will need surgery the following day. The operating room scheduling office is contacted and asked to add this case to the next day's schedule. What starting time should be selected? Individuals involved in this decision include, among others, the scheduler, the patient and the surgeon. In this example, it is likely that the patient will be present at the scheduled starting time. On the other hand, the surgeon may arrive late, either due to unexpected events (e.g. an emergency call) or because her expects having to wait for the operating room to become available. The latter is expected to occur in cases of severe capacity shortage. The scheduler aims for maximum utilization, which, in the presence of uncertainty, inevitably leads to being behind schedule. A surgeon faced with this form of institutionalized delay may look at the schedule and elect to arrive at what he believes to be a more realistic starting time. This situation can be formalized as follows. There is one facility available (e.g. an operating room) to perform a specific kind of job. A number of jobs of unknown duration have been previously scheduled, to be carried out sequentially. A request arrives to schedule an additional job. The question we intend to address here is: given the existing schedule, what starting time should be selected for this additional request? The trade-off is between keeping the facility idle and running behind schedule. This situation is repeated over time. Scheduling dentist office hours also fits this framework. A dentist has scheduled a number of tWe use a single gender here for purposes of simplicity. 147

Upload: ann-van-ackere

Post on 26-Aug-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Pergamon

Socio.Econ. Plann. Sci. Vol. 28. No. 3, pp. 147-165, 1994 Copyright ~C 1994 Elsevier Science Ltd

0038-0121(94)00009-3 Printed in Great Britain. All rights reserved 0038-0121/94 $7.00 + 0.00

Scheduling as a Multi-person, Multi-period Decision Problem

ANN VAN ACKERE London Business School. Regents Park, London NWI 4SA. U.K.

Absm~'t--We look at the issue of scheduling several jobs sequentially in a single facility as a multi-person decision problem, by making the assumption that a job cannot be performed unless some third party (labelled agent) is present. We introduce private information by assuming that the scheduler does not know the agent's valuation of time and extend the model by allowing the agent's private information to change over time. We conclude that reputation building takes place when the agent and the scheduler interact repeatedly.

INTRODUCTION

In the literature scheduling is generally considered to be a single person decision problem: someone selects a schedule and this schedule is then implemented. In many practical situations, scheduling turns out to be a multi-person issue--one (or more) individuals make up the schedule, but this schedule can only be implemented to the extent that the people concerned agree to it. In other words, it is not sufficient to select a schedule that is optimal with respect to the criteria being considered (e.g. minimal costs or maximum utilization), it is also necessary to motivate the people involved in the implementation process to respect it. We use the term optimal schedule to denote the schedule that optimizes the scheduler's objective, cost minimization in this model. More generally, a scheduler may be trading off several objectives, or he may be aiming for a robust schedule; i.e. a schedule that is expected to perform well under a wide variety of circumstances.

It is also generally assumed that the scheduling problem occurs only once. But, in many instances, similar situations, involving the same individuals, arise repeatedly. A decision that is optimal in a once-only situation may no longer be appropriate in a long-term relationship.

Consider the following scenario. A hospital operating room's schedule for the next day has been set. A new patient is admitted to the hospital. After a preliminary examination, it becomes clear that this patient will need surgery the following day. The operating room scheduling office is contacted and asked to add this case to the next day's schedule. What starting time should be selected? Individuals involved in this decision include, among others, the scheduler, the patient and the surgeon. In this example, it is likely that the patient will be present at the scheduled starting time. On the other hand, the surgeon may arrive late, either due to unexpected events (e.g. an emergency call) or because her expects having to wait for the operating room to become available.

The latter is expected to occur in cases of severe capacity shortage. The scheduler aims for maximum utilization, which, in the presence of uncertainty, inevitably leads to being behind schedule. A surgeon faced with this form of institutionalized delay may look at the schedule and elect to arrive at what he believes to be a more realistic starting time.

This situation can be formalized as follows. There is one facility available (e.g. an operating room) to perform a specific kind of job. A number of jobs of unknown duration have been previously scheduled, to be carried out sequentially. A request arrives to schedule an additional job. The question we intend to address here is: given the existing schedule, what starting time should be selected for this additional request? The trade-off is between keeping the facility idle and running behind schedule. This situation is repeated over time.

Scheduling dentist office hours also fits this framework. A dentist has scheduled a number of

tWe use a single gender here for purposes of simplicity.

147

148 ANN vA.~ ACKERE

patients for the next day. Another patient calls, asking for an appointment. Note that in this case the facility is a person, namely, the dentist. The earlier the time selected by the dentist, the more likely it is that the patient will have to wait. On the other hand, if the dentist selects a later time, he may have to wait for the patient to arrive.

The dentist may also be faced with the choice of referring a patient to another dentist because his schedule is full (which may lead to the loss of this patient), vs attempting to "squeeze him in," leading to a delay for patients scheduled later in the day. This may, in turn, cause dissatisfaction and lead to the loss of other patients. In this regard, we have showed in earlier work [5] that a patient's inability to commit to arriving on time may induce the dentist to select a scheduling system that makes both him and the patient worse off.

The purpose of the current work is to illustrate the impact of the human element on scheduling in a service context. This is done in a very stylized way, by assuming that all individuals minimize their expected costs. We focus on examples in the health sector, as this is an area where resources (e.g. operating rooms, practitioners) are expensive and in limited supply. It is therefore natural to aim for a high level of utilization. But this often occurs at the expense of other parties involved, who may refuse to cooperate. One such occurrence is the scheduling of patients by some general practitioners (GPs) in the U.K. It is quite common for more than one patient to be given the same appointment time at the start of the day, to avoid the GP having to wait should a patient arrive late. Another common occurrence is for the first patient to be scheduled half an hour before the GP's expected arrival time. Patients have little recourse, as their choice of GP is very limited unless they can afford a private GP (which is typically not covered by private medical insurance).

The situation is quite different in the private sector. For non-urgent care (e.g. a flu jab or an annual check-up at the dentist), patients have the option to walk out if they are kept waiting too long. If this occurs repeatedly, several outcomes are possible: the provider may modify his scheduling habits to reduce waiting times, the patient may switch permanently to another provider, or he may accept the situation. Whether or not the provider modifies his behavior depends on his expectations regarding the patients' reaction, and whether he can afford to lose some of his patients.

Throughout this paper, we will use the following terminology. The scheduler will be referred to as the principal. The person involved in the scheduling issue and who may arrive late is called the agent. Any other individuals who need to be present are lumped together in one group, labelled "others". Table i summarizes this terminology for the two examples.

We model explicitly the behavior of the principal and the agent. The principal selects a starting time. The agent decides when to arrive. Their incentives are represented by several cost parameters, which are discussed in the next section. Basically, the principal trades off idle time of the facility against waiting time of the agent, while the agent trades off having to wait against being late. We assume that each player attempts to minimize his expected costs; i.e. we assume risk neutrality. This assumption is made for reasons of analytical tractability.

In Ref. [6] we look at a similar situation, making the assumption that all players know each other's cost parameters. Here, we extend this model in three ways:

I. We assume that the principal does not know the agent's cost parameters (private information); 2. We look at a multi-period model; i.e. the same situation is assumed to arise repeatedly between

the same players; and 3. We allow the agent's cost parameters to change over time.

The results of the one-period model illustrate that the scheduler must take the agent's incentives into account: the best schedule is useless unless it is respected. Introducing private information shows how uncertainty can affect the scheduler's decision. Finally, the extension to multiple periods

Table I. Examples

Hospital operating room [::)entisl's practice

Facilily Operating room Denlisl Principal Manager. nurse Dentist or his assistant Agent Surgeon Patient Other5 Patient, nurses None

Scheduling as a multi-person, multi-period decision problem

Table 2. The cost typas

Cost incurred by Cause of cost Symbol

Principal Facility idle C~, Principal Keeping the agent waiting C~ Principal Keeping the others waiting C ° Agent Being late Ctl Agent Having to wait C~'

149

illustrates how what is optimal in a once-only interaction can differ from what is optimal in a long-term relationship, especially if the time valuation of the agent can change over time.

At this stage, a word of caution is in order. The model presented in the next two sections is a very stylized version of reality. The simplifications are required for analytical tractability. It is the qualitative conclusions rather than the technicalities which matter. For this reason, we have relegated all mathematical derivations to the Appendices, and provide an illustrative numerical example in the next to last section. Also, we focus on examples where only two parties play an active role. One could consider more parties, at the cost of increased complexity, but this would not alter the basic conclusion: a selected schedule can only be implemented if all parties agree to it. As noted earlier, we herein focus on scheduling waiting times, but (as discussed in the concluding section) a similar approach could be used whenever various parties with conflicting interest cooperate to perform a task.

The paper is structured as follows. In the next section, we formalize the model and summarize results for the one-period, no private information case. We then discuss the general model, followed by a brief numerical example. Finally, we summarize the main results and state our conclusions.

THE MODEL

In this section, we first discuss the cost structure of the players. We then formalize the proposed model and summarize results for the one period, no private information case.

The players' costs

The costs incurred by the agent are of two types: costs of having to wait for the facility to become available, and costs of being late. The former consist mainly of the agent's opportunity cost of time, but also includes other elements, such as being unable to meet appointments later in the day. Lateness costs include loss of goodwill, the possibility that the facility has been allocated to some other job, loss of the right to perform jobs at that facility, etc. For instance, if a patient arrives late, the doctor may see another patient first. We assume that the agent incurs lateness costs whenever he is late, independent of whether or not the facility is available. This assumption is discussed in detail in Ref. [7].

The principal's goal is to minimize the expected costs incurred by the facility. These costs are of three types, two of which will be modelled explicitly:

(I) Costs of idle time: while the facility is idle, some costs are being incurred. Examples include wages of idle personnel and opportunity costs.

(2) Waiting costs: keeping jobs (which may involve customers) and/or agents waiting causes a loss of goodwill, which may induce a loss of jobs in the long run.

(3) Operating costs, which are independent of the schedule: these will not be modelled explicitly since they are irrevelant for our purposes. Examples include the cost of electrical power and wages incurred while the job is being performed, as well as the cost of supplies.

For analytical tractability, costs are assumed to increase linearly in time. This allows us to express all costs on a per unit basis. The following notational convention is adopted: each cost type is denoted by C r, where the subscript X refers to the player incurring the cost (A or P) and the superscript Y refers to the type of cost. For example, C~, is the cost per time=unit (e.g. per hour) incurred by the principal (P) if the facility is idle (I). Table 2 summarizes the notation for the various cost types.

As mentioned earlier, one of our aims is to keep the mathematical side of the modelling exercise

SEPS 214 3 - B

150 ANN VAN ACKERE

as simple as possible, focusing mainly on the qualitative implications. For instance, by focusing specifically on scheduling, we only consider one particular aspect of service quality; namely, waiting time or delay. Still, other elements such as attention paid to waiting patients have a considerable impact on a patient's perceived quality of service. This is especially true in a hospital environment. A supportive, caring attitude of nursing staff can go a long way towards relieving patient anxiety. We basically assume that these other factors are unaffected by the scheduling decision.

The one-period, no prh'ate information model

Consider the following situation. A number of jobs have been scheduled sequentially in a single facility. The time until completion of this sequence is unknown and represented by a random variable, denoted X, with cumulative probability distribution G(- ). We can assume without loss of generality that the first job of this sequence is scheduled to start at time zero. An agent wishes an additional job to be scheduled after this sequence. As we are interested in determining an optimal starting time for this job, we make the following assumptions:

I. The agent's arrival-time is deterministic; i.e. he arrives precisely at the time he selects (this assumption is relaxed in Ref. [6]);

2. The others arrive at the scheduled starting time; 3. A job cannot start until both the agent and the others are present, and the facility is available; 4. The principal and the agent know G('); and 5. The density function g( . ) exists and the inverse cumulative distribution function G ~(.x-) is

single valued for all x, 0 ~< x < I.

To summarize the results for the one-period model in which the principal knows the agent's cost parameters, we need to introduce the following notation and definitions. Let tA be defined by G(t~) = max(0, 1 - Ct~/C~ ") and let tp be defined by G(te) = (C~ + C°)/(C~e + C~ + C~).

Proposition I. (a) Given a scheduled starting time t, it is optimal for the agent to arrive at time max(t, t~); (b) The principal minimizes his expected costs by selecting the starting time max(t~, re).

Proof. See Ref. [6]. This proof is a variation on the classical newsboy model; see, for instance, Ref. [3]. []

This result says that there exists a critical time, denoted tA, such that the agent will never accept to arrive before t,~. Consequently, if the principal schedules the job to start before this time, the agent arrives at time tA. Otherwise, he shows up on time. in other words, tA is the earliest implementable starting time. This implies that it is never optimal for the principal to schedule the job to start before tA. If the time favored by the principal (te) is before t,~, he selects t~. Otherwise he chooses t e. We say that a conflict of interest arises whenever t.~ > tt,, as this is the situation in which the principal is unable to implement the schedule he favors.

A REPEATED GAME WITH PRIVATE INFORMATION

In this section we generalize the proposed model by considering a multi-period scenario where only the agent knows his costs, and these costs may change over time. We first discuss how these generalizations are incorporated into the model, and then determine the optimal strategies for both players.

Model description

We now drop the assumption that the principal knows the agent's costs; in other words, the agent has private information. We consider the following specific situation: the agent is either strong or weak. He is strong if his waiting costs are large compared to his lateness costs. He is weak if his waiting costs are low compared to his lateness costs. Let CL(C~) and C~(C~ ~) denote the strong

Scheduling as a multi-person, multi-period decision problem 151

(weak) agent's lateness and waiting costs. Let 7~ (-G) denote the strong (weak) agent's earliest arrival time; i.e.

G0"A)=max 0"I-C't'~C,'] and G(_h)=max O,I-c , , , I .

Define C,=- Ct,/C, wand C,-= t w C,./C,. The definition of weak and strong implies that C, < C,. If we add the requirement C, < 1 (i.e. the strong agent's lateness cost is strictly less than his waiting cost), we obtain that 7A > G. Denote 7 = max0"A, te). This is the starting time the principal would select if he knew with certainty that the agent is strong. Similarly, t = max(G, re) is the starting time the principal would select if he knew with certainty that the agent is weak. We assume that 7A > te; i.e. if the principal is convinced that the agent is strong, then a conflict of interests arises. This implies that t = G and 7 > _t. The principal does not know to which type this specific agent belongs. Formally, we assume that the principal assesses a probability 6 that the agent is strong.

If the principal knew that the agent was strong (i.e. 6 = I), he would select, 7, while if he knew that the agent was weak (6 = 0), he would select t. It will never be optimal to select a time before _t or afterT. It is intuitively clear that for sufficiently high values of 6 (i.e. the principal is reasonably certain that the agent is strong) the principal will select 7, and, similarly, for sufficiently low values of 6, he will select t. For intermediate values of 6, a time between _t and 7 is optimal. (Details of this result can be found in Ref. [7].)

In many practical situations, the scheduler may only have a few discrete alternatives. For instance, in the operating room example, if room time is billed per 15 min period, it is logical to schedule time periods that are multiples of 15 min. From here on, we thus restrict the principal's choice to either ( or 7. This assumption considerably simplifies the model and enables us to look at repeated games.

The problem faced by the principal and the agent can be represented by the decision tree shown in the left part of Fig. I. The principal initially selects either 7 or _t. if he selects 7, the agent will show up at T, independent of whether he is weak or strong, because max0",TA ) = max0", -G ) = 7. This implies that we can, without loss of generality, delete from the decision tree the branch consisting of the agent choosing .t following the principars choice of 7. if the principal selects _t, the agent chooses between 7 and _t. The expected cost matrix is shown in Table 3. The first action denotes the principal's decision, while the second action is the agent's choice.

Taking the negative of the cost matrix, adding a constant to each column and normalizing appropriately yields the payoff matrix shown on the right in Fig. I. Explicit expressions for a, b

Payoffs

strong weak pdncipal

agent agent

A ~"

c a 0

-t 0 b

0 -! b-1

Fig. I. A discrete model.

152 ANN VAN ACKERE

Table 3. Expected costs :.n the discrete game

Actions Expected costs P A Str .ng agenl 14"euk agent Principal

7 F C ~, E(.V - 7V C, ~, EC¥ - 7)" C~E(7 - .~')" + ( C ; + C'~)E(X - 7)"

~ ~ C ~, E(X - t ) " C~ EC~ - ~)" C ' E ~ - X ) " + ( C ; + C'~)EtX - ~_V

_~ 7 c," e~,v - 7)" C. ~ e~x - 7)" c ' e ~ F - x ~ " + c ° ~ 7 - r ) + c ~ ( 7 - I ) + c ' . ( 7 - 0 + ( c ; + C°,)ECV - 7V

and c are given in Appendix A. It follows from the computations that 0 < b < I. We make the additional assumption that a > I, as this considerably simplifies the analysis in the repeated version of this game. A necessary and sufficient condition for this assumption to hold is:

E(X - t)* - E ( X - 7 ) + I > ( I ) 7-t_ 5 c '

This condition requires that the two types of agents should not be too different. How restrictive is this assumption? Consider the case where X is distributed uniformly over the interval [0, 2m] (where m is the expected value of X), and te < -G. Equation ( I ) reduces to C, > 0, which is satisfied whenever C~ > 0; i.e. the strong agent has a non-zero lateness cost. If te > t_A, equation (I) reduces t o :

c ' > C , - C , ;

c" + + c °

i.e. an upper limit on the difference between the two types. Next, consider the case where X is distributed exponentially with mean 2 and tp < G. Then equation (!) reduces to:

This inequality is satisfied for c , / C , ~ (i , 4.92]. Remember that C, > C~ by assumption, implying that C,/C~ > I. Rounding off 4.92 to 5, this condition reduces to C,, < 5C,. A sufficient, but not necessary condition for this to hold is C, > 0.2; i.e. the probability that the strong agent must wait if he arrives at time 7A is at least 20%.

An alternative interpretation is that in any one period, the benefit of the weak agent of inducing the principal to select 7 (payoff a) exceeds the cost of pretending that he is strong (opt for strategy _t7 rather than tt, get payoff - I rather than 0).

As mentioned above, we assume that the scheduling problem is repeated over time; that is, the game depicted in Fig. I is played N times in a row between the same two players. This creates an incentive for the agent to build a reputation of belonging to the strong type, thus inducing the principal to select 7 rather than t. We assume that the agent knows his initial type at the start of the game, but this type can change over time. As an example, we consider in detail the case where the agent's type can evolve from weak to strong as time goes by. Between any two repetitions there is an external shock such that, if the agent is weak before the shock, there is a small probability s that he becomes strong. If he is strong before the shock, he remains strong. This shock is observable to the agent, but not to the principal.

To illustrate the underlying idea, consider the operating room example. At the beginning of his career, a surgeon is very likely to be of the weak type. As time goes on, he becomes more experienced, develops a private practice etc. Consequently, his opportunity cost of time increases relative to his lateness cost. In other words, he moves closer to the strong type. It is clear that in reality this is a gradual process and our formulation, implying that this switch happens overnight, is a very crude approximation.

As another example, consider the case where the agent is the patient. The term weak may characterize patients with a poor health record and limited health insurance, while the strong agent benefits from a well paid job that includes health insurance as a perk. Getting a job may change the patient's status from weak to strong, while being made redundant results in the opposite transition. In this specific example, the introduction of a national health insurance scheme may also

Scheduling as a multi-person, multi-period decision problem 153

affect patients" relative strength, although the British experience indicates that this certainly does not eradicate all differences.

Non-economic differences may also play a role. For instance, a teaching-hospital may have a stronger incentive to retain a sought-after specialist, and therefore become more inclined to satisfy his requests as his reputation grows. As mentioned, in this model we illustrate the impact of a type change by focusing on a transition from weak to strong. One could also consider more complex patterns of change, but this would add another level of complexity to the model.

It is worth noting that in the special case where s = 0 (i.e. the agent's type does not change over time) the model's structure is similar to that of the model analysed in Kreps and Wilson [2] and their results apply. Kreps and Wilson (hereafter referred to as KW) consider a monopolist who faces either a sequence of N entrants or an entrant with N entry opportunities. In each round, the entrant chooses between entering and staying out. If the entrant decides to enter, the monopolist chooses between fighting and acquiescing. Associating the entrant with N entry opportunities with our principal, the monopolist with our agent, entering (staying out) to playing t (7) and fighting (acquiescing) to playing 7 (t), it is clear that these two situations are different illustrations of the same game. The only difference lies in the slightly more general structure of our payoff matrix: KW require that a = c. Allowing a # c does not modify the equilibrium.

The notion of equilibrium that we will use in this section is that of sequential equilibrium, as defined in Ref. [I]. The following concise definition of this notion is given in Ref. [2], p. 257: "'There are three basic parts to the definition of a sequential equilibrium: (a) whenever a player must choose an action, that player has some probability assessment over the nodes in his information set, reflecting what that player believes has happened so far; (b) these assessments are consistent with the hypothesized equilibrium strategy. For example, they satisfy Bayes' rule whenever it applies; and (c) starting from every information set, the player whose turn it is to move is using a strategy that is optimal for the remainder of the game against the hypothesized future moves of his opponent (given by the strategies) and the assessment of past moves by other players and by 'nature" (given by the assessment over nodes in the information set)..." Note that part (c) of the definition must hold for every information set, even those that are reached with probability zero. This implies that a player is always willing to carry out his strategy, even at nodes off the equilibrium path. Because of this requirement, this notion of equilibrium is more restrictive than the traditional Nash concept.

The optimal strategies

We first describe how the principal updates his beliefs with respect to the agent's type, and then state the strategies for the various players. We follow this with a specification of our assumptions on the magnitude of s and then state our main results.

Updating the principal's beliefs

(a) P# = 6. (b) If the principal plays 7 at slage n + I, then P, = s + (I - s)P,+ t. (c) if the principal plays t at stage n + I, the agent plays

P,=max[P*,,s + ( i -s)Pn+a] where P~ =b and for n >/2:

(P*--, - e*=\ i-_- 7 S)b. (d)

7 and P . + t > 0 , then

(2)

If the principal plays _t at stage n + I, and either the agent plays _t or P. +, -- O, then P. = s.

Strategy of the principal

(a) If Pn > P*. then the principal plays 7". (b) If P~ = P,*. then the principal plays 7' with probability R,, e and _t with the complementary

probability, where:

I R':=sc +(I -s)a'

154 ANN VAN ACKERE

and I P R . . i = . , (3)

sc+(I--s)a--sct~l,=, ;=.-,~-,I~I R J e ( - I ) " ' }

for n ~> I. (c) If P. < P* . then the principal plays t.

Strategy of the agent (a) If the agent is of the strong type. he always plays 7". (b) If the principal plays t at stage I, then the weak agent plays _t.

p* (c) If the principal plays t at stage n > I and P. > ( ._, - s)/(! - s). then the weak agent plays 7.

(d) If the principal plays t at stage n > I and 0 ~< P. ~< (P*_ ~ - s)/(! - s). then the weak agent plays 7 with probability R~,(P.) and t with the complementary probability. where:

P.(I - P * , ) R~(P.) = (4)

( I - - P . ) ( P * , - s)" for n > I.

Note that if P . = 0 , then R2(0) = 0 and i f s + ( I - s ) P . = P * then RT,(P.) = I n - l ,

Assumption. s < min(P$_ ~, I - I/a). The requirement that s < I - I/a ensures that the probabilities Re( ' ) are well defined. This is

a sufficient, though not necessary condition. The assumption s < P~_ t implies that s < P* for all n ~< N - I. Looking at the strategy of the principal and at how he updates his beliefs clarifies this assumption: we require that the possibility of a single external shock does not provide the principal with suilicient evidence that the agent is strong to induce him to play 7. Consider the following situation. At stage n + I, the principal plays t. The agent replies with t. Consequently, the principal knows that the agent is weak. Now the extern.'d shock occurs, and the principal assesses a probability P. = s that the agent is strong. I f s > P* , this would induce the principal to playT. For the case s = P.*, he would play 7 with positive probability.

Proposition 2. The strategies and beliefs stated above constitute a sequential equilibrium. Proof The proof consists of two main parts: (I) checking that the principal's beliefs are

consistent with his strategy (i.e. Bayes' rule is satisfied whenever it applies); and (2) checking the optimality of the strategies for (2a) the principal, (2b) the strong agent, and (2c) the weak agent. As the proof (especially part 2) is long and tedious, details are relegated to Appendix B. [ ]

What can we conclude from these propositions? As mentioned above, if we let s = 0 and a = c, the result is identical to the KW result. It is interesting to compare the strategies of the players for the case s = 0 (referred to as the constant type model) and the case s > 0 (referred to as the variable type model). First, note that P~' = b, while for n > I, P* < b", This implies that if the agent can become strong, the principal will play 7 for smaller values of P,, than if the agent is of a constant type. An intuitive argument for this is as follows. Assume that at some stage n, the principal's assessment is P.. if no more informatic , is received, this assessment will remain unchanged in the constant type model, but will increase in the variable type model. This implies that. although at state n the assessments are the same, the probability that the agent will be strong at some future point in time is higher in the variable type model. Consequently, the principal tends to "back off'" and play 7.

For n > l , R~(P. )>P. ( I -h"-~) / ( I -P . )h ". with equality if s = 0 . Therefore, when the principal plays _t, the weak agent will play 7" with higher probability in the variable type model, for the same assessment P.. Again, this can be explained intuitively, in the variable type model, the number of periods that the weak agent will have to play 7" and incur a loss in order to convince the principal that he is strong is smaller than in the constant type model for two reasons: (I) the principal will play t for a lower value of P,,, and (2) once the agent becomes strong, playing 7 does not cause him to incur a loss any longer.

Scheduling as a multi-person, multi-period decision problem 155

Now, let us look at the probability with which the principal plays _t when he is indifferent between his two alternatives. R~ is increasing in n; i.e. the larger the number of remaining periods, the higher the probability that the principal playsT. In the constant type model, this probability is independent ofn. This difference occurs because, in the variable type model, the higher the number of remaining periods, the higher the probability that the weak agent becomes strong, and thus the lower the probability of future gains (when the agent plays t) to offset the short run losses (when the agent plays 7).

Consider the case a = c. Then, R~ = I/a, and for n > i we have R,e> I/a, with equality i fs = 0. This means that the principal is more likely to play 7 in the variable type model. The intuition is similar to the one for P,* < b". If a #: c, the relative magnitude of R, e and I/a depends on the magnitude of a and c. If c >I a, then R e, >I l/a. For c < a, this is not necessarily the case, because c < a means that the weak agent has more to gain by inducing the principal to playTthan the strong agent and thus a higher incentive to play 7. By playing t with a higher probability whenever he is indifferent between his two alternatives, the principal reduces the weak agent's incentive to playT.

Next, we turn to the uniqueness issue. The suggested equilibrium is not unique for reasons similar to those discussed in Ref. [2], pp. 262-264. Before stating our result, we need to introduce the concept of plausible beliefs, as defined by Ref. [2], p. 263.

Definition. The beliefs {P,} of the principal are said to be plausible if, for any two histories of play,/7, and h:., such that h, and h:, are the same except that at one or more stages the agent played 7 in h, and _t in h ' , the beliefs satisfy P,(h,) >I P,(h'~).

The intuition is that beliefs are only considered plausible if the principal interprets an agent playingTas an indication that the agent is likely to be of the strong type. In other words, whenever the principal observes an agent playing ~', he should either revise his beliefs upwards or keep them unchanged. Similarly, observing _t should induce the principal to revise his assessment downwards or to keep it unchanged. The beliefs of the equilibrium given in Proposition 2 are plausible.

Recall that ~ = PN is the principars initial assessment that the agent is strong. Define ~N =/i and for I ~< n < N, let ~5, -- s + (I - s)c~,, j. This means that if the principal always plays'~, then at stage n his beliefs will satisfy P, = 6,.

Proposition 3. If 6, # P* for all n, then every sequential equilibrium with plausible beliefs has on-the-equilibrium-path strategies as stated in Proposition 2. This implies that every sequential equilibrium with plausible beliefs has the same value function as the sequential equilibrium described in Proposition 2.

Proof The proof is given in Appendix C. []

Why did we require ~5, ¢-P* for all n? Assume that for some m we have ~Sm = P,*. Then P* = s + (I - s)~5,,+ I. This yields P*+ I < (P* - s)/(! - s) = ,Sin+, and, inductively, for all n > m, P* < 3,. This implies that at all stages n > m, the principal plays ~" and, when reaching stage m, he is indifferent between playing _t or ~'. At this point, any randomization will yield a sequential equilibrium, implying that in this case we do not have uniqueness on the equilibrium path.

A NUMERICAL EXAMPLE

in this section we present a simple numerical example to illustrate the implications of the results derived previously. We first apply the one period, one agent type model. Consider the operating room example, and assume that a block of time allotted to one surgeon is expected to end at a certain time .-. Also assume that the actual time at which the operating room will become available follows a uniform distribution on the interval [z _ m, = + m], where m is some positive number. At what time should the start of the next block be scheduled?

Note that the assumptions about timing are equivalent to assuming that X is uniformly distributed on the interval [0, 2m]. For simplicity, we choose the time unit such that m -- I. Consider the following parameter values: C~' = 2, C~ = I, C:, = 2, C ° = I. These values yield t,4 = i; i.e. the agent considers time I to be a plausible starting time. if C~. ~< 3, then tr >/I. It is optimal for the principal to schedule the next block to start at time tp >i G, and no problem arises. What if the principal attaches a much higher cost to the facility being idle, say C~,- 6? In this case, tp = 2/3; i.e. the scheduler would minimize his expected costs if the agent would agree to show up at time

156 A r ~ VAN ACKERE

2/'3. In this scenario, his expected costs would equal 2. But the agent does not arrive until time i, resulting in an expected cost for the principal equal to 8;r3 > 2. His second best option is to schedule a starting time t, = I, which yields an expected cost equal to 9/4 < 8/3, but larger than 2. This illustrates that it is in the scheduler's interest to take the agent 's behavior into account.

4 _ 2 and C ° = I as before, Next, consider a one period model with two types of agents. Let Ce_- and let C~,=3, C ~ = C , L = I, C ~ = 3 , C,~=3/2. This implies t ~ = 4 , " 3 > t e = l > t A = 2 / 3 , 7" = max(tp,7,4 ) = 4/3, t = max(t e, G ) = i. Applying the expected cost matrix of Table 3 here yields Table 4.

If the principal knew the agent to be weak, he would select t, the agent would arrive on time, and the principal would achieve his lowest expected cost (3/2). If he knew the agent to be of the strong type, he would select _t, the agent would arrive on time and the scheduler's expected cost would equal 5/3. But, what if the principal is unsure? Letting 6 denote the probability that the agent is strong, the principal's expected cost equals 5/3 if he selects 7, and 26 + (3/2)(I - 6 ) if he selects t. If he is risk neutral, he will select t for 6 < I/3 and 7" for 6 > 1/3. If he is risk averse, he may seek to reduce the likelihood of incurring a very high cost at the expense of a higher expected cost. In this case, he may select 7" for smaller values of 6; i.e. even when he assesses the likelihood of the agent being strong at less than 1/3.

Consider, now, how this argument is modified for a repeated game. First assume that the agent's type does not change over time. Consider the situation of a weak agent. If the principal selects 7, he only incurs a cost of 1/6, while if the principal selects _t and he shows up on time (optimal behavior in the one-period game), he incurs a cost of 3/8. Therefore, if he can convince the principal that he is of the strong type (i.e. that he will not show up before 7), he gains 3/8 - I/6 = 5/24 each period. The question he needs to answer is therefore: is it worth incurring a cost of I/8 for a number of periods (the time required to convince the principal that he is of the strong type) in the hope of reaping a benefit of 5/24 per period in the future'? The theorems presented herein swte when this behavior is worthwhile for the case of risk-neutral players.

What if the agent's type can change? We consider the specific example of when a weak player may become strong, in this case, the principal is more likely to select 7" than if types were fixed, because he assesses a higher likelihood to the agent being strong at any future point in time, and therefore has more to lose from selecting t.

A similar argument could be developed for the case of scheduling a dentist practice. A dentist does not wish to waste time, waiting for a patient to arrive. But neither do patients like to wait for the dentist. As the duration of a dentist visit is, by its very nature, quite unpredictable, the dentist needs to strike a balance between his interests and those of his patients. Over time he learns about their punctuality and willingness to wait, and may account for this when setting appointments. Also, as his reputation becomes well established, patients may be more willing to wait.

Next, consider a quite different example: the efforts of the British National Health Service (NHS) to induce a large number of women to test for breast-cancer at regular time intervals. It is in the NHS's interest to achieve a high level of utilization of the equipment (C~, high). This requires tight scheduling, causing waiting time for patients. This may cause patients to increase the time between tests, or refuse to have the test altogether (depending on their willingness to wait

t1" C4 ), thereby defeating the initial purpose of screening a large fraction of the female population (high cost of keeping patients waiting. C~ large). The principal thus needs to trade off these various elements when setting a schedule, which, in this case, amounts to selecting the utilization level of the equipment.

Table 4 A rlunlcrical CX~IITI~IC: cxpct:lcd coM',

Aclion~ [~.x pcclcd ¢osl~

P A Slrou.l,, ~Igcnl ll'e~d~ agent Principal

7 7 13 16 5.1 I ! 3 4 3 8 3 2 _t t 2 3 12 2

158 Scheduling as a multi-person, multi-period decision problem

4. A. van Ackere. The principal/agent paradigm: its relevance to various functional fields. Fur. JI Operl Res. 70, (1993). 5. A. van Ackere. The impact of conflicting interests on the choice of an appointments system. Belg. J. Opers Res. Statist.

Compur Sci. 31, 97-109 (1992). 6. A. van Ackere. Conflicting interests in the timing of jobs. Mgrat Sci. 36, 970-984 (1990). 7. A. van Ackere. Conflicting interests and private information in scheduling problems. Unpublished. PhD. Thesis.

Graduate School of Business. Stanford University, Stanford, CA (1987).

a ~

b=

APPENDIX A

Explicit Expressions for the Payoffs

c ~ { E ( x - t) ÷ - E ( X - 7 ) ÷ }

C ~ f f - 0 - C ~ ' { E ( X - t) + - E ( X - t-l+} '

C " { g ~ - X ) + - E ( t - X) ÷ } + (C ~ + C ° ) { E ( X - 7 ) ÷ - E ( X - 0 + }

C-~=

C ' { E 0 " - X) + - E( t - X) + } + (C~ + C ° ) { E ( X - ~ * - E ( X - 0 ÷ } + C o O - t ) '

c , ~ 7 - t)

c , " { g ( x -0 + - E ( X - th* } - cfR- O"

APPENDIX B

Proof of Proposition 2 The proof consists of two main parts: (I) we check that the principal's beliefs are consistent with

his strategy (i.e. we check that Bayes' rule is satisfied whenever it applies); and (2) we check the optimality of the strategies for (2a) the principal. (2b) the strong agent and (2c) the weak agent. Each of these three subparts is preceded by several lemmas which are used in the actual proof. In the proof, the abbreviations A, and A, will be used to denote the strong and weak agent, respectively:

(I) Consistency of the updating oJ" beliefs

(la) If the principal plays 7. he learns nothing about the agent's type. Only the shock takes place. This implies P,= P,÷l + s ( I - P , . i ) = s + ( I - s )P,÷l .

(Ib) The principal plays L. ( Ib . l ) i f P,+ i > (P* - s)/(I - s), then both A~ and A, play~'. Again the principal learns nothing

about the agent's type and only the shock takes place. This yields P, =s + ( I - s ) P , + t = max (P*os + ( I - s ) P , . i).

(lb.2) If 0 < P,÷I ~ < ( P * - s ) / ( I - s ) , then the principal learns something from the agent's behavior. Incorporating this in his assessment using Bayes' rule yields an updated probability that we denote P~, ÷ I. We still need to incorporate the effect of the shock to obtain P~.

(lb.2.1) The agent plays 7. In this case, Bayes' rule implies:

Pr~A,)Pr(A~) P,~ ÷, = Pr(A, 0") = Pr~A3Pr(A,) + Pr~A,,)Pr(A,)

P,+l

P,+I + ( I - P,+t)R'~'+,(P,+I)

P* - s I - s '

where Pr denotes probability. This yields:

P,=s + ( I -s)e',+l

= P* = max[P*,s + (I - s)P,+l].

(Ib.2,2) The agent plays _t. In this case, Bayes' rule yields P,~+t = 0, implying P, =s. ( lb.3) if P,÷ t = 0 and the agent plays _t. Bayes" rule yields P:+~= 0, implying P , - - s . If either

Pn+t/> ( P * - s)/(I - s ) and the agent selects _t, or P ,+~--0 and the agent selects, 7, then Bayes" rule does not apply. Following the same arguments as in KW, p. 267, we arbitrarily set P~,+ ~ = 0

Scheduling as a multi-person, multi-period decision problem 157

SUMMARY AND CONCLUSIONS

In this paper, we have looked at scheduling as a multi-person decision problem by assuming that some third party (labelled agent) must be present for a job to be performed. We also introduced private information by assuming that the principal doesn't know the agent's costs. This implies that the scheduler does not know the earliest time at which the agent is willing to arrive. We showed that the principal would either choose the time that is optimal if the agent is strong, or the time that is optimal if the agent is weak, or some time in between these two extremes. The time selected depends on the principal's assessment of the probability that the agent is strong.

We then restricted the principal's choice to only two values: the optimal time if the agent is strong and the optimal time if the agent is weak. In this framework, we were able to look at repeated games. We also extended the model by allowing for type changes: after each repetition there was a small probability that the agent, if weak, became strong. We showed that the agent tries to use the principal's ignorance to build a reputation of being strong in order to induce the principal to select a later starting time in future periods.

We imposed the assumption s < P* L in seeking to limit the impact of the probability of a change of type on the principal's beliefs about the agent's type. This assumption can be reinterpreted as imposing an upper-bound on N for a given s and b. An interesting extension would be to look at what happens when this assumption is not satisfied; i.e. for a given s and h, let N become increasingly large, but still finite.

What policy implications can we thus draw from this analysis? The numerical example in the previous section illustrates how the resulting schedule depends on the incentives of the various parties. In summary, one can state that the schedule most desirable from an efficiency (e.g. utilization or cost) point of view is useless if the various parties involved are unwilling to cooperate. When drawing up the schedule, it is therefore necessary to account for the incentives and the relative power of the various parties, where these may change over time as a result of changing circumstances. In addition, individuals may be tempted to misrepresent their incentives (e.g. overstate thcir value of time) in order to obtain a better deal (e.g. the first time slot, with a very low risk of delay).

More generally, our aim was to show that the straightforward application of an optimization technique (e.g. minimizing expected costs) may result in a suboptimal decision when the various parties have conflicting objectives. We illustrated this in the context of scheduling, when one party may want to avoid waiting, while another party may want to maximize utilization. The result can be a schedule that satisfies neither party.

A similar analysis could be applied to a variety of situations where conflicting interests occur. Consider, for instance, the issue of dividing a budget between different social services. Each service has an incentive to overstate its needs (in the same way that the agent has an incentive to overstate his waiting costs) to obtain a "larger share of the pie" (a better schedule). As the budgeting process is repeated each period, needs may change (change of type), while parties may learn more about each other's real needs (identify the others' types). Other examples include an individual's attempt to overstate the seriousness of an illness or the importance of family responsibilities to obtain a preferential appointment.

Although game theory has been used extensively in a variety of areas to analyse problems involving several parties with conflicting interests (see, for instance, Ref. [4]), one finds very few applications to the field of services in general, and to the health care sector in particular. This paper has attempted to illustrate the potential of using a game-theoretic approach to gain some insight into these types of problems.

At'knowh,dt,~'ments .-This paper is based on Chapter 7 of the author 's Doctoral Dissertation. I am very grateful to my advisor. E. Portcus. to R. Wilson and It) the . 'monymous referees for their useful suggestions.

REFERENCES

I. D. Kreps and R. Wilson. Sequential equilibrium. Ec'onnm(,tric'a ,¢d}, 863 894 (1982). 2. D. Krcps and R. Wilson. Reputation and private information. J. Econ. Theory 27, 253 279 (1982). 3. J. O. McClain and L. J. Thomas. Operations Management: Produ~'tion o/'GomL~ and Serrices. Second Edition, Prentice

| |a l l , New York (1985).

Scheduling as a multi-person, multi-period decision problem 159

for both cases, implying that P, = s. The idea is that: (I) observing t convinces the principal that the agent is weak at that instant; and (2) if the principal is convinced that the agent is weak (P,+ i = 0). then the agent's behavior will not influence this conviction.

(2) Optimality o f the strategies (2a) Optimality o f the principal's strategy. Denote by Ve,(P,) the principal's expected cumulative

payoff from period n through 1 if he assesses probability P, that the agent is strong and all players follow the stated strategies. Let ~.e(p,) [_Ve(p,)] denote the principal's expected cumulative payoff from period n through period 1 if he assesses probability P, that the agent is strong, plays 7(_0 at stage n and from thereon all players follow the stated strategies. Note that:

{, ,, _ r ' , ( e . ) i f P . > p * , v . ( e . ) - v_~(p.) i f P . < e 1. (5)

Lemma I. e , e , e , v . ( e . ) = p : ( e . ). = F : ( P . ) Proof. The proof is by induction on n. If n = l , then r/f(P?)=Pf(b)=O and

_vf(e*)= _Vf(b)=0. But V f ( P i ' ) = e f r ' f ( e ~ ' ) + ( I - R f ) v f ( e * ) = 0 . This yields the desired conclusion for n = I.

w , Consider n > I and assume the lemma holds for n - I. Using the definition of R . ( P . ), the recursive formula for P*, the inductive assumption and some algebra, we obtain:

P * _ V : ( P . ) = [ P * + ( I - P * ) R ; . ( P * ) I [ b I + V , ,P* ~' - - , - I [ . - I / J

+ [I - e * l [ I - R : ( e * ) l [ h + v'._,(s)] = b V ~ P * _,( . _ , ) + ( I b)Z~._,(s)

[ b 2 + ( I - b ) ~ , l f - ~ s ] . - P* = [ b - I + V r 2( . -2 ) ]

r._ 2-sJ

• . _ , [s + ( I - s ) P * ~ ] P'(P. ) = v ~

= {[s + (I - s ) P . * ] + [ ! - s - ( I - s)P~* lR:_ , }[b - I + e L 2 ( P L , ) ]

+ {t - [ s + ( I - s ) P * ] - [! - s - ( I - s )P* lR:_ , }It, + vL: ( s ) ]

The last equality holds because s + ( I - s ) P * , < P * _ l , implying that the principal plays t. Appropriate substitutions andsome algebra yields that:

s ( l - s ) [s+(I s)P*]+[I s ( I -s)e* ]R:_, , b 2 + ( I - b ) p . _ 2 _ s

implying that P * e * V_', (P . ) = P~ (P. ). But:

P * w P * v . ( e . ) R . ~ ' . t ( e . ) + ( I .. e , = - R.)_V' (P . ) .

This yields the desired result, l'-1 Lemma 2. V.e,(P,) = A, + B ,P , where:

A l = b , A . = b + V P _ l ( s ) ,

( I - s ) I V , . _ , ( P * _ , ) - vr._,(s)- I] BI = - I, B . =

P*_ ,s

Proof. For n -- I. _V~(Pm) -- b - P,. as required. Consider n > I. Then:

V_e.(P.) = [P. + - P.)R~(P.)]Ib - I + V e._, (P*_ t)]

+ (I - P.)[R - R~;(P.)][/, + V. ~_ ,(s)]

=[ t , + V."_ i(s)]-~ ( 1 - s ) [ V e . _ , ( P * . _ , ) - VP._,(s)-l ip. P . * _ , - s

160 ANN VAN ACK.Er.E

where the second equality follows by substitution of R~,(P, ) by the expression given in equation (4) and rearranging the terms. []

Using these lemmas, we show that given the agents' strategies, the principal's strategy is optimal. The proof is by induction on n. The optimality of the principal's strategy for the case n = 1 is straightforward. Assume that the principal's strategy is optimal for some n >i I and consider period n + l .

Case !. P,+ ~ >>. (P* - s)/(l - s). In this case, A, plays t. This implies that:

~..e+,(p.+,) = Ve[s + ( I - s )P.+, ]

> b - 1 + Ve.[s + ( I - s )P.+, ]

= _VL,( / ' .+ , )

and thus it is optimal for the principal to play 7. Case 2. P*+, < P.+, < (P * - s)/(I - s). In this case. if the principal plays 7 and t. This yields:

~. p * V_P.+,(P.+,)=[P.+, + ( I - P . + , ) R . + , ( P . + , ) ] [ b -- I + V . ( P . )]

+ (I - & + , ) [ t - R ; : ÷ , ( t ' . + , ) ] I b + V.qs)]

P . + , ( I - s ) e * - _V.P(s) l] _Ve(s) = b + [_v.' ( P . ) - + P,* - -s

P.+l(I - -s ) = h + A . + B.Is + ( I - s )P.+, I

P* - s

where the second equality holds by equation (5) and Lemma I. and the third equality is obtained by Lemma 2 and rearranging the terms. Also:

r f . , ( P . ~ , ) = v~.[s + ( I - s ) P . ~ , ]

= V.qs + ( I - s ) P . + , ]

= .4,,+ &[s + (I - s ) P . ~ ,],

where the second equation holds by equation (5) and the third one by Lemma I. This yields:

F,e ' e (P , ,+ , )= P " + ' ( 1 - s ) , ( P . ÷ , ) - _V'~1 b P,,* - s

P*+l(I - - s ) > b

P* - s

-= 0 ,

where the inequality hol ' s because P,, . ~ > P,*÷ j and the last equality holds by the definition of P t , ~ > e P.*~ j. This implies - , , . , , . . ÷ , ) > _V'+~(P.+a). and thus it is optimal for the principal to play 7.

Case 3. P., ~ j = P,*, ,. It follows immediately from Lemma I that in this case the principal is indifferent between playing _t or 7. and thus any randomization yields the same expected payoff.

Case 4. 0 < P. + ~ < P** ,. Using the same arguments as in Case 2. we obtain:

pe ,+l (p .~ l ) .e P,,+ i(I - s ) - I ' . + , ( P , , + , ) = b.

P * - s

In the present case. P , , , a < P , * , J which yields:

17;,e., (P, , . . ) - _V.e+, (P,, +, ) < P,,*+ i (I - s )

P,* - s b = 0 .

This implies that it is optimal for the principal to play t. Case 5. P,+, =0 . In this case. ~'I,e+,(P,,+ i) V e ( s ) < b + V e(s) ,e = = ~_,+t(P,,+,). implying that it

is optimal for the principal to play t.

Scheduling as a multi-person, multi-period decision problem 161

(2b) Optimafity of the strong agent's strategy. Denote by V,( P,) the expected cumulative payoff of the strong agent if the principal assesses a probability P, that the agent is strong and all players follow the stated strategies.

I, emma 3. V~(P,) is non-decreasing in P,. Proof The proof is again by induction on n. Consider n = I:

i if P, > P; . V'I(P,)= fc i f P i - - P [,

if P, < P].

This shows the result for n = !. Assume the result holds for some n i> I and consider period n + I.

V'.+,(P.+,) =

"c + v'.[s + (! - s)e.÷ ,]

M+, {c + v'.[s +(! - s ) e . ÷ , ] }

+(I - Re+ ,)V: {max[P*, s + (I - s)P.+ ,]}

V'. {max[P*. s + (1 - s)e.÷ ,]}

if P.+, > P*+,.

if P.+, = P*+,.

if P.+, < P*+,.

Case !. s + (! - s ) P . + , >i P*. Then the result follows immediately from the inductive hypoth- esis and c > 0.

Case 2. s + ( I -s)P.+j < P*. We know that the result holds for n = I. Consider n = 2. Then:

I c if P: > P~'. V~(Pz)= R~c + ( I - R ~ ) R f c if P2=P~.

[R fc if P2 < e~'.

This implies that the result holds for n -- 2. Consider some n I> 2 and assume that the result holds for n and n - l. Consider period n + l:

f c + V'~_I(P*_ I) if P.+l > P*+,. R.'+, [c + Z : _ , ( e * - , )l

V.÷,(P.+,)" = + ( I - R . + , ) ( R . { c e e + V:_,[s +(l _s)p.]} i f P . + , = P * + , .

-R.)V._,(P*_,) +(1 r ,

Re.{c+V~._ ,[s+( l -s )P*}+(i Rr~V" (P*._,) i fP .+ <P,,*+ - ' n , , . - I I I "

But:

R..{c+v:_ ,[s+(l-s)P*]}+(l-R:)V'._,(e*._,)< R. c + ~ VL,(P*_, )<c + V:_,(P*_,).

where the first inequality follows from the inductive assumption. This yields the desired re- suit. []

Given Lemma 3, it is straightforward to show the opt imal i ty of the strong agent's strategy. Assume that the principal plays ! at stage n. The strategy is clearly opt imal for n -- I. Consider n > i and assume that the result holds for n - I. Playing7yields V~,{max[P,*, s + (I - s ) P , ] l while playing t yields V,~(s) - !. Using the assumption that P~* > s and Lemma 3, it fol lows immediately that 7 is optimal for the strong agent.

(2c) Optimality of the weak agent's strategy s * Lemma 4. If P . < P* and n > !. then V~(P.) = V._,(P._,).

Proof If P. < P* the principal plays _t and A, plays 7. This implies that As receives 0 this period and the principal updates P. to max[s + ( i - s ) P . . P . _ , ] = P * _ , . This yields V" n (P.) = V~_ ,(P*_ ,). as required.

• f l Lemma 5. V' .(P*)- V'~(s)= c ~ R : ( - I ) '÷' t ~ l ) ~ n - i + |

162 A N N V A N A C K E R E

Proof. For n = I. we have V'I(P *) - V'l(s) = c r y . as required. Consider n > 1 and assume that the lemma holds for n - I. Then:

V ; ( P , * ) - V ; ( s ) = R~.{c - V~._,(P* , ) + l.%_,[s + ( I - s ) P * ] }

Re.[c- V" P* = . _ j ( . _ , ) + V._,(s)]

= c - - c - I ) i*~ = _ i = 1 ] t

=c (I R:( - , ) , - , i = l j = n - / + l

where the second equali ty holds by L e m m a 3 and the third one by the inductive assump- tion. [ ] Denote V~(P.) the expected cumula t ive payof f o f the weak agent f rom period n through period 1 if the principal assesses a probabi l i ty P. that the agent is strong. Clearly. the weak agent ' s s trategy is opt imal for n = I. Using these iemmas, we show that it is opt imal for n > 1 and s + ( I - s ) P . ÷ , <~ P* .

Case I. P . . , = 0. Assume that the weak agent ' s s t rategy is opt imal for some n >/ ! and consider period n + I. If the principal selects t, then playing t yields A. an expected payof f equal to sV~.(s) + (I - s )V~(s) while playing 7" yields only - I + sV'.(s) + (I - s)V'~'(s). It follows immedi- ately that playing t is opt imal .

Case Z 0 < P .÷t ~ < ( P . * - s ) / ( I - s ) . Consider n = 2 and assume that the principal plays t. Playing 7 yields A,, an expected payof f equal to - I + R~[sc + (I - s)a] = 0. Playing t yields sV l ( s ) + (I - s)V]'(s) = 0. This implies that A. is indifferent between playing _t andT. Now assume that the weak agent ' s strategy is opt imal for some n i> 2 and for n - I. Consider period n + I. Denote F',[(P.) (respectively ~ (P. ) ) the expected cumula t ive payof f o f A.. f rom period n through period I if the principal assesses a probabi l i ty P. that the agent is strong, the principal plays 7. the weak agent plays/" (respectively _t) and f rom there on all players follow their strategies. We need to show that ~ ÷ , ( P . ~ , ) = ~ ' ÷ , ( P . ÷ , ) :

~ , ( P . . ,) = sV' .(s) + (i - s)V~,(s)

= sV'._ ,(P*._ ,) + (I - s)[sV'._ ,(s) + (I - s ) V / ; , (s)].

where we obtain the second equali ty by using the inductive assumpt ion for n - I. and

s * P: .+ , (P .+ , ) = - I + s V . ( P . ) + ( I -s)V.(P." *)

= - I + ee .[se + ( ! - s )a] + ( I - s ) [ s V ~ , _ , ( s ) + ( I - s ) V : _ ~(s ) ]

+ sRe. V'. _ i(s) + s(l - Re.) V'. _, (P*_, ) ,

where we obtain the second equali ty f rom the strategies at stage n, L e m m a 4 and the inductive assumpt ion for n - I. This yields:

_w,+,(e..,)-rq...,(e.÷,) l R~[sc+(i " , v. = - - s ) a ] - s R . [ V . _ , ( e~*- i ) - . _ , (s ) ]

:[ "n' ] = I - R s c + ( I - s ) a - s c R ~ ( - I ) '+1 I - I J - . - t

= i _ R e I . -g-~ = 0, R .

where the second equali ty follows from L e m m a 5 and the third one f rom the definition o f Re..

V"' = V" P* " Corollary. I. s V ' ~ . l ( s ) + ( I - s ) . _ , ( s ) - I + s . - i ( . _ , ) + ( I - s ) V . _ l ( P . _ l ) . * Proof This follows immediate ly f rom _V',[+ , (P .+ , ) = P',[+ i (P.+ ,), which was shown above. [ ]

Now we give three more lemmas, which will al low us to complete the p r o o f o f the opt imal i ty of the weak agent ' s strategy.

Lemma 6. l f O < P. < P* and n > I. then V : ( P . ) = sV ' ._ , ( s ) + (1 - s )V :_ l ( s ) .

Scheduling as a multi-ptrson, multi-period decision problem 163

Proof. I f0 < P, < P,* and n > !, the weak agent randomizes between/'and _t. Using the corollary stated above immediately yields thedesired result.

Lemma 7. l f O < P , < e * , then V~,(P,) < V:(P*) . Proof

v" [s + I(] -s)p*]} v : ( e * ) = M { a + sV' ._ , [s + (I - s ) e * ] + ( l - s ) . _ ,

+ (1 - - R ~ ) [ S V ~ l l (S) "~ ( [ - - S)g: '__ ] (S)]

V' V" = R".a + s . _ , ( s ) + ( 1 - s) ._ , (s )

> sV'._ ,(s) + (I - s)V~_ i(s)

= v:; ( e . ) .

The first equality follows from the corollary. The second one holds because of Lemmas 4 and 6, which imply that for P. < P*, V'.(P.) and V~,(P.) are independent of P.. The last equality follows from Lemma 6. []

Lemma 8. I f P, > P*, then V~'(P,) >1 V~(P*). Proof. Consider P, > P~'. Then V~(PI) --- a > R~a = V~(P*). This shows the result for n = I.

Assume that the result holds for some n I> i and consider P,+ ~ >i P*÷ ,. Using the corollary yields:

x" w * Vs+ I ( P . ÷ l) - V . . I ( P . + I) ~-- ( I - Re.)(a + s{ V'.[s + (I - s ) P . _ l] - V : ( s ) }

(I - s ) {V"[s +(I -s)P.÷,]- V:(s)})

Case a. s + ( I - s ) P . ÷ , < P*. In this case, Lemmas 4 and 6 yield:

V,÷ i(p,+ |) _ " * " V.÷ ,(P.+ I ) = ( I - - Re.)a > O,

implying that V:'+ ,(P.+ i) > V.'÷ ,(P.** ~). Case b. s + ( I - s ) P . ÷ ~ >1 P*.. In this case, using Lemmas 3 and 7 as well as the inductive

assumption yields:

V , ] . , ( P . ÷ l ) - . . V.+,(P.+t) >t (I - Re.)a > O,

again implying that V~+ a(P, + l) > V~'+ I (P*+ ,). [] With these lemmas we are ready to show the remaining case for the optimality of the weak agent's strategy.

Case 3. s + ( I - s ) P , ÷ l > P*+,. It is again obvious that the strategy is optimal for n = I. Consider n =2 . Assume that the principal plays I. Then I ; '~ (P2)=- I + s c + ( I - s ) a while V.~'(P2) = sV| (s ) + (I - s) V~'(s) = 0. Using the assumption that s < I - 1/a, or equivalently, (i - s ) a > i, this yields that it is optimal for At. to play T. Now assume that the result holds for some n I> 2 and n - I. Consider period n + I. Using the corollary yields:

~ : ' . l ( P n . l ) - - J /~ ' . i ( e~ . l ) - - - - .

because by Lemmas 3 and 8 both terms on This implies that playing/ ' is optimal in this weak agent's strategy.

s{ V:[s + (I - s) e , + , l - v:(P*)} u, * +1] - s ) {V: [ s + 11 - s )p . . , ] - v.le.)}

> O,

the right-hand side of the equality sign are positive. case and concludes the proof of the optimality of the

APPENDIX C

Proof o f Proposition 3

First note that, because of the requirement that beliefs be plausible, it is always optimal for the strong agent to play T. The remainder of the proof is by induction on n. Consider n - I. if the principal plays T, then it is optimal for A, to play/ 'and for A.. to play E Given this, it follows that, if P~ > P~', the principal selects T, while if PI < P*, the principal selects .t. If PI = P*, then the principal is indifferent. We still need to show that in this case, the principal will play /' with

16,4 ANN VAN ACKERE

probability /~e= R e - l/[sc + ( I - s ) a ] . We assumed that 6, ¢ P~'. Given the requirement that beliefs be plausible, there are only two ways to obtain P, = P* . Either we have s + (I - s ) P : >1 P? and at stage 2 the principal plays _t and the agent plays _t. such that the principal revises his beliefs downwards. But if the agent plays t, then the principal knows that the agent is weak (recall that A, always plays 7) and thus revises his assessment to P, = s < P~'. The alternative is to have s + (1 - s)P,. <~ P* and at stage 2 the principal plays _t and the agent plays 7, such that the principal revises his beliefs upwards to P, = P* . We will argue that this can only occur i f / ~ = R e.

Case !. Assume Re < Re. In this case, if the principal plays t at n = 2, then it is optimal for A,, to play F, rather than to randomize, as was the case when the principal followed his suggested strategy. This implies that the principal learns nothing about the agent's type by observing the agent's choice as both A, and A,, play [ with probability I. The principal's expected payoff for playing _t then is:

~_";(t',) = b - I + V f l s + (I - s ) p , ]

< Vf [ s + ( I - s ) ? , ]

= 9 ~ ( p , ) .

This implies that if t~' > R~', then the principal is strictly better off playing [ rather than !. Thus we cannot have that ~ > R[ and the principal playing _t at n = 2.

Case Z Assume ~[ < Re. In this case, if the principal plays _t at n = 2, then it is optimal for A,, to play t. This implies that if the principal observes/ , he knows that the agent is strong, i.e. P, -- I. But then it is optimal for the principal to play f a t stage I rather than to randomize. This is a deviation from his strategy and implies that we cannot have R~ < R~ in a sequential equilibrium.

From these two cases, it follows that we must have ~ -- R~" as desired. Note that by Bayes' rule this implies:

P* - s p:, P,

I - s P . + ( I - P : )R ' . ' ( P . )

i.e.

P,(I - P~') R~(P:) =

(I - P,.)(P? -s)'

Thus we have shown that the uniqueness result holds for n = I. We also showed that if at stage

2 it is optimal for the principal to play t and the agent plays 7, then the principal must randomize at stage i with probability Re, implying that we must have P, = P~'. This will be part of the inductive hypothesis.

Now consider some stage n > I and assume that the proposition holds for stages I through n - I. Assume that the prinicpal plays t. Then it is optimal for A, to play 7. From the inductive assumption and from arguments similar to those in Proposition 2, it follows that if s + (I - s)P,, > P*, _ ,, then it is optimal for A,. to play ~, while i f s + (I - s ) P , <~ P*_ L, then A, is indifferent between playing 7or t. We still need to determine the randomization probability RII (P,,) and show that it is equal to R~(P,) as defined in equation (4). By the inductive assumption, we know that if it is optimal for the principal to play _t at stage n and the agent plays T, then P,_~P,_,.* Using Bayes' rule, we obtain:

P * - i - s P,, - - P ~ - -

I - - s P . + ( I - P , , ) R ~ ( P , , )

i.e.

- P . _ , ) "" = - R ' ( P . ) , R. (P . ) P.(1

(I - P . ) ( P * . _ j - s )

as required. Now consider the decision of the principal. From analysis similar to that in the proof of

Proposition 2. we know that if P. > P.*, then it is optimal for him to play 7. while if P. < P,*,. then it is optimal for him to play b If P. = P*. the principal is indifferent. To conclude the proof, we

Scheduling as a multi-person, multi-period decision problem 165

still need to show that in this case. the principal plays/ 'with probability R, "p = R,.P where R,P is as defined in equation (3). Given that 6, ~< P,*. an argument similar to the one for the case n = I yields that P, = P*, can only occur if at stage n + I the principal plays I and the agent plays ~.

Case /. Assume RP,> R p̀*.. In this case. if the principal plays .t at stage n + I. it is optimal for both A. and A. to play ~. which implies that the principal learns nothing about the agent's type. His expected payoff in this case is b - l + P P . [ s + ( I - s ) P , + ~ ] . By playing t. he gets V~[s + (I - s ) P , , ~]. which is strictly better. Thus. for R. p > R~. the principal will not play l. and we cannot have P. = P,*.

Case 2. Assume RP, < R~. In this case. if the principal plays t. it is optimal for A. to p lay/ ' and for A. to play _t. This implies that if the principal observes 7. he knows that the agent is strong. His expected payoff equals:

__ " P p ? . + , { b - I + ( I ~()[b-l+V(_,(1)I+n.v._,(1))+(l-en_,)[h+v'.(s)]. But if P._, = I, then, by the inductive hypothesis, it is optimal for the principal to play T, which yields P V._ , ( I )= V~_z(1). By induction, we obtain V. p_ , ( I )= V~e(1)= O. The principal's expected payoff reduces to:

P,,+ , [b - I + ( i - J i ~ ) ( b - I ) ] + ( I - e . _ , ) [b + V# (s ) ] .

The principal can increase this expected payoff by deviating from his strategy at stage n and R, = I. This implies that we cannot have Rn p < R, p in a sequential equilibrium, it follows selecting ~~'

that we must have ,~,P = R, p, which completes the proof.

SEPS ~g Y--C