pdfs.semanticscholar.org · serv er assignmen t p olicies for maximizing the steady-state...
TRANSCRIPT
Server Assignment Policies for Maximizing the Steady-State
Throughput of Finite Queueing Systems
Sigr�un Andrad�ottir, Hayriye Ayhan
School of Industrial and Systems Engineering, Georgia Institute of Technology,
Atlanta, GA 30332-0205, U.S.A.
Douglas G. Down
Department of Computing and Software, McMaster University,
Hamilton, Ontario L8S 4L7, Canada
For a system of �nite queues, we study how servers should be assigned dynamically
to stations in order to obtain optimal (or near-optimal) long-run average throughput. We
assume that travel times between di�erent service facilities are negligible, that each server
can only work on one job at a time, and that several servers can work together on one
job. We show that when the service rates depend only on either the server or the station
(and not both), then all non-idling server assignment policies are optimal. Moreover, for a
Markovian system with two stations in tandem and two servers, we show that the optimal
policy assigns one server to each station unless that station is blocked or starved (in which
case the server helps at the other station) and we specify the criterion used for assigning
servers to stations. Finally, we propose a simple server assignment policy for tandem systems
in which the number of stations equals the number of servers and present numerical results
that show that our policy appears to yield near-optimal throughput under general conditions.
(Markov Decision Processes; Markovian Queueing Systems; Tandem Queues; Finite Bu�ers;
Mobile and Cooperating Servers; Preemptive Service; Manufacturing Blocking)
1 Introduction
Consider a queueing network with N � 2 stations and M � 1 servers. Assume that at any
given time, there can be at most one job in service at each station and that each server can
work on at most one job. Furthermore, assume that the service times of each job at each
1
station i 2 f1; : : : ; Ng are independent and identically distributed with rate �(i) (where
�(i) is the inverse of the mean service time). We assume, without loss of generality, that
�(i) = � = 1 for all i 2 f1; : : : ; Ng. Each server i 2 f1; : : : ;Mg works at a deterministic
rate 0 � �ij < 1 at each station j 2 f1; : : : ; Ng (i.e., server i is cross-trained to work at
all stations j satisfying �ij > 0). We assume that several servers can work together on a
single job, in which case their service rates are additive. Finally, we assume that the network
operates under the manufacturing blocking mechanism.
For queueing systems of the form described in the previous paragraph, we are interested
in determining the (dynamic) server assignment policy that maximizes the long-run average
throughput. For simplicity, we assume that the travel times required for servers to go from
one station to another station (including any setup times) are negligible. Also, all of our
results are stated under the assumption that the underlying system is a tandem queueing
network with an in�nite supply of jobs in front of station 1, in�nite room for completed jobs
after station N , and with �nite bu�ers between each two successive stations. However, this
assumption of an underlying tandem queueing system is made for simplicity, and several of
our results easily can be shown to hold for more general network con�gurations.
A signi�cant amount of literature exists on the static server assignment problem, in which
the (permanent) assignment of servers to stations is determined. For an introduction to this
area, see Hillier and So (1996), Yamazaki, Sakasegawa, and Shanthikumar (1992), and the
refererences therein. In recent years, several papers have appeared that are concerned with
dynamic server assignment policies. Ostalaza, McClain, and Thomas (1990) and McClain,
Thomas, and Sox (1992) have studied dynamic line balancing in tandem systems with shared
tasks that can be performed at either of two successive stations. This work was continued
by Zavadlav, McClain, and Thomas (1996) who study several server assignment policies for
systems with fewer servers than machines in which all servers trained to work at a particular
station have the same capabilities for working at that station. Assuming that each server
has a service rate that does not depend on the task he or she is working on, Bartholdi and
2
Eisenstein (1996) de�ne the \bucket brigades" server assignment policy and show that under
this policy, a stable partition of work will emerge yielding optimal throughput. Bartholdi,
Eisenstein, and Foley (2001) show that the behavior of the bucket brigades policy applied
to systems with (discrete) tasks and exponentially distributed task times resembles that of
the same policy applied in the deterministic setting with in�nitely divisible tasks. Finally,
for systems with fewer servers than stations, Bischak (1996) proposes a server assignment
policy in which the servers move between stations and then presents simulation results that
show that for unbu�ered systems with high processing time variation, her policy can achieve
higher throughput than would be obtained with a single (stationary) server at each station.
Most of the existing work in the area of optimal assignment of servers to queues focuses
on parallel queues, see for example Hofri and Ross (1987) and Duenyas and Van Oyen
(1995). Van Oyen and Teneketzis (1994) and Iravani, Posner, and Buzacott (1997) have
studied the optimal assignment of a single server to multiple interconnected queues. To our
knowledge, there are only a few papers that consider the optimal assignment of multiple
servers to multiple interconnected stations. Farrar (1993), Pandelis and Teneketzis (1994),
and Ahn, Duenyas, and Zhang (1999) all study the optimal assignment of servers to stations
for networks with two queues and no arrivals. Similarly, Rosberg, Varaiya, and Walrand
(1982) and Hajek (1984) study the optimal assignment of (service) e�ort in the two station
setting with Poisson arrivals. Unlike our work, all of the above minimize holding costs,
whereas our objective is to maximize throughput.
The papers discussed above all assume that at most one server can be working on a
particular job at any given time. By contrast, we assume that several servers can collaborate
on the same job (with additive service rates). Mandelbaum and Reiman (1998) have studied
queueing systems in which several servers are pooled into a single server. Similarly, Buzacott
(1996) considers situations where several servers work together in a team, but he emphasizes
the case where the task completion times of the team are the maximum of the completion
times of the subtasks completed by each team member. Finally, Van Oyen, Senturk-Gel,
3
and Hopp (2001) show that the server assignment policy in which all servers work as a team
on a single job minimizes the cycle time per job when all servers are identical and complete
collaboration of all servers is possible. However, Mandelbaum and Reiman (1998), Buzacott
(1996), and Van Oyen et al. (2001) are concerned with the static (permanent) pooling of
servers into teams, whereas we are interested in dynamic server assignment policies in which
the movement and collaboration of servers changes with the state of the underlying system.
This paper is organized as follows. In Section 2, we show that if the service rates �ij,
where i = 1; : : : ;M and j = 1; : : : ; N , only depend on either the server i or the station j
(and not on both the server and the station), then all non-idling policies will yield optimal
throughput. In Section 3, we assume that all service times are exponentially distributed and
translate the original (continuous time) throughput optimization problem into an equivalent
(discrete time) Markov decision problem. In Section 4, we determine the form of the optimal
server assignment policy for Markovian systems of two queueing stations in tandem with two
servers. In Section 5, we present a simple server assignment heuristic for tandem systems with
an equal number of servers and stations and use numerical results obtained for Markovian
networks with three or �ve stations in tandem to show that this heuristic generally appears
to yield near-optimal throughput. Finally, Section 6 contains some concluding remarks.
2 Two special cases
Let � be the set of server assignment policies under consideration, and for all � 2 � and
t � 0, let D�(t) be the number of departures under the policy � by time t and let
T� = limt!1
EfD�(t)g
t
be the long-run average throughput corresponding to the server assignment policy �. We
are interested in solving the optimization problem:
max�2�
T�: (1)
4
In this section, we consider the special cases when the service rates �ij depend either on
the server i 2 f1; : : : ;Mg or on the station j 2 f1; : : : ; Ng (and not on both). We refer to
these two cases as generalist cases because each server is either equally capable of performing
each task (when the service rates depend only on the server and not on the station), or the
servers are all identical (when the service rates depend only on the station and not on the
server). If the servers are not generalists (e.g., if some servers are better at a subset of the
tasks than the other servers), then we say that the servers are specialists. For j = 2; : : : ; N ,
we let Bj <1 denote the size of the bu�er between stations j � 1 and j.
The following theorem shows that when the servers are generalists, then all non-idling
policies will yield maximal throughput. Although this result is stated under the assumption
that the underlying system is a tandem queueing network, it is easy to see from the proof
that the optimality of all non-idling policies holds for more general system con�gurations.
Theorem 2.1 Assume that for each j = 1; : : : ; N , the service times Sk;j of job k � 1 at
station j are independent and identically distributed with mean one, and that for all t � 0,
if there is a job in service at station j at time t, then the expected remaining service time
at station j of that job is bounded above by a scalar 1 � �S < 1. Moreover, assume that
service is either nonpreemptive or preemptive-resume. If �ij = �i for all i = 1; : : : ;M
and j = 1; : : : ; N , then any non-idling server assignment policy � is optimal, with long-run
average throughput
T� =
PMi=1 �iN
: (2)
On the other hand, if �ij = �j for all i = 1; : : : ;M and j = 1; : : : ; N , then any non-idling
server assignment policy � is optimal, with long-run average throughput
T� =MPN
j=1 1=�j: (3)
Remark 2.1 Van Oyen, Senturk-Gel, and Hopp (2001) assume that all workers are identical,
which corresponds to the assumption that �ij = �j for all i = 1; : : : ;M and j = 1; : : : ; N .
5
Similarly, the main result on the bucket brigades policy (see Theorem 3 of Bartholdi and
Eisenstein, 1996) requires that �ij = �i for all i = 1; : : : ;M and j = 1; : : : ; N .
Proof: We �rst consider the case when �ij = �i for all i = 1; : : : ;M and j = 1; : : : ; N . Let
� be a non-idling server assignment policy and let W�;p(t) be the total work performed by
time t for all servers under the policy �. Then, we have that
W�;p(t) = tMXi=1
�i; (4)
because � is a non-idling policy, there is always a job to be served at station 1, and if station
1 is blocked, then there will be a job downstream that may be served.
Let B =PN
j=2Bj be the total number of bu�er slots in the system and let Sk =PN
j=1 Sk;j
be the total service requirement (in the system) of job k for all k � 1. Let W�(t) =
PD�(t)+N+Bk=1 Sk and let W�;r(t) = W�(t)�W�;p(t) be the total remaining service requirement
(work) at time t for the N +B jobs starting service at station 1 after job D�(t) starts service
at station 1. Since �S � 1, the expected (remaining) service time of each of these jobs at any
station j 2 f1; : : : ; Ng is bounded above by �S. We therefore have that
EfW�;r(t)g � (N +B)� �S �N; (5)
implying that limt!1EfW�;r(t)g=t = 0, and consequently that
MXi=1
�i = limt!1
EfW�;p(t)g
t= lim
t!1
EfW�(t)g
t: (6)
It remains to relate the term EfW�(t)g with the system throughput T�. For all n � 0,
let Zn = (Sn;1; : : : ; Sn;N). Note that because job D�(t) must have left the system before job
D�(t)+N +B enters service at station 1, we have that for all n � 0, the event fD�(t) � ng
is completely determined by the random vectors Z1; : : : ; Zn+N+B�1 (and independent of
Zn+N+B; Zn+N+B+1; : : :). This shows that D�(t) +N +B is a stopping time with respect to
the sequence of random vectors fZng. Also, note that for all t � 0, D�(t) � M(t), where
M(t) is the number of jobs departing station 1 by time t if all servers work at station 1 at
6
all times and there is unlimited room for completed jobs after station 1. Since fM(t)g is a
non-decreasing process with limt!1EfM(t)g=t =PM
i=1 �i < 1 by the elementary renewal
theorem, we have that EfD�(t)g <1 for all t � 0. Thus, Wald's lemma yields
EfW�(t)g = E
8<:D�(t)+N+BX
k=1
Sk
9=; = EfD�(t) +N +Bg � EfS1g
= EfD�(t)g �N + (N +B)�N:
Equation (6) now yields that
MXi=1
�i = limt!1
EfW�(t)g
t= lim
t!1
EfD�(t)g
t�N = T� �N; (7)
which shows that all non-idling policies � yield the long-run average throughput given in
equation (2). The optimality of this throughput follows from equations (6) and (7) and the
fact that W�;p(t) � tPM
i=1 �i for all t � 0 and all server assignment policies �.
We now outline the proof for the case when �ij = �j for all i = 1; : : : ;M and j = 1; : : : ; N .
Let � be a non-idling server assignment policy and letW�;p(t) be the total busy time by time t
for all servers under the policy �. Let Sk =PN
j=1 Sk;j=�j be the total time required for serving
job k for all k � 1, and let W�(t) =PD�(t)+N+B
k=1 Sk. Finally, let W�;r(t) = W�(t) �W�;p(t)
be the total time remaining at time t for serving the N + B jobs starting service at station
1 after job D�(t) starts service at station 1. The proof is similar to the above proof, except
that equation (4) is replaced by W�;p(t) =Mt, equation (5) is replaced by
EfW�;r(t)g � (N +B)� �S �
0@ NXj=1
1
�j
1A ;
and equation (7) is replaced by
M = limt!1
EfW�(t)g
t= lim
t!1
EfD�(t)g
t�
0@ NXj=1
1
�j
1A = T� �
0@ NXj=1
1
�j
1A :
This shows that all non-idling policies � yield the long-run average throughput given in
equation (3). The optimality of this throughput follows from the fact that W�;p(t) �Mt for
all t � 0 and all server assignment policies �. 2
7
Corollary 2.1 Assume that M = 1, that for each j = 1; : : : ; N , the service times Sk;j of job
k � 1 at station j are independent and identically distributed with mean one, and that for all
t � 0, if there is a job in service at station j at time t, then the expected remaining service
time at station j of that job is bounded above by a scalar 1 � �S < 1. Then any non-idling
server assignment policy � is optimal, with long-run average throughput
T� =1PN
j=1 1=�1j:
Corollary 2.2 If the service times at station j are independent and exponentially distributed
with mean one, for j = 1; : : : ; N , and if service is either nonpreemptive, preemptive-resume,
or preemptive-repeat, then the two results of Theorem 2.1 hold.
Proof: For all j = 1; : : : ; N and k � 1, if the service times Sk;j are exponentially distributed
with mean one, then preemptive-repeat service is stochastically equivalent to preemptive-
resume service, and for all t � 0, if there is a job in service at station j at time t, then the
expected remaining service time at station j of that job is bounded above by �S = 1. 2
3 Problem formulation
In the remainder of this paper, we assume that the service times of each job at each station
are independent and exponentially distributed with rate � = 1. In this section, we translate
the original optimization problem (1) into an equivalent (discrete time) Markov decision
problem. For simplicity, we derive the alternative formulation for a system of �nite queues
in tandem, but our arguments apply to systems with general con�gurations.
For all � 2 � and t � 0, letX�(t) = (X�;1(t); : : : ; X�;2N�1(t)), whereX�;2j 2 f0; 1; : : : ; Bj+1g
denotes the number of jobs in the bu�er between stations j and j + 1 at time t under the
policy � for j = 1; : : : ; N � 1, and X�;2j�1(t) 2 f0; 1; 2g denotes the status of station j at
time t under the policy � for j = 1; : : : ; N , where 0 refers to the starved status, 1 refers to
the operating status, and 2 refers to the blocked status (note however that X�;2j�1(t) = 1
does not necessarily imply that there is a server working at station j at time t). From now
8
on, we assume that the class � of server assignment policies under consideration consists of
all Markovian stationary deterministic policies corresponding to the state space S � IN2N�1
of the stochastic processes fX�(t)g. In other words, the policies in � specify at what station
in the network each server is working as a function of the current state x 2 S. Note that
when M � 2 (i.e., there is more than one server in the network), the policies in � may
involve preemptive service because a service completion at one station in the network may
trigger the movement of servers that are currently working at other stations in the network.
It is clear that for all � 2 �, fX�(t)g is a continuous time Markov chain and there exists
a scalar q� �PM
i=1max1�j�N �ij < 1 such that the transition rates fq�(x; x0)g of fX�(t)g
satisfyP
x02S;x0 6=x q�(x; x0) � q� for all x 2 S. This shows that fX�(t)g is uniformizable
for all � 2 �. We let fY�(k)g be the corresponding discrete time Markov chain, so that
fY�(k)g has state space S and transition probabilities p�(x; x0) = q�(x; x
0)=q� if x0 6= x and
p�(x; x) = 1 �P
x02S;x0 6=x q�(x; x0)=q� for all x 2 S. We will use the fact that fX�(t)g is
uniformizable to translate the original optimization problem (1) into an equivalent (discrete
time) Markov decision problem (using uniformization in this manner was proposed originally
by Lippman, 1975). In particular, one can generate sample paths of fX�(t)g, where � 2 �,
by generating a Poisson process fK�(t)g with rate q� and at the times of the events of
fK�(t)g, the next state of fX�(t)g is generated using the transition probabilities of fY�(k)g.
For all x; x0 2 S, de�ne
R(x; x0) =
(1 if x 2 D and x0 2 Dx;0 otherwise;
where D = fx 2 S : x2N�1 = 1g,
Dx =
8><>:f(x1; : : : ; x2N�2; 0)g if x2N�3 6= 2 and x2N�2 = 0;f(x1; : : : ; x2N�3; x2N�2 � 1; 1)g if x2N�3 6= 2 and x2N�2 > 0;fx0 2 S : x02N�3 < 2g if x2N�3 = 2;
for all x 2 D, and Dx = ; for all x 62 D. Note that for all x 2 D and � 2 �, the set Dx � S
contains at most one state x0 with p�(x; x0) > 0. It is now easy to see that for all � 2 �,
T� = limt!1
E
8<:K�(t)
t�
1
K�(t)
K�(t)Xk=1
R(Y�(k � 1); Y�(k))
9=; :
9
By the elementary renewal theorem, it is clear that K�(t)=t! q� almost surely as t!1 for
all � 2 �. Moreover, it is clear from the strong law of large numbers for Markov chains (see
for example Wol�, 1989, page 164) that for all � 2 �, the limit limK!1PK
k=1R(Y�(k �
1); Y�(k))=K exists almost surely, although the limit may depend on Y�(0) and it may
be random (see also Section 3.8 of Kulkarni, 1995). From the facts thatPK
k=1R(Y�(k �
1); Y�(k))=K � 1 for all K � 1 and supt�0Ef[K�(t)=t]2g < 1 (because K�(t) is a Poisson
random variable with mean q�t), uniform integrability implies that for all � 2 �, we have
T� = q�E
(limK!1
1
K
KXk=1
R(Y�(k � 1); Y�(k))
)= q� lim
K!1E
(1
K
KXk=1
R(Y�(k � 1); Y�(k))
)
(see for example the corollary to Theorem 25.12 in Billingsley, 1995). This shows that the
optimization problem (1) has the same solution as the optimization problem
max�2�
q� limK!1
E
(1
K
KXk=1
R(Y�(k � 1); Y�(k))
):
The strong law of large numbers for Markov chains also gives that for all � 2 �,
limK!1
1
K
KXk=1
R(Y�(k � 1); Y�(k)) = limK!1
1
K
KXk=1
R0�(Y�(k � 1)) a.s.;
where R0�(x) =P
x02Dxp�(x; x
0) =P
x02Dxq�(x; x
0)=q� for all x 2 S (note that both limits
may be random and may depend on Y�(0)). Uniform integrability now gives that
limK!1
E
(1
K
KXk=1
R(Y�(k � 1); Y�(k))
)= lim
K!1E
(1
K
KXk=1
R0�(Y�(k � 1))
);
and hence the optimization problem (1) is equivalent to the optimization problem
max�2�
q� limK!1
E
(1
K
KXk=1
R0�(Y�(k � 1))
):
Therefore, if one selects q� = q for all � 2 � (which is always possible in our setting), then
the optimization problem (1) has the same solution as
max�2�
limK!1
E
(1
K
KXk=1
R0�(Y�(k � 1))
):
10
Finally, it is clear from the above that if R00�(x) =P
x02Dxq�(x; x
0), the departure rate
from state x under the policy �, for all x 2 S and � 2 �, then the optimization problem (1)
has the same solution as the Markov decision problem
max�2�
limK!1
E
(1
K
KXk=1
R00�(Y�(k � 1))
): (8)
This is the alternative formulation of the optimization problem (1) used in this paper. We
have shown that maximizing the steady-state throughput is equivalent to maximizing the
steady-state departure rate for the associated embedded (discrete time) Markov chain.
4 The case with two stations
In this section, we consider the special case of a tandem Markovian queueing network with
two stations and two servers (i.e., M = N = 2). The bu�er size between stations 1 and 2 is
a �xed, �nite integer B2 = B � 0. It is clear that the number of possible states and actions
are both �nite in this setting. Therefore, the existence of an optimal Markovian stationary
deterministic policy follows immediately from Theorem 9.1.8 of Puterman (1994). We now
specify the server assignment policy that maximizes the long-run average throughput:
Theorem 4.1 For a Markovian system of tandem queues with two stations and two servers,
the following hold:
(i) In the class of Markovian stationary deterministic policies �, the policy that assigns
server 1 to station 1 and server 2 to station 2 unless station 1 is blocked or station
2 is starved and assigns both servers to station 1 (station 2) when station 2 (station
1) is starved (blocked) is optimal if �11�22 � �21�12 � 0. Moreover, this is the unique
optimal policy if the inequality is strict.
(ii) In the class �, the policy that assigns server 1 to station 2 and server 2 to station 1 unless
station 1 is blocked or station 2 is starved and assigns both servers to station 1 (station
2) when station 2 (station 1) is starved (blocked) is optimal if �21�12 � �11�22 � 0.
Moreover, this is the unique optimal policy if the inequality is strict.
11
Note that this result shows that the optimal server assignment policy uses the movement
and cooperation of servers only to avoid idleness of servers, even though our model assumes
that the travel times between the service facilities are negligible and that the service rates of
cooperating servers are additive. The optimal policy consists of two components: a primary
assignment of servers to stations (this involves maximizing the product of the service rates
of the servers assigned to the di�erent stations) and a contingency plan that speci�es what
a server should do when there is no work to be done at the station the server is assigned to.
Even though servers only move to avoid idleness in the optimal server assignment policy,
it is still possible that server movements will occur with high frequency. This will for instance
be the case when the server assigned to one station works at a much higher rate than the
server assigned to the other station. Therefore, it is clear that the above server assignment
policy will not necessarily be optimal in the presence of set-up times or when service is non-
preemptive. In such cases, the movement of servers may be so costly that it is preferable to
allow servers to idle under some circumstances.
Proof: Suppose �rst that �1j = �2j = 0 for some j 2 f1; 2g (i.e., there is at least one
station where no server is capable of working). Then the long-run average throughput is
zero under any policy (and hence the policies described in Theorem 4.1 are optimal). On
the other hand, if �i1 = �i2 = 0 for some i 2 f1; 2g (i.e., server i is incapable of working at
both stations), then Corollary 2.2 shows that all non-idling policies, including the policies
described in Theorem 4.1, are optimal. This shows that we can assume, without loss of
generality, that there exist j1; j2 2 f1; 2g, j1 6= j2, such that �1j1 > 0 and �2j2 > 0.
For N = 2, the state space of the Markov chain fX�(t)g, where � 2 �, reduces to S =
f(1; 0; 0); (1; 0; 1); : : : ; (1; B; 1); (2; B; 1)g, where in state (i; l; j) 2 S, i refers to the status of
station 1, l refers to the number of jobs in the bu�er, and j refers to the status of station 2.
We will use the notation a�1�2 for the possible actions, where, for i = 1; 2, �i 2 fI; 1; 2g is
the status of server i, with �i = I when server i idling and �i = j 2 f1; 2g when server i is
12
working at station j. Then the set As of allowable actions in state s 2 S is given by
As =
8><>:faII ; aI1; a1I ; a11g for s = (1; 0; 0),faII ; aI1; aI2; a1I ; a2I ; a11; a12; a21; a22g for s = (1; l; 1), where 0 � l � B,faII ; aI2; a2I ; a22g for s = (2; B; 1).
Note that the set of possible actions in states (1; 0; 0) and (2; B; 1) can be reduced. For
example, in state (1; 0; 0), action aII is identical to actions aI2, a2I and a22, action aI1 is
identical to action a21, and action a1I is identical to action a12.
Without loss of generality, it suÆces to prove part (i) of Theorem 4.1. Note that under
our assumptions on the service rates (i.e., �11�22��21�12 � 0 and there exist j1; j2 2 f1; 2g,
j1 6= j2, such that �1j1 > 0 and �2j2 > 0), neither �11 nor �22 can be equal to zero. This
shows that the optimal policy suggested in part (i) corresponds to an irreducible Markov
chain, and consequently that we have a communicating Markov decision process. Therefore,
we use the policy iteration algorithm for communicating models (see pages 479 and 480 of
Puterman, 1994) to prove the optimality of the policies described in Theorem 4.1.
Since � is the set of Markovian stationary deterministic policies, we can describe each
feasible policy by a B + 3 dimensional vector d whose components d(s) 2 As specify what
action in As should be applied in state s for all s 2 S. Similarly, let Pd be the (B+3)�(B+3)
dimensional transition probability matrix corresponding to the policy described by d and let
rd be the B+3 dimensional reward vector corresponding to d, with rd(s) denoting the reward
earned in state s under the policy corresponding to d, for all s 2 S. For the purposes of this
proof, we use the constant uniformization constant q = �11 + �12 + �21 + �22.
In the policy iteration algorithm, we start by choosing
d0(s) =
8><>:a11 for s = (1; 0; 0),a12 for s = (1; l; 1), 80 � l � B,a22 for s = (2; B; 1),
corresponding to the policy described in part (i) of Theorem 4.1. Then
rd0(s) =
8><>:0 for s = (1; 0; 0),�22 for s = (1; l; 1), 80 � l � B,�12 + �22 for s = (2; B; 1),
13
and
Pd0(s; s0) =
8>>>>>>>>>>>>>>>>><>>>>>>>>>>>>>>>>>:
�11+�21q
for s = (1; 0; 0), s0 = (1; 0; 1),�12+�22
qfor s = s0 = (1; 0; 0),
�22q
for s = (1; 0; 1), s0 = (1; 0; 0),�22q
for s = (1; l; 1), s0 = (1; l � 1; 1), 81 � l � B,�12+�21
qfor s = s0 = (1; l; 1), 80 � l � B,
�11q
for s = (1; l; 1), s0 = (1; l + 1; 1), 80 � l � B � 1,�11q
for s = (1; B; 1), s0 = (2; B; 1),�12+�22
qfor s = (2; B; 1), s0 = (1; B; 1),
�11+�21q
for s = s0 = (2; B; 1).
Since the Markov chain fY�(k)g under the policy � corresponding to d0 is irreducible, we
�nd a scalar g0 and a vector h0 solving
rd0 � g0e + (Pd0 � I)h0 = 0; (9)
subject to h(1; 0; 0) = 0. Note that e is a column vector of ones and I is the identity matrix.
It is not diÆcult to show that
g0 =
8<:
(�11+�21)(�12+�22)(�B+2
22��B+2
11)
(�B+222
��B+211
)(�12+�21)+(�B+1
22��B+1
11)�21�12+�
B+3
22��B+3
11
if �11 6= �22,(B+2)�22(�11+�21)(�12+�22)
�11(�12+�22)+(B+1)(�12+�22)(�11+�21)+�11(�11+�21)if �11 = �22,
and h0((1; 0; 0)) = 0,
h0((1; l; 1)) = qg0
Pl�1j=0(j + 1)(�21 + �22)�
l�1�j22 �j11 + (l + 1)�l11
�l11(�11 + �21)
�q�22
Pl�1j=0(j + 1)�l�1�j22 �j11
�l11; 80 � l � B;
h0((2; B; 1)) = qg0
PBj=0(j + 1)(�21 + �22)�
B�j22 �j11 + (B + 2)�B+111
�B+111 (�11 + �21)� q�22
PBj=0(j + 1)�B�j22 �j11
�B+111
(with the convention that summation over an empty set equals zero) constitute a solution
to (9). Note that g0 > 0, regardless of whether �11 6= �22 or �11 = �22.
For the remainder of the proof, we assume that �11 6= �22; the proof for �11 = �22 is
similar. For all s 2 S and a 2 As, let r(s; a) be the immediate reward obtained when action
a is chosen in state s and let p(jjs; a) be the probability of going to state j in one step when
action a is chosen in state s. As a next step of the policy iteration algorithm, we choose
d1(s) 2 argmaxa2As
8<:r(s; a) +
Xj2S
p(jjs; a)h0(j)
9=; ; 8s 2 S;
14
setting d1(s) = d0(s) if possible. We now show that if �11�22��21�12 � 0, then d1(s) = d0(s),
for all s 2 S. In particular, for all s 2 S and a 2 As, we will compute the di�erences
r(s; a) +Xj2S
p(jjs; a)h0(j)�
0@r(s; d0(s)) +X
j2S
p(jjs; d0(s))h0(j)
1A (10)
and show that the di�erences are non-positive when �11�22 � �21�12 � 0.
For s = (1; 0; 0), recall that d0(s) = a11. We have
r(s; aII) +Xj2S
p(jjs; aII)h0(j)�
0@r(s; a11) +X
j2S
p(jjs; a11)h0(j)
1A = �g0 < 0;
r(s; aI1) +Xj2S
p(jjs; aI1)h0(j)�
0@r(s; a11) +X
j2S
p(jjs; a11)h0(j)
1A = �
�11g0�11 + �21
< 0; and
r(s; a1I) +Xj2S
p(jjs; a1I)h0(j)�
0@r(s; a11) +X
j2S
p(jjs; a11)h0(j)
1A = �
�21g0�11 + �21
� 0:
(Recall that �11 and �22 are both positive under our assumptions as was discussed previously.)
Note that in the last equation, we have the expression equal to zero only when �21 = 0, in
which case a11 is identical to a1I . This shows that d1(s) = d0(s) for s = (1; 0; 0).
For s = (1; l; 1), where 0 � l � B, we have that d0(s) = a12. Since the set As of all
possible actions is large, in the interest of space we will specify the di�erence in (10) only
for the actions a21, a11, and a22. We have
r(s; a21) +Xj2S
p(jjs; a21)h0(j)�
0@r(s; a12) +X
j2S
p(jjs; a12)h0(j)
1A
= �(�11�22 � �21�12)[(�
B+222 � �B+211 ) + �12�
l22(�
B+1�l22 � �B+1�l11 ) + �21�
B�l11 (�l+122 � �l+111 )]
(�B+222 � �B+211 )(�12 + �21) + (�B+122 � �B+111 )�21�12 + �B+322 � �B+311
� 0;
where the expression is equal to zero only when �11�22 � �21�12 = 0. Similarly,
r(s; a11) +Xj2S
p(jjs; a11)h0(j)�
0@r(s; a12) +X
j2S
p(jjs; a12)h0(j)
1A
= ��B�l11 (�11�22 � �21�12)(�11 + �21)(�
l+122 � �l+111 )
(�B+222 � �B+211 )(�12 + �21) + (�B+122 � �B+111 )�21�12 + �B+322 � �B+311
� 0;
where the expression is equal to zero only when �11�22 � �21�12 = 0. Moreover,
r(s; a22) +Xj2S
p(jjs; a22)h0(j)�
0@r(s; a12) +X
j2S
p(jjs; a12)h0(j)
1A
= ��l+122 (�11�22 � �21�12)(�12 + �22)(�
B�l22 � �B�l11 )
(�B+222 � �B+211 )(�12 + �21) + (�B+122 � �B+111 )�21�12 + �B+322 � �B+311
� 0;
15
where the expression is equal to zero only when �11�22 � �21�12 = 0.
Finally, consider s = (2; B; 1), so that d0(s) = a22. We have
r(s; aII) +Xj2S
p(jjs; aII)h0(j)�
0@r(s; a22) +X
j2S
p(jjs; a22)h0(j)
1A = �g0 < 0;
r(s; aI2) +Xj2S
p(jjs; aI2)h0(j)�
0@r(s; a22) +X
j2S
p(jjs; a22)h0(j)
1A
= �(�11 + �21)�12(�
B+222 � �B+211 )
(�B+222 � �B+211 )(�12 + �21) + (�B+122 � �B+111 )�21�12 + �B+322 � �B+311
� 0; and
r(s; a2I) +Xj2S
p(jjs; a2I)h0(j)�
0@r(s; a22) +X
j2S
p(jjs; a22)h0(j)
1A
= �(�11 + �21)�22(�
B+222 � �B+211 )
(�B+222 � �B+211 )(�12 + �21) + (�B+122 � �B+111 )�21�12 + �B+322 � �B+311
< 0;
where the equality in the second equation holds only when �12 = 0, implying that a22 = aI2.
We have shown that if �11�22��21�12 � 0, then d1(s) = d0(s) for all s 2 S. By Theorem
9.5.1 of Puterman (1994), this proves that the policy described in Theorem 4.1(i) is optimal.
The uniqueness of the optimal policy when �11�22 � �21�12 > 0 follows from the action
elimination results. In particular, Proposition 8.5.10 of Puterman (1994) implies that if
r(s; a)� g0 +Xj2S
p(jjs; a)h0(j)� h0(s) < 0; (11)
then any policy that chooses action a in state s can not be optimal. Since h0 satis�es equation
(9), we have
g0 + h0(s) = r(s; d0(s)) +Xj2S
p(jjs; d0(s))h0(j); 8s 2 S:
It then immediately follows from the above expressions that equation (11) holds for all
a 6= d0(s) if �11�22 � �21�12 > 0, and the proof is complete. 2
5 Server assignment policies for larger systems
This section is concerned with server assignment policies for tandem systems with more than
two stations. We use the insights gained in Section 4 to develop three simple (heuristic) server
assignment policies for systems in which the number of stations equals the number of servers.
16
We then compare the performance of our policies with a number of other server assignment
policies, including the optimal policy, three benchmark policies with stationary servers, and
the (expedite) teamwork policy studied by Van Oyen et al. (2001), using numerical results
and other considerations, and we recommend one of the proposed server assignment policies
over the other policies considered here. The outline of this section is as follows: The various
server assignment policies are described in Section 5.1. In Section 5.2, we present a numerical
comparison of the di�erent policies considered in this section. Finally, in Section 5.3, we
recommend one of the proposed server assignment policies and justify our recommendation.
5.1 Heuristic server assignment policies
In Section 4, we determined the optimal server assignment policy for tandem systems with
two stations and two servers. This policy is very simple, and consists of two parts: a primary
assignment of servers to stations and a contingency plan specifying what servers will do when
there is no work available at the stations they are assigned to. Unfortunately, our experience
indicates that the optimal policy for larger systems (with three or more stations, three or
more servers, or both) is generally not so simple, and may involve servers moving away from
stations even when there is work to be done at the stations they are moving away from.
We now consider policies of the form found to be optimal for tandem networks of two
stations and two servers (consisting of primary assignment and contingency parts) applied to
Markovian networks with arbitrary numbers of �nite queues in tandem in which the number
of servers M equals the number of stations N (even though we know that policies of this
form are not necessarily optimal for larger systems). Based on Theorem 4.1, we propose the
following primary assignment criterion for the general M = N � 2 case:
Assign each server i 2 f1; : : : ;Mg to station ji 2 f1; : : : ;Mg in such a manner that
fj1; : : : ; jMg = f1; : : : ;Mg (i.e., there is exactly one server assigned to each station)
and the productQM
i=1 �iji is maximized.
This primary assignment criterion appears to balance the line to the extent possible, except in
17
situations where some servers can work at great speed at a subset of the tasks. For example,
if N = M = 2, �11 = , �12 = 4, �21 = 5, �22 = 1, then the unbalanced con�guration of
assigning server i to station i, for i = 1; 2, will only be optimal if � 20.
We consider three di�erent contingency plans. In the �rst contingency plan, the focus of
each server is always on creating work for itself at the station it is assigned to. In particular,
when a server is idle, then the server will move downstream when it is blocked (to clear the
block at the station it is assigned to) and the server will move upstream when it is starved
(to create work for itself at the station it is assigned to). When the server is both blocked
and starved, a higher priority is put on clearing the block than on creating new work. More
speci�cally, this �rst contingency plan involves each server using the following procedure for
�nding work when there is no work to be done at the station it is assigned to:
At any time when station j 2 f1; : : : ; N�1g is blocked, the server assigned to station j will
be working downstream at the nearest station k > j where there is work to be done
and where there is room for at least one job in the bu�er following station k (there
will always be such a station k since there is unlimited room following station N).
At any given time when station j 2 f2; : : : ;Mg is starved but not blocked, the server
assigned to station j will be working upstream at the nearest station k < j where there
is work to be done (there will always be such a station k since there is an unlimited
supply of jobs preceding station 1).
In other words, if a station is blocked, then its assigned server will work downstream at the
nearest station where a service completion does not increase the number of blocked stations,
and if a station is starved but not blocked, then its assigned server will work upstream at the
nearest station where a service completion would move a job towards the starved station.
In the second contingency plan, each server will use the time when there is no work to
be done at its assigned station to push new jobs into the system. More speci�cally,
At any given time, all servers that have no work to do at the station they are assigned to
18
will be working at the lowest numbered station 1 � k � M that is not blocked.
Finally, in the third contingency plan, each server will use the time when there is no work to
be done at its assigned station to pull completed jobs out of the system. More speci�cally,
At any given time, all servers that have no work to do at the station they are assigned to
will be working at the highest numbered station 1 � k �M that is not starved.
For all three contingency plans, each server should ensure that it does not idle when there
is some work in the system that it is capable of performing (this is an issue when some of
the service rates equal zero). Also, it is easy to see that all three contingency plans have the
feature that during each period of time when a station j is blocked or starved, the server
assigned to station j may work at several di�erent stations before returning to station j.
For example, in the �rst contingency plan, if station j is blocked and its assigned server is
working at station k > j, then upon the next service completion at station k, the server
assigned to station j will backtrack to the nearest station l < k where there is now work to
be done with room in the following bu�er. Note that there will be exactly one such station
l satisfying j � l < k, and that this station l may satisfy l > j. Similarly (again for the �rst
contingency plan), if station j is starved (but not blocked) and its assigned server is working
at a station k < j, then a service completion at station k would cause the server assigned to
station j to move from station k to station l = k + 1, and station l may satisfy l < j.
When these three contingency plans are implemented with the primary assignment strat-
egy described above, the resulting heuristics will be referred to as the local , push, and pull
heuristics, respectively. We will compare these three heuristics with several other server
assignment policies, including the optimal policy (i.e., the policy that solves the optimiza-
tion problem (8)) and four benchmark policies, namely the non-moving policy with server i
assigned to station i at all times, the non-moving heuristic using our criterion for assigning
servers to stations, the best non-moving heuristic (with the best primary assignment), and
the teamwork policy of Van Oyen et al. (2001) (where all servers work in a single team that
19
will follow each job from the �rst to the last station and only starts work on a new job once
all work on the previous job has been completed). We will also compare the three heuristics
with the best local , push, and pull heuristics that use the best primary assignment of servers
to stations instead of our heuristic primary assignment criterion.
5.2 Numerical results
In this section, we present numerical results that show that the three server assignment
heuristics proposed in Section 5.1 generally appear to yield either optimal or near-optimal
throughput for tandem systems of �nite queues with M = N . In our �rst set of numerical
experiments, we consider the following �ve special cases:
The Generalist Balanced Case with Identical Servers: For all i; j = 1; : : : ;M , let �ij = 1.
The Generalist Balanced Case with Di�erent Servers: For all i; j = 1; : : : ;M , let �ij = i.
The Generalist Unbalanced Case: There exists a station j0 2 f1; : : : ;Mg such that �ij0 =
0:5 for all i = 1; : : : ;M . For all i; j = 1; : : : ;M , j 6= j0, let �ij = 1.
The Specialist Balanced Case: For all i; j = 1; : : : ;M , i 6= j, let �jj = 2 and �ij = 1.
The Specialist Unbalanced Case: There exists a station j0 2 f1; : : : ;Mg such that �j0j0 =
1:5. For all j = 1; : : : ;M , j 6= j0, let �jj = 2. For all i; j = 1; : : : ;M , i 6= j, let �ij = 1.
Note that these �ve classes of test problems include systems with generalist servers and
also systems with specialist servers (see Section 2). These test problems also include both
balanced and unbalanced tandem lines (the balanced lines do not have a bottleneck station
and the unbalanced lines have a single bottleneck station). We consider all �ve classes of
test problems with M = N 2 f3; 5g. When M = N = 3, we let B2 = B3 = 10, and when
M = N = 5, we let B2 = B3 = B4 = B5 = 1. In all unbalanced cases, we consider networks
with a single bottleneck at the beginning of the line, middle of the line, and end of the line
(i.e., when N = 3, we consider tandem lines with a single bottleneck at station 1, 2, or 3,
and when N = 5, we consider tandem lines with a single bottleneck at station 1, 3, or 5).
20
Note that for four of the above �ve cases (all except the generalist balanced case with
di�erent servers), it is optimal to assign server i to station i for all i = 1; : : : ;M , and that
this is achieved by our primary assignment heuristic. Therefore, it suÆces to compare the
local, push, and pull heuristics with the optimal, teamwork, and non-moving policies for
these four cases. For the generalist balanced case with di�erent servers, we also need to
consider the best non-moving heuristic. This is because the primary assignment of server
i to station i is not optimal in this case. In fact, the \bowl e�ect" for tandem lines with
non-moving servers (see for example Hillier and Boling, 1966, and Yamazaki et al., 1992)
indicates that the fastest servers should be assigned to stations near the center of the line.
For example, when M = N = 3, then server 3 should be assigned to station 2.
The results of our numerical experiments are given in Tables 1 through 7, where the
columns titled \Throughput" and \WIP" show the long-run average throughput and work
in process, respectively, of the di�erent server assignment policies. The optimal policy and
corresponding steady-state throughput were obtained by using the policy iteration algorithm
for communicating discrete time Markov chains to solve the optimization problem (8), as is
done in the proof of Theorem 4.1. The steady-state throughput and WIP corresponding to
the other policies (and the steady-state WIP for the optimal policy) were computed directly
using the stationary distributions of the underlying discrete time Markov chains. In addition
to the numerical results shown in Tables 1 through 7, we have obtained numerical results for
a number of systems that were de�ned by adding 0.1 to (and sometimes also subtracting 0.1
from) some of the service rates in the above �ve cases. This was done to test the sensitivity
of the results to the speci�c parameter values chosen for the above cases. These numerical
results are similar to the results shown in Tables 1 through 7.
Tables 1 through 4 show that the local, push, and pull heuristics and the teamwork policy
all achieve optimal throughput in the generalist cases. This is consistent with Theorem 2.1
and Corollary 2.2, since these are all non-idling policies. Moreover, the three heuristics and
teamwork policy yield substantial improvements over the non-moving policy (ranging from
21
Policy Three Stations Five StationsThroughput WIP Throughput WIP
Optimal Policy 1.000000 10.312435 1.000000 1.800000
Local Heuristic 1.000000 12.885649 1.000000 6.198983Push Heuristic 1.000000 15.801848 1.000000 7.823768Pull Heuristic 1.000000 9.718327 1.000000 3.602937
Teamwork Policy 1.000000 1.000000 1.000000 1.000000Non-Moving Policy 0.894419 12.871976 0.607583 6.135394
Table 1: The Generalist Balanced Case with Identical Servers.
Policy Three Stations Five StationsThroughput WIP Throughput WIP
Optimal Policy 2.000000 14.366447 3.000000 1.800000
Local Heuristic 2.000000 2.999846 3.000000 3.327466Push Heuristic 2.000000 8.333508 3.000000 6.434113Pull Heuristic 2.000000 2.294802 3.000000 2.089633
Teamwork Policy 2.000000 1.000000 3.000000 1.000000Best Non-Moving Heuristic 0.999999 22.374647 0.988451 8.327375Non-Moving Policy 0.999878 2.498125 0.929409 2.614003
Table 2: The Generalist Balanced Case with Di�erent Servers.
Policy Bottleneck at Station 1 Bottleneck at Station 2 Bottleneck at Station 3
Throughput WIP Throughput WIP Throughput WIP
Optimal Policy 0.750000 6.491690 0.750000 14.868122 0.750000 20.543528
Local Heuristic 0.750000 3.747617 0.750000 12.746915 0.750000 21.487172
Push Heuristic 0.750000 5.476488 0.750000 13.383503 0.750000 22.320113
Pull Heuristic 0.750000 2.493495 0.750000 12.012699 0.750000 20.225886
Teamwork Policy 0.750000 1.000000 0.750000 1.000000 0.750000 1.000000
Non-Moving Policy 0.499939 2.997666 0.499882 12.499844 0.499939 22.053571
Table 3: The Generalist Unbalanced Case with Three Stations.
22
Policy Bottleneck at Station 1 Bottleneck at Station 3 Bottleneck at Station 5
Throughput WIP Throughput WIP Throughput WIP
Optimal Policy 0.833333 1.666667 0.833333 1.833333 0.833333 1.833333
Local Heuristic 0.833333 4.481176 0.833333 6.120387 0.833333 7.558352
Push Heuristic 0.833333 7.079568 0.833333 7.422197 0.833333 8.563677
Pull Heuristic 0.833333 2.365190 0.833333 4.207580 0.833333 4.701035
Teamwork Policy 0.833333 1.000000 0.833333 1.000000 0.833333 1.000000
Non-Moving Policy 0.452899 3.995338 0.438765 5.879080 0.452899 7.904704
Table 4: The Generalist Unbalanced Case with Five Stations.
Policy Three Stations Five StationsThroughput WIP Throughput WIP
Optimal Policy 1.932518 12.963370 1.728204 6.488199
Local Heuristic 1.929884 12.869740 1.715618 6.132353Push Heuristic 1.913000 14.727949 1.665518 7.398706Pull Heuristic 1.907984 10.947652 1.598731 4.488739
Teamwork Policy 1.333333 1.000000 1.200000 1.000000Non-Moving Policy 1.788837 12.871976 1.215167 6.135394
Table 5: The Specialist Balanced Case.
Policy Bottleneck at Station 1 Bottleneck at Station 2 Bottleneck at Station 3
Throughput WIP Throughput WIP Throughput WIP
Optimal Policy 1.739319 10.023062 1.740004 12.864368 1.739319 15.902414
Local Heuristic 1.714386 6.938373 1.739867 12.842823 1.723097 18.612142
Push Heuristic 1.737466 9.433618 1.718053 13.910340 1.672892 20.114778
Pull Heuristic 1.645108 5.149370 1.717268 11.714313 1.737174 16.377073
Teamwork Policy 1.272727 1.000000 1.272727 1.000000 1.272727 1.000000
Non-Moving Policy 1.485968 6.470765 1.479007 12.737061 1.485968 19.086661
Table 6: The Specialist Unbalanced Case with Three Stations.
23
Policy Bottleneck at Station 1 Bottleneck at Station 3 Bottleneck at Station 5
Throughput WIP Throughput WIP Throughput WIP
Optimal Policy 1.647643 6.398480 1.649783 6.428326 1.647669 6.686640
Local Heuristic 1.617653 5.529211 1.638425 6.135185 1.638803 6.684818
Push Heuristic 1.611912 7.215560 1.588374 7.294791 1.567814 7.871986
Pull Heuristic 1.472112 3.751244 1.533066 4.623911 1.560953 4.733387
Teamwork Policy 1.178571 1.000000 1.178571 1.000000 1.178571 1.000000
Non-Moving Policy 1.131774 5.406879 1.103754 6.070718 1.131774 6.849260
Table 7: The Specialist Unbalanced Case with Five Stations.
about 12% improvement in Table 1 to about 90% improvement in Table 4).
For the specialist cases, Tables 5 through 7 show that the three heuristics always yield
near-optimal throughput that is substantially higher than the throughputs achieved by the
benchmark teamwork and non-moving policies. Among the three heuristics, the local heuris-
tic almost always yields the best performance, and the pull heuristic almost always shows
the worst performance. The only exceptions occur for the specialist unbalanced cases with
three stations and bottlenecks at stations 1 or 3. When the bottleneck is at station 1, then
the push heuristic yields the largest throughput, and when the bottleneck is at station 3,
then the pull heuristic yields the highest throughput (in both cases, the local heuristic is
second best). This is reasonable because the push heuristic always tries to assign all idle
servers to station 1, which is the bottleneck in the former case; similarly, the pull heuristic
always tries to assign all idle servers to station 3, which is the bottleneck in the latter case.
The non-moving policy yields better results than the teamwork policy for the specialist cases
with three stations and the specialist balanced case with �ve stations. This is because the
teamwork policy does not attempt to assign servers to the tasks that they are good at (i.e.,
where their service rates are large). However, the throughput of the teamwork policy is
slightly better than that of the non-moving policy for the specialist unbalanced case with
�ve stations. This suggests that as the number of stations increases, it becomes more and
more important to avoid idleness of servers, even at the expense of having servers spend
24
substantial amounts of time working on tasks that they are not particularly good at.
Tables 1 through 7 also show that the push heuristic always yields larger WIP than the
local and pull heuristics, and the pull heuristic always yields smaller WIP than the other
two heuristics. This is reasonable because the focus of the push heuristic is to push new jobs
into the system, which would be expected to yield higher WIP, whereas the focus of the pull
heuristic is to pull jobs out of the system, and hence reduce the WIP. The teamwork policy
always yields the optimal WIP of one. The WIP achieved by the non-moving policy always
lies between the WIP levels achieved by the push and pull policies and the WIP achieved
by the optimal policy ranges from being quite close to the optimal WIP (see for example
Table 4) to being larger than the WIP achieved by the push policy (see for example Table
3); recall however that for the generalist cases, the optimal policy is not unique.
Note that the �ve cases studied in Tables 1 through 7 all have an equal number of bu�er
spaces between any two successive stations. Moreover, the service rates chosen for these
cases exhibit quite a bit of symmetry. To compare the performance of the eleven server
assignment policies described in the last paragraph of Section 5.1 for more general systems,
we considered systems where the service rates of the di�erent servers at the di�erent stations
are drawn independently from a uniform distribution with range [0:5; 2:5]. Tables 8 and 9
show the 95% con�dence intervals for the steady-state throughput and WIP obtained by
each policy for such systems with either three or �ve stations. To model systems with small
bu�ers, large bu�ers, and arbitrary bu�ers, we considered systems with three stations and
a common bu�er size 1 or 10, systems with three stations and independent and uniformly
distributed bu�er sizes on the set f1; 2; : : : ; 10g, systems with �ve stations and a common
bu�er size 1, and systems with �ve stations and independent and uniformly distributed bu�er
sizes on the set f1; 2; 3; 4g. The results for three stations and either 1 or 10 bu�ers were
obtained from 1,000 replications; the results for three stations and random bu�er sizes were
obtained from 2,000 replications; the results for �ve stations and 1 bu�er were obtained from
200 replications; and the results for �ve stations and random bu�er sizes were obtained from
25
250 replications. For systems with �ve stations and random bu�er sizes, the optimal policy
was not determined, due to the prohibitive amount of required computer time.
Policy Common Bu�er Size = 1 Common Bu�er Size = 10 Bu�er Sizes � Uniform f1; : : : ; 10gThroughput WIP Throughput WIP Throughput WIP
Optimal Policy 1.7151 � 0.0149 3.6623 � 0.0195 1.7792 � 0.0162 12.9306 � 0.1600 1.7581 � 0.0113 7.7413 � 0.1000
Best Local Heuristic 1.6823 � 0.0147 3.6007 � 0.0257 1.7521 � 0.0163 13.1217 � 0.3298 1.7287 � 0.0112 7.6938 � 0.1423Best Push Heuristic 1.6712 � 0.0145 3.9013 � 0.0228 1.7394 � 0.0162 14.8718 � 0.3043 1.7150 � 0.0111 8.7517 � 0.1441Best Pull Heuristic 1.6626 � 0.0145 3.2653 � 0.0274 1.7342 � 0.0163 11.1935 � 0.3035 1.7068 � 0.0111 6.5260 � 0.1330
Local Heuristic 1.6803 � 0.0147 3.5870 � 0.0263 1.7504 � 0.0164 13.0367 � 0.3343 1.7269 � 0.0113 7.6468 � 0.1425Push Heuristic 1.6701 � 0.0146 3.9144 � 0.0228 1.7373 � 0.0162 15.1537 � 0.3064 1.7132 � 0.0112 8.8471 � 0.1462Pull Heuristic 1.6593 � 0.0146 3.2278 � 0.0285 1.7302 � 0.0165 10.7043 � 0.3155 1.7037 � 0.0112 6.3455 � 0.1328
Teamwork Policy 1.4510 � 0.0127 1.0000 1.4510 � 0.0127 1.0000 1.4515 � 0.0091 1.0000Best Non-Moving Heuristic 1.1549 � 0.0131 3.5242 � 0.0305 1.4352 � 0.0200 13.0674 � 0.3156 1.3374 � 0.0125 7.5580 � 0.1426Non-Moving Heuristic 1.1461 � 0.0137 3.5008 � 0.0350 1.4081 � 0.0212 12.9661 � 0.3521 1.3182 � 0.0133 7.5527 � 0.1529Non-Moving Policy 0.8286 � 0.0164 3.3968 � 0.0467 0.9794 � 0.0227 12.1925 � 0.4053 0.9275 � 0.0143 7.4377 � 0.1740
Table 8: The Case with Three Stations and Random Service Rates.
Policy Common Bu�er Size = 1 Bu�er Sizes � Uniform f1; : : : ; 4gThroughput WIP Throughput WIP
Optimal Policy 1.9154 � 0.0192 6.3414 � 0.0342
Best Local Heuristic 1.8180 � 0.0185 6.2318 � 0.0830 1.9003 � 0.0199 9.3400 � 0.2399
Best Push Heuristic 1.7828 � 0.0185 7.5098 � 0.0511 1.8186 � 0.0168 11.8993 � 0.2534
Best Pull Heuristic 1.7305 � 0.0173 4.3896 � 0.0692 1.8064 � 0.0190 6.6933 � 0.2409
Local Heuristic 1.8123 � 0.0190 6.1771 � 0.0849 1.8800 � 0.0201 9.2703 � 0.2514
Push Heuristic 1.7782 � 0.0186 7.5538 � 0.0529 1.8075 � 0.0170 12.0079 � 0.2616
Pull Heuristic 1.7189 � 0.0179 4.2521 � 0.0752 1.7708 � 0.0190 5.9097 � 0.2184
Teamwork Policy 1.4636 � 0.0160 1.0000 1.4709 � 0.0151 1.0000
Best Non-Moving Heuristic 1.1650 � 0.0168 6.1493 � 0.0896 1.3703 � 0.0227 9.1845 � 0.2295
Non-Moving Heuristic 1.1532 � 0.0188 6.1507 � 0.1013 1.3330 � 0.0234 9.3455 � 0.2674
Non-Moving Policy 0.6843 � 0.0259 5.7950 � 0.1973 0.7876 � 0.0308 8.7979 � 0.3849
Table 9: The Case with Five Stations and Random Service Rates.
Tables 8 and 9 show the same overall behavior as Tables 1 through 7. The local heuristic
always gives better throughput than the push heuristic, which in turn gives better results
than the pull heuristic. In all cases, these three heuristics (especially the local one) yield
near-optimal throughput that is substantially better than that of the benchmark non-moving
and teamwork policies. Moreover, the teamwork policy always shows better average behavior
than the three non-moving policies. The average WIP levels for the local heuristic and also
26
the optimal and non-moving policies lie between the WIP levels of the push and pull policies
in all cases. Moreover, in all cases, the behavior of the local, push, pull, and non-moving
heuristics are very close to those of the best local, push, pull, and non-moving heuristics,
respectively. Therefore, our primary assignment strategy appears to yield very good results.
5.3 Discussion
The numerical results given in Section 5.2 show that although our three heuristics are gener-
ally not optimal (except for generalist cases, see Theorem 2.1), they yield throughput levels
that are extremely close to the optimal achievable throughput in all cases, and substan-
tially better than the throughput levels achieved by the teamwork policy and all policies
with non-moving servers. Thus, our numerical results show that allowing servers to move
between stations and to cooperate on the same job can result in substantial improvements
in throughput relative to policies with stationary servers. Moreover, the vast majority of
the potential bene�t can be obtained by using a simple server assignment policy in which
servers only move to avoid idleness.
Among our three heuristics, the local heuristic gives the highest throughput. The local
heuristic also has the advantages over the push and pull heuristics that servers make decisions
about where to go based only on local information, and generally will not travel long distances
(suggesting that the assumption of negligible travel times is more reasonable for the local
heuristic than for the other two heuristics). Moreover, servers will only move when they
complete work on a job at a particular station (note that this is not true of the push and
pull policies because servers may return to the station they are assigned to before they
complete work on a job at other stations). Finally, both the push and pull policies assign
all idle servers to the same station; this is not the case of the local heuristic. Therefore, the
push and pull heuristics are more likely than the local heuristic to yield situations where
there is a large number of servers working in parallel on the same job. Given that we assume
that the service rates of all servers working together on a single job are additive, and that
27
this assumption is more likely to be satis�ed when a few servers work together on a job than
when a large number of servers are cooperating on one job, the local heuristic has the edge
over the other two heuristics in this respect as well.
Our goal in this section was to develop a simple and easily implementable server as-
signment policy that yields near-optimal throughput for systems of tandem queues with an
equal number of stations and servers. The local heuristic is such a server assignment pol-
icy. However, once the service rates of all servers at all stations are known (i.e., once �ij
is known for all i = 1; : : : ;M and j = 1; : : : ; N), using this information to develop server
assignment policies that give (slightly) higher throughput than our policy is often not diÆ-
cult. In particular, one could move servers assigned to blocked or starved stations to stations
where they are highly capable of working (i.e., server i could be moved to a station j with
a relatively large �ij), rather than just moving the servers to nearby stations where there
is work to be done but where they may not be particularly competent to work. Similarly,
our heuristic policy does not take the location of bottlenecks into account when deciding
where idle servers should move to. It seems obvious that (slightly) higher throughput can be
achieved by employing policies that attempt to use the movement of idle servers to increase
the production rate of bottleneck stations. Finally, our (local) heuristic policy uses myopic
rules based on local information to decide what station idle servers should move to, because
such decentralized rules seem more easily implementable than rules that take the state of the
system as a whole into account to decide on the movement of servers. However, taking the
entire state of the system into account obviously has the potential to yield higher throughput
than our myopic policies. For example, when there are very few jobs in the system, it seems
sensible to put priority on pushing jobs into the system, something that our local heuristic
does not do (this suggests that a suitable combination of our three heuristics may yield better
performance than any one of them does individually). Nevertheless, the numerical results
given above show that our local heuristic yields the vast majority of the potential bene�t,
so the potential impact of more complex policies that take the service rates of the servers at
28
the di�erent stations, the location of bottleneck stations, and the entire state of the system
into account does not appear to be large, at least for the systems considered here.
In Sections 4 and 5, we focused on the case where the number of servers M equals the
number of stations N . When M = 1 � N , Corollary 2.1 shows that all non-idling policies
are optimal. When M > 1 and M 6= N , our three contingency plans can still be used, so
the main issue to be considered is the (primary) assignment of servers to stations. However,
it is not clear that policies of the form considered in this section (comprising of primary
assignment and contingency components) will be desirable when M 6= N . For example,
when M > N , then it may be desirable to give only N servers primary assignments to
stations and allow the remaining M � N servers to move between stations depending on
the number of jobs waiting to be processed at the stations. Alternatively, if the number of
tasks is not equal to the number of servers, then it may be possible to subdivide or combine
tasks to achieve M = N . Note that if several combined tasks can be worked on in parallel
by di�erent servers, then our assumption that the service rates of cooperating servers are
additive will hold (as long as the number of servers working at a station is not too large).
6 Conclusion
We have studied systems of �nite queues in which servers can travel between stations in a
negligible amount of time and several servers can collaborate on a single job (with additive
service rates). For such queueing systems, we have shown that when all servers are generalists
(i.e., the service rates depend only on either the server, or the station, but not both) and the
service times at each station are independent and identically distributed, then all non-idling
policies will yield optimal throughput. For Markovian queueing systems, we have shown
how throughput maximization problems can be reformulated as Markov decision problems
in which the departure rate for the associated embedded (discrete time) Markov chain is
maximized. Using this formulation, we have determined the optimal server assignment policy
for Markovian systems with two �nite queues in tandem and two servers. In particular, we
29
showed that the optimal policy has a primary assignment part in which servers are assigned
to stations and a contingency part in which servers move to other stations to avoid idleness.
Finally, using insights gained from the case with two stations and two servers, we proposed
a simple (heuristic) server assignment policy, presented numerical results that suggest that
our policy generally achieves near-optimal throughput, and discussed the advantages of our
proposed policy relative to other server assignment policies.
Our research yields the following managerial insights:
1. In order to maximize throughput, it is extremely important to ensure that workers do
not idle. This requires that workers be cross-trained to capably perform several tasks.
2. When several workers can e�ectively collaborate on the same task and little set-up time
or expense is involved when workers move to new tasks, then near-optimal throughput
can be achieved using very simple rules for assigning workers to tasks, as follows:
� When the processing times for each task are approximately independent of the
worker who completes the task, then it suÆces to ensure that workers do not idle.
� When each worker works at about the same speed at all tasks, then it again
suÆces to ensure that workers do not idle.
� When the processing times depend both on the worker and on the task, then
it suÆces to assign a worker to each task in such a way that the product of the
processing rates of the workers at their assigned tasks is maximized, provided that
the workers are instructed to avoid idleness by working on tasks that will enable
them to get back to work at their assigned task as expeditiously as possible and
that the variability in the worker processing rates at the tasks that they are not
assigned to is not huge.1
1The research of the �rst author was supported by the National Science Foundation under Grants DMI{
9523111 and DMI{0000135. The research of the second author was supported by the National Science
Foundation under Grants DMI{9713974, DMI{9908161, and DMI{9984352. The research of the third author
was supported by the National Science Foundation under Grant DMI{0000135. The authors thank the
Department Editor, Associate Editor, and two anonymous referees for their comments about this paper.
30
7 References
H.-S. Ahn, I. Duenyas, and R. Zhang. Optimal Stochastic Scheduling of a Two-Stage Tandem
Queue with Parallel Servers. Advances in Applied Probability , 31, 1095{1117, 1999.
J. J. Bartholdi, III, and D. D. Eisenstein. A Production Line that Balances Itself. Operations
Research, 44, 21{34, 1996.
J. J. Bartholdi, III, D. D. Eisenstein, and R. D. Foley. Performance of Bucket Brigades when
Work is Stochastic. To appear in Operations Research, (2001).
P. Billingsley. Probability and Measure, Third Edition. Wiley, New York, NY, 1995.
D. P. Bischak. Performance of a Manufacturing Module with Moving Workers. IIE Trans-
actions, 28, 723{733, 1996.
J. A. Buzacott. Commonalities in Reengineered Business Processes: Models and Issues.
Management Science, 42, 768{782, 1996.
I. Duenyas and M. P. Van Oyen. Stochastic Scheduling of Parallel Queues with Set-up Costs.
Queueing Systems, 19, 421{444, 1995.
T. M. Farrar. Optimal Use of an Extra Server in a Two Station Tandem Queueing Network.
IEEE Transactions on Automatic Control , 38, 1296{1299, 1993.
B. Hajek. Optimal Control of Two Interacting Service Stations. IEEE Transactions on
Automatic Control , 29, 491{499, 1984.
F. S. Hillier and R. W. Boling. The E�ect of Some Design Factors on the EÆciency of
Production Lines with Variable Operation Times. J. Indust. Eng., 17, 651{658, 1966.
F. S. Hillier and K. C. So. On the Simultaneous Optimization of Server and Work Allocations
in Production Line Systems with Variable Processing Times. Operations Research, 44,
435{443, 1996.
M. Hofri and Keith W. Ross. On the Optimal Control of Two Queues with Server Setup
Times and Its Analysis. SIAM Journal on Computing , 16, 399{420, 1987.
S. M. R. Iravani, M. J. M. Posner, and J. A. Buzacott. A Two-Stage Tandem Queue Attended
by a Moving Server with Holding and Switching Costs. Queueing Systems, 26, 203{228,
31
1997.
V. G. Kulkarni. Modeling and Analysis of Stochastic Systems. Chapman & Hall, London,
UK, 1995.
S. A. Lippman. Applying a New Device in the Optimization of Exponential Queueing
System. Operations Research, 23, 687{710, 1975.
A. Mandelbaum and M. I. Reiman. On Pooling in Queueing Networks. Management Science,
44, 971{981, 1998.
J. O. McClain, L. J. Thomas, and C. Sox. \On-the- y" line balancing with very little WIP.
International Journal of Production Economics, 27, 283{289, 1992.
J. Ostolaza, J. O. McClain, and L. J. Thomas. The Use of Dynamic (State-dependent)
Assembly-Line Balancing to Improve Throughput. J. Mfg. Oper. Mgt., 3, 105{133, 1990.
D. G. Pandelis and D. Teneketzis. Optimal Multiserver Stochastic Scheduling of Two Inter-
connected Priority Queues. Advances in Applied Probability , 26, 258{279, 1994.
M. L. Puterman. Markov Decision Processes. Wiley, New York, NY, 1994.
Z. Rosberg, P. P. Varaiya, and J. C. Walrand. Optimal Control of Service in Tandem Queues.
IEEE Transactions on Automatic Control , 27, 600{609, 1982.
M. P. Van Oyen, E. G. Senturk-Gel, and W. J. Hopp. Performance Opportunity of Work-
force Agility in Collaborative and Noncollaborative Work Systems. To appear in IIE
Transactions, (2001).
M. P. Van Oyen and D. Teneketzis. Optimal Stochastic Scheduling of Forest Networks with
Switching Penalties. Advances in Applied Probability , 26, 474{497, 1994.
R. W. Wol�. Stochastic Modeling and the Theory of Queues. Prentice Hall, Englewood
Cli�s, NJ, 1989.
G. Yamazaki, H. Sakasegawa, and J. G. Shanthikumar. On Optimal Arrangement of Stations
in a Tandem Queueing System with Blocking. Management Science, 38, 137{153, 1992.
E. Zavadlav, J. O. McClain, and L. J. Thomas. Self-bu�ering, Self-balancing, Self- ushing
Production Lines. Management Science, 42, 1151{1164, 1996.
32