serial inventory systems with markov-modulated demand ...lc91/more/papers/chen_song_zhang... ·...
TRANSCRIPT
Serial Inventory Systems with Markov-Modulated Demand:
Derivative Bounds, Asymptotic Analysis, and Insights
Li ChenSamuel Curtis Johnson Graduate School of Management, Cornell University, Ithaca, NY 14853
Jing-Sheng SongFuqua School of Business, Duke University, Durham, NC 27708
Yue ZhangFuqua School of Business, Duke University, Durham, NC 27708
In this paper we consider the inventory control problem for serial supply chains with continuous, Markov-
modulated demand (MMD). Our goal is to simplify the computational complexity by resorting to certain
approximation techniques, and, in doing so, to gain a deeper understanding of the problem. To this end, we
analyze the problem in several new ways. We first perform a derivative analysis of the problem’s optimality
equations, and develop general, analytical solution bounds for the optimal policy. Based on the bound results,
we derive a simple procedure for computing near-optimal heuristic solutions for the problem. These simple
solutions reveal a closer relationship with the primitive model parameters. Second, we perform asymptotic
analysis with long replenishment lead time and establish an MMD central limit theorem. We further show
that the relative errors between our heuristics and the optimal solutions converge to zero as the lead time
becomes sufficiently long, with the rate of convergence being the square root of the lead time. Our numerical
results reveal that our heuristic solutions can achieve near-optimal performance even under relatively short
lead times. Third, we show that, by leveraging the Laplace transformation, the optimal policy becomes
computationally tractable under the gamma distribution family. This enables us to numerically compare
various heuristic solutions with the optimal solution, and to demonstrate that our heuristic outperforms
existing heuristics in most cases. Finally, we observe that the internal fill rate and demand variability
propagation in an optimally controlled supply chain under MMD exhibit behaviors different from those
under stationary demand.
Key words : multi-echelon inventory systems, Markov-modulated demand, derivative, asymptotic analysis
History : file version October 8, 2015
1. Introduction
The increasingly open global economy has made it possible for companies to seek the best available
resources and technologies worldwide and to serve new markets. As a result, many supply chains
have been stretched long and thin, often consisting of multiple production and distribution stages
across different countries and continents. This new supply chain configuration brings new challenges
1
2
to managers. For example, demand shocks, which could be political, economic, technological, or
climatic, in one region can quickly propagate to other regions, causing shortages in some places but
oversupplies in others. Thus, companies that traditionally operate their supply chains in a relatively
stable environment, such as within one country, must now learn how to effectively rationalize
inventory along the global supply chain to hedge against greater environmental uncertainties.
To help managers to meet this new challenge, we consider in this paper a serial supply chain
model that involves demand uncertainties driven by dynamic environmental factors. Specifically, we
assume that customer demand in each period is influenced by a “world” state that evolves according
to a discrete-time Markov chain. This demand process is known as the Markov-modulated demand
(MMD) process in the literature (Iglehart and Karlin 1962 and Song and Zipkin 1993). The MMD
process is useful and practical for its structured data requirement and modeling flexibility. To
construct the Markov chain, one can first identify demand scenarios of different phases in a product
development process or in a product life cycle, and then assess the likelihood of each scenario as
well as the transition probabilities between the scenarios, much as in the construction of a decision
tree (Song and Zipkin 1996b).
Chen and Song (2001) showed that a world-state-dependent echelon base-stock policy is optimal
for serial inventory systems with MMD. Computing the optimal policy, however, is a nontrivial task.
Chen and Song (2001) developed an iterative optimization algorithm, which requires solving a set
of integral renewal equations in each iteration. Their algorithm is effective for the discrete demand
case (as the integral renewal equations reduce to linear equations), but remains computationally
challenging for the continuous demand case. Muharremoglu and Tsitsiklis (2008) showed that the
state-dependent echelon base-stock policy continues to be optimal for more general systems with
(exogenous and sequential) stochastic lead times and discrete demand processes (including MMD).
They decomposed the problem into a series of single-unit single-demand problems, and developed
an efficient dynamic programming algorithm for computing the optimal policy. Due to the nature
of their single-unit decomposition approach, their algorithm cannot be applied to the continuous
demand case. In addition, while representing a significant improvement over the standard dynamic
programming algorithms, these exact algorithms work somewhat like a “black box”—numbers in
and numbers out, lacking the transparency between the model parameters (inputs) and the optimal
inventory decisions (outputs).
In this paper we focus on the continuous demand case. Our goal is two-fold. First, we seek to
develop new approximation techniques to simplify the computational complexity. Such solution
techniques enable managers to make swift decisions in choosing appropriate inventory levels along
the supply chain to hedge against environmental fluctuations. Second, we strive to make the black
box more transparent by establishing a direct relationship between the primitive model parameters
3
and the simple solutions. Such relationship sheds light on both the qualitative and quantitative
effects of the environmental uncertainties. It also helps guide managers to invest wisely to obtain
the right information (or problem parameters) and to react quickly in the right direction. To achieve
the above goal, we analyze the problem in the following new ways.
Derivative Bounds and Heuristics
Our first approach is to perform a derivative analysis of the problem’s optimality equations and
develop derivative bounds. The solution bounds obtained from our derivative bounds can be com-
puted without solving the integral renewal equations required in the algorithm of Chen and Song
(2001). The derivative bound expressions also reveal a closer relationship with the primitive model
parameters, which was not apparent in the algorithms of Chen and Song (2001) and Muharremoglu
and Tsitsiklis (2008).
From the derivative analysis, we obtain general, analytical solution bounds for serial inventory
systems with MMD. The existing bounding approaches for problems with nonstationary demand
all build on a simplifying assumption that requires the optimal state-dependent echelon base-stock
level to be achievable in each period (e.g., Dong and Lee 2003 and Shang 2012). This assumption,
however, does not hold in general, and the “lower bound” obtained from these approaches may fail
to bound the optimal solution under MMD. On the contrary, our derivative-based approach does
not require such an assumption, and thus our solution bounds work in general. In addition, our
solution bounds can be viewed as a generalization of those obtained by Shang and Song (2003) for
the independent and identically-distributed (i.i.d.) demand process. When the state space of the
Markov chain degenerates to a singleton, the demand process becomes i.i.d. and our bounds reduce
to theirs, whereas their bounding approach does not extend to the MMD case (see Appendix D
for a detailed discussion).
Computing our derivative-based solution lower bound involves optimization over demand state
permutations, which can be challenging if there is a large number of states. To simplify computation,
we develop easy-to-compute heuristic solutions that only require evaluating a linear combination
of derivatives of the newsvendor cost functions of each demand state. To our knowledge, this
derivative-based heuristic is the first of its kind in approximating the optimal policy for both single-
stage and serial inventory systems with MMD. Moreover, with this heuristic, we can establish a
mapping between the underlying Markov chain transition probabilities and the linear weight of
each demand state, making the solution process more transparent.
Asymptotic Analysis with Long Lead Time
Our second approach is to perform an asymptotic analysis of the problem with long lead time. In
doing so, we first establish an MMD central limit theorem for the lead time demand (i.e., demand
4
aggregated over the lead time periods). We find that, as the lead time becomes sufficiently long,
the lead time demands with different initial states all converge to the same normal distribution.
To prove this result, we use the concept of α-mixing to show that demands far apart in time under
MMD are almost independent (Billingsley 1995). This finding reveals an inherent averaging effect
of aggregation under MMD: While the demand distributions of different states in a single period
may be drastically different, the aggregated (lead time) demand would become more resembling to
that of an i.i.d. process as the aggregation period increases.
Leveraging our derivative analysis and an MMD central limit theorem, we obtain asymptotic
results for the solution bounds under long replenishment lead time. Specifically, we show that the
relative errors between our solution upper and lower bounds converge to zero as the lead time
increases, with the convergence rate being the square root of the lead time. Accordingly, heuristic
solutions that fall between our solution bounds, such as our derivative-based heuristic and its
simplified variants, are guaranteed to converge to the optimal solution at a minimum speed of
square root of lead time. This observation suggests that, when the lead time is sufficiently long, the
seemingly complex supply chain problem with MMD has surprisingly simple solutions. In proving
the above result, we make a new methodological contribution to the literature—one can apply our
analysis approach to establish similar results for other inventory systems.
Exact Evaluation and Observations
While solving integral renewal equations is generally challenging for continuous demand, we dis-
cover an exception. When demand belongs to the gamma distribution family, we find that the
exact algorithm of Chen and Song (2001) becomes tractable with the aid of Laplace transformation
and its inverse (see Appendix C for details). This finding enables us to numerically compare our
derivative-based heuristic solutions with the optimal policy under relatively short lead times, com-
plementing the asymptotic results established under long lead times. From our numerical studies,
we observe that our heuristic achieves near-optimal performance in most cases.
We further compare our derivative-based heuristic with the decoupling heuristic proposed by
Abhyankar and Graves (2001). Their heuristic, designed for a two-stage system with MMD, is based
on the following simplifying assumption. The upstream stage provides 100% internal fill rate, i.e., it
can always fulfill orders from the downstream stage. As a result, the two-stage system is decoupled
into two single-stage systems, with the downstream stage operating under a state-dependent instal-
lation base-stock policy, and the upstream stage operating under a static installation base-stock
policy. Our numerical results show that our derivative-based heuristic outperforms their decoupling
heuristic in most cases, with an average performance improvement of 4.6%. Besides the perfor-
mance difference, we note that our derivative-based heuristic can be applied to systems with any
number of stages, whereas their decoupling heuristic only works for two-stage systems.
5
It has been demonstrated in the literature that, in a serial inventory system with i.i.d. demand,
an optimal internal fill rate is usually much lower than the fill rate at the customer-facing stage
(see Choi et al. 2004, Shang and Song 2006, and references therein). For systems with MMD,
a natural question is whether the internal fill rate would continue to be low, or become much
higher as assumed in the decoupling heuristic of Abhyankar and Graves (2001). To obtain insights
into this question, we conduct numerical experiments to measure the (optimal) internal fill rate
under MMD. We find that, contrary to the observations documented for the i.i.d. demand case,
when the lead time at the upstream stage is long, the internal fill rate can be high under MMD
(around 94-96%; the target fill rate at the customer-facing stage is around 94%), suggesting that
the decoupling heuristic may yield a good approximation to the optimal policy in this case. On the
other hand, when the lead time at the upstream stage is relatively short, the internal fill rate tends
to be low (around 84-93%, lower than the 94% target fill rate at the customer-facing stage). As
a result, the decoupling heuristic yields a poor approximation; and our derivative-based heuristic
has a clear advantage in this case, as it does not require the high internal fill rate assumption.
In addition, we investigate numerically the demand variability propagation effect, or the bullwhip
effect (see Chen and Lee 2009, 2012 and the references therein). Interestingly, we observe that the
bullwhip effect can be significantly dampened under the optimal policy, indicating that the state-
dependent inventory policy under MMD may smooth demand variability propagation in the supply
chain (see Appendix E for details). This contrasts the existing observations under autoregressive
AR(1) demand in the literature, suggesting interesting future research directions.
Finally, we note that our derivative bounds are derived based on the optimality equations of
the exact algorithm. Chen and Song (2001) showed that the exact algorithm can be extended to
systems with a fixed setup cost at the upmost stage as well as the assembly systems. Thus, all our
results can be extended accordingly to those more general systems.
The rest of this paper is organized as follows. We provide a brief literature review in §2. We then
analyze a single-stage inventory problem in §3. This lays the foundation for the analysis of serial
inventory systems in §4. §5 presents the numerical studies, and §6 concludes the paper. All proofs
are presented in Appendix A; and all other supplementary results are included in Appendices B-E.
2. Literature Review
The serial system structure we consider in this paper is the classic multi-echelon inventory model,
first developed and analyzed by Clark and Scarf (1960), and further studied by many researchers
in various dimensions (see Zipkin 2000 for a review). It serves as an important baseline model
and a key building block for more complex supply chain structures. Much of the literature on
this classic model assumes an i.i.d. demand process. Under this assumption, it has been shown
6
that a stationary echelon base-stock policy is optimal. Earlier work focused on the computational
efficiency of the optimal policy; e.g., see Federgruen and Zipkin (1984) and Chen and Zheng (1994).
More recently, researchers have developed simple bounds and heuristics that aim to increase the
transparency of the factors that determine the optimal policy. See, for example, Gallego and Zipkin
(1999), Dong and Lee (2003), Shang and Song (2003), Gallego and Ozer (2005), and Chao and
Zhou (2007). Our work extends their efforts to systems with MMD.
The MMD process, which extends the i.i.d. demand process to incorporate dynamic state evo-
lution, was first introduced by Iglehart and Karlin (1962) to study a single-stage inventory model,
but only gained popularity in the inventory literature in the last few decades; see, for example, Song
and Zipkin (1993), Ozeciki and Parlar (1993), Beyer and Sethi (1997), and Sethi and Cheng (1997).
An important special case of MMD is the cyclic demand model, which various authors explored;
see, for example, Karlin (1960), Zipkin (1989), Aviv and Federgruen (1997), and Kapuscinsky and
Tayur (1998). Several authors also adopted MMD in serial inventory systems, but focused on dif-
ferent aspects of the problem than we do. For example, Song and Zipkin (1992, 1996a, 2009) and
Abhyankar and Graves (2001) analyzed specific types of policies, Parker and Kapukinski (2004)
characterized optimal policy for a two-echelon capacitated system, Huh and Janakiraman (2008)
used a sample-path approach to establish the optimality of the echelon base-stock policies, Angelus
(2011) considered the complexity of incorporating secondary market sales, and Abouee-Mehrizi et
al. (2014) conducted an exact analysis of capacitated two-echelon inventory systems with priorities.
The most closely related works to ours are Chen and Song (2001) and Muharremoglu and Tsitsiklis
(2008). These authors showed that, for a serial inventory system, a state-dependent echelon base-
stock policy is optimal. They also presented exact algorithms to compute the optimal policy. Using
a decomposition approach similar to that in Muharremoglu and Tsitsiklis (2008), Janakiraman and
Muckstadt (2009) developed efficient algorithms for computing the optimal policies for capacitated
and lost-sales serial systems. We do not consider these system features in our paper.
There has also been research in the literature on deriving solution bounds for serial inventory
systems with nonstationary demand processes. For example, Dong and Lee (2003) developed lower
bounds for optimal policies for serial systems with time-correlated demand. The time-series demand
model requires past demand information, whereas the MMD model we consider here focuses on
external factors. Shang (2012) extended the bounding approach of Shang and Song (2003) to derive
solution bounds for serial inventory systems with independent but nonidentical demands (which is
a special case of MMD). As discussed earlier, the bounding approaches employed in these papers
all build on a simplifying assumption that the optimal state-dependent echelon base-stock level is
achievable in each period. This assumption does not hold in general; as a result, the “lower bound”
obtained from these approaches may fail to bound the optimal solution under MMD. Our paper
complements this literature by deriving solution bounds that work in general under MMD.
7
3. Single-Stage Inventory Systems
For expositional purposes, we consider a single-stage inventory problem in this section. This allows
us to clearly introduce the key ideas to be used later in the analysis for serial inventory systems.
Specifically, the demand of a product is met with on-hand inventory in each period; when there
is stockout, we assume the unmet demand is fully backlogged. Inventory is replenished from an
external source with ample supply. The replenishment lead time is a constant of L periods. At the
end of each period, unit inventory holding cost h and unit backlog penalty cost b are charged on the
on-hand inventory and backorders, respectively. The planning horizon is infinite, and the objective
is to minimize the long-run average cost of the inventory system. Because the linear ordering cost
under the long-run average cost is a constant, we can assume the ordering cost is zero without loss
of generality (see Federgruen and Zipkin 1984).
The demand process is nonstationary and has an embedded Markov chain structure (i.e., the
MMD process). Specifically, demand in each period is a nonnegative random variable denoted by
Dk, where k is the demand state that determines the continuous demand distribution with density
function fk(·) and the cumulative density function Fk(·). The demand state in period t follows
a Markov chain W = W (t), t = 1,2, ..., with a total of K states, denoted by 1,2, ...,K. The
Markov chain is time-homogeneous. Let P = (pij) denote the transition probability matrix, where
pij is the one-step transition probability from state i to j. Without loss of generality, we assume the
Markov chain is irreducible, which implies that all states communicate with each other. (When the
Markov chain is reducible, under the long-run average cost criterion, the problem is equivalent to
one involving only the irreducible class of the chain.) Also let Dk[t, t′] denote the total demand in
periods t, ..., t′ with the demand state being k in period t, and let Dk[t, t′) denote the total demand
in periods t, ..., t′− 1.
In each period, the sequence of events is as follows: 1) at the beginning of a period, the demand
state k is observed; 2) an inventory replenishment order is placed with the supplier; 3) a shipment
ordered L periods ago is received from the supplier; 4) demand arrives during the period; and 5)
at the end of the period, inventory holding and backorder costs are assessed.
3.1 Preliminaries
It has been shown that a state-dependent base-stock policy s∗(k), k= 1, ...,K is optimal for the
problem (see Iglehart and Karlin 1962, Beyer and Sethi 1997, Chen and Song 2001). The policy
works as follows: When the demand state is k, if the inventory position is below the base-stock
level s∗(k), order up to this level; otherwise, do not order.
Given the demand state k and inventory position y, the single-period expected cost is
G(k, y) = h ·E
(y−Dk[t, t+L])+
+ b ·E
(Dk[t, t+L]− y)+, (1)
8
where (x)+ = maxx,0. The above function is the well-known newsvendor cost function, which
is convex in y. Its minimizer, denoted by s(k) = arg minyG(k, y), is termed the myopic base-stock
level. Specifically, s(k) solves the following first-order condition:
∂yG(k, y) =−b+ (b+h)Fk,L(y) = 0, or Fk,L(y) =b
b+h, (2)
where we use “∂y” to denote the first-order derivative with respect to y, and Fk,L(·) is the cumulative
distribution of the lead time demand Dk[t, t+L]. We shall also refer to the function ∂yG(k, y) as
the “newsvendor derivative function” throughout the paper. For later usage, we define the smallest
myopic base-stock level among all states and its corresponding demand state as
smin = min1≤k≤K
s(k), kmin = arg min1≤k≤K
s(k).
Although the myopic base-stock level s(k) is easy to compute, it is not necessarily optimal. The
demand state in the next period may result in a (stochastically) much smaller demand distribution;
therefore, stocking up to the myopic base-stock level in the current period may cause overstocking
in the next period. More sophisticated methods are needed to determine the optimal policy.
Exact Algorithm. Chen and Song (2001) developed the following K-iteration algorithm to
compute the optimal policy. The algorithm starts with a partition of the state space of W. In
each iteration i, let V i denote the set that contains the states for which we have not found the
optimal solution yet, and U i the complementary set of V i. Specifically, in the first iteration i= 1,
let V 1 = 1, ...,K and U 1 = φ. Define
G1(k, y) =G(k, y)
as given in (1), solve the single-period problem, and let s1(k) = s(k) denote the solution for state
k. Find the demand state that gives the smallest solution among V 1. Denote such a state as 1∗ =
arg mink∈V 1 s1(k) = arg min1≤k≤Ks(k). Then, the optimal solution for state 1∗ is just s∗(1∗) =
s1(1∗) = smin. Next, update the state sets by V 2 = V 1\1∗ and U 2 = U 1 ∪ 1∗, and proceed to
iteration i= 2. In iteration i+1 (1≤ i <K), for each k ∈ V i+1, solve the cost minimization problem
minyGi+1(k, y), where
Gi+1(k, y) =Gi(k, y) +∑
u∈Ui+1
pkuERi(u, y−Dk), (3)
and for any u∈U i+1,
Ri(u, y) =
∑u′∈Ui+1
puu′ERi(u′, y−Du) if u∈U i,
Gi(i∗, y) +∑
u′∈Ui+1
pi∗u′ERi(u′, y−Di∗) if u= i∗,
(4)
9
with Gi(i∗, y) =Gi(i∗,maxy, si(i∗)). It can be shown that Gi+1(k, y) is convex in y. Let si+1(k) =
arg minyGi+1(k, y). Also let (i+ 1)∗ = arg mink∈V i+1 si+1(k), i.e., the demand state that gives the
smallest si+1(k) among V i+1. Then the optimal solution for state (i+ 1)∗ is given by s∗((i+ 1)∗) =
si+1((i+1)∗). Update the state sets by V i+2 = V i+1\(i+1)∗ and U i+2 =U i+1∪(i+1)∗. Repeat
the above procedure until reaching the final iteration K.
In summary, the above algorithm finds the optimal base-stock level for each demand state by sort-
ing the demand states based on the solutions to the cost minimization problems minyGi+1(k, y)
in each iteration. Although the algorithm is simpler and more efficient than the standard dynamic
programming algorithms, it remains quite complicated because each iteration builds on the results
obtained in the previous iterations. The main difficulty lies in determining the function Ri at each
iteration, as given in (4). For the discrete demand case, computing Ri is relatively easy, as it
involves solving a set of linear equations (see Chen and Song 2001). For the continuous demand
case, however, determining Ri requires solving a set of integral renewal equations.
To illustrate the last point more clearly and to facilitate our subsequent analysis, we introduce
the following matrix notation. Let m = (m1, ...,mK) denote a permutation of the sequence of
demand states 1, ...,K. Define the following sub-matrices of transition probabilities under m: for
i= 1, ...,K − 1,
P(i)m =
pm1m1. . . pm1mi
.... . .
...pmim1
. . . pmimi
. (5)
Let m∗ = (1∗, ...,K∗) denote the optimal demand state sequence determined by the exact algo-
rithm. Let D(i)m∗(x) be the following diagonal matrix involving demand densities for states 1∗, ..., i∗:
D(i)m∗(x) =
f1∗(x) . . . 0...
. . ....
0 . . . fi∗(x)
. (6)
Also let Ri(y) = [Ri(1∗, y),Ri(2∗, y), ...,Ri(i∗, y)]T . Then, we can rewrite (4) in a matrix form as
follows:
Ri(y) =
∫ ∞0
D(i)m∗(x) ·P(i)
m∗ ·Ri(y−x)dx+ ei ·Gi(i∗, y), (7)
where ei is the unit vector with the i-th element being one.
Our Approach. Based on the exact algorithm, it is useful to let [k] denote the iteration in
which the optimal solution s∗(k) for state k is obtained. In other words,
s∗(k) = arg miny≥0
G[k](k, y)
.
Because G[k](k, y) is convex in y, its derivative is increasing in y. If we can find bounding functions
for its derivative, then the roots of the bounding functions become bounds for the optimal solution.
10
Thus, our approach is to develop bounding functions that involve only the primitive model param-
eters, so that the resulting solution bounds will depend only on the primitive model parameters.
To this end, by repeatedly applying (3) and (4) and then taking derivatives, we observe that
∂yG[k](k, y) = ∂yG(k, y) +
[k]−1∑i=1
∑u∈Ui+1
pku∂yERi(u, y−Dk). (8)
The first term is simple, given by (2), so what remains is to find simple functions to bound
∂yERi(u, y−Dk). Note that all functions in the exact algorithm are continuously differentiable
and all the expectations are finite. Therefore, we can interchange the derivative and expectation (or
integral) operators in our subsequent derivations, e.g., ∂yERi(u, y−Dk)=E∂yRi(u, y−Dk).
3.2 Derivative and Solution Bounds
We now derive bounds for ∂yG[k](k, y). Observe that ∂yG
i(i∗, y) = (∂yGi(i∗, y))+. Also, from equa-
tion (4), it is straightforward to show that ∂yRi(·, y) = 0 for y ≤ si(i∗). Using a standard result in
renewal equations (Sgibnev 2001), we obtain
Lemma 1 For any u ∈ U i+1, Ri(u, y) is increasing convex in y. Moreover, for any k = 1, ...,K,
Gi(k, y) is convex in y and Gi(i∗, y) is increasing convex in y.
According to the above Lemma, the second term of (8) is nonnegative, so we immediately have
∂yG[k](k, y)≥ ∂yG(k, y) =−b+ (b+h)Fk,L(y).
Thus, ∂yG(k, y) serves as a simple lower bound function for ∂yG[k](k, y).
We now proceed to derive an upper bound for ∂yG[k](k, y). Because W is irreducible, it follows
that I(i)−P(i)m is invertible for any i= 1, ...,K − 1, where I(i) is the i× i identity matrix and P(i)
m
is defined in (5). Thus, we can define the following parameters: for i= 1, ...,K − 1,
βi(m) =
∣∣I(i−1)−P(i−1)m
∣∣∣∣∣I(i)−P(i)m
∣∣∣ ,
where | · | is the matrix determinant and we adopt the convention that |I(0)−P(0)m |= 1. Taking the
derivative with respect to y on both sides of the equation (7), we have
∂yRi(y) =
∫ ∞0
D(i)m∗(x) ·P(i)
m∗ · ∂yRi(y−x)dx+ ei · ∂yGi(i∗, y)
=
∫ y
0
D(i)m∗(x) ·P(i)
m∗ · ∂yRi(y−x)dx+ ei · ∂yGi(i∗, y)
≤ P(i)m∗ · ∂yRi(y) + ei · ∂yGi(i∗, y),
11
where the inequality follows from the fact that ∂yRi(·, y) is increasing in y and the probability
distribution is less than one. Therefore,(I(i)−P
(i)m∗
)∂yR
i(y)≤ ei · ∂yGi(i∗, y). (9)
Let e be the i-dimensional vector with all elements being one. With some additional argument, we
obtain the following upper bound on ∂yRi(y):
Lemma 2 For i= 1, ...,K − 1,
∂yRi(y)≤
(I(i)−P
(i)m∗
)−1
· ei · ∂yGi(i∗, y)≤ e ·βi(m∗) · ∂yGi(i∗, y).
Applying both Lemmas 1 and 2 to the second term of (8) yields
[k]−1∑i=1
∑u∈Ui+1
pku∂yERi(u, y−Dk) ≤[k]−1∑i=1
∑u∈Ui+1
pkuβi(m∗)∂yG
i(i∗, y)
≤ (1− pkk)K−1∑i=1
βi(m∗)∂yG
i(i∗, y),
where the second inequality relaxes the range of the second summation to make it independent of
the set U i. We can further show that (see the detailed derivation in Appendix A)
K−1∑i=1
βi(m∗)∂yG
i(i∗, y)≤K−1∑i=1
αi(m∗)(∂yG(i∗, y))+ ≤∆(y),
where
αi(m) =K−1∑j=i+1
αj(m) ·φj,i(m) +βi(m), i= 1, ...,K − 1 (10)
with φj,i(m) = [pmjm1, pmjm2
, ..., pmjmi ]T(I(i)−P(i)
m
)−1ei, and
∆(y) = maxm∈S
K−1∑i=1
αi(m) · (∂yG(mi, y))+, (11)
with S being the set of all permutations of 1, ...,K. Thus, we have obtained an upper bound
function for ∂yG[k](k, y) based on a linear combination of simple newsvendor derivative functions.
Note that both βi(m) and αi(m) depend only on the primitive model parameters—the transition
matrix P. The following proposition summarizes the above bound results:
Proposition 1 For any state k= 1, ...,K, the following holds:
∂yG(k, y)≤ ∂yG[k](k, y)≤ ∂yG(k, y) + (1− pkk)∆(y).
12
When K = 1, ∆(y) ≡ 0, so the inequalities in the above proposition becomes binding and the
result reduces to the derivative of the single-period cost function. This is intuitive because when
K = 1, the demand process reduces to an i.i.d. process. With the derivative lower bound shown
above, it follows that the myopic base-stock level s(k) is an upper bound for the optimal solution
s∗(k). This result is implied by the exact algorithm of Chen and Song (2001). Symmetrically, from
the derivative upper bound shown above, we can obtain a lower bound for the optimal solution
s∗(k). Specifically, let s(k) be the solution to the following equation:
∂yG(k, y) + (1− pkk)∆(y) =−b+ (b+h)Fk,L(y) + (1− pkk)∆(y) = 0. (12)
We obtain the following result:
Corollary 1 For any demand state k= 1, ...,K, the following holds:
smin ≤ s(k)≤ s∗(k)≤ s(k).
where s(k) and s(k) are determined by (2) and (12), respectively. The inequalities become equalities
when k= kmin.
While the bounds smin and s(k) have appeared in the literature (see Song and Zipkin 1993),
the lower bound s(k) is new to the literature. This new lower bound is tighter than the existing
one, smin. Note that computing the new lower bound involves optimization over demand state
permutations, which can be challenging if there is a large number of states. To further simplify
computation, we develop easy-to-compute heuristic solutions in the next subsection.
3.3 Heuristics
The insights gained from the derivation of the derivative bounds lead us next to construct heuristics
that fall between the solution lower and upper bounds. To do so, we augment the derivative lower
bound by a positive component that is less than (1−pkk)∆(y). Specifically, we first sort the myopic
base-stock level s(k) in an increasing order and let m = (m1, ..., mK) denote the resulting order
sequence of the states. Suppose that k = mj. Based on Lemma 2, we can make the following
approximation for the second term of ∂yG[k](k, y) in (8):
[k]−1∑i=1
∑u∈Ui+1
pku∂yERi(u, y−Dk) ≈j−1∑i=1
pkmi · ∂yERi(mi, y−Dk)
≈j−1∑i=1
pkmi ·βi(m) ·E
(∂yG(mi, y−Dk))+
≈ 1
2
j−1∑i=1
pkmi ·βi(m) ·Fk(y− s(mi)) (∂yG(mi, y))+.
13
Here, in the first step we approximate m∗ by m and replace U i+1 by mi. The second step follows
from Lemma 2 with m replacing m∗. The last step is based on the following integral approximation:
E
(∂yG(mi, y−Dk))+
=
∫ y−smi
0
(∂yG(mi, y−x))+fk(x)dx≈ 1
2Fk(y− s(mi)) · (∂yG(mi, y))+.
Now let sa(k) denote a heuristic solution determined by the following equation:
∂yG(k, y) +1
2
j−1∑i=1
pkmi ·βi(m) ·Fk(y− s(mi)) (∂yG(mi, y))+
= 0. (13)
Note that equation (13) involves only a linear combination of the newsvendor derivative functions
of different demand states. Thus, solving sa(k) only requires a direct evaluation of the lead time
demand distributions of different demand states. With this heuristic, we also establish a direct
relationship between the underlying Markov chain transition probabilities and the linear weights
in equation (13), making the solution process more transparent. Moreover, we have
Corollary 2 For any demand state k = 1, ...,K, smin ≤ s(k) ≤ sa(k) ≤ s(k). The inequalities
become equalities when k= kmin.
The above Corollary shows that the heuristic solution obtained from (13) indeed falls between
the solution lower and upper bounds. A detailed illustration of our bound and heuristic results
is provided in Appendix B for a two-state example (i.e., K = 2). In the example, we show that,
when it is highly likely to transit from a high demand state to a low demand state, one should
reduce the optimal inventory level for the high demand state, and our heuristic solution moves
in the same direction as the optimal solution. In addition, our heuristic solution provides a closer
approximation to the optimal solution when there is a high chance of transiting from a low demand
state to a high demand state (see Appendix B for details).
Combining Corollaries 1 and 2, we observe that s∗(k) and sa(k) fall in the same interval
[s(k), s(k)]. Therefore, if the interval is tight, the heuristic solution sa(k) becomes a close approx-
imation to the optimal solution s∗(k). In the next subsection, we identify sufficient conditions to
ensure that such an interval is tight.
3.4 Asymptotic Analysis with Long Lead Time
Intuitively, when the replenishment lead time L increases, the lead time demands starting from
different states may converge to the same distribution. As a result, the gap between smin and s(k)
will narrow. Below we formally prove this result by first establishing an MMD central limit theorem
for the lead time demand.
First, consider the case in which W has a stationary distribution π = [π(1), ..., π(K)]. Let Dπ(t)
be the demand in period t under the stationary distribution, that is, Dπ(t) follows the distribution
14
of Dk with probability π(k). Let µk =EDk and σ2k = varDk. The mean and variance of Dπ(t)
are given by
µ=EDπ(t)=K∑k=1
µkπ(k), varDπ(t)=K∑k=1
(σ2k + µ2
k
)π(k), (14)
where µk = µk−µ is the relative difference between µk and µ.
Let Dπ[t, t+L] denote the lead time demand under the stationary distribution π, we have
EDπ[t, t+L]= (L+ 1)µ, covDπ(t),Dπ(t+ l)=K∑k=1
K∑l=1
µkµlp(s)kl π(k), l= 1,2, ..., (15)
where p(s)kl is the s-step transition probability from state k to state l. We can further show (see the
proof of Lemma 3 in Appendix A)
σ2 = limL→∞
varDπ[t, t+L]L+ 1
= varDπ(t)+ 2∞∑l=1
covDπ(t),Dπ(t+ l). (16)
Thus, the variance limit σ2 contains two parts: the single-period demand variance and the covari-
ance across different periods. It is interesting to note that the covariance structure under MMD
depends only on the demand mean of each state. Thus, besides the demand variance at each state,
another major source of variability under MMD comes from the variation of the demand means.
Let N (0,1) denote the standard normal random variable with distribution function Φ(·). We
can establish the following result for the lead time demand distribution under MMD:
Lemma 3 (MMD Central Limit Theorem) Suppose that the Markov chain W has a stationary
distribution and Dk has finite moments for any demand state k. If σ 6= 0, then as L→∞,
Dk[t, t+L]− (L+ 1)µ
σ√L+ 1
dist.−−→N (0,1) ,
where µ and σ are defined in (14) and (16), respectively.
The above Lemma confirms the intuition that the lead time demands with different initial states
converge to the same distribution as the lead time increases. To prove this result, we use the
concept of α-mixing to show that demands far apart in time under MMD are almost independent
(Billingsley 1995). This finding reveals an inherent averaging effect of aggregation under MMD:
While the demand distributions of different states in a single period may be drastically different,
the aggregated (lead time) demand would become more resembling to that of an i.i.d. process as
the aggregation period increases.
Based on the result, when L is sufficiently long, the lead time demand Dk[t, t+L] can be approx-
imated by a normal distribution with mean (L+ 1)µ and variance (L+ 1)σ2, independent of the
15
initial demand state k. Consequently, the myopic base-stock level for state k can be approximated
by the following formula:
s(k)≈ (L+ 1)µ+ z∗σ√L+ 1, k= 1, ...,K, (17)
where z∗ = Φ−1(b/(b+ h)). In other words, all myopic base-stock levels s(k) converge to the same
value, so does smin.
Next, consider a cyclic Markov chain W. In this case W does not have a stationary distribution.
However, due to its cyclic nature, as the lead time increases, the lead time demands starting from
different states share a growing common component. As a result, we can show that, for any two
states k and k′, the lead time demand distributions Fk,L(y) and Fk′,L(y) converge to the same
limit distribution for every y as L goes to infinity. Thus, it continues to hold that all myopic base-
stock levels s(k) converge to the same value in this case. The following proposition summarizes our
discussion above and further determines the convergence rate:
Proposition 2 Suppose that the Markov chain W either has a stationary distribution or is cyclic,
and that Dk has finite moments for any demand state k. For any demand state k= 1, ...,K,
limL→∞
√L+ 1 · s(k)− smin
smin= 0.
The above proposition shows that the relative percentage error between s(k) and smin converges
to zero as L goes to infinity. The speed of convergence is at a rate of o(1/√L+ 1
). Thus, when the
lead time is sufficiently long, we can use the closed-form expression (17) or the more sophisticated
sa(k) (Corollary 2) to approximate s∗(k). In proving the above convergence result, we leverage
the fact that smin = mink∈1,...,K s(k), which is the minimum of the myopic base-stock levels of
different states. Since the myopic base-stock level of a given state depends only on the lead time
demand distribution associated with that state, we can then apply the MMD central limit theorem
to obtain the convergence rate result for our solution bounds.
4. Serial Inventory Systems
In this section we present our main results for the general serial inventory system with N > 1
stages. Specifically, random customer demand arises in every period at Stage 1, Stage 1 orders
from Stage 2, ..., and Stage N orders from an external supplier with ample supply. We also call
the external supplier Stage N + 1. If there is stockout at Stage 1, we assume the unmet demand
is fully backlogged. The production-transportation lead time from Stage n + 1 to Stage n is a
constant of Ln periods. Let L′n =∑n
j=1Lj denote the total lead time from Stage n+ 1 to Stage 1.
At the end of each period, a unit (installation) inventory holding cost Hn is charged for on-hand
16
inventories at Stage n (1 ≤ n ≤ N) and a unit backlog penalty cost b is charged for backlogs at
Stage 1. Following the convention in the literature, we define the echelon inventory holding cost at
Stage n as hn =Hn−Hn+1 > 0, with HN+1 = 0. The demand process at Stage 1 follows the MMD
process as described in §3.
We assume that all the replenishment activities in a period happen at the beginning of the
period after observing the demand state. At Stage n (n> 1), the sequence of events is as follows:
an order from Stage n− 1 is received, an order is placed with Stage n+ 1, a shipment is received
from Stage n+ 1, and then a shipment is sent to downstream Stage n− 1. For Stage 1, an order is
placed at the beginning of the period, a shipment is received from Stage 2, and then the demand
arrives during the period. The planning horizon is infinite and the objective is to minimize the
long-run average cost of the serial inventory system. As in the single-stage problem, we assume the
ordering cost is zero without loss of generality. In what follows, we shall assign a subscript “n” to
the corresponding functions and variables for Stage n.
4.1 Preliminaries
Let ILn(t) denote the echelon inventory level at Stage n by the end of period t, which includes the
total on-hand inventory at Stages 1, ..., n, plus the total inventory in transit to Stages 1, ..., n− 1,
minus the backorders at Stage 1. Also let B(t) denote the backorder level at Stage 1 by the end
of period t. Then, the system inventory holding and penalty cost at the end of period t can be
written as
N∑n=2
Hn[ILn(t)− ILn−1(t)] +H1[IL1(t) +B(t)] + bB(t) =N∑n=1
hnILn(t) + (b+H1)B(t),
where the right-hand side is a sum of a sequence of echelon-related costs from Echelon 1 to N .
Consider first the Echelon 1 cost. Under the long-run average cost criterion, we can charge the cost
of period t+L1 to period t without affecting the cost assessment. Specifically, given the demand
state k and inventory position y at the beginning of period t, the expected cost at the end of period
t+L1 is given by
Eh1IL1(t+L1) + (b+H1)B(t+L1)
= h1 ·E
(y−Dk[t, t+L1])+
+ (b+H2) ·E
(Dk[t, t+L1]− y)+.
Thus, we can define the single-period expected cost function at Echelon 1 as
G11(k, y) = h1 ·E
(y−Dk[t, t+L1])+
+ (b+H2) ·E
(Dk[t, t+L1]− y)+
. (18)
Recall that when N = 1, h1 =H1 and H2 = 0. Therefore, in the single-stage system, G11(k, y) reduces
to G(k, y) as defined in (1). Chen and Song (2001) showed that the echelon decomposition technique
17
of Clark and Scarf (1960) for the i.i.d. demand case can be extended to the MMD process. That
is, the optimal state-dependent echelon base-stock policy can be determined sequentially from
Echelons 1 to N , with the aid of an induced penalty function between the successive echelons.
Exact Algorithm. Chen and Song (2001) developed the following algorithm for computing the
optimal policy. Start with Stage 1. Apply the K-iteration algorithm described in §3.1 to G11(k, y)
defined in (18), and obtain the optimal echelon base-stock levels s∗1(k) for Stage 1. For Stage n≥ 2,
first define the following induced penalty functions from stage n− 1 to stage n: for k= 1, ...,K,
Gn−1,n(k, y) =G[k]n−1(k,miny, s∗n−1(k))−G[k]
n−1(k, s∗n−1(k)), (19)
where G[k]n−1(k, ·) is the function obtained from the K-iteration algorithm for Stage n−1 (see §3.1).
With the induced penalty functions, define the following cost functions for Stage n: for k= 1, ...,K,
G1n(k, y) =Ehn(y−Dk[t, t+Ln]) +Gn−1,n(Wk(t+Ln), y−Dk[t, t+Ln)), (20)
where Wk(t+Ln) is the demand state at the beginning of period t+Ln given the state k in period t.
Now apply the K-iteration algorithm again to G1n(k, y), and obtain the optimal echelon base-stock
levels s∗n(k) for Stage n. Repeat the above procedure until reaching the final Stage N .
Thus, the computation procedure for the N -stage system requires N ×K iterations. Besides the
difficulty in determining the function Rin at each iteration as discussed in §3.1, determining the
induced penalty function Gn−1,n in the objective function (20) presents an additional challenge.
Our Approach. As in §3.2, we seek to derive simple bounds for the optimal policy by performing
a derivative analysis of the two key equations (19) and (20) in the exact algorithm. Analogous to
the expression (8) in the single-stage system, we can show
∂yG[k]n (k, y) = ∂yG
1n(k, y) +
[k]−1∑i=1
∑u∈Ui+1
n
pku∂yERin(u, y−Dk). (21)
For each Stage n and demand state k, we define the following newsvendor cost function parame-
terized by ζ (hn ≤ ζ ≤∑n
i=1 hi):
Gn(k, y|ζ) = ζ ·E
(y−Dk[t, t+L′n])+
+ (b+Hn+1) ·E
(Dk[t, t+L′n]− y)+. (22)
Its derivative function ∂yGn(k, y|ζ) =−(b+Hn+1) + (b+Hn+1 + ζ)Fk,L′n(y) will play an important
role in our bound and heuristic development. Note that when N = 1, ζ ≡ h1 = h, H2 = 0, and the
above newsvendor cost function reduces to (1) .
18
4.2 Derivative and Solution Bounds
For illustrative purposes, consider the cost function for Stage 2. From (19) and Proposition 1, we
have
∂yG1,2(k, y) = ∂yG[k]1 (k,miny, s∗1(k))≥ ∂yG1
1(k,miny, s∗1(k)) = min
0, ∂yG11(k, y)
.
Because H1 ≥H2,
∂yG11(k, y) =−(b+H2) + (b+H1)Fk,L1
(y)≥−(b+H2) + (b+H2)Fk,L1(y).
Observe that the right-hand side of the above inequality is always negative. It follows that
∂yG1,2(k, y)≥−(b+H2) + (b+H2)Fk,L1(y).
With this inequality, we obtain
∂yG12(k, y) = h2 +E ∂yG1,2(Wk(t+L2), y−Dk[t, t+L2))
≥ h2 +E−(b+H2) + (b+H2)FWk(t+L2),L1
(y−Dk[t, t+L2))
= −(b+H3) + (b+H2)Fk,L′2(y) = ∂yG2(k, y|h2).
where the second equality follows from EFWk(t+L2),L1(y−Dk[t, t+L2))= Fk,L′2(y). By an argu-
ment analogous to that used in Proposition 1, we have
∂yG[k]2 (k, y)≥ ∂yG2(k, y|h2). (23)
Repeating the above argument from Stage 2 to Stage n, we can obtain a lower bound for ∂yG[k]n (k, y).
Developing the derivative upper bound for the serial system, however, is more complex, because
we need to bound the induced penalty function Gn−1,n in (20). Nevertheless, we can show the
following result, which generalizes Proposition 1 to the serial system. Denote h[1,n] =∑n
i=1 hi.
Proposition 3 For any stage n= 1, ...,N and any state k= 1, ...,K, the following holds:
(a) ∂yG[k]n (k, y)≥ ∂yGn(k, y|hn),
(b) ∂yG[k]n (k, y)≤ ∂yGn(k, y|h[1,n]) + (1− pkk)∆n(y) +
∑n−1
j=1 ∆j,n(k, y),
where ∆n(y) and ∆j,n(k, y) are recursively defined as follows: for n= 1, ...,N ,
∆n(y) = maxm∈S
K−1∑i=1
αi(m)[∂yGn(k, y|h[1,n]) +
∑n−1
j=1 ∆j,n(mi, y)]+
,
with ∆j,n(k, y) =E
∆j
(y−Dk[t, t+L′n−L′j)
)for 1≤ j ≤ n− 1 and αi(m) given by (10).
19
It is straightforward to verify that ∆1(y) = ∆(y) as defined in (11). Thus, the above result
reduces to that of Proposition 1 when N = 1. When N > 1, there is an extra term∑n−1
j=1 ∆j,n(k, y)
on the right-hand side of the inequality. This term captures the dependence of Echelon n on all
the downstream echelon base-stock levels from Echelons 1 to n− 1.
Based on Proposition 3, let sn(k) be the solution to the equation
∂yGn(k, y|hn) =−(b+Hn+1) + (b+Hn)Fk,L′n(y) = 0, (24)
and let sn(k) be the solution to the equation:
∂yGn(k, y|h[1,n]) + (1− pkk)∆n(y) +∑n−1
j=1 ∆j,n(k, y) = 0. (25)
It follows that
Corollary 3 For any stage n= 1, ...,N and any state k= 1, ...,K, sn(k)≤ s∗n(k)≤ sn(k).
The above result generalizes Corollary 1 to serial inventory systems. To our knowledge, Corollary
3 presents the first general, analytical solution bounds for serial inventory systems with MMD. The
existing bounding approaches for problems with nonstationary demand all build on a simplifying
assumption that the optimal state-dependent echelon base-stock level is achievable in each period
(e.g., Dong and Lee 2003, Theorem 6 and Shang 2012, Theorem 1). This assumption, however, does
not hold in general, and the “lower bound” obtained from these approaches may fail to bound the
optimal solution under MMD. On the contrary, our derivative-based bounding approach does not
require such an assumption and accounts for the dependence on the downstream echelon base-stock
levels through the ∆ terms in Proposition 3. Thus, our solution bounds work in general.
To see the above point more clearly, observe that by ignoring the ∆ terms in the derivative upper
bound in Proposition 3, we obtain a newsvendor derivative function. Let svn(k) be the root of this
function, i.e., the solution to
∂yGn(k, y|h[1,n]) =−(b+Hn+1) + (b+H1)Fk,L′n(y) = 0. (26)
Thus, svn(k) is the “lower bound” for the optimal solution s∗n(k) under the assumption that the
optimal state-dependent echelon base-stock level is achievable in each period (e.g., Dong and Lee
2003 and Shang 2012). However, for Stage 1, we have h[1,1] = h1, and thus, from Proposition 3 and
Corollary 3, sv1(k) = s1(k). Thus, sv1(k) is not a solution lower bound, but instead a solution upper
bound under MMD. For Stage n> 1, it is also not guaranteed that svn(k)≤ s∗n(k) under MMD (see
Table 1 in §5 for the counterexamples marked with a “†” sign).
When the state space of the Markov chain degenerates to a singleton, i.e., K = 1, the demand
process becomes i.i.d. In this case, the optimal state-dependent echelon base-stock level is always
20
achievable in each period, and the ∆ terms in Proposition 3 vanish. As a result, ∂yG[k]n (k, y) is
bounded below and above by simple newsvendor derivative functions and svn(k) becomes a true
lower bound in this case. It is interesting to note that under the i.i.d. demand process, the solution
bounds obtained from our derivative bounds coincide with those obtained by Shang and Song
(2003) through their single-stage approximation approach. However, their bounding approach does
not extend to the more general MMD process, as it also requires the assumption that the optimal
state-dependent echelon base-stock level is achievable in each period (see Appendix C for a detailed
discussion).
Our solution bounds can be computed without solving the integral renewal equations required
in the exact algorithm. However, computing the solution lower bound involves optimization over
demand state permutations, which can be challenging if there is a large number of states. To further
simplify computation, we develop easy-to-compute heuristic solutions in the next subsection.
4.3 Heuristics
Based on the definition of svn(k) given in (26), by Proposition 3, it follows that sn(k)≤ svn(k)≤ sn(k).
Hence, although svn(k) is not guaranteed to be a solution lower bound, it can still serve as a simple
heuristic solution for our problem.
We can further improve this heuristic based on the idea used in the single-stage problem in
§3.3. Specifically, we can sort svn(k) in an increasing order and let mn = (mn,1, ..., mn,K) denote the
resulting order sequence of the states. Suppose that k = mn,j. Based on Lemma 2, we can make
the following approximation (parameterized by ζ) for ∂yG[k]n (k, y) in (21):
∂yG[k]n (k, y) ≈ ∂yGn(k, y|ζ) +
j−1∑i=1
pkmn,i · ∂yERin(mn,i, y−Dk)
≈ ∂yGn(k, y|ζ) +
j−1∑i=1
pkmn,i ·βi(mn) ·E(
∂yGn(mn,i, y−Dk|ζ))+
≈ ∂yGn(k, y|ζ) +1
2
j−1∑i=1
pkmn,i ·βi(mn) ·Fk(y− svn(mn,i))(∂yGn(mn,i, y|ζ)
)+
,
where the parameter ζ takes value in hn ≤ ζ ≤ h[1,n], and the last step follows from the same integral
approximation as in the single-stage problem in §3.3.
Next, let san(k|ζ) denote a heuristic solution determined by the following equation:
∂yGn(k, y|ζ) +1
2
j−1∑i=1
pkmn,i ·βi(mn) ·Fk(y− svn(mn,i))(∂yGn(mn,i, y|ζ)
)+
= 0, (27)
where the above equation involves only a linear combination of the newsvendor derivative func-
tions of different demand states. Thus, solving the heuristic solution san(k|ζ) only requires a direct
21
evaluation of the lead time demand distribution of different demand states. With this heuristic,
we also establish a mapping between the underlying Markov chain transition probabilities and the
linear weights in equation (27), making the solution process more transparent. Moreover, we have
the following result:
Corollary 4 For any stage n= 1, ...,N and any state k= 1, ...,K, sn(k)≤ san(k|ζ)≤ sn(k).
Combining Corollaries 3 and 4, we observe that san(k|ζ) and the optimal solution s∗n(k) fall in the
same interval [sn(k), sn(k)], suggesting that san(k|ζ) can be a close approximation for the optimal
solution s∗n(k) as long as the interval is tight. In the next section we identify sufficient conditions
to ensure that such an interval is tight.
4.4 Asymptotic Analysis with Long Lead Time
Recall from Corollary 1 that in the single-stage system the solution lower bound s(k) is further
bounded below by smin = mink∈1,...,K s(k). However, in the serial inventory system, this property
is no longer available. Instead, we need to work directly with sminn = mink∈1,...,K sn(k) for n≥ 1,
with sn(k) determined from the derivative upper bound expression (25).
From Proposition 3, the derivative upper bound for the serial inventory system is more complex
than that for the single-stage system. Thus, in order to apply the MMD central limit theorem to
derive asymptotic results, we need to further relax the derivative upper bound to obtain a more
workable expression (see the proof of Proposition 4 in Appendix A). Overcoming these technical
difficulties, we can establish the following result, which generalizes Proposition 2 to serial inventory
systems.
Proposition 4 Suppose that the Markov chain W either has a stationary distribution or is cyclic,
and that Dk has finite moments for any demand state k. For any Stage n= 1, ...N and any demand
state k= 1, ...,K,
limsupL′n→∞
√L′n + 1 · sn(k)− sminn
sminn
= c
where c is a nonnegative constant.
The above proposition shows that the relative percentage error between sn(k) and sminn converges
to zero as the lead time L goes to infinity. The convergence rate is O(1/√L′n + 1
), where L′n is
the total lead time from Stage n+ 1 to Stage 1. Therefore, according to Corollary 4, the heuristic
solution san(k|ζ) is guaranteed to be a close approximation to the optimal solution s∗n(k) when
the total lead time L′n is sufficiently long. In other words, when the lead time is sufficiently long,
the seemingly complex supply chain problem with MMD has surprisingly simple solutions. This
is because the aggregated (lead time) demand under MMD becomes more resembling to that of
22
an i.i.d. process as the lead time increases. Moreover, in the serial inventory system, the total
lead time at an upstream stage is always greater than the total lead time at a downstream stage.
Thus, Proposition 4 suggests that the heuristic solution at the upstream stage tends to be a closer
approximation to the optimal solution than does the heuristic solution at the downstream stage.
Finally, in proving the above result, we make a new methodological contribution to the literature—
one can apply our analysis approach to establish similar results for other inventory systems.
5. Numerical Studies
In this section we conduct numerical studies to evaluate our bound and heuristic performance. With
the aid of Laplace transform and its inverse, we find that the optimal policy for our problem becomes
computationally tractable under gamma demand distribution (see Appendix C for details). This
finding enables us to benchmark our bounds and heuristic solutions against the optimal solution.
Specifically, we assume that the single-period demand Dk under demand state k follows a gamma
probability density
f(x|nk, θk) =(θk)
nkxnk−1e−θkx
Γ(nk), with x≥ 0, (28)
where nk is the shape parameter and θk the rate parameter (1/θk is the scale parameter). The
demand mean is µk = nk/θk. It is worth emphasizing that, unlike the optimal policy, our solution
bounds and heuristic solutions can be computed under any demand distributions.
We focus on presenting numerical results of two-stage systems. The numerical insights from
systems with more than two stages are similar to what we report below, and are thus omitted
for brevity. Recall from our asymptotic analysis—as the number of stages increases, the heuristic
solutions at upstream stages tend to be closer approximations to the optimal solution than those
at the downstream stages. Also note that the heuristic solutions (as well as the optimal solution)
are computed echelon by echelon from downstream to upstream stages, according to the echelon
decomposition technique of Clark and Scarf (1960). For each echelon, the solutions are computed
by assuming ample supply at the immediate upstream stage. Thus, the Echelons 1 and 2 solutions
in a two-stage system would be the same as the Echelons 1 and 2 solutions in an N -stage system
(with N ≥ 2). Combining this observation with our asymptotic results, we believe that focusing on
two-stage systems in our numerical studies is sufficient, as the numerical performance observed in
such systems is clearly informative to what to expect in systems with greater number of stages.
5.1 Base Case
We first present results for two demand cases. Both involve three demand states, with each state
following a gamma distribution. The parameters for these gamma distributions are (ni, θi) = (1,1),
(2,2), and (2,0.2), for i = 1,2,3, representing demand means of 1, 1, and 10, respectively. The
transition matrices for these two cases are given below:
23
Cyclic demand Noncyclic demand 0 1 00 0 11 0 0
,
0.3 0.1 0.60.5 0.2 0.30.3 0.5 0.2
.
We set the unit penalty cost to be b= 50, and the unit echelon holding costs to be h1 = 2 and
h2 = 1 for Stages 1 and 2, respectively. This echelon holding cost structure corresponds to unit
installation holding costs of H1 = 3 for Stage 1 and H2 = 1 for Stage 2, representing the value-
adding process of moving product from Stage 2 to Stage 1. Based on these cost parameters, the
implicit fill rate at the customer-facing Stage 1 is 50/(50 + 3) = 94.3%. The lead times are set as
(L1,L2) = (1,1), (1,3), and (3,1).
For each scenario, we compute five solutions: the optimal solution s∗n(k), the lower bound sn(k),
the upper bound sn(k), the newsvendor heuristic svn(k) defined by (26), and the derivative-based
heuristic san(k|ζ) defined by (27). Because we have two stages and three states, we compare the
latter four policies with the optimal solution using the following weighted relative solution error
percentage metric. Specifically, for a given policy sn(k), the weighted relative solution error per-
centage is defined as ∑k,n |sn(k)− s∗n(k)|∑
k,n s∗n(k)
× 100%.
We evaluate the long-run average system costs under the policies of s∗n(k), svn(k), and san(k|ζ).
For each parameter scenario, we simulate the system for 50,000 periods and compute the average
cost over these 50,000 periods. To reduce variance in the simulation results, we use the same
random demand sample path for different policies. For the heuristic policies svn(k) and san(k|ζ), we
compute the relative cost error percentage with respect to the cost under the optimal policy s∗n(k).
For the heuristic san(k|ζ), we find that it performs well in most cases when ζ = h[1,n], so we let
san(k|ζ) = san(k|h[1,n]) in our reporting below.
The detailed numerical results for the three-state demand cases are given in Table 1. From the
table, the foremost observation is that the solution “lower bound” svn(k) obtained by the bounding
approaches of Dong and Lee (2003) and Shang (2012) may be greater than the optimal solution
s∗n(k). We mark all such counterexamples with a “†” sign in the table. The numerical results confirm
our earlier discussion that sv1(k) is actually a solution upper bound and sv2(k) may be greater than
s∗2(k) in many cases.
Both san(k|ζ) and svn(k) are good approximations to the optimal solution, with the derivative-
based heuristic san(k|ζ) outperforming svn(k) in most cases. As the Stage 1 lead time L1 increases
from one to three periods, the echelon base-stock levels for different states become close to each
other, and both heuristics’ approximation to the optimal solution improves, resulting in less than
24
Table 1 Comparison of policies in two-stage systems with three-state demand.
Cyclic demand Noncyclic demand(L1,L2) n k s∗n(k) sn(k) sn(k) svn(k) san(k|ζ) s∗n(k) sn(k) sn(k) svn(k) san(k|ζ)(1, 1) 1 1 4.63 4.63 4.63 4.63 4.63 23.13 21.12 23.40 23.40† 23.22
2 24.79 19.70 26.45 26.45† 26.45 19.03 19.03 19.03 19.03 19.033 22.75 19.74 26.50 26.50† 24.25 29.74 25.49 31.57 31.57† 30.07
2 1 23.99 19.79 31.42 25.09† 25.09 27.60 24.44 36.52 28.60† 28.482 22.39 20.11 31.42 25.09† 25.09 25.22 23.56 33.83 25.94† 25.943 29.24 20.11 31.42 25.09 25.09 36.48 27.08 45.10 36.43 35.08
Solution error % - 18.6% 57.6% 10.5% 8.7% - 12.7% 41.4% 2.4% 2.1%Cost error % 59.43 - - 2.7% 2.1% 73.83 - - 0.7% 0.3%
(1, 3) 1 1 4.63 4.63 4.63 4.63 4.63 23.13 21.12 23.40 23.40† 23.222 24.79 19.70 26.45 26.45† 26.45 19.03 19.03 19.03 19.03 19.033 22.75 19.74 26.50 26.50† 24.25 29.74 25.49 31.57 31.57† 30.07
2 1 33.35 22.82 33.55 27.22 27.22 45.06 35.21 52.76 43.00 42.872 43.76 27.88 48.70 40.96 40.96 42.21 34.30 50.09 40.40 40.403 42.31 28.39 48.74 40.99 39.00 52.36 37.86 60.70 50.43 49.10
Solution error % - 28.2% 38.8% 9.1% 9.0% - 18.2% 30.3% 3.7% 3.6%Cost error % 67.46 - - 2.7% 2.5% 84.79 - - 0.7% 0.8%
(3, 1) 1 1 28.63 28.60 28.63 28.63 28.63 39.32 37.95 39.48 39.48† 39.352 28.58 28.58 28.58 28.58 28.58 36.85 36.85 36.85 36.85 36.853 39.37 34.67 42.95 42.95† 40.94 45.91 41.53 47.48 47.48† 46.08
2 1 27.21 27.22 33.55 27.22† 27.22 42.56 40.18 52.76 43.00† 42.872 38.87 32.05 48.70 40.96† 40.96 39.72 39.47 50.09 40.40† 40.403 41.29 32.05 48.74 40.99 39.00 50.13 42.38 60.70 50.43† 49.10
Solution error % - 10.2% 41.4% 2.9% 2.9% - 6.3% 41.9% 1.2% 0.9%Cost error % 82.35 - - 0.6% 0.1% 101.79 - - 0.5% 0.1%
1% system cost errors. This corroborates the convergence result of Proposition 4 and suggests that
the heuristics can achieve near-optimal performance even under relatively short lead time. It is also
interesting to note that under the cyclic demand with L1 =L2 = 1, the echelon base-stock levels for
sa2(k|ζ) are identical across all states, so do s2(k) and sv2(k). This is because (L′2 + 1) mod K = 0,
which leads to identical lead time demand distributions with different initial states.
As a robustness check, we conduct an additional numerical study for two five-state demand cases.
The numerical insights are essentially the same as in the three-state demand cases (see Appendix
E for details).
5.2 Effect of Transition Probability
To illustrate the effect of transition probability on our bound and heuristic performance, we param-
eterize the Markov chain transition probability matrix as follows: 0 1 00 0 1q 0 1− q
,
where we vary the transition probability q from 0.1 to 1. When q = 1, the transition matrix
corresponds to the cyclic demand case; otherwise, it represents the noncyclic demand case. The
25
cost parameters are set to be the same as the base case. The lead times are set as (L1,L2) = (2,1).
For the highest demand state 3, we plot the optimal solution s∗n(3), the lower bound sn(3), the
upper bound sn(3), the newsvendor heuristic svn(3), and the derivative-based heuristic san(3|ζ) (with
ζ = h[1,n]). The plots for Stages 1 and 2 are shown in Figure 1 (a) and (b), respectively.
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 125
30
35
40
45
50
55
60
65
70
75
q
Bas
e st
ock
leve
l for
sta
te 3
s∗
1(3)
s1(3)
s1(3)
sv
1(3)
sa
1(3)
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 125
30
35
40
45
50
55
60
65
70
75
qB
ase
stoc
k le
vel f
or s
tate
3
s∗
2(3)
s2(3)
s2(3)
sv
2(3)
sa
2(3)
(a) Stage 1 (b) Stage 2Figure 1 Impact of transition probability on the solutions in a two-stage system.
From the figure, the optimal echelon base-stock level for the highest demand state decreases as
the transition probability q increases (i.e., the probability of jumping from the highest demand
state to a lower demand state increases). Our bounds and heuristics follow the same decreasing
trend as the optimal solution. Specifically, the derivative-based heuristic san(3|ζ) tracks the optimal
solution closely as the transition probability q increases.
5.3 Comparison with an Alternative Heuristic
In this subsection we compare our derivative-based heuristic with an alternative heuristic proposed
by Abhyankar and Graves (2001). Motivated by a case study of Teradyne Inc. (a major electronic
test equipment manufacturer), Abhyankar and Graves (2001) considered an inventory hedging
problem for a two-stage system with a two-state MMD demand process. Their heuristic is based
on the following decoupling assumption. The upstream stage is assumed to provide 100% internal
fill rate, i.e., it can always fulfill orders from the downstream stage. As a result, the two-stage
system is decoupled into two single-stage systems, with the downstream stage operating under a
state-dependent installation base-stock policy, and the upstream stage operating under a static
installation base-stock policy. The cost objective function considered in Abhyankar and Graves
(2001) is different from ours: They treated backorders as negative inventory, and aimed to minimize
expected inventory holding cost subject to a customer fill rate constraint. Nevertheless, we can still
implement their decoupling heuristic idea in our problem setting, and compare its performance
with that of our derivative-based heuristic.
26
We design the following numerical study to (approximately) match the model parameters of
Abhyankar and Graves (2001). Their demand model is a continuous-time two-state Markov demand
process. The low-state demand rate is 7.5 per time unit, with an expected cycle length of 45
time units. The high-state demand rate is 12.5 per time unit, with an expected cycle length of 45
time units. The total lead time of the system is 30 time units. We convert their continuous-time
model to our discrete-time model as follows. We first assume that each discrete period consists
of 5 time units. As a result, the mean demands for the low and high states are 7.5 × 5 = 37.5
and 12.5 × 5 = 62.5, respectively. The expected cycle length for each state is 45/5 = 9 periods.
To approximate these parameters, we assume that the demand follow a gamma distribution with
parameters (ni, θi) = (3,0.1) and (6,0.1) for i = 1,2, with i = 1 (mean 30) representing the low
demand state and i= 2 (mean 60) representing the high demand state. We also assume that the
transition probability matrix is given by (1− p pp 1− p
).
Under this transition matrix, the system stays in each state for 1/p periods on average. Therefore,
when p= 0.1, the expected cycle length is 10 periods, approximately matching the original cycle
length given in Abhyankar and Graves (2001). To illustrate the effect of the transition probability
p, we let p vary from 0.1 to 0.9. Since each discrete period consists of 5 time units, the total lead
time (30 time units) becomes L1 +L2 = 6 periods. We let L1 vary from 1 to 5 while keeping the
total lead time at a constant of 6 periods. We set the stockout penalty cost b= 50 and installation
holding costs H1 = 3, H2 = 1, which are the same as in the base case. To sum up, we have a total
of 9× 5 = 45 parameter cases for comparing our heuristic with the decoupling heuristic.
To implement the decoupling heuristic idea in our problem setting, we first use our single-stage
heuristic to compute the state-dependent installation base-stock levels for Stage 1. For Stage 2, we
need to compute a static installation base-stock level. Note that the transition probability matrix
under consideration is doubly stochastic. Hence, the stationary distribution is uniform (1/2,1/2).
We use a gamma distribution with parameters (n, θ) = (9,0.2) to approximate the demand under
the stationary distribution (with a matching demand mean 45). Based on this demand distribution,
we use a target fill rate of 97.7%, as suggested in Abhyankar and Graves (2001), to determine the
installation base-stock level for Stage 2. We then convert the installation base-stock levels into the
echelon base-stock levels by a standard technique (Zipkin 2000). We let sdn(k) denote the resulting
echelon base-stock levels, so that we can have a comparison with the optimal echelon base-stock
levels and our heuristic solutions. Recall that our derivative-based heuristic is parametrized by ζ,
with hn ≤ ζ ≤ h[1,n]. We find the heuristic performs well at either of the two extreme values of ζ,
27
so we present both cases san(k|hn) and san(k|h[1,n]) below. We evaluate the long-run average costs
of the two-stage system under the heuristic policies and the optimal policy, and then compute the
percentages of relative solution/cost errors.
When the transition probability p is small (e.g., p= 0.1), the transition between the two states
occurs less frequently. The demand process behaves more like an i.i.d. process, with an occasional
demand state shifting. On the other hand, when p is large (e.g., p = 0.9), the demand process
becomes more dynamic, as it transitions between the two states more frequently. The latter case
is similar to the three-state and five-state demand cases considered earlier (where the probability
of transitioning to a different state is no less than 0.7). Table 2 presents the detailed numerical
results for the two extreme transition probability cases of p= 0.1 and 0.9.
Table 2 Comparison of policies in a two-stage system with two-state demand.
p=0.1 p=0.9(L1,L2) n k s∗n(k) san(k|h[1,n]) san(k|hn) sdn(k) s∗n(k) san(k|h[1,n]) san(k|hn) sdn(k)(1, 5) 1 1 117.9 117.9 117.9 110.1 148.0 148.0 148.0 139.9
2 185.1 180.6 180.6 171.6 156.9 156.9 156.9 148.22 1 438.6 356.1 392.0 458.4 423.8 410.7 447.5 488.2
2 533.3 471.9 512.2 519.9 447.5 433.1 467.2 496.6Solution error % - 11.6% 5.7% 4.3% - 2.3% 3.7% 11.1%
Cost error % 414.0 29.5% 4.8% 2.3% 357.3 1.3% 1.6% 10.1%(2, 4) 1 1 168.3 168.3 168.3 158.8 197.1 197.1 197.1 187.3
2 253.8 247.4 247.4 236.5 227.2 226.9 226.9 216.72 1 434.4 356.1 392.0 455.7 421.9 410.7 447.5 484.3
2 527.7 471.9 512.2 533.4 444.7 433.1 467.2 513.6Solution error % - 10.2% 4.6% 3.9% - 1.8% 3.8% 11.7%
Cost error % 501.8 22.3% 3.6% 2.9% 424.6 0.6% 1.8% 10.4%(3, 3) 1 1 218.6 218.6 218.6 207.7 264.2 264.2 264.2 253.0
2 318.4 310.9 310.9 298.4 276.5 276.5 276.5 264.92 1 428.5 356.1 392.0 452.5 417.7 410.7 447.5 497.8
2 519.9 471.9 512.2 543.2 442.3 433.1 467.2 509.7Solution error % - 8.6% 3.5% 5.3% - 1.2% 3.9% 12.2%
Cost error % 591.3 16.9% 2.6% 4.5% 490.7 0.2% 2.4% 10.9%(4, 2) 1 1 269.0 269.0 269.0 256.8 314.0 314.0 314.0 301.5
2 380.0 371.9 371.9 358.0 339.5 339.3 339.3 326.42 1 420.4 356.1 392.0 448.6 416.0 410.7 447.5 493.2
2 509.7 471.9 512.2 549.8 438.8 433.1 467.2 518.2Solution error % - 7.0% 2.5% 6.5% - 0.7% 4.0% 12.1%
Cost error % 681.3 11.7% 1.6% 6.1% 552.0 0.0% 3.0% 10.9%(5, 1) 1 1 319.6 319.6 319.6 306.3 375.0 375.0 375.0 361.4
2 439.4 430.9 430.9 415.7 389.5 389.5 389.5 375.52 1 403.4 356.1 392.0 443.5 410.2 410.7 447.5 498.6
2 496.7 471.9 512.2 552.9 436.3 433.1 467.2 512.7Solution error % - 4.9% 2.1% 8.0% - 0.2% 4.2% 11.9%
Cost error % 768.1 6.3% 0.7% 7.9% 609.1 0.0% 4.1% 11.5%
28
From Table 2, we observe that, when p is small, our heuristic san(k|ζ) performs better with
ζ = hn. When p is large, our heuristic san(k|ζ) performs better with ζ = h[1,n]. This latter observation
accords with our earlier observations in the three-state and five-state demand cases. These results
indicate that, when the demand process is more dynamic, we should set ζ = h[1,n] in our heuristic;
otherwise, we should use ζ = hn. Comparing our heuristic with the decoupling heuristic, we note
that our heuristic performs better in most cases, with an average performance improvement of
4.6%. The most dramatic improvement occurs in the more dynamic demand case of p= 0.9, where
the improvement can be as high as 11.5%. However, in the less dynamic demand case of p =
0.1, when the Stage 2 lead time is long (i.e., L2 ≥ 4), the decoupling heuristic performs better
than our heuristic, with an improvement of 0.7-2.5%. Performance difference aside, we note that
our derivative-based heuristic can be applied to systems with any number of stages, whereas the
decoupling heuristic only works for two-stage systems.
Figure 2 illustrates the heuristic performance comparison under other p values, where the relative
cost error percentages of the three heuristic policies are plotted for the two extreme lead time cases
of (L1,L2) = (1,5), (5,1). The plots confirm our observations from Table 2.
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0%
5%
10%
15%
20%
25%
30%
35%
p
Heu
ristic
Cos
t Erro
rs
san(k |h [ 1 ,n ])
san(k |hn)
sdn(k )
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0%
5%
10%
15%
20%
25%
30%
35%
p
Heu
ristic
Cos
t Erro
rs
san(k |h [ 1 ,n ])
san(k |hn)
sdn(k )
(a) (L1,L2) = (1,5) (b) (L1,L2) = (5,1)Figure 2 Relative cost error percentages under various heuristics
It has been demonstrated in the literature that, in a serial inventory system with i.i.d. demand,
an optimal internal fill rate is usually much lower than the fill rate at the customer-facing stage (see
Choi et al. 2004, Shang and Song 2006, and references therein). For systems with MMD, a natural
question is whether the internal fill rate would continue to be low, or becomes much higher as
assumed in the decoupling heuristic. To obtain insights into this question, we compute the system
internal fill rates (at Stage 2) under the optimal policy for the two extreme lead time cases of
(L1,L2) = (1,5), (5,1). The results are plotted in Figure 3.
29
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 80%
82%
84%
86%
88%
90%
92%
94%
96%
98%
100%
p
Syst
em In
tern
al F
ill R
ate
unde
r Opt
imal
Sol
utio
n
Lead time (1, 5)
Lead time (5, 1)
Figure 3 Internal fill rates under optimal policy
Figure 3 shows that, contrary to the the observations documented for the i.i.d. demand case,
when the lead time at the upstream stage is long (i.e., L2 = 5), the internal fill rate can be high
under MMD (ranging between 94-96%; the target fill rate at the customer-facing stage is 94.3%),
suggesting that the decoupling heuristic may yield a good approximation to the optimal policy.
However, from Table 2, the decoupling heuristic still performs poorly when p is large. This is
because the static installation base-stock level computed under the the decoupling heuristic tends
to be a poor approximation to the optimal policy as the demand process becomes more dynamic.
Figure 3 also shows that, when the lead time at the upstream stage is relatively short (i.e., L2 = 1),
the internal fill rate tends to be low (ranging between 84-93%, lower than the 94.3% target fill rate
at the customer-facing stage). As a result, the decoupling heuristic yields a poor approximation;
and our derivative-based heuristic has a clear advantage in this case, as it does not require the
high internal fill rate assumption. We conduct an additional numerical study on the internal fill
rates under the three-state demand cases. The findings are similar to the above observations (see
Appendix E).
6. Conclusion
We have addressed in this paper supply chain inventory management in a volatile demand envi-
ronment driven by uncertain external factors. Our goal is to develop simple analytical tools and
insights to cope effectively with dynamic demand uncertainties. To this end, we have focused on
simplifying the inventory decision tools for serial inventory systems with MMD.
We make several contributions to the literature. First, we have developed simple solution bounds
and approximations for these systems. Our derivative analysis approach enables us to obtain a
general solution lower bound that does not require the restrictive assumptions used in previous
works. Second, we have performed the first asymptotic analysis for these systems with long lead
30
time. We prove the MMD central limit theorem and characterize the asymptotic mean and variance.
We also show the convergence rate of our bounds and approximations to the optimal policy. These
results help reveal the effect of lead time on the optimal policy and provide analytical assessment of
the effectiveness of our bounds and heuristic policies (i.e., the relative errors). Third, to evaluate the
performance of our solution bounds and heuristic policies, we use the gamma demand distribution
family and show that, under this family, both the optimal solutions and the solution bounds can be
computed easily by leveraging Laplace transformation and its inverse. To our knowledge, existing
algorithms in the literature for serial inventory systems with MMD only address discrete demand
(such as Poisson distribution). With the gamma demand distribution, we have presented the first
set of numerical studies of serial systems under MMD.
We have demonstrated through numerical studies that our approximations are effective. These
results greatly simplify computation and, hence, facilitate implementation. For instance, the bounds
provide a convenient benchmark for the maximum and minimum inventory needed for each possible
demand state, which is useful for budgeting purposes. The heuristics can serve as quick decision
tools. Perhaps, even more importantly, these simple solutions can be obtained by solving equations
that depend on the primitive model parameters only, making them much more transparent than
the exact optimal solution. In particular, they shed light on how the transition probability among
the states (and thus the environmental uncertainty) affects the optimal inventory position.
The MMD demand model is particularly attractive because the flexibility of Markov chains
enables us to account for environmental uncertainties. Advanced information technology, such
as hand-held devices, GPS, and social media, as well as abundant data associated with these
technologies, makes it easier for the construction of MMD. Real-time data also makes it easier to
observe the demand state and, hence, to facilitate the implementation of MMD. For example, one
can detect the predictive seasonality pattern from real-time data and construct a Markov chain
with cyclic demand state transition matrix. For demand processes with more complex dynamics,
such as in the Teradyne case of Abhyankar and Graves (2001), one can construct a discrete-time
Markov chain based on the observed data as we have demonstrated in the numerical study.
Several extensions of our problem merit further investigation. We have observed numerically
that the optimal internal fill rate under MMD may be significantly higher than that observed in
previous studies under i.i.d. demand. It would be interesting to identify analytical conditions that
guarantee a high internal fill rate, such that the serial inventory system can be decoupled into
separate single-stage systems to facilitate the implementation of the decoupling heuristic. Another
extension is to investigate the demand variability propagation effect, or the bullwhip effect (see
Chen and Lee 2009, 2012 and the references therein), in serial inventory systems under MMD. We
have observed numerically that the optimal policy can dampen the bullwhip effect significantly,
31
suggesting that the state-dependent inventory policy under MMD may smooth demand variability
propagation in the supply chain (see Appendix E for details). Providing analytical explanations
for this system behavior can be an interesting future research topic.
References
Abhyankar, H., S. Graves. 2001. Creating an inventory hedge for Markov-modulated Poisson
demand: An application and model. Manufacturing & Service Oper. Management 3(4) 306–320.
Abouee-Mehrizi, H., O. Baron, O. Berman. 2014. Exact analysis of capacitated two-echelon inven-
tory systems with priorities. Manufacturing & Service Oper. Management 16(4) 561-577.
Angelus, A. 2011. A multiechelon inventory problem with secondary market sales. Management
Sci. 57(12) 2145–2162.
Aviv, Y., A. Federgruen. 1997. Stochastic inventory models with limited production capacity and
periodically varying parameters. Probability in the Engineering and Informational Sciences. 11
107–135.
Beyer, D., S. Sethi. 1997. Average cost optimality in inventory models with Markovian demands.
J. Optim. Theory Appl. 92(3) 497–526.
Billingsley P. 1995. Probability and Measure. 3rd Edition, John Wiley & Sons.
Brenner, J. L. 1959. Relations among the minors of a matrix with dominant principal diagonal.
Duke Math. J. 26(4) 563-567.
Chao, X., S. Zhou. 2007. Probabilistic solution and bounds for serial inventory system with dis-
counted and average cost. Naval Res. Logist. 54(6) 623–631.
Chen, F., J.-S. Song. 2001. Optimal policies for multiechelon inventory problems with Markov-
modulated demand. Oper. Res. 49(2) 226–234.
Chen, F., Y.-S. Zheng. 1994. Lower bounds for multi-echelon stochastic inventory systems. Man-
agement Sci. 40(11) 1426–1443.
Chen, L., H.L. Lee. 2009. Information sharing and order variability control under a generalized
demand model. Management Science 55(5) 781-797.
Chen, L., H.L. Lee. 2012. Bullwhip Effect Measurement and Its Implications. Operations Research
60(4) 771-784.
Choi, K.-S., J.G. Dai and J.-S. Song. 2004. On measuring supplier performance under vendor-
managed-inventory programs in capacitated supply chains. Manufacturing & Service Operations
Management 6 (1) 53-72.
32
Clark, A.J., H. Scarf. 1960. Optimal policies for a multi-echelon inventory problem. Management
Sci. 6 475-490.
Dong, L., H.L. Lee. 2003. Optimal policies and approximations for a serial multi-echelon inventory
system with time-correlated demand. Oper. Res. 51(6) 969-980.
Federgruen, A., P. Zipkin. 1984. Computational issues in an infinite horizon multi-echelon inventory
problem with stochastic demand. Oper. Res. 32 818-836.
Gallego, G., O. Ozer. 2005. A new algorithm and a new heuristic for serial supply systems. Oper.
Res. Letters 33(4) 349-362.
Gallego, G., P. Zipkin. 1999. Stock positioning and performance estimation in serial production-
transportation systems. Manufacturing & Service Oper. Management 1(1) 77-88.
Huh, W. T., G. Janakiraman. 2008. A sample-path approach to the optimality of echelon order-
up-to policies in serial inventory systems. Oper. Res.Letters 36(5) 547-550.
Iglehardt, D., S. Karlin. 1962. Optimal policy for dynamic inventory process with nonstationary
stochastic demands. Studies in Applied Probability and Management Science. K. Arrow, S. Kar-
lin, H. Scarf (eds.). Chapter 8, Stanford University Press, Stanford, California.
Janakiraman, G., J.A. Muckstadt. 2009. A decomposition approach for a class of capacitated serial
systems. Oper. Res. 57(6) 1384-1393.
Karlin S. 1960. Dynamic inventory policy with varying stochastic demands. Management Sci. 6(3)
231-258.
Kapuscinski, R., S. Tayur. 1998. A capacitated production-inventory model with periodic demand.
Oper. Res. 46(6) 899-911.
Muharremoglu, A., J.N. Tsitsiklis. 2008. A Single-unit decomposition approach to multiechelon
inventory systems. Oper. Res. 56(5) 1089-1103.
Ozekici, S., M. Parlar. 1993. Periodic-review inventory models in random environments. Working
paper, School of Business, McMaster University, Hamilton, Ontario, Canada.
Parker, R.P., R. Kapuscinski. 2004. Optimal policies for a capacitated two-echelon inventory sys-
tem. Oper. Res. 52(5) 739-755.
Sgibnev M. S. 2001. Stone decomposition for a matrix renewal measure on a half-line. Sbornik:
Mathematics 192(7) 1025-1033.
Sethi, S., F. Cheng. 1997. Optimality of (s, S) policies in inventory models with Markovian demand.
Oper. Res. 45(6) 931-939.
33
Shang, K.H. 2012. Single-stage approximations for optimal policies in serial inventory systems with
non-stationary demand. Manufacturing & Service Oper. Management 14(3) 414-422.
Shang, K.H., J.-S. Song. 2003. Newsvendor bounds and heuristic for optimal policies in serial
supply chains. Management Sci. 49(5) 618-638.
Shang, K. and J.-S. Song. 2006. A closed-form approximation for serial inventory systems and its
application to system design. Manufacturing & Service Oper. Management 8(4) 394-406.
Song, J.-S., P. Zipkin. 1992. Evaluation of base stock policies in multiechelon inventory systems with
state dependent demands part I: State independent policies. Naval Res. Logist. 39(5) 715-728.
Song, J.-S., P. Zipkin. 1993. Inventory control in a fluctuating demand environment. Oper. Res.
41(2) 351-370.
Song, J.-S. and P. Zipkin. 1996a. Evaluation of base-stock policies in multiechelon inventory systems
with state-dependent demands: Part II: state-dependent depot policies. Naval Research Logistics
43 381-396.
Song, J.-S., P. Zipkin. 1996b. Managing inventory with the prospect of obsolescence. Oper. Res.
44(1) 215-222.
Song, J.-S., P. Zipkin. 2009. Inventories with multiple supply sources and networks of queues with
overflow bypasses. Management Sci. 55(3) 362–372.
Zipkin, P. 1989. Critical number policies for inventory models with periodic data. Management
Sci. 35(1) 71-80.
Zipkin, P. 2000. Foundations of inventory management. McGraw-Hill, New York.
34
Appendices for Online Companion
Appendix A: Proofs
Proof (Lemma 1) Prove by induction. For case i= 1, because G1(·, y) is convex in y, it follows
that G1(1∗, y) is increasing convex in y. Thus, by definition (4), we have
∂yR1(1∗, y) = ∂yG
1(1∗, y) + p1∗1∗E∂yR
1(1∗, y−D1∗).
Because the first term on the right-hand side is nonnegative, by a standard result of renewal
equations (see Sgibnev 2001 Theorem 3, p. 1032), it follows that ∂yR1(1∗, y)≥ 0. Taking derivative
on both sides of the equation, by the same argument, we obtain ∂2yR
1(1∗, y) ≥ 0. Therefore,
R1(1∗, y) is increasing convex in y. Next, by the definition of G2(·, y), we have G2(·, y) is convex
in y. Hence, it follows that G2(2∗, y) is increasing convex in y. By repeating the above induction
proof step, we obtain the desired result.
Proof (Lemma 2) We first show ∂yRi(y)≤
(I(i)−P
(i)m∗
)−1
· ei · ∂yGi(i∗, y). When i= 2, (9) can
be written as
(1− p1∗1∗)∂yR2(1∗, y)− p1∗2∗∂yR
2(2∗, y) ≤ 0;
−p2∗1∗∂yR2(1∗, y) + (1− p2∗2∗)∂yR
2(2∗, y) ≤ ∂yG2(2∗, y).
Recall that ∂yRi(·, y) and ∂yG
i(·, y) are always nonnegative. It follows that
∂yR2(1∗, y) ≤ p1∗2∗
(1− p1∗,1∗)(1− p2∗2∗)− p1∗2∗p2∗1∗∂yG
2(2∗, y) =p1∗2∗∣∣∣I(2)−P
(2)m∗
∣∣∣∂yG2(2∗, y);
∂yR2(2∗, y) ≤ 1− p1∗1∗
(1− p1∗,1∗)(1− p2∗2∗)− p1∗2∗p2∗1∗∂yG
2(2∗, y) =1− p1∗1∗∣∣∣I(2)−P
(2)m∗
∣∣∣∂yG2(2∗, y).
On the other hand, it is easy to verify(I(2)−P
(2)m∗
)−1
e2 =∣∣∣I(2)−P
(2)m∗
∣∣∣−1(
p1∗2∗
1− p1∗1∗
).
Therefore, we have ∂yR2(y)≤
(I(2)−P
(2)m∗
)−1
·e2 ·∂yG2(2∗, y). When i > 2, following the same proof
steps, we can show the inequality holds in general. Recall from lemma 1 that ∂yRi(y) is always
nonnegative. This implies that(I(i)−P
(i)m∗
)−1
· ei ≥ 0.
Next, let (a)c denote the minor of an element a in I(i)−P(i)m∗ . From the standard result of matrix
inverse, we have
(I(i)−P
(i)m∗
)−1
· ei =∣∣∣I(i)−P
(i)m∗
∣∣∣−1
(−1)i+1(−pi∗1∗)c(−1)i+2(−pi∗2∗)c
...(1− pi∗i∗)c
.
35
Also note that I(i) − P(i)m∗ is a diagonally dominant matrix (because 1 − pi∗i∗ ≥
∑i−1
j=1 pi∗j∗). By
Brenner (1959, Corollary 1), it follows that(−1)i+1(−pi∗1∗)c(−1)i+2(−pi∗2∗)c
...(1− pi∗i∗)c
≤
(1− pi∗i∗)c(1− pi∗i∗)c
...(1− pi∗i∗)c
,
where (1− pi∗i∗)c =∣∣∣I(i−1)−P
(i−1)m∗
∣∣∣. Therefore, we obtain(I(i)−P
(i)m∗
)−1
ei ≤ e · βi(m∗), and the
desired result follows.
Proof (Proposition 1) The lower bound result is straightforward from Lemma 1. For the upper
bound result, from the discussion preceding Proposition 1, we know
∂yG[k](k, y)≤ ∂yG(k, y) + (1− pkk)
K−1∑i=1
βi(m∗)∂yG
i(i∗, y),
where m∗ is the optimal state sequence. We want to further relax the second term of the right-hand
side to make it independent of the optimal state sequence. Note that, for i= 1, ...,K − 1,
∂yGi(i∗, y) = max0, ∂yGi(i∗, y)
= max
0, ∂yG
1(i∗, y) +i−1∑l=1
l∑k=1
pi∗k∗E∂yR
l(k∗, y−Di∗)
≤(∂yG
1(i∗, y))+
+i−1∑l=1
l∑k=1
pi∗k∗∂yRl(k∗, y)
≤(∂yG
1(i∗, y))+
+i−1∑l=1
φi,l(m∗)∂yG
l(l∗, y), (A1)
where the first inequality follows from Lemma 1 and the second inequality follows from Lemma
2, with φi,l(m∗) = [pm∗im∗1 , ..., pm∗im∗l ]
T(I(l)−P
(l)m∗
)−1
el. Define the following vector and matrix
notations:
∂yG =[∂yG
1(1∗, y), ∂yG2(2∗, y), ..., ∂yG
K−1((K − 1)∗, y)]T,
(∂yG1)+ =
[(∂yG
1(1∗, y))+, (∂yG1(2∗, y))+, ..., (∂yG
1((K − 1)∗, y))+
]T,
Ψ =
0 0 . . . 0
φ2,1(m∗) 0 0φ3,1(m∗) φ3,2(m∗) 0 0
......
. . .. . .
...φK−1,1(m∗) φK−1,2(m∗) . . . φK−1,K−2(m∗) 0
.
Then, the inequalities of (A1) can be written in a matrix form as(I(K−1)−Ψ
)∂yG ≤ (∂yG
1)+.
Note that I(K−1)−Ψ is a lower triangular matrix with ones on the diagonal, and hence its inverse
36
must also be a lower triangular matrix with ones on the diagonal. From the fact that the elements
in Ψ are all nonnegative, it follows that I(K−1)−Ψ is invertible and its inverse has all nonnegative
elements. Therefore, we have ∂yG≤(I(K−1)−Ψ
)−1(∂yG
1)+.
Further define β(m∗) = [β1(m∗), ..., βK−1(m∗)]T
. Then, we have
K−1∑i=1
βi(m∗)∂yG
i(i∗, y) =β(m∗)T∂yG≤β(m∗)T(I(K−1)−Ψ
)−1(∂yG
1)+.
Let α(m∗) = [α1(m∗), ..., αK−1(m∗)] = β(m∗)T(I(K−1)−Ψ
)−1, where αi(m
∗) can be determined
recursively by (10). Therefore, we conclude that
∂yG[k](k, y) ≤ ∂yG(k, y) + (1− pkk)
K−1∑i=1
αi(m∗)(∂yG
1(i∗, y))+
≤ ∂yG(k, y) + (1− pkk)maxm∈S
K−1∑i=1
αi(m) (∂yG(mi, y))+
= ∂yG(k, y) + (1− pkk)∆(y),
where the last equality follows from the definition of ∆(y) given in (11).
Proof (Corollary 1) s(k)≤ s∗(k)≤ s(k) is a direct result from Proposition 1. By the definition
of smin, we have smin ≤ s(k). Note that ∂yG(k, y) is increasing in y. Thus, it follows from the
definition of s(k) that ∂yG(k, smin) ≤ 0. Moreover, note that (∂yG(k, smin))+ = 0 for any k.
Hence, it is straightforward to verify that ∂yG(k, smin) + (1− pkk)∆(smin)≤ 0, which implies that
smin ≤ s(k) by the definition of s(k) in (12).
Proof (Corollary 2) By the definition of αi(m), we have αi(m)≥ βi(m). Therefore,
∂yG(k, y) ≤ ∂yG(k, y) +1
2
j−1∑i=1
pkmi ·βi(m) ·Fk (y− s(mi)) (∂yG(mi, y))+
≤ ∂yG(k, y) + (1− pkk) ·αi(m) (∂yG(mi, y))+ ≤ ∂yG(k, y) + (1− pkk)∆(y),
which implies the desired result.
Proof (Lemma 3) Let πtk = [πtk(1), ..., πtk(K)] denote the distribution of demand states at period
t given an initial state k ∈ 1, ...,K. Then, we have limt→∞πtk = π, where π is the Markov chain
stationary distribution. Let Xtk denote the demand in period t given the initial state k, then
Xtk =
D
(t)1 with prob. πtk(1),
D(t)2 with prob. πtk(2),
...
D(t)K with prob. πtk(K),
37
where D(t)i has the same distribution as Di. Note that D
(t1)i1, ...,D
(tn)in
are all independent of each
other for any given sets of i1, ..., in and t1, ..., tn, as long as tj 6= tj′ for j 6= j′.
Consider first an alternative Markov chain with the same transition matrix but starting with
the stationary distribution. Let Y t denote the demand in each period under this Markov chain.
Then, we have Y t =D(t)i with probability π(i) for any t≥ 1. Hence, E Y t=
∑K
i=1 µiπ(i) = µ. Let
Y t = Y t−µ. Then we have EY t= 0. It is also easy to check that
E
(Y t)2
=K∑k=1
(σ2k +µ2
k−µ2)π(k), and E
Y tY t+s
=
K∑k=1
K∑l=1
(µk−µ)µlp(s)kl π(k),
where p(s)kl is the s-step transition probability from state k to state l.
Since the Markov chain W has a stationary distribution and the state space is finite, by Theorem
8.9 in Billingsley (1995, p. 131), we have, for any k, l and s, where A≥ 0 and 0≤ ρ< 1,∣∣∣p(s)kl −π(l)
∣∣∣≤Aρs. (A2)
Since Y t is stationary, from the above condition, it can be verified that Y t is α-mixing with
an exponential decay rate (see Billingsley 1995, p. 363). Let SL = Y 1 + · · · Y L+1. By Theorem 27.4
of Billingsley (1995, p. 364), it follows that, as L→∞, the following series converge absolutely:
(L+ 1)−1varSL→ E
(Y 1)2
+ 2
∞∑s=1
EY 1Y 1+s
=
K∑k=1
[σ2k +µ2
k−µ2]π(k) + 2
∞∑s=1
K∑k=1
K∑l=1
(µk−µ)µlp(s)kl π(k)
= σ2.
If σ > 0, we further have, as L→∞,
SL
σ√L+ 1
dist.−−→N (0,1) . (A3)
Now consider the original Markov chain. Define Xtk =Xt
k − µ and SLk = X1k + · · · XL+1
k . Given a
fixed y value (y≥ 0), for any N ≥ 0, we can write the following:
P
(XN+1k ...+ XL+1
k
σ√L+ 1
≤ y
)
=K∑
iN+1=1
· · ·K∑
iL+1=1
P
(XN+1k ...+ XL+1
k
σ√L+ 1
≤ y
∣∣∣∣∣XN+1k =DiN+1
, ...,XL+1k =DiL+1
)×P
(XN+1k =DiN+1
, ...,XL+1k =DiL+1
)=
K∑iN+1=1
· · ·K∑
iL+1=1
P
(D
(N+1)iN+1
...+D(L+1)iL+1
− (L−N + 1)µ
σ√L+ 1
≤ y
)· p(N+1)k,iN+1
piN+1,iN+2· · ·piL,iL+1
,
38
and
P
(Y N+1...+ Y L+1
σ√L+ 1
≤ y
)
=K∑
iN+1=1
· · ·K∑
iL+1=1
P
(D
(N+1)iN+1
...+D(L+1)iL+1
− (L−N + 1)µ
σ√L+ 1
≤ y
)·π(iN+1)piN+1,iN+2
· · ·piL,iL+1.
Therefore, comparing the two expressions, we obtain∣∣∣∣∣P(XN+1k ...+ XL+1
k
σ√L+ 1
≤ y
)−P
(Y N+1...+ Y L+1
σ√L+ 1
≤ y
)∣∣∣∣∣≤
K∑iN+1=1
· · ·K∑
iL+1=1
P
(D
(N+1)iN+1
...+D(L+1)iL+1
− (L−N + 1)µ
σ√L+ 1
≤ y
)∣∣∣p(N+1)k,iN+1
−π(iN+1)∣∣∣piN+1,iN+2
· · ·piL,iL+1
≤ AρN+1
K∑iN+1=1
· · ·K∑
iL+1=1
P
(D
(N+1)iN+1
...+D(L+1)iL+1
− (L−N + 1)µ
σ√L+ 1
≤ y
)· piN+1,iN+2
· · ·piL,iL+1
≤ AKρN+1,
where the second inequality follows from the condition (A2) and the last inequality is a result of
relaxing the inner probability term P (·) to be one.
For any ε > 0, there exists Nε such that AKρN ≤ ε/3 for N >Nε. Fix Nε, it is easy to check that
the following results hold:
limL→∞
P
(SL
σ√L+ 1
≤ y
)= lim
L→∞P
(∑Nε
i=1 Yi)
+ Y Nε+1...+ Y L+1
σ√L+ 1
≤ y
= lim
L→∞P
(Y Nε+1...+ Y L+1
σ√L+ 1
≤ y
),
and, similarly,
limL→∞
P
(SLk
σ√L+ 1
≤ y)
= limL→∞
P
(XNε+1k ...+ XL+1
k
σ√L+ 1
≤ y
).
Therefore, from the above limiting results, there exists L(ε,Nε) such that for L>L(ε,Nε),∣∣∣∣∣P(
SL
σ√L+ 1
≤ y
)−P
(Y Nε+1...+ Y L+1
σ√L+ 1
≤ y
)∣∣∣∣∣< ε/3,and∣∣∣∣∣P(
SLkσ√L+ 1
≤ y)−P
(XNε+1k ...+ XL+1
k
σ√L+ 1
≤ y
)∣∣∣∣∣< ε/3.Hence, for L>L(ε,Nε),∣∣∣∣∣P(
SLkσ√L+ 1
≤ y)−P
(SL
σ√L+ 1
≤ y
)∣∣∣∣∣ ≤∣∣∣∣∣P(
SLkσ√L+ 1
≤ y)−P
(XNε+1k ...+ Xk
L
σ√L+ 1
≤ y
)∣∣∣∣∣
39
+
∣∣∣∣∣P(XNε+1k ...+ Xk
L
σ√L+ 1
≤ y
)−P
(Y Nε+1...+ YL
σ√L+ 1
≤ y
)∣∣∣∣∣+
∣∣∣∣∣P(
SL
σ√L+ 1
≤ y
)−P
(Y Nε+1...+ YL
σ√L+ 1
≤ y
)∣∣∣∣∣≤ ε/3 +AKρNε+1 + ε/3
≤ ε.
Since ε is chosen arbitrarily, it follows thatSLk
σ√L+ 1
andSL
σ√L+ 1
converge to the same distribu-
tion as L goes to infinity. From (A3), the desired result follows.
Proof (Proposition 2) Consider first the case when the Markov chain W has a stationary
distribution. Let η = b/(b + h). For any initial state k, the myopic base-stock level s(k) is the
solution of
P(X1k +X2
k + ...+XL+1k ≤ s(k)
)= η.
The above equation can be further written as
P
(X1k +X2
k + ...+XL+1k − (L+ 1)µ√
L+ 1σ≤ s(k)− (L+ 1)µ√
L+ 1σ
)= η.
Let L goes to infinity, since the left-hand side of the inequality inside the parenthesis converges
to N(0,1) in distribution, it follows that limL→∞
s(k)− (L+ 1)µ√L+ 1σ
= Φ−1(η), where Φ(·) is the stan-
dard normal distribution function. Hence, it follows that limL→∞
s(k)− (L+ 1)µ
(L+ 1)σ= 0, or, equivalently,
limL→∞
s(k)
(L+ 1)σ=µ
σ. Now for different initial states k and k′, we have
√L+ 1
s(k)− s(k′)s(k)
=(s(k)− s(k′))/
√L+ 1σ
s(k)/(L+ 1)σ
=(s(k)− (L+ 1)µ)/
√L+ 1σ− (s(k′)− (L+ 1)µ))/
√L+ 1σ
s(k)/(L+ 1)σ,
where the numerator converges to zero and the denominator converges to a positive constant, as
L goes to infinity. Therefore, we conclude that, for any k and k′,
limL→∞
√L+ 1 · s(k)− s(k′)
s(k)= 0,
which implies limL→∞
√L+ 1 · s(k)− smin
smin= 0.
When the Markov chain W is cyclic, for any given lead time L, we can write L=mK+ l, where
l = L mod K. The first mK periods of aggregated demand of Dk(L) and Dk′(L) are exactly the
same. As L increases to infinity, the residual part is negligible compared with the first mK periods
40
of aggregated demand. Therefore, we can show that, for any states k and k′, the lead time demand
distributions Fk,L(y) and Fk′,L(y) converge to the same limit distribution for every y as L goes to
infinity. Hence, the same convergence rate result can be obtained. We omit the details here.
Proof (Proposition 3) By repeatedly applying the procedure we used to obtain (23), we can
prove by induction that,
∂yG[k]n (k, y)≥ ∂yG1
n(k, y)≥−(b+Hn+1) + (b+Hn)Fk,L′n(y) = ∂yGn(k, y|hn).
By applying (3) repeatedly, taking derivative with respect to y, and applying Lemma 2, we have
∂yG[k]n (k, y) = ∂yG
1n(k, y) +
[k]−1∑i=1
∑u∈Ui+1
n
pku∂yERin(u, y−Dk)
≤ ∂yG
1n(k, y) +
[k]−1∑i=1
∑u∈Ui+1
n
pkuβi(m∗n)∂yG
in (i∗, y)
≤ ∂yG1(k, y) +
K−1∑i=1
(1− pkk)βi(m∗n)∂yGin (i∗, y) , (A4)
where the last inequality utilizes the facts that [k]≤K, transition probabilities pku sum up to 1,
and ∂yRin(u, y) is increasing and nonnegative. For ease of exposition, denote vj =Wk(t+L′n−L′j−1).
By using the derivatives of (19) and (20), we have
∂yG1n(k, y) = hn +E ∂yGn−1,n (vn, y−Dk[t, t+Ln))
= hn +E∂yG
[vn]n−1
(vn,min
y−Dk[t, t+Ln), s
[vn]n−1(vn)
)− ∂yG[vn]
n−1
(vn, s
[vn]n−1(vn)
)= hn +E
∂yG
[vn]n−1
(vn,min
y−Dk[t, t+Ln), s
[vn]n−1(vn)
)≤ hn +E
∂yG
[vn]n−1 (vn, y−Dk[t, t+Ln))
, (A5)
where the inequality utilizes the fact that the derivative of ∂yG[vn]n−1 (vn, y) is increasing in y (Chen
and Song 2001). By applying (3) to (A5), we have
∂yG1n(k, y) ≤ hn +E
∂yG
1n−1 (vn, y−Dk[t, t+Ln))
+E
[vn]−1∑i=1
∑u∈Ui+1
n−1
pvnuEDvn[∂yR
in−1 (u, y−Dk[t, t+Ln)−Dvn)
]≤ hn +E
∂yG
1n−1 (vn, y−Dk[t, t+Ln))
+E
[vn]−1∑i=1
∑u∈Ui+1
n−1
pvnu∂yRin−1 (u, y−Dk[t, t+Ln))
, (A6)
41
where the last inequality follows from Lemma 1.
Recall that ∂yGn(k, y|h[1,n]) =−(b+Hn+1)+(b+H1)Fk,L′n(y). By repeatedly applying the above
techniques and using Lemma 2, we have
∂yG1n(k, y)
≤ hn +hn−1 + ...+h2 +E∂yG
11 (v2, y−Dk[t, t+L′n−L1))
+E
n−1∑j=1
[vj+1]−1∑i=1
∑u∈Ui+1
j
pvj+1u∂yRij
(u, y−Dk[t, t+L′n−L′j)
)= ∂yGn(k, y|h[1,n]) +E
n−1∑j=1
[vj+1]−1∑i=1
∑u∈Ui+1
j
pvj+1u∂yRij
(u, y−Dk[t, t+L′n−L′j)
)≤ ∂yGn(k, y|h[1,n]) +E
n−1∑j=1
[vj+1]−1∑i=1
∑u∈Ui+1
j
pvj+1uβi(m∗j)∂yG
ij
(i∗, y−Dk[t, t+L′n−L′j)
)≤ ∂yGn(k, y|h[1,n]) +E
n−1∑j=1
K−1∑i=1
βi(m∗j)∂yG
ij
(i∗, y−Dk[t, t+L′n−L′j)
), (A7)
where the equality follows from v2 =Wk(t+L′n−L1) and Fv2,L1(y−Dk[t, t+L′n−L1)) = Fk,L′n(y).
Combining (A4) and (A7) leads to
∂yG[k]n (k, y)≤ ∂yGn(k, y|h[1,n]) +Mn(k, y),
where
Mn(k, y) =n−1∑j=1
K−1∑i=1
βi(m∗j )E
∂yG
ij
(i∗, y−Dk[t, t+L′n−L′j)
)+K−1∑i=1
(1− pkk)βi(m∗n)∂yGin(i∗, y),
with m∗j being the optimal sequence in the jth stage. With the same analysis as in the proof of
Proposition 1, we can further relax Mn(k, y) as follows.
Mn(k, y) ≤n−1∑j=1
E
K−1∑i=1
αi(m∗j) (∂yG
1j
(i∗, y−Dk[t, t+L′n−L′j)
))+
+(1− pkk)K−1∑i=1
αi (m∗n)(∂yG
1n (i∗, y)
)+, (A8)
where αi(m∗j ) can be determined recursively by (10). From (A7) and Lemma 2, we also have
(∂yG
1j(i∗, y)
)+ ≤ max
0, gj(i
∗, y) +
j−1∑s=1
E
K−1∑l=1
βl(m∗s)∂yG
ls
(l∗, y−Di∗ [t, t+L′j −L′s)
)
≤
(gj(i
∗, y) +
j−1∑s=1
E
K−1∑l=1
αl (m∗s)(∂yG
1s
(l∗, y−Di∗ [t, t+L′j −L′s)
))+
)+
.
(A9)
42
Combining (A8) and (A9), we can verify that
Mn(k, y)≤ (1− pkk)∆n(y) +∑n−1
j=1 ∆j,n(k, y)
where ∆n(y) and ∆j,n(k, y) are defined in Proposition 3. We arrive at the desired result.
Proof (Corollary 3) The proof follows the same argument as in the proof for Corollary 1 and is
omitted here.
Proof (Corollary 4) The proof follows the same argument as in the proof for Corollary 2 and is
omitted here.
Proof (Proposition 4) For ease of exposition, we present a proof for N = 2. The general case
with N > 2 can be shown following the same argument.
Consider first the case when the Markov chain W has a stationary distribution. When L′1 goes
to infinity, we can apply Proposition 2 to obtain the result for Stage 1. Thus, we only need to check
Stage 2. When L′2 going to infinity, either L1 or L2 goes to infinity. For brevity, we shall consider
the case of L1 going to infinity and L2 staying finite below. The other cases can be shown similarly.
By Proposition 3, s2(k) is the solution to −(b+H3) + (b+H2)Fk,L′2(y) = 0, or equivalently,
Fk,L′2(s2(k)) =b+H3
b+H2
.
With a similar argument to that used in the proof of Proposition 2, we have
limL′2→∞
s2(k)− (L′2 + 1)µ√L′2 + 1σ
= Φ−1(η),
where η= (b+H3)/(b+H2). We next establish a further relaxation of the derivative upper bound
given in Proposition 3. Define the following functions recursively:
∆n(y) = αK∑k=1
((b+H1)Fk,L′n(y) +
∑n−1
j=1 ∆j,n(k, y))
∆j,n(k, y) = E
∆j
(y−Dk[t, t+L′n−L′j)
),
where α = maxi∈1,...,K−1,m∈S αi(m). It is easy to verify that ∆2(y) ≤ ∆2(y) and ∆1,2(k, y) ≤∆1,2(k, y). Then, we have
∂yG[k]2 (k, y) ≤ ∂yG2(k, y|h[1,2]) + (1− pkk)∆2(y) + ∆1,2(k, y).
Therefore, we obtain a relaxed solution lower bound, denoted by s′2(k), from solving the equation:
∂yG2(k, y|h[1,2]) + (1− pkk)∆2(y) + ∆1,2(k, y) = 0.
43
Expanding the terms ∆2(y) and ∆1,2(k, y) based on their definitions yields
∆1(y) = αK∑k=1
(b+H1)Fk,L1(y),
∆1,2(k, y) = E
∆1 (y−Dk[t, t+L′2−L′1))
= α(b+H1)E
K∑l=1
Fl,L1(y−Dk[t, t+L2))
= α(b+H1)K∑l=1
P (Dl[t, t+L1] +Dk[t, t+L2)≤ y) ,
∆2(y) = αK∑k=1
((b+H1)Fk,L′2(y) + ∆1,2(k, y)
)= α(b+H1)
K∑k=1
Fk,L′2(y) + α2(b+H1)K∑k=1
K∑l=1
P (Dl[t, t+L1] +Dk[t, t+L2)≤ y) .
Therefore, s′2(k) is the solution to the following equation:
(b+H1)Fk,L′2(y) + (1− pkk)α(b+H1)K∑k=1
Fk,L′2(y)
+(1− pkk)α2(b+H1)K∑k=1
K∑l=1
P (Dl[t, t+L1] +Dk[t, t+L2)≤ y)
+α(b+H1)K∑l=1
P (Dl[t, t+L1] +Dk[t, t+L2)≤ y) = b+H3,
where the left-hand side involves a linear combination of the lead time demand distributions with
the same time length. From Lemma 3, we have
Dk[t, t+L′2]− (L′2 + 1)µ√L′2 + 1σ
d−−−−→L′2→∞
N(0,1).
Also recall that L1 goes to infinity and L2 stays finite, as L′2 goes to infinity. It is straightforward
to verify thatDl[t, t+L1] +Dk[t, t+L2)− (L′2 + 1)µ√
L′2 + 1σ
d−−−−→L′2→∞
N(0,1).
With a similar argument to that used in the proof of Proposition 2, we have
limL′2→∞
s′2(k)− (L′2 + 1)µ√L′2 + 1σ
= Φ−1(η′),
where η′ = (b+H3)/[(b+H1)(1 + αK(2− pkk) + α2K2(1− pkk))]. It is easy to verify that η′ < η.
Therefore, we have, for any k and k′,
√L′2 + 1 · s2(k)− s′2(k′)
s′2(k′)=
√L′2 + 1 ·
(s2(k)− s′2(k′))/√L′2 + 1σ
s′2(k′)/√L′2 + 1σ
44
=(s2(k)− (L′2 + 1)µ)/
√L′2 + 1σ− (s′2(k′)− (L′2 + 1)µ)/
√L′2 + 1σ
(L′2 + 1)−1/2
[(s′2(k′)− (L′2 + 1)µ)/
√L′2 + 1σ
]+µ/σ
−−−−→L′2→∞
Φ−1(η)−Φ−1(η′)
µ/σ.
This implies that
limL′2→∞
√L′2 + 1
s2(k)−mink′ s′2(k′)
mink′ s′2(k′)=
Φ−1(η)−Φ−1(η′)
µ/σ.
Since s2(k)≥ s′2(k), by the definition of smin2 , we have
0≤ s2(k)− smin2
smin2
≤ s2(k)−mink′ s′2(k′)
mink′ s′2(k′).
Hence,
0≤ limsupL′2→∞
√L′2 + 1
s2(k)− smin2
smin2
≤ limL′2→∞
√L′2 + 1
s2(k)−mink′ s′2(k′)
mink′ s′2(k′)=
Φ−1(η)−Φ−1(η′)
µ/σ.
Therefore, we arrive at
limsupL′2→∞
√L′2 + 1
s2(k)− smin2
smin2
= c,
where c is a nonnegative constant.
When the Markov chain W is cyclic, we can apply the same argument as used in the proof of
Proposition 2 to obtain the convergence result. We omit the details here.
45
Appendix B: A Two-State Example for the Single-Stage System
To illustrate our bounds and heuristic solutions, let us consider a single-stage system with a two-
state MMD demand process. For ease of exposition, we write the transition matrix of W as
P =
(1− p pq 1− q
).
Its stationary distribution is π = (q/(p+q), p/(p+q)). According to Corollary 1, the solution upper
bounds are just the myopic base-stock levels s(k), which solves (2). Without loss of generality, we
assume s(1)< s(2). Thus, the optimal solution for state 1 is given by s∗(1) = s(1) = smin = s(1) =
sa(1). It remains to determine the solution lower bound for state 2, s(2), and the heuristic solution
sa(k).
Given the two demand states, there are two possible state sequences: m1 = (1,2) and m2 = (2,1).
It is easy to verify that P(1)m1
= 1−p and P(1)m2
= 1−q, so that β1(m1) = α1(m1) = 1/p and β1(m2) =
α1(m2) = 1/q. From Corollary 1, s(2) is the solution of (12), which in this case reduces to
q · ∂yG(1, y) + p · ∂yG(2, y) = 0.
This is equivalent to
q
p+ q·F1,L(y) +
p
p+ q·F2,L(y) =
b
b+hor Fπ,L(y) =
b
b+h.
Comparing the above equation with (2), we can see that s(2) is a myopic base-stock level with an
“effective” lead time demand distribution, which is a weighted average of the lead time demand
distributions of states 1 and 2 (or the “stationary” lead time demand distribution).
Next, we illustrate how our heuristic solution works. With the assumption s(1) < s(2), the
heuristic sequence is m = (1,2). Therefore, by equation (13), sa(2) is given by
∂yG(2, y) +q
2p·F2(y− s(1)) (∂yG(1, y))
+= 0,
which is equivalent to
F2,L(y) +q
2p·F2(y− s(1))
(F1,L(y)− b
b+h
)+
=b
b+h. (A10)
To gain further insight, we next assume the demand distributions for state k is exponential
with mean 1/θk, with θ1 > θ2. We also assume L= 0 to allow for explicit expressions. This yields
Fk(y) = Fk,0(y) = 1− e−θky, ∂yG(k, y) = h− (h+ b)e−θky, s(k) = ln(1 + b/h)/θk for k = 1,2, and
s(1)< s(2). Thus, s∗(1) = s(1) = s(1) = sa(1) = ln(1 + b/h)/θ1.
46
To determine the optimal solution for state 2, note that s∗(2)> s∗(1). From (3), using Laplace
transform technique described in Appendix C, we can show that s∗(2) is determined by the following
first-order condition:
∂yG[2](2, y) = h− (h+ b)e−θ2y +
qh
p·(
1− pθ1e−θ2(y−s∗(1))− θ2e
−pθ1(y−s∗(1))
pθ1− θ2
)= 0. (A11)
From (A10), we can show that the heuristic sa(2) solves the following equation: for y≥ s∗(1),
h− (h+ b)e−θ2y +qh
p·(1− e−θ1(y−s∗(1))
) (1− e−θ2(y−s∗(1))
)2
= 0. (A12)
The difference between the above two first-order conditions lies in the last terms, which can be
approximated respectively as follows by using Taylor expansion to the second order:
1− (pθ1e−θ2(y−s∗(1))− θ2e
−pθ1(y−s∗(1)))
pθ1− θ2
≈ 1
2pθ1θ2(y− s∗(1))2 + o((y− s∗(1))2),
and (1− e−θ1(y−s∗(1))
) (1− e−θ2(y−s∗(1))
)2
≈ 1
2θ1θ2(y− s∗(1))2 + o((y− s∗(1))2).
Therefore, sa(2) is a close approximation to s∗(2) to the second order if p is close to one. This
condition implies that there is a high likelihood that the demand will stay in the high state—even
if it transitions back to state 1 in one period, it will jump to state 2 with a high probability in the
next period. The following proposition further characterizes how sa(2) and s∗(2) are affected by
the Markov chain transition probability q:
Proposition 5 Under a two-state MMD with exponential density and zero lead time, both the
optimal solution s∗(2) and the heuristic sa(2) are decreasing in transition probability q.
Proof Denote s∗ = s∗(1) for simplification. Because s∗(2) ≥ s∗, we can introduce the following
change of variable: y= s∗+x with x≥ 0. Substituting this into (A11), we obtain
∂xG2(2, s∗+x) =
p+ q
ph−
(h+ b+h
qθ1
pθ1− θ2
eθ2s∗)e−θ2(x+s∗) +h
qθ2 · epθ1s∗
p2θ1− pθ2
e−pθ1(x+s∗)
=p+ q
ph− (h+ b)e−θ2(x+s∗)−h qθ1
pθ1− θ2
e−θ2x +hqθ2
p2θ1− pθ2
e−pθ1x
By substituting s∗ = ln(1 + b/h)/θ1 into the above equation we have
∂xG2(2, s∗+x) =
p+ q
ph− (h+ b)e−θ2x
(h
h+ b
)θ2/θ1−h qθ1
pθ1− θ2
e−θ2x +hqθ2
p2θ1− pθ2
e−pθ1x.
Further take derivative with respect to q of the above expression. We obtain
∂q(∂xG
2(2, s∗+x))
=h
p−h θ1
pθ1− θ2
e−θ2x +hθ2
p2θ1− pθ2
e−pθ1x.
47
Note that ∂q (∂xG2(2, s∗+x)) |x=0 = 0, ∂q (∂xG
2(2, s∗+x)) |x=+∞ = h/p> 0, and
∂x(∂q(∂xG
2(2, s∗+x)))
= hθ1θ2
pθ1− θ2
(e−θ2x− e−pθ1x
)≥ 0 for x≥ 0.
Therefore we must have ∂q (∂xG2(2, s∗+x))≥ 0 for x≥ 0, which implies that s2(2∗) decreases as q
increases.
Next we examine our heuristic. Since sa(2)≥ s∗, we can also apply the above change of variable
to (A12). Therefore, sa(2) is given by the solution to
h− (h+ b)e−θ2(x+s∗) +qh
p· (1− e
−θ1x) (1− e−θ2x)2
= 0.
Clearly the left-hand side of the above equation is increasing in q, which implies that our heuristic
solution for state 2 also decreases as q increases.
48
Appendix C: Gamma Demand Distribution and Laplace Transform
Recall from (3) and (4) that the exact algorithm requires convolutions involving the demand
distributions Fk. In this section, we assume the demand distributions under different demand states
all belong to the gamma distribution family, with the probability density function given by (28).
We show that, in this case, the optimal policy can be computed relatively easily by leveraging
Laplace transformation and its inverse.
Denote Laplace transform of ∂yRi(k, y) by
ri(k,λ) =
∫ ∞0
e−λydRi(k, y),
and denote Laplace transform of ∂yGi(i∗, y) by
gi(i∗, λ) =
∫ ∞0
e−λydGi(i∗, y).
Given the optimal demand state sequence m∗ = (1∗, ...,K∗), define ri(λ) = [ri(1∗, λ), ..., ri(i∗, λ)]T ,
gi(λ) = [0, ...,0, gi(i∗, λ)]T . Because Laplace transform of a gamma density f(x|nk, θk) is given by
θnkk /(λ+ θk)
nk , Laplace transform of D(i)m∗(x) defined in (6) is given by
d(i)m∗(λ) =
θn1∗1∗
(λ+θ1∗ )n1∗ . . . 0...
. . ....
0 . . .θni∗i∗
(λ+θi∗ )ni∗
.
Under Laplace transform, and noting that I(i)−d(i)m∗(λ)P
(i)m∗ is strictly diagonal dominant, (7) can
be simplified to
ri(λ) =(I(i)−d
(i)m∗(λ)P
(i)m∗
)−1
gi(λ).
Here, because d(i)m∗(λ) has a simple form, the inverse Laplace transform of ri(λ) can be fairly easily
determined by Laplace transform table or by standard software packages such as Matlab. Thus,
we can obtain ∂yRi(k, y) from the inverse Laplace transform, and substitute it into (3) to compute
the optimal policy.
It is worth emphasizing here that the computation of the solution bounds and heuristic in Corol-
lary 2 does not require the function of ∂yRi(k, y); it only involves evaluation of simple newsvendor
derivative functions. Thus, unlike the optimal policy, the solution bounds and heuristic proposed
in this paper can be computed for other general demand distributions.
49
Appendix D: Single-Stage Approximation under Stationary Demand
Shang and Song (2003) established solution bounds for a serial inventory system with i.i.d.
demand, which is a special case of our problem. Our results relate to theirs as follows. Let ΩN =
(N, (hi,Li)Ni=1, b,D) denote the original N -stage serial system. Shang and Song (2003) showed
that under i.i.d. demand, the echelon n problem is equivalent to an n-stage serial system Ωn =(n, (hi,Li)
ni=1, b+
∑N
i=n+1 hi,D)
. That is, the echelon n problem is the same as the original N -
stage system truncated at stage n, except that the backorder cost rate is modified to be b +∑N
i=n+1 hi. They also showed that (the cost of) Ωn is bounded by a lower bound system Ωln =(
n, (hi,Li)ni=1, b+
∑N
i=n+1 hi,D)
, with hn = hn and hi = 0 for i < n, and an upper bound system
Ωun =
(n, (hi,Li)
ni=1, b+
∑N
i=n+1 hi,D)
, with hn =∑n
i=1 hi and hi = 0 for i < n. They further showed
that the optimal solution to the lower (upper) bound system is an upper (lower) bound for the
optimal echelon base-stock solution of the original system. Because the two bounding systems are
equivalent to single-stage systems, where the myopic base-stock level is optimal, their solution
bounds are simple single-stage myopic base-stock levels, and one can regard these solutions as
single-stage approximations.
We can adapt the same two bounding systems to our problem with MMD. Specifically, the
derivative of the single-period cost function for demand state k in the lower bound system Ωln is
given by
∂yGn(k, y|Ωln) = −
(b+∑N
i=n+1 hi
)+(b+∑N
i=n+1 hi +hn
)Fk,L′n(y)
= −(b+Hn+1) + (b+Hn)Fk,L′n(y). (A13)
And its counterpart in the upper bound system Ωun is given by
∂yGn(k, y|Ωun) = −
(b+∑N
i=n+1 hi
)+(b+∑N
i=n+1 hi + hn
)Fk,L′n(y)
= −(b+Hn+1) + (b+H1)Fk,L′n(y). (A14)
Observe that (A13) is exactly the same as the derivative lower bound in Proposition 3 (a). On the
other hand, (A14) is only the first term in the derivative upper bound in Proposition 3 (b). Under
the i.i.d. demand process, i.e., K = 1, the ∆ terms in Proposition 3(b) vanish, and we recover
the solution bound results in Shang and Song (2003). However, their bounding approach cannot
be fully extended to the general MMD process: While the myopic base-stock level to the lower
bound system remains an upper bound for the optimal solution of the original system, the myopic
base-stock level to the upper bound system is no longer guaranteed to be a lower bound (See Table
1 for the counterexamples marked with a “†” sign).
50
Appendix E: Additional Numerical Results
Five-State Demand Cases
As a robustness check, we conduct an additional numerical study for two five-state demand cases.
Each of the five states follows a gamma distribution, with parameters given by (ni, θi) = (1,1),
(2,2), (3,3), (2,0.2), and (2,0.2) for i= 1,2,3,4,5, representing demand means of 1, 1, 1, 10, and
10, respectively. We set the demand distributions for states 4 and 5 to be the same in these two
cases, to illustrate the fact that the optimal echelon base-stock levels for these two states may
be different (because the lead time demand distributions starting from these two states may be
different under MMD). The transition matrices for these two cases are given as follows:
Cyclic demand Noncyclic demand0 1 0 0 00 0 1 0 00 0 0 1 00 0 0 0 11 0 0 0 0
,
0.10 0.10 0.20 0.40 0.200.35 0.10 0.20 0.10 0.250.10 0.20 0.20 0.10 0.400.50 0.05 0.25 0.15 0.050.30 0.05 0.25 0.30 0.10
.
We set the cost parameters to be the same as in our base case. Table 3 below presents the detailed
numerical results for the two five-state demand cases. As we can see, the numerical insights are
essentially the same as those observed in the three-state demand cases (Table 1).
Internal Fill Rates
We design an additional numerical study to shed light on the internal fill rate under the optimal and
heuristic policies (i.e., s∗n(k), svn(k), and san(k|h[1,n])). For this purpose, we simulate the two-stage
inventory system for 50,000 periods to measure the realized fill rates at both stages. In each period,
we measure the fill rate (= fulfilled order quantity/placed order quantity) at each stage and then
average the fill rates across 50,000 periods. We present the results for the base case of three-state
MMD in Table 4. The numerical insights from the five-state demand cases are essentially the same,
and we omit them for brevity.
From Table 4, we observe that the fill rates under the enhanced heuristic san(k|h[1,n]) are close to
those under the optimal policy s∗n(k), except for the cyclic demand case with lead times (L1,L2) =
(1,1). In this case, as can be seen from Table 1, the Stage 2 heuristic policy is the same across
different states (because the lead time demand distributions are the same across different states),
whereas the optimal policy varies depending on the state. This explains why the Stage 2 fill rate
under the heuristic policies is significantly lower than that under the optimal policy.
We observe that, under the optimal policy, the Stage 2 fill rate need not be high, especially
when the Stage 1 lead time is relatively long. However, the difference between the internal and
external fill rates is not as significant as what is observed under the i.i.d. demand. The behavior
51
Table 3 Comparison of policies in a two-stage system under the five-state demand cases.
Cyclic demand Noncyclic demand(L1,L2) n k s∗n(k) sn(k) sn(k) svn(k) san(k) s∗n(k) sn(k) sn(k) svn(k) san(k)(1, 1) 1 1 4.47 4.03 4.63 4.63† 4.49 23.12 21.39 23.40 23.40† 23.20
2 3.88 3.88 3.88 3.88 3.88 20.01 20.01 20.01 20.01 20.013 26.44 13.17 26.44 26.44 26.44 22.03 20.92 22.22 22.22† 22.054 34.21 23.74 40.85 40.85† 38.84 30.15 24.06 31.58 31.58† 30.485 20.65 13.21 26.50 26.50† 24.25 32.91 24.98 35.04 35.04† 33.61
2 1 6.14 5.09 6.50 5.40 5.40 28.97 24.83 36.92 28.98† 28.912 25.02 6.50 31.35 25.03† 25.03 28.18 24.47 35.25 27.31 27.313 34.15 7.37 46.58 38.85† 38.85 29.63 24.75 36.57 28.60 28.514 37.45 7.41 46.63 38.89† 36.90 38.27 25.96 46.71 38.02 36.765 27.34 6.72 31.42 25.09 22.88 39.98 26.09 49.05 40.13† 38.75
Soln error % - 58.5% 20.5% 9.9% 8.5% - 19.0% 14.8% 2.2% 2.0%Cost error % 66.46 - - 4.3% 2.2% 79.23 - - 0.6% 0.6%(1, 3) 1 1 4.47 4.03 4.63 4.63† 4.49 23.12 21.39 23.40 23.20† 23.40
2 3.88 3.88 3.88 3.88 3.88 20.01 20.01 20.01 20.01 20.013 26.44 13.17 26.44 26.44 26.44 22.03 20.92 22.22 22.22† 22.054 34.21 23.74 40.85 40.85† 38.84 30.15 24.06 31.58 31.58† 30.485 20.65 13.21 26.50 26.50† 24.25 32.91 24.98 35.04 35.04† 33.61
2 1 36.35 27.14 48.69 40.94† 40.94 47.85 34.15 54.72 44.87† 44.772 42.87 27.44 48.69 40.94 40.94 46.07 33.61 52.79 42.98 42.983 48.41 27.44 48.69 40.94 40.94 47.73 33.99 54.18 44.32 44.234 47.81 27.44 48.69 40.94 40.94 55.72 35.49 63.27 52.90 51.795 43.07 27.14 48.69 40.94 40.94 57.58 35.62 65.52 54.97 53.67
Soln error % - 36.8% 12.2% 11.6% 10.1% - 25.8% 10.3% 4.9% 4.9%Cost error % 72.90 - - 7.7% 6.2% 89.63 - - 1.0% 1.2%(3, 1) 1 1 28.57 28.57 28.57 28.57 28.57 40.79 39.58 40.93 40.93† 40.82
2 42.53 31.96 42.90 42.90† 42.90 38.81 38.81 38.81 38.81 38.813 41.77 31.97 42.94 42.94† 42.94 40.22 39.39 40.32 40.32† 40.204 40.17 31.97 42.95 42.95† 40.94 47.86 41.83 49.22 49.22† 48.085 28.57 28.57 28.57 28.57 28.57 49.79 42.30 51.57 51.57† 50.24
2 1 40.83 33.20 48.69 40.94† 40.94 45.13 42.09 54.72 44.87 44.772 40.48 33.20 48.69 40.94† 40.94 43.33 41.73 52.79 42.98 42.983 39.41 33.45 48.69 40.94† 40.94 45.20 42.01 54.18 44.32 44.234 44.62 33.48 48.69 40.94 40.94 52.96 43.06 63.27 52.90 51.795 43.91 33.44 48.69 40.94 40.94 54.65 43.18 65.52 54.97† 53.67
Soln error % - 18.2% 9.9% 3.3% 2.8% - 9.8% 11.5% 1.1% 1.0%Cost error % 87.90 - - 0.6% 0.6% 108.59 - - 0.4% 0.1%
of the system also differs depending on whether W is cyclic. In the noncyclic demand case with
lead times (L1,L2) = (3,1), the realized fill rate at Stage 2 is only 78%. However, when Stage 2 has
longer lead time than Stage 1, i.e., (L1,L2) = (1,3), the realized fill rate at Stage 2 is 98% under
cyclic demand and 93% under noncyclic demand. This observation suggests that the high internal
fill rate assumption under the decoupling heuristic holds when the downstream lead time is shorter
than the upstream lead time (e.g., Abhyankar and Graves 2001). However, this assumption may
be problematic when the downstream lead time is longer than the upstream lead time.
52
Table 4 Fill rates in a two-stage system under the three-state demand cases.
Cyclic demand Noncyclic demand(L1,L2) n s∗n(k) svn(k) san(k|h[1,n]) s∗n(k) svn(k) san(k|h[1,n])(1, 1) 1 88% 89% 89% 89% 90% 89%
2 88% 60% 77% 88% 85% 86%(1, 3) 1 90% 90% 90% 89% 89% 88%
2 98% 87% 96% 93% 90% 89%(3, 1) 1 91% 92% 91% 92% 92% 92%
2 80% 77% 81% 78% 74% 75%
The Bullwhip Effect
The bullwhip effect, or the amplification of demand variability propagating from the downstream
to the upstream stages of a supply chain, has been extensively studied in the literature (see Chen
and Lee 2009, 2012 and the references therein). Most studies of the bullwhip effect have focused on
either a single-stage system or a two-stage system with decentralized decision making. In a serial
supply chain with centralized decision making, it is known that a stationary echelon base-stock
policy is optimal under i.i.d. demand (Clark and Scarf 1960). As a result, the replenishment order
at each stage in each period is simply the replacement of the demand in the previous period, so
there is no amplification of demand variability in such a system. However, when demand follows an
MMD process, the optimal inventory policy is state-dependent, and it is not clear whether demand
variability would be amplified or dampened in such a system.
Specifically, we simulate the two-stage inventory system for 50,000 periods to measure the order
variability amplification ratio (i.e., the bullwhip effect ratio) at both stages. We first calculate the
sample variances of external demand, orders placed by Stage 1, and orders placed by Stage 2. Then,
the bullwhip effect ratio for Stage 1 is determined by the ratio of Stage 1 order variance over the
external demand variance, and the bullwhip effect ratio for Stage 2 (or more precisely echelon 2) is
determined by the ratio of Stage 2 order variance over the external demand variance. We present
the results of the base case of three-state MMD in Table 5. The numerical insights of the five-state
demand cases are essentially the same, and we omit them for brevity.
Table 5 Bullwhip effect ratio in a two-stage system under the three-state demand cases.
Cyclic demand Noncyclic demand(L1,L2) n s∗n(k) svn(k) san(k|h[1,n]) s∗n(k) svn(k) san(k|h[1,n])(1, 1) 1 0.85 0.98 1.09 0.63 0.69 0.64
2 0.64 1.00 1.00 0.70 0.67 0.63(1, 3) 1 1.15 1.02 1.19 0.63 0.68 0.63
2 0.75 0.78 0.91 0.64 0.65 0.62(3, 1) 1 0.68 0.77 0.66 0.63 0.67 0.63
2 0.49 0.76 0.89 0.65 0.65 0.62
53
From Table 5, we observe that the bullwhip effect ratios under both heuristics closely track
those under the optimal policy for the noncyclic demand. When the demand is cyclic, the bullwhip
effect ratios under both heuristics tend to be higher than those under the optimal policy. This
observation accords with our earlier observations that the heuristics tend to perform better when
demand is noncyclic.
An interesting observation from the above table is that the bullwhip effect ratio under the optimal
policy is less than one in most cases, indicating a sweeping demand variability dampening effect.
Specifically, for noncyclic demand, the bullwhip effect ratios for Stages 1 and 2 are consistently
below 0.70. To further investigate this phenomenon, we compute the autocorrelation of sequential
demand under the noncyclic demand. We find that ρ(1) = −0.122, ρ(2) = −0.002, ρ(3) = 0.010,
ρ(4) =−0.003, and ρ(5) = 0.000, where ρ(i) = CorrD(t),D(t+ i). Thus, there exists only a weak
negative correlation among sequential demands, which cannot fully account for the large magnitude
of the variability dampening effect. Chen and Lee (2012) showed that the existence of system
capacity constraints can help dampen the bullwhip effect. In our system with MMD, there is no
capacity constraint, yet we still observe that the bullwhip effect can be significantly dampened
under the optimal policy. This suggests that the state-dependent inventory policy under MMD
may have an inherent smoothing effect on the order variability. Finding the root cause for this
intriguing observation is beyond the scope of this paper, and we shall leave it for future research.