serial inventory systems with markov-modulated demand ...lc91/more/papers/chen_song_zhang... ·...

Serial Inventory Systems with Markov-Modulated Demand:

Derivative Bounds, Asymptotic Analysis, and Insights

Li ChenSamuel Curtis Johnson Graduate School of Management, Cornell University, Ithaca, NY 14853

[email protected]

Jing-Sheng SongFuqua School of Business, Duke University, Durham, NC 27708

[email protected]

Yue ZhangFuqua School of Business, Duke University, Durham, NC 27708

[email protected]

In this paper we consider the inventory control problem for serial supply chains with continuous, Markov-

modulated demand (MMD). Our goal is to simplify the computational complexity by resorting to certain

approximation techniques, and, in doing so, to gain a deeper understanding of the problem. To this end, we

analyze the problem in several new ways. We first perform a derivative analysis of the problem’s optimality

equations, and develop general, analytical solution bounds for the optimal policy. Based on the bound results,

we derive a simple procedure for computing near-optimal heuristic solutions for the problem. These simple

solutions reveal a closer relationship with the primitive model parameters. Second, we perform asymptotic

analysis with long replenishment lead time and establish an MMD central limit theorem. We further show

that the relative errors between our heuristics and the optimal solutions converge to zero as the lead time

becomes sufficiently long, with the rate of convergence being the square root of the lead time. Our numerical

results reveal that our heuristic solutions can achieve near-optimal performance even under relatively short

lead times. Third, we show that, by leveraging the Laplace transformation, the optimal policy becomes

computationally tractable under the gamma distribution family. This enables us to numerically compare

various heuristic solutions with the optimal solution, and to demonstrate that our heuristic outperforms

existing heuristics in most cases. Finally, we observe that the internal fill rate and demand variability

propagation in an optimally controlled supply chain under MMD exhibit behaviors different from those

under stationary demand.

Key words : multi-echelon inventory systems, Markov-modulated demand, derivative, asymptotic analysis

History : file version October 8, 2015

1. Introduction

The increasingly open global economy has made it possible for companies to seek the best available

resources and technologies worldwide and to serve new markets. As a result, many supply chains

have been stretched long and thin, often consisting of multiple production and distribution stages

across different countries and continents. This new supply chain configuration brings new challenges

1

2

to managers. For example, demand shocks, which could be political, economic, technological, or

climatic, in one region can quickly propagate to other regions, causing shortages in some places but

oversupplies in others. Thus, companies that traditionally operate their supply chains in a relatively

stable environment, such as within one country, must now learn how to effectively rationalize

inventory along the global supply chain to hedge against greater environmental uncertainties.

To help managers to meet this new challenge, we consider in this paper a serial supply chain

model that involves demand uncertainties driven by dynamic environmental factors. Specifically, we

assume that customer demand in each period is influenced by a “world” state that evolves according

to a discrete-time Markov chain. This demand process is known as the Markov-modulated demand

(MMD) process in the literature (Iglehart and Karlin 1962 and Song and Zipkin 1993). The MMD

process is useful and practical for its structured data requirement and modeling flexibility. To

construct the Markov chain, one can first identify demand scenarios of different phases in a product

development process or in a product life cycle, and then assess the likelihood of each scenario as

well as the transition probabilities between the scenarios, much as in the construction of a decision

tree (Song and Zipkin 1996b).

Chen and Song (2001) showed that a world-state-dependent echelon base-stock policy is optimal

for serial inventory systems with MMD. Computing the optimal policy, however, is a nontrivial task.

Chen and Song (2001) developed an iterative optimization algorithm, which requires solving a set

of integral renewal equations in each iteration. Their algorithm is effective for the discrete demand

case (as the integral renewal equations reduce to linear equations), but remains computationally

challenging for the continuous demand case. Muharremoglu and Tsitsiklis (2008) showed that the

state-dependent echelon base-stock policy continues to be optimal for more general systems with

(exogenous and sequential) stochastic lead times and discrete demand processes (including MMD).

They decomposed the problem into a series of single-unit single-demand problems, and developed

an efficient dynamic programming algorithm for computing the optimal policy. Due to the nature

of their single-unit decomposition approach, their algorithm cannot be applied to the continuous

demand case. In addition, while representing a significant improvement over the standard dynamic

programming algorithms, these exact algorithms work somewhat like a “black box”—numbers in

and numbers out, lacking the transparency between the model parameters (inputs) and the optimal

inventory decisions (outputs).

In this paper we focus on the continuous demand case. Our goal is two-fold. First, we seek to

develop new approximation techniques to simplify the computational complexity. Such solution

techniques enable managers to make swift decisions in choosing appropriate inventory levels along

the supply chain to hedge against environmental fluctuations. Second, we strive to make the black

box more transparent by establishing a direct relationship between the primitive model parameters

3

and the simple solutions. Such relationship sheds light on both the qualitative and quantitative

effects of the environmental uncertainties. It also helps guide managers to invest wisely to obtain

the right information (or problem parameters) and to react quickly in the right direction. To achieve

the above goal, we analyze the problem in the following new ways.

Derivative Bounds and Heuristics

Our first approach is to perform a derivative analysis of the problem’s optimality equations and

develop derivative bounds. The solution bounds obtained from our derivative bounds can be com-

puted without solving the integral renewal equations required in the algorithm of Chen and Song

(2001). The derivative bound expressions also reveal a closer relationship with the primitive model

parameters, which was not apparent in the algorithms of Chen and Song (2001) and Muharremoglu

and Tsitsiklis (2008).

From the derivative analysis, we obtain general, analytical solution bounds for serial inventory

systems with MMD. The existing bounding approaches for problems with nonstationary demand

all build on a simplifying assumption that requires the optimal state-dependent echelon base-stock

level to be achievable in each period (e.g., Dong and Lee 2003 and Shang 2012). This assumption,

however, does not hold in general, and the “lower bound” obtained from these approaches may fail

to bound the optimal solution under MMD. On the contrary, our derivative-based approach does

not require such an assumption, and thus our solution bounds work in general. In addition, our

solution bounds can be viewed as a generalization of those obtained by Shang and Song (2003) for

the independent and identically-distributed (i.i.d.) demand process. When the state space of the

Markov chain degenerates to a singleton, the demand process becomes i.i.d. and our bounds reduce

to theirs, whereas their bounding approach does not extend to the MMD case (see Appendix D

for a detailed discussion).

Computing our derivative-based solution lower bound involves optimization over demand state

permutations, which can be challenging if there is a large number of states. To simplify computation,

we develop easy-to-compute heuristic solutions that only require evaluating a linear combination

of derivatives of the newsvendor cost functions of each demand state. To our knowledge, this

derivative-based heuristic is the first of its kind in approximating the optimal policy for both single-

stage and serial inventory systems with MMD. Moreover, with this heuristic, we can establish a

mapping between the underlying Markov chain transition probabilities and the linear weight of

each demand state, making the solution process more transparent.

Asymptotic Analysis with Long Lead Time

Our second approach is to perform an asymptotic analysis of the problem with long lead time. In

doing so, we first establish an MMD central limit theorem for the lead time demand (i.e., demand

4

aggregated over the lead time periods). We find that, as the lead time becomes sufficiently long,

the lead time demands with different initial states all converge to the same normal distribution.

To prove this result, we use the concept of α-mixing to show that demands far apart in time under

MMD are almost independent (Billingsley 1995). This finding reveals an inherent averaging effect

of aggregation under MMD: While the demand distributions of different states in a single period

may be drastically different, the aggregated (lead time) demand would become more resembling to

that of an i.i.d. process as the aggregation period increases.

Leveraging our derivative analysis and an MMD central limit theorem, we obtain asymptotic

results for the solution bounds under long replenishment lead time. Specifically, we show that the

relative errors between our solution upper and lower bounds converge to zero as the lead time

increases, with the convergence rate being the square root of the lead time. Accordingly, heuristic

solutions that fall between our solution bounds, such as our derivative-based heuristic and its

simplified variants, are guaranteed to converge to the optimal solution at a minimum speed of

square root of lead time. This observation suggests that, when the lead time is sufficiently long, the

seemingly complex supply chain problem with MMD has surprisingly simple solutions. In proving

the above result, we make a new methodological contribution to the literature—one can apply our

analysis approach to establish similar results for other inventory systems.

Exact Evaluation and Observations

While solving integral renewal equations is generally challenging for continuous demand, we dis-

cover an exception. When demand belongs to the gamma distribution family, we find that the

exact algorithm of Chen and Song (2001) becomes tractable with the aid of Laplace transformation

and its inverse (see Appendix C for details). This finding enables us to numerically compare our

derivative-based heuristic solutions with the optimal policy under relatively short lead times, com-

plementing the asymptotic results established under long lead times. From our numerical studies,

we observe that our heuristic achieves near-optimal performance in most cases.

We further compare our derivative-based heuristic with the decoupling heuristic proposed by

Abhyankar and Graves (2001). Their heuristic, designed for a two-stage system with MMD, is based

on the following simplifying assumption. The upstream stage provides 100% internal fill rate, i.e., it

can always fulfill orders from the downstream stage. As a result, the two-stage system is decoupled

into two single-stage systems, with the downstream stage operating under a state-dependent instal-

lation base-stock policy, and the upstream stage operating under a static installation base-stock

policy. Our numerical results show that our derivative-based heuristic outperforms their decoupling

heuristic in most cases, with an average performance improvement of 4.6%. Besides the perfor-

mance difference, we note that our derivative-based heuristic can be applied to systems with any

number of stages, whereas their decoupling heuristic only works for two-stage systems.

5

It has been demonstrated in the literature that, in a serial inventory system with i.i.d. demand,

an optimal internal fill rate is usually much lower than the fill rate at the customer-facing stage

(see Choi et al. 2004, Shang and Song 2006, and references therein). For systems with MMD,

a natural question is whether the internal fill rate would continue to be low, or become much

higher as assumed in the decoupling heuristic of Abhyankar and Graves (2001). To obtain insights

into this question, we conduct numerical experiments to measure the (optimal) internal fill rate

under MMD. We find that, contrary to the observations documented for the i.i.d. demand case,

when the lead time at the upstream stage is long, the internal fill rate can be high under MMD

(around 94-96%; the target fill rate at the customer-facing stage is around 94%), suggesting that

the decoupling heuristic may yield a good approximation to the optimal policy in this case. On the

other hand, when the lead time at the upstream stage is relatively short, the internal fill rate tends

to be low (around 84-93%, lower than the 94% target fill rate at the customer-facing stage). As

a result, the decoupling heuristic yields a poor approximation; and our derivative-based heuristic

has a clear advantage in this case, as it does not require the high internal fill rate assumption.

In addition, we investigate numerically the demand variability propagation effect, or the bullwhip

effect (see Chen and Lee 2009, 2012 and the references therein). Interestingly, we observe that the

bullwhip effect can be significantly dampened under the optimal policy, indicating that the state-

dependent inventory policy under MMD may smooth demand variability propagation in the supply

chain (see Appendix E for details). This contrasts the existing observations under autoregressive

AR(1) demand in the literature, suggesting interesting future research directions.

Finally, we note that our derivative bounds are derived based on the optimality equations of

the exact algorithm. Chen and Song (2001) showed that the exact algorithm can be extended to

systems with a fixed setup cost at the upmost stage as well as the assembly systems. Thus, all our

results can be extended accordingly to those more general systems.

The rest of this paper is organized as follows. We provide a brief literature review in §2. We then

analyze a single-stage inventory problem in §3. This lays the foundation for the analysis of serial

inventory systems in §4. §5 presents the numerical studies, and §6 concludes the paper. All proofs

are presented in Appendix A; and all other supplementary results are included in Appendices B-E.

2. Literature Review

The serial system structure we consider in this paper is the classic multi-echelon inventory model,

first developed and analyzed by Clark and Scarf (1960), and further studied by many researchers

in various dimensions (see Zipkin 2000 for a review). It serves as an important baseline model

and a key building block for more complex supply chain structures. Much of the literature on

this classic model assumes an i.i.d. demand process. Under this assumption, it has been shown

6

that a stationary echelon base-stock policy is optimal. Earlier work focused on the computational

efficiency of the optimal policy; e.g., see Federgruen and Zipkin (1984) and Chen and Zheng (1994).

More recently, researchers have developed simple bounds and heuristics that aim to increase the

transparency of the factors that determine the optimal policy. See, for example, Gallego and Zipkin

(1999), Dong and Lee (2003), Shang and Song (2003), Gallego and Ozer (2005), and Chao and

Zhou (2007). Our work extends their efforts to systems with MMD.

The MMD process, which extends the i.i.d. demand process to incorporate dynamic state evo-

lution, was first introduced by Iglehart and Karlin (1962) to study a single-stage inventory model,

but only gained popularity in the inventory literature in the last few decades; see, for example, Song

and Zipkin (1993), Ozeciki and Parlar (1993), Beyer and Sethi (1997), and Sethi and Cheng (1997).

An important special case of MMD is the cyclic demand model, which various authors explored;

see, for example, Karlin (1960), Zipkin (1989), Aviv and Federgruen (1997), and Kapuscinsky and

Tayur (1998). Several authors also adopted MMD in serial inventory systems, but focused on dif-

ferent aspects of the problem than we do. For example, Song and Zipkin (1992, 1996a, 2009) and

Abhyankar and Graves (2001) analyzed specific types of policies, Parker and Kapukinski (2004)

characterized optimal policy for a two-echelon capacitated system, Huh and Janakiraman (2008)

used a sample-path approach to establish the optimality of the echelon base-stock policies, Angelus

(2011) considered the complexity of incorporating secondary market sales, and Abouee-Mehrizi et

al. (2014) conducted an exact analysis of capacitated two-echelon inventory systems with priorities.

The most closely related works to ours are Chen and Song (2001) and Muharremoglu and Tsitsiklis

(2008). These authors showed that, for a serial inventory system, a state-dependent echelon base-

stock policy is optimal. They also presented exact algorithms to compute the optimal policy. Using

a decomposition approach similar to that in Muharremoglu and Tsitsiklis (2008), Janakiraman and

Muckstadt (2009) developed efficient algorithms for computing the optimal policies for capacitated

and lost-sales serial systems. We do not consider these system features in our paper.

There has also been research in the literature on deriving solution bounds for serial inventory

systems with nonstationary demand processes. For example, Dong and Lee (2003) developed lower

bounds for optimal policies for serial systems with time-correlated demand. The time-series demand

model requires past demand information, whereas the MMD model we consider here focuses on

external factors. Shang (2012) extended the bounding approach of Shang and Song (2003) to derive

solution bounds for serial inventory systems with independent but nonidentical demands (which is

a special case of MMD). As discussed earlier, the bounding approaches employed in these papers

all build on a simplifying assumption that the optimal state-dependent echelon base-stock level is

achievable in each period. This assumption does not hold in general; as a result, the “lower bound”

obtained from these approaches may fail to bound the optimal solution under MMD. Our paper

complements this literature by deriving solution bounds that work in general under MMD.

7

3. Single-Stage Inventory Systems

For expositional purposes, we consider a single-stage inventory problem in this section. This allows

us to clearly introduce the key ideas to be used later in the analysis for serial inventory systems.

Specifically, the demand of a product is met with on-hand inventory in each period; when there

is stockout, we assume the unmet demand is fully backlogged. Inventory is replenished from an

external source with ample supply. The replenishment lead time is a constant of L periods. At the

end of each period, unit inventory holding cost h and unit backlog penalty cost b are charged on the

on-hand inventory and backorders, respectively. The planning horizon is infinite, and the objective

is to minimize the long-run average cost of the inventory system. Because the linear ordering cost

under the long-run average cost is a constant, we can assume the ordering cost is zero without loss

of generality (see Federgruen and Zipkin 1984).

The demand process is nonstationary and has an embedded Markov chain structure (i.e., the

MMD process). Specifically, demand in each period is a nonnegative random variable denoted by

Dk, where k is the demand state that determines the continuous demand distribution with density

function fk(·) and the cumulative density function Fk(·). The demand state in period t follows

a Markov chain W = W (t), t = 1,2, ..., with a total of K states, denoted by 1,2, ...,K. The

Markov chain is time-homogeneous. Let P = (pij) denote the transition probability matrix, where

pij is the one-step transition probability from state i to j. Without loss of generality, we assume the

Markov chain is irreducible, which implies that all states communicate with each other. (When the

Markov chain is reducible, under the long-run average cost criterion, the problem is equivalent to

one involving only the irreducible class of the chain.) Also let Dk[t, t′] denote the total demand in

periods t, ..., t′ with the demand state being k in period t, and let Dk[t, t′) denote the total demand

in periods t, ..., t′− 1.

In each period, the sequence of events is as follows: 1) at the beginning of a period, the demand

state k is observed; 2) an inventory replenishment order is placed with the supplier; 3) a shipment

ordered L periods ago is received from the supplier; 4) demand arrives during the period; and 5)

at the end of the period, inventory holding and backorder costs are assessed.

3.1 Preliminaries

It has been shown that a state-dependent base-stock policy s∗(k), k= 1, ...,K is optimal for the

problem (see Iglehart and Karlin 1962, Beyer and Sethi 1997, Chen and Song 2001). The policy

works as follows: When the demand state is k, if the inventory position is below the base-stock

level s∗(k), order up to this level; otherwise, do not order.

Given the demand state k and inventory position y, the single-period expected cost is

G(k, y) = h ·E

(y−Dk[t, t+L])+

+ b ·E

(Dk[t, t+L]− y)+, (1)

8

where (x)+ = maxx,0. The above function is the well-known newsvendor cost function, which

is convex in y. Its minimizer, denoted by s(k) = arg minyG(k, y), is termed the myopic base-stock

level. Specifically, s(k) solves the following first-order condition:

∂yG(k, y) =−b+ (b+h)Fk,L(y) = 0, or Fk,L(y) =b

b+h, (2)

where we use “∂y” to denote the first-order derivative with respect to y, and Fk,L(·) is the cumulative

distribution of the lead time demand Dk[t, t+L]. We shall also refer to the function ∂yG(k, y) as

the “newsvendor derivative function” throughout the paper. For later usage, we define the smallest

myopic base-stock level among all states and its corresponding demand state as

smin = min1≤k≤K

s(k), kmin = arg min1≤k≤K

s(k).

Although the myopic base-stock level s(k) is easy to compute, it is not necessarily optimal. The

demand state in the next period may result in a (stochastically) much smaller demand distribution;

therefore, stocking up to the myopic base-stock level in the current period may cause overstocking

in the next period. More sophisticated methods are needed to determine the optimal policy.

Exact Algorithm. Chen and Song (2001) developed the following K-iteration algorithm to

compute the optimal policy. The algorithm starts with a partition of the state space of W. In

each iteration i, let V i denote the set that contains the states for which we have not found the

optimal solution yet, and U i the complementary set of V i. Specifically, in the first iteration i= 1,

let V 1 = 1, ...,K and U 1 = φ. Define

G1(k, y) =G(k, y)

as given in (1), solve the single-period problem, and let s1(k) = s(k) denote the solution for state

k. Find the demand state that gives the smallest solution among V 1. Denote such a state as 1∗ =

arg mink∈V 1 s1(k) = arg min1≤k≤Ks(k). Then, the optimal solution for state 1∗ is just s∗(1∗) =

s1(1∗) = smin. Next, update the state sets by V 2 = V 1\1∗ and U 2 = U 1 ∪ 1∗, and proceed to

iteration i= 2. In iteration i+1 (1≤ i <K), for each k ∈ V i+1, solve the cost minimization problem

minyGi+1(k, y), where

Gi+1(k, y) =Gi(k, y) +∑

u∈Ui+1

pkuERi(u, y−Dk), (3)

and for any u∈U i+1,

Ri(u, y) =

∑u′∈Ui+1

puu′ERi(u′, y−Du) if u∈U i,

Gi(i∗, y) +∑

u′∈Ui+1

pi∗u′ERi(u′, y−Di∗) if u= i∗,

(4)

9

with Gi(i∗, y) =Gi(i∗,maxy, si(i∗)). It can be shown that Gi+1(k, y) is convex in y. Let si+1(k) =

arg minyGi+1(k, y). Also let (i+ 1)∗ = arg mink∈V i+1 si+1(k), i.e., the demand state that gives the

smallest si+1(k) among V i+1. Then the optimal solution for state (i+ 1)∗ is given by s∗((i+ 1)∗) =

si+1((i+1)∗). Update the state sets by V i+2 = V i+1\(i+1)∗ and U i+2 =U i+1∪(i+1)∗. Repeat

the above procedure until reaching the final iteration K.

In summary, the above algorithm finds the optimal base-stock level for each demand state by sort-

ing the demand states based on the solutions to the cost minimization problems minyGi+1(k, y)

in each iteration. Although the algorithm is simpler and more efficient than the standard dynamic

programming algorithms, it remains quite complicated because each iteration builds on the results

obtained in the previous iterations. The main difficulty lies in determining the function Ri at each

iteration, as given in (4). For the discrete demand case, computing Ri is relatively easy, as it

involves solving a set of linear equations (see Chen and Song 2001). For the continuous demand

case, however, determining Ri requires solving a set of integral renewal equations.

To illustrate the last point more clearly and to facilitate our subsequent analysis, we introduce

the following matrix notation. Let m = (m1, ...,mK) denote a permutation of the sequence of

demand states 1, ...,K. Define the following sub-matrices of transition probabilities under m: for

i= 1, ...,K − 1,

P(i)m =

pm1m1. . . pm1mi

.... . .

...pmim1

. . . pmimi

. (5)

Let m∗ = (1∗, ...,K∗) denote the optimal demand state sequence determined by the exact algo-

rithm. Let D(i)m∗(x) be the following diagonal matrix involving demand densities for states 1∗, ..., i∗:

D(i)m∗(x) =

f1∗(x) . . . 0...

. . ....

0 . . . fi∗(x)

. (6)

Also let Ri(y) = [Ri(1∗, y),Ri(2∗, y), ...,Ri(i∗, y)]T . Then, we can rewrite (4) in a matrix form as

follows:

Ri(y) =

∫ ∞0

D(i)m∗(x) ·P(i)

m∗ ·Ri(y−x)dx+ ei ·Gi(i∗, y), (7)

where ei is the unit vector with the i-th element being one.

Our Approach. Based on the exact algorithm, it is useful to let [k] denote the iteration in

which the optimal solution s∗(k) for state k is obtained. In other words,

s∗(k) = arg miny≥0

G[k](k, y)

.

Because G[k](k, y) is convex in y, its derivative is increasing in y. If we can find bounding functions

for its derivative, then the roots of the bounding functions become bounds for the optimal solution.

10

Thus, our approach is to develop bounding functions that involve only the primitive model param-

eters, so that the resulting solution bounds will depend only on the primitive model parameters.

To this end, by repeatedly applying (3) and (4) and then taking derivatives, we observe that

∂yG[k](k, y) = ∂yG(k, y) +

[k]−1∑i=1

∑u∈Ui+1

pku∂yERi(u, y−Dk). (8)

The first term is simple, given by (2), so what remains is to find simple functions to bound

∂yERi(u, y−Dk). Note that all functions in the exact algorithm are continuously differentiable

and all the expectations are finite. Therefore, we can interchange the derivative and expectation (or

integral) operators in our subsequent derivations, e.g., ∂yERi(u, y−Dk)=E∂yRi(u, y−Dk).

3.2 Derivative and Solution Bounds

We now derive bounds for ∂yG[k](k, y). Observe that ∂yG

i(i∗, y) = (∂yGi(i∗, y))+. Also, from equa-

tion (4), it is straightforward to show that ∂yRi(·, y) = 0 for y ≤ si(i∗). Using a standard result in

renewal equations (Sgibnev 2001), we obtain

Lemma 1 For any u ∈ U i+1, Ri(u, y) is increasing convex in y. Moreover, for any k = 1, ...,K,

Gi(k, y) is convex in y and Gi(i∗, y) is increasing convex in y.

According to the above Lemma, the second term of (8) is nonnegative, so we immediately have

∂yG[k](k, y)≥ ∂yG(k, y) =−b+ (b+h)Fk,L(y).

Thus, ∂yG(k, y) serves as a simple lower bound function for ∂yG[k](k, y).

We now proceed to derive an upper bound for ∂yG[k](k, y). Because W is irreducible, it follows

that I(i)−P(i)m is invertible for any i= 1, ...,K − 1, where I(i) is the i× i identity matrix and P(i)

m

is defined in (5). Thus, we can define the following parameters: for i= 1, ...,K − 1,

βi(m) =

∣∣I(i−1)−P(i−1)m

∣∣∣∣∣I(i)−P(i)m

∣∣∣ ,

where | · | is the matrix determinant and we adopt the convention that |I(0)−P(0)m |= 1. Taking the

derivative with respect to y on both sides of the equation (7), we have

∂yRi(y) =

∫ ∞0

D(i)m∗(x) ·P(i)

m∗ · ∂yRi(y−x)dx+ ei · ∂yGi(i∗, y)

=

∫ y

0

D(i)m∗(x) ·P(i)

m∗ · ∂yRi(y−x)dx+ ei · ∂yGi(i∗, y)

≤ P(i)m∗ · ∂yRi(y) + ei · ∂yGi(i∗, y),

11

where the inequality follows from the fact that ∂yRi(·, y) is increasing in y and the probability

distribution is less than one. Therefore,(I(i)−P

(i)m∗

)∂yR

i(y)≤ ei · ∂yGi(i∗, y). (9)

Let e be the i-dimensional vector with all elements being one. With some additional argument, we

obtain the following upper bound on ∂yRi(y):

Lemma 2 For i= 1, ...,K − 1,

∂yRi(y)≤

(I(i)−P

(i)m∗

)−1

· ei · ∂yGi(i∗, y)≤ e ·βi(m∗) · ∂yGi(i∗, y).

Applying both Lemmas 1 and 2 to the second term of (8) yields

[k]−1∑i=1

∑u∈Ui+1

pku∂yERi(u, y−Dk) ≤[k]−1∑i=1

∑u∈Ui+1

pkuβi(m∗)∂yG

i(i∗, y)

≤ (1− pkk)K−1∑i=1

βi(m∗)∂yG

i(i∗, y),

where the second inequality relaxes the range of the second summation to make it independent of

the set U i. We can further show that (see the detailed derivation in Appendix A)

K−1∑i=1

βi(m∗)∂yG

i(i∗, y)≤K−1∑i=1

αi(m∗)(∂yG(i∗, y))+ ≤∆(y),

where

αi(m) =K−1∑j=i+1

αj(m) ·φj,i(m) +βi(m), i= 1, ...,K − 1 (10)

with φj,i(m) = [pmjm1, pmjm2

, ..., pmjmi ]T(I(i)−P(i)

m

)−1ei, and

∆(y) = maxm∈S

K−1∑i=1

αi(m) · (∂yG(mi, y))+, (11)

with S being the set of all permutations of 1, ...,K. Thus, we have obtained an upper bound

function for ∂yG[k](k, y) based on a linear combination of simple newsvendor derivative functions.

Note that both βi(m) and αi(m) depend only on the primitive model parameters—the transition

matrix P. The following proposition summarizes the above bound results:

Proposition 1 For any state k= 1, ...,K, the following holds:

∂yG(k, y)≤ ∂yG[k](k, y)≤ ∂yG(k, y) + (1− pkk)∆(y).

12

When K = 1, ∆(y) ≡ 0, so the inequalities in the above proposition becomes binding and the

result reduces to the derivative of the single-period cost function. This is intuitive because when

K = 1, the demand process reduces to an i.i.d. process. With the derivative lower bound shown

above, it follows that the myopic base-stock level s(k) is an upper bound for the optimal solution

s∗(k). This result is implied by the exact algorithm of Chen and Song (2001). Symmetrically, from

the derivative upper bound shown above, we can obtain a lower bound for the optimal solution

s∗(k). Specifically, let s(k) be the solution to the following equation:

∂yG(k, y) + (1− pkk)∆(y) =−b+ (b+h)Fk,L(y) + (1− pkk)∆(y) = 0. (12)

We obtain the following result:

Corollary 1 For any demand state k= 1, ...,K, the following holds:

smin ≤ s(k)≤ s∗(k)≤ s(k).

where s(k) and s(k) are determined by (2) and (12), respectively. The inequalities become equalities

when k= kmin.

While the bounds smin and s(k) have appeared in the literature (see Song and Zipkin 1993),

the lower bound s(k) is new to the literature. This new lower bound is tighter than the existing

one, smin. Note that computing the new lower bound involves optimization over demand state

permutations, which can be challenging if there is a large number of states. To further simplify

computation, we develop easy-to-compute heuristic solutions in the next subsection.

3.3 Heuristics

The insights gained from the derivation of the derivative bounds lead us next to construct heuristics

that fall between the solution lower and upper bounds. To do so, we augment the derivative lower

bound by a positive component that is less than (1−pkk)∆(y). Specifically, we first sort the myopic

base-stock level s(k) in an increasing order and let m = (m1, ..., mK) denote the resulting order

sequence of the states. Suppose that k = mj. Based on Lemma 2, we can make the following

approximation for the second term of ∂yG[k](k, y) in (8):

[k]−1∑i=1

∑u∈Ui+1

pku∂yERi(u, y−Dk) ≈j−1∑i=1

pkmi · ∂yERi(mi, y−Dk)

≈j−1∑i=1

pkmi ·βi(m) ·E

(∂yG(mi, y−Dk))+

≈ 1

2

j−1∑i=1

pkmi ·βi(m) ·Fk(y− s(mi)) (∂yG(mi, y))+.

13

Here, in the first step we approximate m∗ by m and replace U i+1 by mi. The second step follows

from Lemma 2 with m replacing m∗. The last step is based on the following integral approximation:

E

(∂yG(mi, y−Dk))+

=

∫ y−smi

0

(∂yG(mi, y−x))+fk(x)dx≈ 1

2Fk(y− s(mi)) · (∂yG(mi, y))+.

Now let sa(k) denote a heuristic solution determined by the following equation:

∂yG(k, y) +1

2

j−1∑i=1

pkmi ·βi(m) ·Fk(y− s(mi)) (∂yG(mi, y))+

= 0. (13)

Note that equation (13) involves only a linear combination of the newsvendor derivative functions

of different demand states. Thus, solving sa(k) only requires a direct evaluation of the lead time

demand distributions of different demand states. With this heuristic, we also establish a direct

relationship between the underlying Markov chain transition probabilities and the linear weights

in equation (13), making the solution process more transparent. Moreover, we have

Corollary 2 For any demand state k = 1, ...,K, smin ≤ s(k) ≤ sa(k) ≤ s(k). The inequalities

become equalities when k= kmin.

The above Corollary shows that the heuristic solution obtained from (13) indeed falls between

the solution lower and upper bounds. A detailed illustration of our bound and heuristic results

is provided in Appendix B for a two-state example (i.e., K = 2). In the example, we show that,

when it is highly likely to transit from a high demand state to a low demand state, one should

reduce the optimal inventory level for the high demand state, and our heuristic solution moves

in the same direction as the optimal solution. In addition, our heuristic solution provides a closer

approximation to the optimal solution when there is a high chance of transiting from a low demand

state to a high demand state (see Appendix B for details).

Combining Corollaries 1 and 2, we observe that s∗(k) and sa(k) fall in the same interval

[s(k), s(k)]. Therefore, if the interval is tight, the heuristic solution sa(k) becomes a close approx-

imation to the optimal solution s∗(k). In the next subsection, we identify sufficient conditions to

ensure that such an interval is tight.

3.4 Asymptotic Analysis with Long Lead Time

Intuitively, when the replenishment lead time L increases, the lead time demands starting from

different states may converge to the same distribution. As a result, the gap between smin and s(k)

will narrow. Below we formally prove this result by first establishing an MMD central limit theorem

for the lead time demand.

First, consider the case in which W has a stationary distribution π = [π(1), ..., π(K)]. Let Dπ(t)

be the demand in period t under the stationary distribution, that is, Dπ(t) follows the distribution

14

of Dk with probability π(k). Let µk =EDk and σ2k = varDk. The mean and variance of Dπ(t)

are given by

µ=EDπ(t)=K∑k=1

µkπ(k), varDπ(t)=K∑k=1

(σ2k + µ2

k

)π(k), (14)

where µk = µk−µ is the relative difference between µk and µ.

Let Dπ[t, t+L] denote the lead time demand under the stationary distribution π, we have

EDπ[t, t+L]= (L+ 1)µ, covDπ(t),Dπ(t+ l)=K∑k=1

K∑l=1

µkµlp(s)kl π(k), l= 1,2, ..., (15)

where p(s)kl is the s-step transition probability from state k to state l. We can further show (see the

proof of Lemma 3 in Appendix A)

σ2 = limL→∞

varDπ[t, t+L]L+ 1

= varDπ(t)+ 2∞∑l=1

covDπ(t),Dπ(t+ l). (16)

Thus, the variance limit σ2 contains two parts: the single-period demand variance and the covari-

ance across different periods. It is interesting to note that the covariance structure under MMD

depends only on the demand mean of each state. Thus, besides the demand variance at each state,

another major source of variability under MMD comes from the variation of the demand means.

Let N (0,1) denote the standard normal random variable with distribution function Φ(·). We

can establish the following result for the lead time demand distribution under MMD:

Lemma 3 (MMD Central Limit Theorem) Suppose that the Markov chain W has a stationary

distribution and Dk has finite moments for any demand state k. If σ 6= 0, then as L→∞,

Dk[t, t+L]− (L+ 1)µ

σ√L+ 1

dist.−−→N (0,1) ,

where µ and σ are defined in (14) and (16), respectively.

The above Lemma confirms the intuition that the lead time demands with different initial states

converge to the same distribution as the lead time increases. To prove this result, we use the

concept of α-mixing to show that demands far apart in time under MMD are almost independent

(Billingsley 1995). This finding reveals an inherent averaging effect of aggregation under MMD:

While the demand distributions of different states in a single period may be drastically different,

the aggregated (lead time) demand would become more resembling to that of an i.i.d. process as

the aggregation period increases.

Based on the result, when L is sufficiently long, the lead time demand Dk[t, t+L] can be approx-

imated by a normal distribution with mean (L+ 1)µ and variance (L+ 1)σ2, independent of the

15

initial demand state k. Consequently, the myopic base-stock level for state k can be approximated

by the following formula:

s(k)≈ (L+ 1)µ+ z∗σ√L+ 1, k= 1, ...,K, (17)

where z∗ = Φ−1(b/(b+ h)). In other words, all myopic base-stock levels s(k) converge to the same

value, so does smin.

Next, consider a cyclic Markov chain W. In this case W does not have a stationary distribution.

However, due to its cyclic nature, as the lead time increases, the lead time demands starting from

different states share a growing common component. As a result, we can show that, for any two

states k and k′, the lead time demand distributions Fk,L(y) and Fk′,L(y) converge to the same

limit distribution for every y as L goes to infinity. Thus, it continues to hold that all myopic base-

stock levels s(k) converge to the same value in this case. The following proposition summarizes our

discussion above and further determines the convergence rate:

Proposition 2 Suppose that the Markov chain W either has a stationary distribution or is cyclic,

and that Dk has finite moments for any demand state k. For any demand state k= 1, ...,K,

limL→∞

√L+ 1 · s(k)− smin

smin= 0.

The above proposition shows that the relative percentage error between s(k) and smin converges

to zero as L goes to infinity. The speed of convergence is at a rate of o(1/√L+ 1

). Thus, when the

lead time is sufficiently long, we can use the closed-form expression (17) or the more sophisticated

sa(k) (Corollary 2) to approximate s∗(k). In proving the above convergence result, we leverage

the fact that smin = mink∈1,...,K s(k), which is the minimum of the myopic base-stock levels of

different states. Since the myopic base-stock level of a given state depends only on the lead time

demand distribution associated with that state, we can then apply the MMD central limit theorem

to obtain the convergence rate result for our solution bounds.

4. Serial Inventory Systems

In this section we present our main results for the general serial inventory system with N > 1

stages. Specifically, random customer demand arises in every period at Stage 1, Stage 1 orders

from Stage 2, ..., and Stage N orders from an external supplier with ample supply. We also call

the external supplier Stage N + 1. If there is stockout at Stage 1, we assume the unmet demand

is fully backlogged. The production-transportation lead time from Stage n + 1 to Stage n is a

constant of Ln periods. Let L′n =∑n

j=1Lj denote the total lead time from Stage n+ 1 to Stage 1.

At the end of each period, a unit (installation) inventory holding cost Hn is charged for on-hand

16

inventories at Stage n (1 ≤ n ≤ N) and a unit backlog penalty cost b is charged for backlogs at

Stage 1. Following the convention in the literature, we define the echelon inventory holding cost at

Stage n as hn =Hn−Hn+1 > 0, with HN+1 = 0. The demand process at Stage 1 follows the MMD

process as described in §3.

We assume that all the replenishment activities in a period happen at the beginning of the

period after observing the demand state. At Stage n (n> 1), the sequence of events is as follows:

an order from Stage n− 1 is received, an order is placed with Stage n+ 1, a shipment is received

from Stage n+ 1, and then a shipment is sent to downstream Stage n− 1. For Stage 1, an order is

placed at the beginning of the period, a shipment is received from Stage 2, and then the demand

arrives during the period. The planning horizon is infinite and the objective is to minimize the

long-run average cost of the serial inventory system. As in the single-stage problem, we assume the

ordering cost is zero without loss of generality. In what follows, we shall assign a subscript “n” to

the corresponding functions and variables for Stage n.

4.1 Preliminaries

Let ILn(t) denote the echelon inventory level at Stage n by the end of period t, which includes the

total on-hand inventory at Stages 1, ..., n, plus the total inventory in transit to Stages 1, ..., n− 1,

minus the backorders at Stage 1. Also let B(t) denote the backorder level at Stage 1 by the end

of period t. Then, the system inventory holding and penalty cost at the end of period t can be

written as

N∑n=2

Hn[ILn(t)− ILn−1(t)] +H1[IL1(t) +B(t)] + bB(t) =N∑n=1

hnILn(t) + (b+H1)B(t),

where the right-hand side is a sum of a sequence of echelon-related costs from Echelon 1 to N .

Consider first the Echelon 1 cost. Under the long-run average cost criterion, we can charge the cost

of period t+L1 to period t without affecting the cost assessment. Specifically, given the demand

state k and inventory position y at the beginning of period t, the expected cost at the end of period

t+L1 is given by

Eh1IL1(t+L1) + (b+H1)B(t+L1)

= h1 ·E

(y−Dk[t, t+L1])+

+ (b+H2) ·E

(Dk[t, t+L1]− y)+.

Thus, we can define the single-period expected cost function at Echelon 1 as

G11(k, y) = h1 ·E

(y−Dk[t, t+L1])+

+ (b+H2) ·E

(Dk[t, t+L1]− y)+

. (18)

Recall that when N = 1, h1 =H1 and H2 = 0. Therefore, in the single-stage system, G11(k, y) reduces

to G(k, y) as defined in (1). Chen and Song (2001) showed that the echelon decomposition technique

17

of Clark and Scarf (1960) for the i.i.d. demand case can be extended to the MMD process. That

is, the optimal state-dependent echelon base-stock policy can be determined sequentially from

Echelons 1 to N , with the aid of an induced penalty function between the successive echelons.

Exact Algorithm. Chen and Song (2001) developed the following algorithm for computing the

optimal policy. Start with Stage 1. Apply the K-iteration algorithm described in §3.1 to G11(k, y)

defined in (18), and obtain the optimal echelon base-stock levels s∗1(k) for Stage 1. For Stage n≥ 2,

first define the following induced penalty functions from stage n− 1 to stage n: for k= 1, ...,K,

Gn−1,n(k, y) =G[k]n−1(k,miny, s∗n−1(k))−G[k]

n−1(k, s∗n−1(k)), (19)

where G[k]n−1(k, ·) is the function obtained from the K-iteration algorithm for Stage n−1 (see §3.1).

With the induced penalty functions, define the following cost functions for Stage n: for k= 1, ...,K,

G1n(k, y) =Ehn(y−Dk[t, t+Ln]) +Gn−1,n(Wk(t+Ln), y−Dk[t, t+Ln)), (20)

where Wk(t+Ln) is the demand state at the beginning of period t+Ln given the state k in period t.

Now apply the K-iteration algorithm again to G1n(k, y), and obtain the optimal echelon base-stock

levels s∗n(k) for Stage n. Repeat the above procedure until reaching the final Stage N .

Thus, the computation procedure for the N -stage system requires N ×K iterations. Besides the

difficulty in determining the function Rin at each iteration as discussed in §3.1, determining the

induced penalty function Gn−1,n in the objective function (20) presents an additional challenge.

Our Approach. As in §3.2, we seek to derive simple bounds for the optimal policy by performing

a derivative analysis of the two key equations (19) and (20) in the exact algorithm. Analogous to

the expression (8) in the single-stage system, we can show

∂yG[k]n (k, y) = ∂yG

1n(k, y) +

[k]−1∑i=1

∑u∈Ui+1

n

pku∂yERin(u, y−Dk). (21)

For each Stage n and demand state k, we define the following newsvendor cost function parame-

terized by ζ (hn ≤ ζ ≤∑n

i=1 hi):

Gn(k, y|ζ) = ζ ·E

(y−Dk[t, t+L′n])+

+ (b+Hn+1) ·E

(Dk[t, t+L′n]− y)+. (22)

Its derivative function ∂yGn(k, y|ζ) =−(b+Hn+1) + (b+Hn+1 + ζ)Fk,L′n(y) will play an important

role in our bound and heuristic development. Note that when N = 1, ζ ≡ h1 = h, H2 = 0, and the

above newsvendor cost function reduces to (1) .

18

4.2 Derivative and Solution Bounds

For illustrative purposes, consider the cost function for Stage 2. From (19) and Proposition 1, we

have

∂yG1,2(k, y) = ∂yG[k]1 (k,miny, s∗1(k))≥ ∂yG1

1(k,miny, s∗1(k)) = min

0, ∂yG11(k, y)

.

Because H1 ≥H2,

∂yG11(k, y) =−(b+H2) + (b+H1)Fk,L1

(y)≥−(b+H2) + (b+H2)Fk,L1(y).

Observe that the right-hand side of the above inequality is always negative. It follows that

∂yG1,2(k, y)≥−(b+H2) + (b+H2)Fk,L1(y).

With this inequality, we obtain

∂yG12(k, y) = h2 +E ∂yG1,2(Wk(t+L2), y−Dk[t, t+L2))

≥ h2 +E−(b+H2) + (b+H2)FWk(t+L2),L1

(y−Dk[t, t+L2))

= −(b+H3) + (b+H2)Fk,L′2(y) = ∂yG2(k, y|h2).

where the second equality follows from EFWk(t+L2),L1(y−Dk[t, t+L2))= Fk,L′2(y). By an argu-

ment analogous to that used in Proposition 1, we have

∂yG[k]2 (k, y)≥ ∂yG2(k, y|h2). (23)

Repeating the above argument from Stage 2 to Stage n, we can obtain a lower bound for ∂yG[k]n (k, y).

Developing the derivative upper bound for the serial system, however, is more complex, because

we need to bound the induced penalty function Gn−1,n in (20). Nevertheless, we can show the

following result, which generalizes Proposition 1 to the serial system. Denote h[1,n] =∑n

i=1 hi.

Proposition 3 For any stage n= 1, ...,N and any state k= 1, ...,K, the following holds:

(a) ∂yG[k]n (k, y)≥ ∂yGn(k, y|hn),

(b) ∂yG[k]n (k, y)≤ ∂yGn(k, y|h[1,n]) + (1− pkk)∆n(y) +

∑n−1

j=1 ∆j,n(k, y),

where ∆n(y) and ∆j,n(k, y) are recursively defined as follows: for n= 1, ...,N ,

∆n(y) = maxm∈S

K−1∑i=1

αi(m)[∂yGn(k, y|h[1,n]) +

∑n−1

j=1 ∆j,n(mi, y)]+

,

with ∆j,n(k, y) =E

∆j

(y−Dk[t, t+L′n−L′j)

)for 1≤ j ≤ n− 1 and αi(m) given by (10).

19

It is straightforward to verify that ∆1(y) = ∆(y) as defined in (11). Thus, the above result

reduces to that of Proposition 1 when N = 1. When N > 1, there is an extra term∑n−1

j=1 ∆j,n(k, y)

on the right-hand side of the inequality. This term captures the dependence of Echelon n on all

the downstream echelon base-stock levels from Echelons 1 to n− 1.

Based on Proposition 3, let sn(k) be the solution to the equation

∂yGn(k, y|hn) =−(b+Hn+1) + (b+Hn)Fk,L′n(y) = 0, (24)

and let sn(k) be the solution to the equation:

∂yGn(k, y|h[1,n]) + (1− pkk)∆n(y) +∑n−1

j=1 ∆j,n(k, y) = 0. (25)

It follows that

Corollary 3 For any stage n= 1, ...,N and any state k= 1, ...,K, sn(k)≤ s∗n(k)≤ sn(k).

The above result generalizes Corollary 1 to serial inventory systems. To our knowledge, Corollary

3 presents the first general, analytical solution bounds for serial inventory systems with MMD. The

existing bounding approaches for problems with nonstationary demand all build on a simplifying

assumption that the optimal state-dependent echelon base-stock level is achievable in each period

(e.g., Dong and Lee 2003, Theorem 6 and Shang 2012, Theorem 1). This assumption, however, does

not hold in general, and the “lower bound” obtained from these approaches may fail to bound the

optimal solution under MMD. On the contrary, our derivative-based bounding approach does not

require such an assumption and accounts for the dependence on the downstream echelon base-stock

levels through the ∆ terms in Proposition 3. Thus, our solution bounds work in general.

To see the above point more clearly, observe that by ignoring the ∆ terms in the derivative upper

bound in Proposition 3, we obtain a newsvendor derivative function. Let svn(k) be the root of this

function, i.e., the solution to

∂yGn(k, y|h[1,n]) =−(b+Hn+1) + (b+H1)Fk,L′n(y) = 0. (26)

Thus, svn(k) is the “lower bound” for the optimal solution s∗n(k) under the assumption that the

optimal state-dependent echelon base-stock level is achievable in each period (e.g., Dong and Lee

2003 and Shang 2012). However, for Stage 1, we have h[1,1] = h1, and thus, from Proposition 3 and

Corollary 3, sv1(k) = s1(k). Thus, sv1(k) is not a solution lower bound, but instead a solution upper

bound under MMD. For Stage n> 1, it is also not guaranteed that svn(k)≤ s∗n(k) under MMD (see

Table 1 in §5 for the counterexamples marked with a “†” sign).

When the state space of the Markov chain degenerates to a singleton, i.e., K = 1, the demand

process becomes i.i.d. In this case, the optimal state-dependent echelon base-stock level is always

20

achievable in each period, and the ∆ terms in Proposition 3 vanish. As a result, ∂yG[k]n (k, y) is

bounded below and above by simple newsvendor derivative functions and svn(k) becomes a true

lower bound in this case. It is interesting to note that under the i.i.d. demand process, the solution

bounds obtained from our derivative bounds coincide with those obtained by Shang and Song

(2003) through their single-stage approximation approach. However, their bounding approach does

not extend to the more general MMD process, as it also requires the assumption that the optimal

state-dependent echelon base-stock level is achievable in each period (see Appendix C for a detailed

discussion).

Our solution bounds can be computed without solving the integral renewal equations required

in the exact algorithm. However, computing the solution lower bound involves optimization over

demand state permutations, which can be challenging if there is a large number of states. To further

simplify computation, we develop easy-to-compute heuristic solutions in the next subsection.

4.3 Heuristics

Based on the definition of svn(k) given in (26), by Proposition 3, it follows that sn(k)≤ svn(k)≤ sn(k).

Hence, although svn(k) is not guaranteed to be a solution lower bound, it can still serve as a simple

heuristic solution for our problem.

We can further improve this heuristic based on the idea used in the single-stage problem in

§3.3. Specifically, we can sort svn(k) in an increasing order and let mn = (mn,1, ..., mn,K) denote the

resulting order sequence of the states. Suppose that k = mn,j. Based on Lemma 2, we can make

the following approximation (parameterized by ζ) for ∂yG[k]n (k, y) in (21):

∂yG[k]n (k, y) ≈ ∂yGn(k, y|ζ) +

j−1∑i=1

pkmn,i · ∂yERin(mn,i, y−Dk)

≈ ∂yGn(k, y|ζ) +

j−1∑i=1

pkmn,i ·βi(mn) ·E(

∂yGn(mn,i, y−Dk|ζ))+

≈ ∂yGn(k, y|ζ) +1

2

j−1∑i=1

pkmn,i ·βi(mn) ·Fk(y− svn(mn,i))(∂yGn(mn,i, y|ζ)

)+

,

where the parameter ζ takes value in hn ≤ ζ ≤ h[1,n], and the last step follows from the same integral

approximation as in the single-stage problem in §3.3.

Next, let san(k|ζ) denote a heuristic solution determined by the following equation:

∂yGn(k, y|ζ) +1

2

j−1∑i=1

pkmn,i ·βi(mn) ·Fk(y− svn(mn,i))(∂yGn(mn,i, y|ζ)

)+

= 0, (27)

where the above equation involves only a linear combination of the newsvendor derivative func-

tions of different demand states. Thus, solving the heuristic solution san(k|ζ) only requires a direct

21

evaluation of the lead time demand distribution of different demand states. With this heuristic,

we also establish a mapping between the underlying Markov chain transition probabilities and the

linear weights in equation (27), making the solution process more transparent. Moreover, we have

the following result:

Corollary 4 For any stage n= 1, ...,N and any state k= 1, ...,K, sn(k)≤ san(k|ζ)≤ sn(k).

Combining Corollaries 3 and 4, we observe that san(k|ζ) and the optimal solution s∗n(k) fall in the

same interval [sn(k), sn(k)], suggesting that san(k|ζ) can be a close approximation for the optimal

solution s∗n(k) as long as the interval is tight. In the next section we identify sufficient conditions

to ensure that such an interval is tight.

4.4 Asymptotic Analysis with Long Lead Time

Recall from Corollary 1 that in the single-stage system the solution lower bound s(k) is further

bounded below by smin = mink∈1,...,K s(k). However, in the serial inventory system, this property

is no longer available. Instead, we need to work directly with sminn = mink∈1,...,K sn(k) for n≥ 1,

with sn(k) determined from the derivative upper bound expression (25).

From Proposition 3, the derivative upper bound for the serial inventory system is more complex

than that for the single-stage system. Thus, in order to apply the MMD central limit theorem to

derive asymptotic results, we need to further relax the derivative upper bound to obtain a more

workable expression (see the proof of Proposition 4 in Appendix A). Overcoming these technical

difficulties, we can establish the following result, which generalizes Proposition 2 to serial inventory

systems.

Proposition 4 Suppose that the Markov chain W either has a stationary distribution or is cyclic,

and that Dk has finite moments for any demand state k. For any Stage n= 1, ...N and any demand

state k= 1, ...,K,

limsupL′n→∞

√L′n + 1 · sn(k)− sminn

sminn

= c

where c is a nonnegative constant.

The above proposition shows that the relative percentage error between sn(k) and sminn converges

to zero as the lead time L goes to infinity. The convergence rate is O(1/√L′n + 1

), where L′n is

the total lead time from Stage n+ 1 to Stage 1. Therefore, according to Corollary 4, the heuristic

solution san(k|ζ) is guaranteed to be a close approximation to the optimal solution s∗n(k) when

the total lead time L′n is sufficiently long. In other words, when the lead time is sufficiently long,

the seemingly complex supply chain problem with MMD has surprisingly simple solutions. This

is because the aggregated (lead time) demand under MMD becomes more resembling to that of

22

an i.i.d. process as the lead time increases. Moreover, in the serial inventory system, the total

lead time at an upstream stage is always greater than the total lead time at a downstream stage.

Thus, Proposition 4 suggests that the heuristic solution at the upstream stage tends to be a closer

approximation to the optimal solution than does the heuristic solution at the downstream stage.

Finally, in proving the above result, we make a new methodological contribution to the literature—

one can apply our analysis approach to establish similar results for other inventory systems.

5. Numerical Studies

In this section we conduct numerical studies to evaluate our bound and heuristic performance. With

the aid of Laplace transform and its inverse, we find that the optimal policy for our problem becomes

computationally tractable under gamma demand distribution (see Appendix C for details). This

finding enables us to benchmark our bounds and heuristic solutions against the optimal solution.

Specifically, we assume that the single-period demand Dk under demand state k follows a gamma

probability density

f(x|nk, θk) =(θk)

nkxnk−1e−θkx

Γ(nk), with x≥ 0, (28)

where nk is the shape parameter and θk the rate parameter (1/θk is the scale parameter). The

demand mean is µk = nk/θk. It is worth emphasizing that, unlike the optimal policy, our solution

bounds and heuristic solutions can be computed under any demand distributions.

We focus on presenting numerical results of two-stage systems. The numerical insights from

systems with more than two stages are similar to what we report below, and are thus omitted

for brevity. Recall from our asymptotic analysis—as the number of stages increases, the heuristic

solutions at upstream stages tend to be closer approximations to the optimal solution than those

at the downstream stages. Also note that the heuristic solutions (as well as the optimal solution)

are computed echelon by echelon from downstream to upstream stages, according to the echelon

decomposition technique of Clark and Scarf (1960). For each echelon, the solutions are computed

by assuming ample supply at the immediate upstream stage. Thus, the Echelons 1 and 2 solutions

in a two-stage system would be the same as the Echelons 1 and 2 solutions in an N -stage system

(with N ≥ 2). Combining this observation with our asymptotic results, we believe that focusing on

two-stage systems in our numerical studies is sufficient, as the numerical performance observed in

such systems is clearly informative to what to expect in systems with greater number of stages.

5.1 Base Case

We first present results for two demand cases. Both involve three demand states, with each state

following a gamma distribution. The parameters for these gamma distributions are (ni, θi) = (1,1),

(2,2), and (2,0.2), for i = 1,2,3, representing demand means of 1, 1, and 10, respectively. The

transition matrices for these two cases are given below:

23

Cyclic demand Noncyclic demand 0 1 00 0 11 0 0

,

0.3 0.1 0.60.5 0.2 0.30.3 0.5 0.2

.

We set the unit penalty cost to be b= 50, and the unit echelon holding costs to be h1 = 2 and

h2 = 1 for Stages 1 and 2, respectively. This echelon holding cost structure corresponds to unit

installation holding costs of H1 = 3 for Stage 1 and H2 = 1 for Stage 2, representing the value-

adding process of moving product from Stage 2 to Stage 1. Based on these cost parameters, the

implicit fill rate at the customer-facing Stage 1 is 50/(50 + 3) = 94.3%. The lead times are set as

(L1,L2) = (1,1), (1,3), and (3,1).

For each scenario, we compute five solutions: the optimal solution s∗n(k), the lower bound sn(k),

the upper bound sn(k), the newsvendor heuristic svn(k) defined by (26), and the derivative-based

heuristic san(k|ζ) defined by (27). Because we have two stages and three states, we compare the

latter four policies with the optimal solution using the following weighted relative solution error

percentage metric. Specifically, for a given policy sn(k), the weighted relative solution error per-

centage is defined as ∑k,n |sn(k)− s∗n(k)|∑

k,n s∗n(k)

× 100%.

We evaluate the long-run average system costs under the policies of s∗n(k), svn(k), and san(k|ζ).

For each parameter scenario, we simulate the system for 50,000 periods and compute the average

cost over these 50,000 periods. To reduce variance in the simulation results, we use the same

random demand sample path for different policies. For the heuristic policies svn(k) and san(k|ζ), we

compute the relative cost error percentage with respect to the cost under the optimal policy s∗n(k).

For the heuristic san(k|ζ), we find that it performs well in most cases when ζ = h[1,n], so we let

san(k|ζ) = san(k|h[1,n]) in our reporting below.

The detailed numerical results for the three-state demand cases are given in Table 1. From the

table, the foremost observation is that the solution “lower bound” svn(k) obtained by the bounding

approaches of Dong and Lee (2003) and Shang (2012) may be greater than the optimal solution

s∗n(k). We mark all such counterexamples with a “†” sign in the table. The numerical results confirm

our earlier discussion that sv1(k) is actually a solution upper bound and sv2(k) may be greater than

s∗2(k) in many cases.

Both san(k|ζ) and svn(k) are good approximations to the optimal solution, with the derivative-

based heuristic san(k|ζ) outperforming svn(k) in most cases. As the Stage 1 lead time L1 increases

from one to three periods, the echelon base-stock levels for different states become close to each

other, and both heuristics’ approximation to the optimal solution improves, resulting in less than

24

Table 1 Comparison of policies in two-stage systems with three-state demand.

Cyclic demand Noncyclic demand(L1,L2) n k s∗n(k) sn(k) sn(k) svn(k) san(k|ζ) s∗n(k) sn(k) sn(k) svn(k) san(k|ζ)(1, 1) 1 1 4.63 4.63 4.63 4.63 4.63 23.13 21.12 23.40 23.40† 23.22

2 24.79 19.70 26.45 26.45† 26.45 19.03 19.03 19.03 19.03 19.033 22.75 19.74 26.50 26.50† 24.25 29.74 25.49 31.57 31.57† 30.07

2 1 23.99 19.79 31.42 25.09† 25.09 27.60 24.44 36.52 28.60† 28.482 22.39 20.11 31.42 25.09† 25.09 25.22 23.56 33.83 25.94† 25.943 29.24 20.11 31.42 25.09 25.09 36.48 27.08 45.10 36.43 35.08

Solution error % - 18.6% 57.6% 10.5% 8.7% - 12.7% 41.4% 2.4% 2.1%Cost error % 59.43 - - 2.7% 2.1% 73.83 - - 0.7% 0.3%

(1, 3) 1 1 4.63 4.63 4.63 4.63 4.63 23.13 21.12 23.40 23.40† 23.222 24.79 19.70 26.45 26.45† 26.45 19.03 19.03 19.03 19.03 19.033 22.75 19.74 26.50 26.50† 24.25 29.74 25.49 31.57 31.57† 30.07

2 1 33.35 22.82 33.55 27.22 27.22 45.06 35.21 52.76 43.00 42.872 43.76 27.88 48.70 40.96 40.96 42.21 34.30 50.09 40.40 40.403 42.31 28.39 48.74 40.99 39.00 52.36 37.86 60.70 50.43 49.10


(3, 1) 1 1 28.63 28.60 28.63 28.63 28.63 39.32 37.95 39.48 39.48† 39.352 28.58 28.58 28.58 28.58 28.58 36.85 36.85 36.85 36.85 36.853 39.37 34.67 42.95 42.95† 40.94 45.91 41.53 47.48 47.48† 46.08

2 1 27.21 27.22 33.55 27.22† 27.22 42.56 40.18 52.76 43.00† 42.872 38.87 32.05 48.70 40.96† 40.96 39.72 39.47 50.09 40.40† 40.403 41.29 32.05 48.74 40.99 39.00 50.13 42.38 60.70 50.43† 49.10


1% system cost errors. This corroborates the convergence result of Proposition 4 and suggests that

the heuristics can achieve near-optimal performance even under relatively short lead time. It is also

interesting to note that under the cyclic demand with L1 =L2 = 1, the echelon base-stock levels for

sa2(k|ζ) are identical across all states, so do s2(k) and sv2(k). This is because (L′2 + 1) mod K = 0,

which leads to identical lead time demand distributions with different initial states.

As a robustness check, we conduct an additional numerical study for two five-state demand cases.

The numerical insights are essentially the same as in the three-state demand cases (see Appendix

E for details).

5.2 Effect of Transition Probability

To illustrate the effect of transition probability on our bound and heuristic performance, we param-

eterize the Markov chain transition probability matrix as follows: 0 1 00 0 1q 0 1− q

,

where we vary the transition probability q from 0.1 to 1. When q = 1, the transition matrix

corresponds to the cyclic demand case; otherwise, it represents the noncyclic demand case. The

25

cost parameters are set to be the same as the base case. The lead times are set as (L1,L2) = (2,1).

For the highest demand state 3, we plot the optimal solution s∗n(3), the lower bound sn(3), the

upper bound sn(3), the newsvendor heuristic svn(3), and the derivative-based heuristic san(3|ζ) (with

ζ = h[1,n]). The plots for Stages 1 and 2 are shown in Figure 1 (a) and (b), respectively.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 125

30

35

40

45

50

55

60

65

70

75

q

Bas

e st

ock

leve

l for

sta

te 3

s∗

1(3)

s1(3)

s1(3)

sv

1(3)

sa

1(3)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 125

30

35

40

45

50

55

60

65

70

75

qB

ase

stoc

k le

vel f

or s

tate

3

s∗

2(3)

s2(3)

s2(3)

sv

2(3)

sa

2(3)

(a) Stage 1 (b) Stage 2Figure 1 Impact of transition probability on the solutions in a two-stage system.

From the figure, the optimal echelon base-stock level for the highest demand state decreases as

the transition probability q increases (i.e., the probability of jumping from the highest demand

state to a lower demand state increases). Our bounds and heuristics follow the same decreasing

trend as the optimal solution. Specifically, the derivative-based heuristic san(3|ζ) tracks the optimal

solution closely as the transition probability q increases.

5.3 Comparison with an Alternative Heuristic

In this subsection we compare our derivative-based heuristic with an alternative heuristic proposed

by Abhyankar and Graves (2001). Motivated by a case study of Teradyne Inc. (a major electronic

test equipment manufacturer), Abhyankar and Graves (2001) considered an inventory hedging

problem for a two-stage system with a two-state MMD demand process. Their heuristic is based

on the following decoupling assumption. The upstream stage is assumed to provide 100% internal

fill rate, i.e., it can always fulfill orders from the downstream stage. As a result, the two-stage

system is decoupled into two single-stage systems, with the downstream stage operating under a

state-dependent installation base-stock policy, and the upstream stage operating under a static

installation base-stock policy. The cost objective function considered in Abhyankar and Graves

(2001) is different from ours: They treated backorders as negative inventory, and aimed to minimize

expected inventory holding cost subject to a customer fill rate constraint. Nevertheless, we can still

implement their decoupling heuristic idea in our problem setting, and compare its performance

with that of our derivative-based heuristic.

26

We design the following numerical study to (approximately) match the model parameters of

Abhyankar and Graves (2001). Their demand model is a continuous-time two-state Markov demand

process. The low-state demand rate is 7.5 per time unit, with an expected cycle length of 45

time units. The high-state demand rate is 12.5 per time unit, with an expected cycle length of 45

time units. The total lead time of the system is 30 time units. We convert their continuous-time

model to our discrete-time model as follows. We first assume that each discrete period consists

of 5 time units. As a result, the mean demands for the low and high states are 7.5 × 5 = 37.5

and 12.5 × 5 = 62.5, respectively. The expected cycle length for each state is 45/5 = 9 periods.

To approximate these parameters, we assume that the demand follow a gamma distribution with

parameters (ni, θi) = (3,0.1) and (6,0.1) for i = 1,2, with i = 1 (mean 30) representing the low

demand state and i= 2 (mean 60) representing the high demand state. We also assume that the

transition probability matrix is given by (1− p pp 1− p

).

Under this transition matrix, the system stays in each state for 1/p periods on average. Therefore,

when p= 0.1, the expected cycle length is 10 periods, approximately matching the original cycle

length given in Abhyankar and Graves (2001). To illustrate the effect of the transition probability

p, we let p vary from 0.1 to 0.9. Since each discrete period consists of 5 time units, the total lead

time (30 time units) becomes L1 +L2 = 6 periods. We let L1 vary from 1 to 5 while keeping the

total lead time at a constant of 6 periods. We set the stockout penalty cost b= 50 and installation

holding costs H1 = 3, H2 = 1, which are the same as in the base case. To sum up, we have a total

of 9× 5 = 45 parameter cases for comparing our heuristic with the decoupling heuristic.

To implement the decoupling heuristic idea in our problem setting, we first use our single-stage

heuristic to compute the state-dependent installation base-stock levels for Stage 1. For Stage 2, we

need to compute a static installation base-stock level. Note that the transition probability matrix

under consideration is doubly stochastic. Hence, the stationary distribution is uniform (1/2,1/2).

We use a gamma distribution with parameters (n, θ) = (9,0.2) to approximate the demand under

the stationary distribution (with a matching demand mean 45). Based on this demand distribution,

we use a target fill rate of 97.7%, as suggested in Abhyankar and Graves (2001), to determine the

installation base-stock level for Stage 2. We then convert the installation base-stock levels into the

echelon base-stock levels by a standard technique (Zipkin 2000). We let sdn(k) denote the resulting

echelon base-stock levels, so that we can have a comparison with the optimal echelon base-stock

levels and our heuristic solutions. Recall that our derivative-based heuristic is parametrized by ζ,

with hn ≤ ζ ≤ h[1,n]. We find the heuristic performs well at either of the two extreme values of ζ,

27

so we present both cases san(k|hn) and san(k|h[1,n]) below. We evaluate the long-run average costs

of the two-stage system under the heuristic policies and the optimal policy, and then compute the

percentages of relative solution/cost errors.

When the transition probability p is small (e.g., p= 0.1), the transition between the two states

occurs less frequently. The demand process behaves more like an i.i.d. process, with an occasional

demand state shifting. On the other hand, when p is large (e.g., p = 0.9), the demand process

becomes more dynamic, as it transitions between the two states more frequently. The latter case

is similar to the three-state and five-state demand cases considered earlier (where the probability

of transitioning to a different state is no less than 0.7). Table 2 presents the detailed numerical

results for the two extreme transition probability cases of p= 0.1 and 0.9.

Table 2 Comparison of policies in a two-stage system with two-state demand.

p=0.1 p=0.9(L1,L2) n k s∗n(k) san(k|h[1,n]) san(k|hn) sdn(k) s∗n(k) san(k|h[1,n]) san(k|hn) sdn(k)(1, 5) 1 1 117.9 117.9 117.9 110.1 148.0 148.0 148.0 139.9

2 185.1 180.6 180.6 171.6 156.9 156.9 156.9 148.22 1 438.6 356.1 392.0 458.4 423.8 410.7 447.5 488.2

2 533.3 471.9 512.2 519.9 447.5 433.1 467.2 496.6Solution error % - 11.6% 5.7% 4.3% - 2.3% 3.7% 11.1%

Cost error % 414.0 29.5% 4.8% 2.3% 357.3 1.3% 1.6% 10.1%(2, 4) 1 1 168.3 168.3 168.3 158.8 197.1 197.1 197.1 187.3

2 253.8 247.4 247.4 236.5 227.2 226.9 226.9 216.72 1 434.4 356.1 392.0 455.7 421.9 410.7 447.5 484.3

2 527.7 471.9 512.2 533.4 444.7 433.1 467.2 513.6Solution error % - 10.2% 4.6% 3.9% - 1.8% 3.8% 11.7%

Cost error % 501.8 22.3% 3.6% 2.9% 424.6 0.6% 1.8% 10.4%(3, 3) 1 1 218.6 218.6 218.6 207.7 264.2 264.2 264.2 253.0

2 318.4 310.9 310.9 298.4 276.5 276.5 276.5 264.92 1 428.5 356.1 392.0 452.5 417.7 410.7 447.5 497.8

2 519.9 471.9 512.2 543.2 442.3 433.1 467.2 509.7Solution error % - 8.6% 3.5% 5.3% - 1.2% 3.9% 12.2%

Cost error % 591.3 16.9% 2.6% 4.5% 490.7 0.2% 2.4% 10.9%(4, 2) 1 1 269.0 269.0 269.0 256.8 314.0 314.0 314.0 301.5

2 380.0 371.9 371.9 358.0 339.5 339.3 339.3 326.42 1 420.4 356.1 392.0 448.6 416.0 410.7 447.5 493.2

2 509.7 471.9 512.2 549.8 438.8 433.1 467.2 518.2Solution error % - 7.0% 2.5% 6.5% - 0.7% 4.0% 12.1%

Cost error % 681.3 11.7% 1.6% 6.1% 552.0 0.0% 3.0% 10.9%(5, 1) 1 1 319.6 319.6 319.6 306.3 375.0 375.0 375.0 361.4

2 439.4 430.9 430.9 415.7 389.5 389.5 389.5 375.52 1 403.4 356.1 392.0 443.5 410.2 410.7 447.5 498.6

2 496.7 471.9 512.2 552.9 436.3 433.1 467.2 512.7Solution error % - 4.9% 2.1% 8.0% - 0.2% 4.2% 11.9%

Cost error % 768.1 6.3% 0.7% 7.9% 609.1 0.0% 4.1% 11.5%

28

From Table 2, we observe that, when p is small, our heuristic san(k|ζ) performs better with

ζ = hn. When p is large, our heuristic san(k|ζ) performs better with ζ = h[1,n]. This latter observation

accords with our earlier observations in the three-state and five-state demand cases. These results

indicate that, when the demand process is more dynamic, we should set ζ = h[1,n] in our heuristic;

otherwise, we should use ζ = hn. Comparing our heuristic with the decoupling heuristic, we note

that our heuristic performs better in most cases, with an average performance improvement of

4.6%. The most dramatic improvement occurs in the more dynamic demand case of p= 0.9, where

the improvement can be as high as 11.5%. However, in the less dynamic demand case of p =

0.1, when the Stage 2 lead time is long (i.e., L2 ≥ 4), the decoupling heuristic performs better

than our heuristic, with an improvement of 0.7-2.5%. Performance difference aside, we note that

our derivative-based heuristic can be applied to systems with any number of stages, whereas the

decoupling heuristic only works for two-stage systems.

Figure 2 illustrates the heuristic performance comparison under other p values, where the relative

cost error percentages of the three heuristic policies are plotted for the two extreme lead time cases

of (L1,L2) = (1,5), (5,1). The plots confirm our observations from Table 2.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0%

5%

10%

15%

20%

25%

30%

35%

p

Heu

ristic

Cos

t Erro

rs

san(k |h [ 1 ,n ])

san(k |hn)

sdn(k )

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0%

5%

10%

15%

20%

25%

30%

35%

p

Heu

ristic

Cos

t Erro

rs

san(k |h [ 1 ,n ])

san(k |hn)

sdn(k )

(a) (L1,L2) = (1,5) (b) (L1,L2) = (5,1)Figure 2 Relative cost error percentages under various heuristics

It has been demonstrated in the literature that, in a serial inventory system with i.i.d. demand,

an optimal internal fill rate is usually much lower than the fill rate at the customer-facing stage (see

Choi et al. 2004, Shang and Song 2006, and references therein). For systems with MMD, a natural

question is whether the internal fill rate would continue to be low, or becomes much higher as

assumed in the decoupling heuristic. To obtain insights into this question, we compute the system

internal fill rates (at Stage 2) under the optimal policy for the two extreme lead time cases of

(L1,L2) = (1,5), (5,1). The results are plotted in Figure 3.

29

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 80%

82%

84%

86%

88%

90%

92%

94%

96%

98%

100%

p

Syst

em In

tern

al F

ill R

ate

unde

r Opt

imal

Sol

utio

n

Lead time (1, 5)

Lead time (5, 1)

Figure 3 Internal fill rates under optimal policy

Figure 3 shows that, contrary to the the observations documented for the i.i.d. demand case,

when the lead time at the upstream stage is long (i.e., L2 = 5), the internal fill rate can be high

under MMD (ranging between 94-96%; the target fill rate at the customer-facing stage is 94.3%),

suggesting that the decoupling heuristic may yield a good approximation to the optimal policy.

However, from Table 2, the decoupling heuristic still performs poorly when p is large. This is

because the static installation base-stock level computed under the the decoupling heuristic tends

to be a poor approximation to the optimal policy as the demand process becomes more dynamic.

Figure 3 also shows that, when the lead time at the upstream stage is relatively short (i.e., L2 = 1),

the internal fill rate tends to be low (ranging between 84-93%, lower than the 94.3% target fill rate

at the customer-facing stage). As a result, the decoupling heuristic yields a poor approximation;

and our derivative-based heuristic has a clear advantage in this case, as it does not require the

high internal fill rate assumption. We conduct an additional numerical study on the internal fill

rates under the three-state demand cases. The findings are similar to the above observations (see

Appendix E).

6. Conclusion

We have addressed in this paper supply chain inventory management in a volatile demand envi-

ronment driven by uncertain external factors. Our goal is to develop simple analytical tools and

insights to cope effectively with dynamic demand uncertainties. To this end, we have focused on

simplifying the inventory decision tools for serial inventory systems with MMD.

We make several contributions to the literature. First, we have developed simple solution bounds

and approximations for these systems. Our derivative analysis approach enables us to obtain a

general solution lower bound that does not require the restrictive assumptions used in previous

works. Second, we have performed the first asymptotic analysis for these systems with long lead

30

time. We prove the MMD central limit theorem and characterize the asymptotic mean and variance.

We also show the convergence rate of our bounds and approximations to the optimal policy. These

results help reveal the effect of lead time on the optimal policy and provide analytical assessment of

the effectiveness of our bounds and heuristic policies (i.e., the relative errors). Third, to evaluate the

performance of our solution bounds and heuristic policies, we use the gamma demand distribution

family and show that, under this family, both the optimal solutions and the solution bounds can be

computed easily by leveraging Laplace transformation and its inverse. To our knowledge, existing

algorithms in the literature for serial inventory systems with MMD only address discrete demand

(such as Poisson distribution). With the gamma demand distribution, we have presented the first

set of numerical studies of serial systems under MMD.

We have demonstrated through numerical studies that our approximations are effective. These

results greatly simplify computation and, hence, facilitate implementation. For instance, the bounds

provide a convenient benchmark for the maximum and minimum inventory needed for each possible

demand state, which is useful for budgeting purposes. The heuristics can serve as quick decision

tools. Perhaps, even more importantly, these simple solutions can be obtained by solving equations

that depend on the primitive model parameters only, making them much more transparent than

the exact optimal solution. In particular, they shed light on how the transition probability among

the states (and thus the environmental uncertainty) affects the optimal inventory position.

The MMD demand model is particularly attractive because the flexibility of Markov chains

enables us to account for environmental uncertainties. Advanced information technology, such

as hand-held devices, GPS, and social media, as well as abundant data associated with these

technologies, makes it easier for the construction of MMD. Real-time data also makes it easier to

observe the demand state and, hence, to facilitate the implementation of MMD. For example, one

can detect the predictive seasonality pattern from real-time data and construct a Markov chain

with cyclic demand state transition matrix. For demand processes with more complex dynamics,

such as in the Teradyne case of Abhyankar and Graves (2001), one can construct a discrete-time

Markov chain based on the observed data as we have demonstrated in the numerical study.

Several extensions of our problem merit further investigation. We have observed numerically

that the optimal internal fill rate under MMD may be significantly higher than that observed in

previous studies under i.i.d. demand. It would be interesting to identify analytical conditions that

guarantee a high internal fill rate, such that the serial inventory system can be decoupled into

separate single-stage systems to facilitate the implementation of the decoupling heuristic. Another

extension is to investigate the demand variability propagation effect, or the bullwhip effect (see

Chen and Lee 2009, 2012 and the references therein), in serial inventory systems under MMD. We

have observed numerically that the optimal policy can dampen the bullwhip effect significantly,

31

suggesting that the state-dependent inventory policy under MMD may smooth demand variability

propagation in the supply chain (see Appendix E for details). Providing analytical explanations

for this system behavior can be an interesting future research topic.

References

Abhyankar, H., S. Graves. 2001. Creating an inventory hedge for Markov-modulated Poisson

demand: An application and model. Manufacturing & Service Oper. Management 3(4) 306–320.

Abouee-Mehrizi, H., O. Baron, O. Berman. 2014. Exact analysis of capacitated two-echelon inven-

tory systems with priorities. Manufacturing & Service Oper. Management 16(4) 561-577.

Angelus, A. 2011. A multiechelon inventory problem with secondary market sales. Management

Sci. 57(12) 2145–2162.

Aviv, Y., A. Federgruen. 1997. Stochastic inventory models with limited production capacity and

periodically varying parameters. Probability in the Engineering and Informational Sciences. 11

107–135.

Beyer, D., S. Sethi. 1997. Average cost optimality in inventory models with Markovian demands.

J. Optim. Theory Appl. 92(3) 497–526.

Billingsley P. 1995. Probability and Measure. 3rd Edition, John Wiley & Sons.

Brenner, J. L. 1959. Relations among the minors of a matrix with dominant principal diagonal.

Duke Math. J. 26(4) 563-567.

Chao, X., S. Zhou. 2007. Probabilistic solution and bounds for serial inventory system with dis-

counted and average cost. Naval Res. Logist. 54(6) 623–631.

Chen, F., J.-S. Song. 2001. Optimal policies for multiechelon inventory problems with Markov-

modulated demand. Oper. Res. 49(2) 226–234.

Chen, F., Y.-S. Zheng. 1994. Lower bounds for multi-echelon stochastic inventory systems. Man-

agement Sci. 40(11) 1426–1443.

Chen, L., H.L. Lee. 2009. Information sharing and order variability control under a generalized

demand model. Management Science 55(5) 781-797.

Chen, L., H.L. Lee. 2012. Bullwhip Effect Measurement and Its Implications. Operations Research

60(4) 771-784.

Choi, K.-S., J.G. Dai and J.-S. Song. 2004. On measuring supplier performance under vendor-

managed-inventory programs in capacitated supply chains. Manufacturing & Service Operations

Management 6 (1) 53-72.

32

Clark, A.J., H. Scarf. 1960. Optimal policies for a multi-echelon inventory problem. Management

Sci. 6 475-490.

Dong, L., H.L. Lee. 2003. Optimal policies and approximations for a serial multi-echelon inventory

system with time-correlated demand. Oper. Res. 51(6) 969-980.

Federgruen, A., P. Zipkin. 1984. Computational issues in an infinite horizon multi-echelon inventory

problem with stochastic demand. Oper. Res. 32 818-836.

Gallego, G., O. Ozer. 2005. A new algorithm and a new heuristic for serial supply systems. Oper.

Res. Letters 33(4) 349-362.

Gallego, G., P. Zipkin. 1999. Stock positioning and performance estimation in serial production-

transportation systems. Manufacturing & Service Oper. Management 1(1) 77-88.

Huh, W. T., G. Janakiraman. 2008. A sample-path approach to the optimality of echelon order-

up-to policies in serial inventory systems. Oper. Res.Letters 36(5) 547-550.

Iglehardt, D., S. Karlin. 1962. Optimal policy for dynamic inventory process with nonstationary

stochastic demands. Studies in Applied Probability and Management Science. K. Arrow, S. Kar-

lin, H. Scarf (eds.). Chapter 8, Stanford University Press, Stanford, California.

Janakiraman, G., J.A. Muckstadt. 2009. A decomposition approach for a class of capacitated serial

systems. Oper. Res. 57(6) 1384-1393.

Karlin S. 1960. Dynamic inventory policy with varying stochastic demands. Management Sci. 6(3)

231-258.

Kapuscinski, R., S. Tayur. 1998. A capacitated production-inventory model with periodic demand.

Oper. Res. 46(6) 899-911.

Muharremoglu, A., J.N. Tsitsiklis. 2008. A Single-unit decomposition approach to multiechelon

inventory systems. Oper. Res. 56(5) 1089-1103.

Ozekici, S., M. Parlar. 1993. Periodic-review inventory models in random environments. Working

paper, School of Business, McMaster University, Hamilton, Ontario, Canada.

Parker, R.P., R. Kapuscinski. 2004. Optimal policies for a capacitated two-echelon inventory sys-

tem. Oper. Res. 52(5) 739-755.

Sgibnev M. S. 2001. Stone decomposition for a matrix renewal measure on a half-line. Sbornik:

Mathematics 192(7) 1025-1033.

Sethi, S., F. Cheng. 1997. Optimality of (s, S) policies in inventory models with Markovian demand.

Oper. Res. 45(6) 931-939.

33

Shang, K.H. 2012. Single-stage approximations for optimal policies in serial inventory systems with

non-stationary demand. Manufacturing & Service Oper. Management 14(3) 414-422.

Shang, K.H., J.-S. Song. 2003. Newsvendor bounds and heuristic for optimal policies in serial

supply chains. Management Sci. 49(5) 618-638.

Shang, K. and J.-S. Song. 2006. A closed-form approximation for serial inventory systems and its

application to system design. Manufacturing & Service Oper. Management 8(4) 394-406.

Song, J.-S., P. Zipkin. 1992. Evaluation of base stock policies in multiechelon inventory systems with

state dependent demands part I: State independent policies. Naval Res. Logist. 39(5) 715-728.

Song, J.-S., P. Zipkin. 1993. Inventory control in a fluctuating demand environment. Oper. Res.

41(2) 351-370.

Song, J.-S. and P. Zipkin. 1996a. Evaluation of base-stock policies in multiechelon inventory systems

with state-dependent demands: Part II: state-dependent depot policies. Naval Research Logistics

43 381-396.

Song, J.-S., P. Zipkin. 1996b. Managing inventory with the prospect of obsolescence. Oper. Res.

44(1) 215-222.

Song, J.-S., P. Zipkin. 2009. Inventories with multiple supply sources and networks of queues with

overflow bypasses. Management Sci. 55(3) 362–372.

Zipkin, P. 1989. Critical number policies for inventory models with periodic data. Management

Sci. 35(1) 71-80.

Zipkin, P. 2000. Foundations of inventory management. McGraw-Hill, New York.

34

Appendices for Online Companion

Appendix A: Proofs

Proof (Lemma 1) Prove by induction. For case i= 1, because G1(·, y) is convex in y, it follows

that G1(1∗, y) is increasing convex in y. Thus, by definition (4), we have

∂yR1(1∗, y) = ∂yG

1(1∗, y) + p1∗1∗E∂yR

1(1∗, y−D1∗).

Because the first term on the right-hand side is nonnegative, by a standard result of renewal

equations (see Sgibnev 2001 Theorem 3, p. 1032), it follows that ∂yR1(1∗, y)≥ 0. Taking derivative

on both sides of the equation, by the same argument, we obtain ∂2yR

1(1∗, y) ≥ 0. Therefore,

R1(1∗, y) is increasing convex in y. Next, by the definition of G2(·, y), we have G2(·, y) is convex

in y. Hence, it follows that G2(2∗, y) is increasing convex in y. By repeating the above induction

proof step, we obtain the desired result.

Proof (Lemma 2) We first show ∂yRi(y)≤

(I(i)−P

(i)m∗

)−1

· ei · ∂yGi(i∗, y). When i= 2, (9) can

be written as

(1− p1∗1∗)∂yR2(1∗, y)− p1∗2∗∂yR

2(2∗, y) ≤ 0;

−p2∗1∗∂yR2(1∗, y) + (1− p2∗2∗)∂yR

2(2∗, y) ≤ ∂yG2(2∗, y).

Recall that ∂yRi(·, y) and ∂yG

i(·, y) are always nonnegative. It follows that

∂yR2(1∗, y) ≤ p1∗2∗

(1− p1∗,1∗)(1− p2∗2∗)− p1∗2∗p2∗1∗∂yG

2(2∗, y) =p1∗2∗∣∣∣I(2)−P

(2)m∗

∣∣∣∂yG2(2∗, y);

∂yR2(2∗, y) ≤ 1− p1∗1∗

(1− p1∗,1∗)(1− p2∗2∗)− p1∗2∗p2∗1∗∂yG

2(2∗, y) =1− p1∗1∗∣∣∣I(2)−P

(2)m∗

∣∣∣∂yG2(2∗, y).

On the other hand, it is easy to verify(I(2)−P

(2)m∗

)−1

e2 =∣∣∣I(2)−P

(2)m∗

∣∣∣−1(

p1∗2∗

1− p1∗1∗

).

Therefore, we have ∂yR2(y)≤

(I(2)−P

(2)m∗

)−1

·e2 ·∂yG2(2∗, y). When i > 2, following the same proof

steps, we can show the inequality holds in general. Recall from lemma 1 that ∂yRi(y) is always

nonnegative. This implies that(I(i)−P

(i)m∗

)−1

· ei ≥ 0.

Next, let (a)c denote the minor of an element a in I(i)−P(i)m∗ . From the standard result of matrix

inverse, we have

(I(i)−P

(i)m∗

)−1

· ei =∣∣∣I(i)−P

(i)m∗

∣∣∣−1

(−1)i+1(−pi∗1∗)c(−1)i+2(−pi∗2∗)c

...(1− pi∗i∗)c

.

35

Also note that I(i) − P(i)m∗ is a diagonally dominant matrix (because 1 − pi∗i∗ ≥

∑i−1

j=1 pi∗j∗). By

Brenner (1959, Corollary 1), it follows that(−1)i+1(−pi∗1∗)c(−1)i+2(−pi∗2∗)c

...(1− pi∗i∗)c

≤

(1− pi∗i∗)c(1− pi∗i∗)c

...(1− pi∗i∗)c

,

where (1− pi∗i∗)c =∣∣∣I(i−1)−P

(i−1)m∗

∣∣∣. Therefore, we obtain(I(i)−P

(i)m∗

)−1

ei ≤ e · βi(m∗), and the

desired result follows.

Proof (Proposition 1) The lower bound result is straightforward from Lemma 1. For the upper

bound result, from the discussion preceding Proposition 1, we know

∂yG[k](k, y)≤ ∂yG(k, y) + (1− pkk)

K−1∑i=1

βi(m∗)∂yG

i(i∗, y),

where m∗ is the optimal state sequence. We want to further relax the second term of the right-hand

side to make it independent of the optimal state sequence. Note that, for i= 1, ...,K − 1,

∂yGi(i∗, y) = max0, ∂yGi(i∗, y)

= max

0, ∂yG

1(i∗, y) +i−1∑l=1

l∑k=1

pi∗k∗E∂yR

l(k∗, y−Di∗)

≤(∂yG

1(i∗, y))+

+i−1∑l=1

l∑k=1

pi∗k∗∂yRl(k∗, y)

≤(∂yG

1(i∗, y))+

+i−1∑l=1

φi,l(m∗)∂yG

l(l∗, y), (A1)

where the first inequality follows from Lemma 1 and the second inequality follows from Lemma

2, with φi,l(m∗) = [pm∗im∗1 , ..., pm∗im∗l ]

T(I(l)−P

(l)m∗

)−1

el. Define the following vector and matrix

notations:

∂yG =[∂yG

1(1∗, y), ∂yG2(2∗, y), ..., ∂yG

K−1((K − 1)∗, y)]T,

(∂yG1)+ =

[(∂yG

1(1∗, y))+, (∂yG1(2∗, y))+, ..., (∂yG

1((K − 1)∗, y))+

]T,

Ψ =

0 0 . . . 0

φ2,1(m∗) 0 0φ3,1(m∗) φ3,2(m∗) 0 0

......

. . .. . .

...φK−1,1(m∗) φK−1,2(m∗) . . . φK−1,K−2(m∗) 0

.

Then, the inequalities of (A1) can be written in a matrix form as(I(K−1)−Ψ

)∂yG ≤ (∂yG

1)+.

Note that I(K−1)−Ψ is a lower triangular matrix with ones on the diagonal, and hence its inverse

36

must also be a lower triangular matrix with ones on the diagonal. From the fact that the elements

in Ψ are all nonnegative, it follows that I(K−1)−Ψ is invertible and its inverse has all nonnegative

elements. Therefore, we have ∂yG≤(I(K−1)−Ψ

)−1(∂yG

1)+.

Further define β(m∗) = [β1(m∗), ..., βK−1(m∗)]T

. Then, we have

K−1∑i=1

βi(m∗)∂yG

i(i∗, y) =β(m∗)T∂yG≤β(m∗)T(I(K−1)−Ψ

)−1(∂yG

1)+.

Let α(m∗) = [α1(m∗), ..., αK−1(m∗)] = β(m∗)T(I(K−1)−Ψ

)−1, where αi(m

∗) can be determined

recursively by (10). Therefore, we conclude that

∂yG[k](k, y) ≤ ∂yG(k, y) + (1− pkk)

K−1∑i=1

αi(m∗)(∂yG

1(i∗, y))+

≤ ∂yG(k, y) + (1− pkk)maxm∈S

K−1∑i=1

αi(m) (∂yG(mi, y))+

= ∂yG(k, y) + (1− pkk)∆(y),

where the last equality follows from the definition of ∆(y) given in (11).

Proof (Corollary 1) s(k)≤ s∗(k)≤ s(k) is a direct result from Proposition 1. By the definition

of smin, we have smin ≤ s(k). Note that ∂yG(k, y) is increasing in y. Thus, it follows from the

definition of s(k) that ∂yG(k, smin) ≤ 0. Moreover, note that (∂yG(k, smin))+ = 0 for any k.

Hence, it is straightforward to verify that ∂yG(k, smin) + (1− pkk)∆(smin)≤ 0, which implies that

smin ≤ s(k) by the definition of s(k) in (12).

Proof (Corollary 2) By the definition of αi(m), we have αi(m)≥ βi(m). Therefore,

∂yG(k, y) ≤ ∂yG(k, y) +1

2

j−1∑i=1

pkmi ·βi(m) ·Fk (y− s(mi)) (∂yG(mi, y))+

≤ ∂yG(k, y) + (1− pkk) ·αi(m) (∂yG(mi, y))+ ≤ ∂yG(k, y) + (1− pkk)∆(y),

which implies the desired result.

Proof (Lemma 3) Let πtk = [πtk(1), ..., πtk(K)] denote the distribution of demand states at period

t given an initial state k ∈ 1, ...,K. Then, we have limt→∞πtk = π, where π is the Markov chain

stationary distribution. Let Xtk denote the demand in period t given the initial state k, then

Xtk =

D

(t)1 with prob. πtk(1),

D(t)2 with prob. πtk(2),

...

D(t)K with prob. πtk(K),

37

where D(t)i has the same distribution as Di. Note that D

(t1)i1, ...,D

(tn)in

are all independent of each

other for any given sets of i1, ..., in and t1, ..., tn, as long as tj 6= tj′ for j 6= j′.

Consider first an alternative Markov chain with the same transition matrix but starting with

the stationary distribution. Let Y t denote the demand in each period under this Markov chain.

Then, we have Y t =D(t)i with probability π(i) for any t≥ 1. Hence, E Y t=

∑K

i=1 µiπ(i) = µ. Let

Y t = Y t−µ. Then we have EY t= 0. It is also easy to check that

E

(Y t)2

=K∑k=1

(σ2k +µ2

k−µ2)π(k), and E

Y tY t+s

=

K∑k=1

K∑l=1

(µk−µ)µlp(s)kl π(k),

where p(s)kl is the s-step transition probability from state k to state l.

Since the Markov chain W has a stationary distribution and the state space is finite, by Theorem

8.9 in Billingsley (1995, p. 131), we have, for any k, l and s, where A≥ 0 and 0≤ ρ< 1,∣∣∣p(s)kl −π(l)

∣∣∣≤Aρs. (A2)

Since Y t is stationary, from the above condition, it can be verified that Y t is α-mixing with

an exponential decay rate (see Billingsley 1995, p. 363). Let SL = Y 1 + · · · Y L+1. By Theorem 27.4

of Billingsley (1995, p. 364), it follows that, as L→∞, the following series converge absolutely:

(L+ 1)−1varSL→ E

(Y 1)2

+ 2

∞∑s=1

EY 1Y 1+s

=

K∑k=1

[σ2k +µ2

k−µ2]π(k) + 2

∞∑s=1

K∑k=1

K∑l=1

(µk−µ)µlp(s)kl π(k)

= σ2.

If σ > 0, we further have, as L→∞,

SL

σ√L+ 1

dist.−−→N (0,1) . (A3)

Now consider the original Markov chain. Define Xtk =Xt

k − µ and SLk = X1k + · · · XL+1

k . Given a

fixed y value (y≥ 0), for any N ≥ 0, we can write the following:

P

(XN+1k ...+ XL+1

k

σ√L+ 1

≤ y

)

=K∑

iN+1=1

· · ·K∑

iL+1=1

P

(XN+1k ...+ XL+1

k

σ√L+ 1

≤ y

∣∣∣∣∣XN+1k =DiN+1

, ...,XL+1k =DiL+1

)×P

(XN+1k =DiN+1

, ...,XL+1k =DiL+1

)=

K∑iN+1=1

· · ·K∑

iL+1=1

P

(D

(N+1)iN+1

...+D(L+1)iL+1

− (L−N + 1)µ

σ√L+ 1

≤ y

)· p(N+1)k,iN+1

piN+1,iN+2· · ·piL,iL+1

,

38

and

P

(Y N+1...+ Y L+1

σ√L+ 1

≤ y

)

=K∑

iN+1=1

· · ·K∑

iL+1=1

P

(D

(N+1)iN+1

...+D(L+1)iL+1

− (L−N + 1)µ

σ√L+ 1

≤ y

)·π(iN+1)piN+1,iN+2

· · ·piL,iL+1.

Therefore, comparing the two expressions, we obtain∣∣∣∣∣P(XN+1k ...+ XL+1

k

σ√L+ 1

≤ y

)−P

(Y N+1...+ Y L+1

σ√L+ 1

≤ y

)∣∣∣∣∣≤

K∑iN+1=1

· · ·K∑

iL+1=1

P

(D

(N+1)iN+1

...+D(L+1)iL+1

− (L−N + 1)µ

σ√L+ 1

≤ y

)∣∣∣p(N+1)k,iN+1

−π(iN+1)∣∣∣piN+1,iN+2

· · ·piL,iL+1

≤ AρN+1

K∑iN+1=1

· · ·K∑

iL+1=1

P

(D

(N+1)iN+1

...+D(L+1)iL+1

− (L−N + 1)µ

σ√L+ 1

≤ y

)· piN+1,iN+2

· · ·piL,iL+1

≤ AKρN+1,

where the second inequality follows from the condition (A2) and the last inequality is a result of

relaxing the inner probability term P (·) to be one.

For any ε > 0, there exists Nε such that AKρN ≤ ε/3 for N >Nε. Fix Nε, it is easy to check that

the following results hold:

limL→∞

P

(SL

σ√L+ 1

≤ y

)= lim

L→∞P

(∑Nε

i=1 Yi)

+ Y Nε+1...+ Y L+1

σ√L+ 1

≤ y

= lim

L→∞P

(Y Nε+1...+ Y L+1

σ√L+ 1

≤ y

),

and, similarly,

limL→∞

P

(SLk

σ√L+ 1

≤ y)

= limL→∞

P

(XNε+1k ...+ XL+1

k

σ√L+ 1

≤ y

).

Therefore, from the above limiting results, there exists L(ε,Nε) such that for L>L(ε,Nε),∣∣∣∣∣P(

SL

σ√L+ 1

≤ y

)−P

(Y Nε+1...+ Y L+1

σ√L+ 1

≤ y

)∣∣∣∣∣< ε/3,and∣∣∣∣∣P(

SLkσ√L+ 1

≤ y)−P

(XNε+1k ...+ XL+1

k

σ√L+ 1

≤ y

)∣∣∣∣∣< ε/3.Hence, for L>L(ε,Nε),∣∣∣∣∣P(

SLkσ√L+ 1

≤ y)−P

(SL

σ√L+ 1

≤ y

)∣∣∣∣∣ ≤∣∣∣∣∣P(

SLkσ√L+ 1

≤ y)−P

(XNε+1k ...+ Xk

L

σ√L+ 1

≤ y

)∣∣∣∣∣

39

+

∣∣∣∣∣P(XNε+1k ...+ Xk

L

σ√L+ 1

≤ y

)−P

(Y Nε+1...+ YL

σ√L+ 1

≤ y

)∣∣∣∣∣+

∣∣∣∣∣P(

SL

σ√L+ 1

≤ y

)−P

(Y Nε+1...+ YL

σ√L+ 1

≤ y

)∣∣∣∣∣≤ ε/3 +AKρNε+1 + ε/3

≤ ε.

Since ε is chosen arbitrarily, it follows thatSLk

σ√L+ 1

andSL

σ√L+ 1

converge to the same distribu-

tion as L goes to infinity. From (A3), the desired result follows.

Proof (Proposition 2) Consider first the case when the Markov chain W has a stationary

distribution. Let η = b/(b + h). For any initial state k, the myopic base-stock level s(k) is the

solution of

P(X1k +X2

k + ...+XL+1k ≤ s(k)

)= η.

The above equation can be further written as

P

(X1k +X2

k + ...+XL+1k − (L+ 1)µ√

L+ 1σ≤ s(k)− (L+ 1)µ√

L+ 1σ

)= η.

Let L goes to infinity, since the left-hand side of the inequality inside the parenthesis converges

to N(0,1) in distribution, it follows that limL→∞

s(k)− (L+ 1)µ√L+ 1σ

= Φ−1(η), where Φ(·) is the stan-

dard normal distribution function. Hence, it follows that limL→∞

s(k)− (L+ 1)µ

(L+ 1)σ= 0, or, equivalently,

limL→∞

s(k)

(L+ 1)σ=µ

σ. Now for different initial states k and k′, we have

√L+ 1

s(k)− s(k′)s(k)

=(s(k)− s(k′))/

√L+ 1σ

s(k)/(L+ 1)σ

=(s(k)− (L+ 1)µ)/

√L+ 1σ− (s(k′)− (L+ 1)µ))/

√L+ 1σ

s(k)/(L+ 1)σ,

where the numerator converges to zero and the denominator converges to a positive constant, as

L goes to infinity. Therefore, we conclude that, for any k and k′,

limL→∞

√L+ 1 · s(k)− s(k′)

s(k)= 0,

which implies limL→∞

√L+ 1 · s(k)− smin

smin= 0.

When the Markov chain W is cyclic, for any given lead time L, we can write L=mK+ l, where

l = L mod K. The first mK periods of aggregated demand of Dk(L) and Dk′(L) are exactly the

same. As L increases to infinity, the residual part is negligible compared with the first mK periods

40

of aggregated demand. Therefore, we can show that, for any states k and k′, the lead time demand

distributions Fk,L(y) and Fk′,L(y) converge to the same limit distribution for every y as L goes to

infinity. Hence, the same convergence rate result can be obtained. We omit the details here.

Proof (Proposition 3) By repeatedly applying the procedure we used to obtain (23), we can

prove by induction that,

∂yG[k]n (k, y)≥ ∂yG1

n(k, y)≥−(b+Hn+1) + (b+Hn)Fk,L′n(y) = ∂yGn(k, y|hn).

By applying (3) repeatedly, taking derivative with respect to y, and applying Lemma 2, we have

∂yG[k]n (k, y) = ∂yG

1n(k, y) +

[k]−1∑i=1

∑u∈Ui+1

n

pku∂yERin(u, y−Dk)

≤ ∂yG

1n(k, y) +

[k]−1∑i=1

∑u∈Ui+1

n

pkuβi(m∗n)∂yG

in (i∗, y)

≤ ∂yG1(k, y) +

K−1∑i=1

(1− pkk)βi(m∗n)∂yGin (i∗, y) , (A4)

where the last inequality utilizes the facts that [k]≤K, transition probabilities pku sum up to 1,

and ∂yRin(u, y) is increasing and nonnegative. For ease of exposition, denote vj =Wk(t+L′n−L′j−1).

By using the derivatives of (19) and (20), we have

∂yG1n(k, y) = hn +E ∂yGn−1,n (vn, y−Dk[t, t+Ln))

= hn +E∂yG

[vn]n−1

(vn,min

y−Dk[t, t+Ln), s

[vn]n−1(vn)

)− ∂yG[vn]

n−1

(vn, s

[vn]n−1(vn)

)= hn +E

∂yG

[vn]n−1

(vn,min

y−Dk[t, t+Ln), s

[vn]n−1(vn)

)≤ hn +E

∂yG

[vn]n−1 (vn, y−Dk[t, t+Ln))

, (A5)

where the inequality utilizes the fact that the derivative of ∂yG[vn]n−1 (vn, y) is increasing in y (Chen

and Song 2001). By applying (3) to (A5), we have

∂yG1n(k, y) ≤ hn +E

∂yG

1n−1 (vn, y−Dk[t, t+Ln))

+E

[vn]−1∑i=1

∑u∈Ui+1

n−1

pvnuEDvn[∂yR

in−1 (u, y−Dk[t, t+Ln)−Dvn)

]≤ hn +E

∂yG

1n−1 (vn, y−Dk[t, t+Ln))

+E

[vn]−1∑i=1

∑u∈Ui+1

n−1

pvnu∂yRin−1 (u, y−Dk[t, t+Ln))

, (A6)

41

where the last inequality follows from Lemma 1.

Recall that ∂yGn(k, y|h[1,n]) =−(b+Hn+1)+(b+H1)Fk,L′n(y). By repeatedly applying the above

techniques and using Lemma 2, we have

∂yG1n(k, y)

≤ hn +hn−1 + ...+h2 +E∂yG

11 (v2, y−Dk[t, t+L′n−L1))

+E

n−1∑j=1

[vj+1]−1∑i=1

∑u∈Ui+1

j

pvj+1u∂yRij

(u, y−Dk[t, t+L′n−L′j)

)= ∂yGn(k, y|h[1,n]) +E

n−1∑j=1

[vj+1]−1∑i=1

∑u∈Ui+1

j

pvj+1u∂yRij

(u, y−Dk[t, t+L′n−L′j)

)≤ ∂yGn(k, y|h[1,n]) +E

n−1∑j=1

[vj+1]−1∑i=1

∑u∈Ui+1

j

pvj+1uβi(m∗j)∂yG

ij

(i∗, y−Dk[t, t+L′n−L′j)

)≤ ∂yGn(k, y|h[1,n]) +E

n−1∑j=1

K−1∑i=1

βi(m∗j)∂yG

ij


), (A7)

where the equality follows from v2 =Wk(t+L′n−L1) and Fv2,L1(y−Dk[t, t+L′n−L1)) = Fk,L′n(y).

Combining (A4) and (A7) leads to

∂yG[k]n (k, y)≤ ∂yGn(k, y|h[1,n]) +Mn(k, y),

where

Mn(k, y) =n−1∑j=1

K−1∑i=1

βi(m∗j )E

∂yG

ij


)+K−1∑i=1

(1− pkk)βi(m∗n)∂yGin(i∗, y),

with m∗j being the optimal sequence in the jth stage. With the same analysis as in the proof of

Proposition 1, we can further relax Mn(k, y) as follows.

Mn(k, y) ≤n−1∑j=1

E

K−1∑i=1

αi(m∗j) (∂yG

1j


))+

+(1− pkk)K−1∑i=1

αi (m∗n)(∂yG

1n (i∗, y)

)+, (A8)

where αi(m∗j ) can be determined recursively by (10). From (A7) and Lemma 2, we also have

(∂yG

1j(i∗, y)

)+ ≤ max

0, gj(i

∗, y) +

j−1∑s=1

E

K−1∑l=1

βl(m∗s)∂yG

ls

(l∗, y−Di∗ [t, t+L′j −L′s)

)

≤

(gj(i

∗, y) +

j−1∑s=1

E

K−1∑l=1

αl (m∗s)(∂yG

1s

(l∗, y−Di∗ [t, t+L′j −L′s)

))+

)+

.

(A9)

42

Combining (A8) and (A9), we can verify that

Mn(k, y)≤ (1− pkk)∆n(y) +∑n−1

j=1 ∆j,n(k, y)

where ∆n(y) and ∆j,n(k, y) are defined in Proposition 3. We arrive at the desired result.

Proof (Corollary 3) The proof follows the same argument as in the proof for Corollary 1 and is

omitted here.

Proof (Corollary 4) The proof follows the same argument as in the proof for Corollary 2 and is

omitted here.

Proof (Proposition 4) For ease of exposition, we present a proof for N = 2. The general case

with N > 2 can be shown following the same argument.

Consider first the case when the Markov chain W has a stationary distribution. When L′1 goes

to infinity, we can apply Proposition 2 to obtain the result for Stage 1. Thus, we only need to check

Stage 2. When L′2 going to infinity, either L1 or L2 goes to infinity. For brevity, we shall consider

the case of L1 going to infinity and L2 staying finite below. The other cases can be shown similarly.

By Proposition 3, s2(k) is the solution to −(b+H3) + (b+H2)Fk,L′2(y) = 0, or equivalently,

Fk,L′2(s2(k)) =b+H3

b+H2

.

With a similar argument to that used in the proof of Proposition 2, we have

limL′2→∞

s2(k)− (L′2 + 1)µ√L′2 + 1σ

= Φ−1(η),

where η= (b+H3)/(b+H2). We next establish a further relaxation of the derivative upper bound

given in Proposition 3. Define the following functions recursively:

∆n(y) = αK∑k=1

((b+H1)Fk,L′n(y) +

∑n−1

j=1 ∆j,n(k, y))

∆j,n(k, y) = E

∆j

(y−Dk[t, t+L′n−L′j)

),

where α = maxi∈1,...,K−1,m∈S αi(m). It is easy to verify that ∆2(y) ≤ ∆2(y) and ∆1,2(k, y) ≤∆1,2(k, y). Then, we have

∂yG[k]2 (k, y) ≤ ∂yG2(k, y|h[1,2]) + (1− pkk)∆2(y) + ∆1,2(k, y).

Therefore, we obtain a relaxed solution lower bound, denoted by s′2(k), from solving the equation:

∂yG2(k, y|h[1,2]) + (1− pkk)∆2(y) + ∆1,2(k, y) = 0.

43

Expanding the terms ∆2(y) and ∆1,2(k, y) based on their definitions yields

∆1(y) = αK∑k=1

(b+H1)Fk,L1(y),

∆1,2(k, y) = E

∆1 (y−Dk[t, t+L′2−L′1))

= α(b+H1)E

K∑l=1

Fl,L1(y−Dk[t, t+L2))

= α(b+H1)K∑l=1

P (Dl[t, t+L1] +Dk[t, t+L2)≤ y) ,

∆2(y) = αK∑k=1

((b+H1)Fk,L′2(y) + ∆1,2(k, y)

)= α(b+H1)

K∑k=1

Fk,L′2(y) + α2(b+H1)K∑k=1

K∑l=1

P (Dl[t, t+L1] +Dk[t, t+L2)≤ y) .

Therefore, s′2(k) is the solution to the following equation:

(b+H1)Fk,L′2(y) + (1− pkk)α(b+H1)K∑k=1

Fk,L′2(y)

+(1− pkk)α2(b+H1)K∑k=1

K∑l=1

P (Dl[t, t+L1] +Dk[t, t+L2)≤ y)

+α(b+H1)K∑l=1

P (Dl[t, t+L1] +Dk[t, t+L2)≤ y) = b+H3,

where the left-hand side involves a linear combination of the lead time demand distributions with

the same time length. From Lemma 3, we have

Dk[t, t+L′2]− (L′2 + 1)µ√L′2 + 1σ

d−−−−→L′2→∞

N(0,1).

Also recall that L1 goes to infinity and L2 stays finite, as L′2 goes to infinity. It is straightforward

to verify thatDl[t, t+L1] +Dk[t, t+L2)− (L′2 + 1)µ√

L′2 + 1σ

d−−−−→L′2→∞

N(0,1).

With a similar argument to that used in the proof of Proposition 2, we have

limL′2→∞

s′2(k)− (L′2 + 1)µ√L′2 + 1σ

= Φ−1(η′),

where η′ = (b+H3)/[(b+H1)(1 + αK(2− pkk) + α2K2(1− pkk))]. It is easy to verify that η′ < η.

Therefore, we have, for any k and k′,

√L′2 + 1 · s2(k)− s′2(k′)

s′2(k′)=

√L′2 + 1 ·

(s2(k)− s′2(k′))/√L′2 + 1σ

s′2(k′)/√L′2 + 1σ

44

=(s2(k)− (L′2 + 1)µ)/

√L′2 + 1σ− (s′2(k′)− (L′2 + 1)µ)/

√L′2 + 1σ

(L′2 + 1)−1/2

[(s′2(k′)− (L′2 + 1)µ)/

√L′2 + 1σ

]+µ/σ

−−−−→L′2→∞

Φ−1(η)−Φ−1(η′)

µ/σ.

This implies that

limL′2→∞

√L′2 + 1

s2(k)−mink′ s′2(k′)

mink′ s′2(k′)=

Φ−1(η)−Φ−1(η′)

µ/σ.

Since s2(k)≥ s′2(k), by the definition of smin2 , we have

0≤ s2(k)− smin2

smin2

≤ s2(k)−mink′ s′2(k′)

mink′ s′2(k′).

Hence,

0≤ limsupL′2→∞

√L′2 + 1

s2(k)− smin2

smin2

≤ limL′2→∞

√L′2 + 1

s2(k)−mink′ s′2(k′)

mink′ s′2(k′)=

Φ−1(η)−Φ−1(η′)

µ/σ.

Therefore, we arrive at

limsupL′2→∞

√L′2 + 1

s2(k)− smin2

smin2

= c,

where c is a nonnegative constant.

When the Markov chain W is cyclic, we can apply the same argument as used in the proof of

Proposition 2 to obtain the convergence result. We omit the details here.

45

Appendix B: A Two-State Example for the Single-Stage System

To illustrate our bounds and heuristic solutions, let us consider a single-stage system with a two-

state MMD demand process. For ease of exposition, we write the transition matrix of W as

P =

(1− p pq 1− q

).

Its stationary distribution is π = (q/(p+q), p/(p+q)). According to Corollary 1, the solution upper

bounds are just the myopic base-stock levels s(k), which solves (2). Without loss of generality, we

assume s(1)< s(2). Thus, the optimal solution for state 1 is given by s∗(1) = s(1) = smin = s(1) =

sa(1). It remains to determine the solution lower bound for state 2, s(2), and the heuristic solution

sa(k).

Given the two demand states, there are two possible state sequences: m1 = (1,2) and m2 = (2,1).

It is easy to verify that P(1)m1

= 1−p and P(1)m2

= 1−q, so that β1(m1) = α1(m1) = 1/p and β1(m2) =

α1(m2) = 1/q. From Corollary 1, s(2) is the solution of (12), which in this case reduces to

q · ∂yG(1, y) + p · ∂yG(2, y) = 0.

This is equivalent to

q

p+ q·F1,L(y) +

p

p+ q·F2,L(y) =

b

b+hor Fπ,L(y) =

b

b+h.

Comparing the above equation with (2), we can see that s(2) is a myopic base-stock level with an

“effective” lead time demand distribution, which is a weighted average of the lead time demand

distributions of states 1 and 2 (or the “stationary” lead time demand distribution).

Next, we illustrate how our heuristic solution works. With the assumption s(1) < s(2), the

heuristic sequence is m = (1,2). Therefore, by equation (13), sa(2) is given by

∂yG(2, y) +q

2p·F2(y− s(1)) (∂yG(1, y))

+= 0,

which is equivalent to

F2,L(y) +q

2p·F2(y− s(1))

(F1,L(y)− b

b+h

)+

=b

b+h. (A10)

To gain further insight, we next assume the demand distributions for state k is exponential

with mean 1/θk, with θ1 > θ2. We also assume L= 0 to allow for explicit expressions. This yields

Fk(y) = Fk,0(y) = 1− e−θky, ∂yG(k, y) = h− (h+ b)e−θky, s(k) = ln(1 + b/h)/θk for k = 1,2, and

s(1)< s(2). Thus, s∗(1) = s(1) = s(1) = sa(1) = ln(1 + b/h)/θ1.

46

To determine the optimal solution for state 2, note that s∗(2)> s∗(1). From (3), using Laplace

transform technique described in Appendix C, we can show that s∗(2) is determined by the following

first-order condition:

∂yG[2](2, y) = h− (h+ b)e−θ2y +

qh

p·(

1− pθ1e−θ2(y−s∗(1))− θ2e

−pθ1(y−s∗(1))

pθ1− θ2

)= 0. (A11)

From (A10), we can show that the heuristic sa(2) solves the following equation: for y≥ s∗(1),

h− (h+ b)e−θ2y +qh

p·(1− e−θ1(y−s∗(1))

) (1− e−θ2(y−s∗(1))

)2

= 0. (A12)

The difference between the above two first-order conditions lies in the last terms, which can be

approximated respectively as follows by using Taylor expansion to the second order:

1− (pθ1e−θ2(y−s∗(1))− θ2e

−pθ1(y−s∗(1)))

pθ1− θ2

≈ 1

2pθ1θ2(y− s∗(1))2 + o((y− s∗(1))2),

and (1− e−θ1(y−s∗(1))

) (1− e−θ2(y−s∗(1))

)2

≈ 1

2θ1θ2(y− s∗(1))2 + o((y− s∗(1))2).

Therefore, sa(2) is a close approximation to s∗(2) to the second order if p is close to one. This

condition implies that there is a high likelihood that the demand will stay in the high state—even

if it transitions back to state 1 in one period, it will jump to state 2 with a high probability in the

next period. The following proposition further characterizes how sa(2) and s∗(2) are affected by

the Markov chain transition probability q:

Proposition 5 Under a two-state MMD with exponential density and zero lead time, both the

optimal solution s∗(2) and the heuristic sa(2) are decreasing in transition probability q.

Proof Denote s∗ = s∗(1) for simplification. Because s∗(2) ≥ s∗, we can introduce the following

change of variable: y= s∗+x with x≥ 0. Substituting this into (A11), we obtain

∂xG2(2, s∗+x) =

p+ q

ph−

(h+ b+h

qθ1

pθ1− θ2

eθ2s∗)e−θ2(x+s∗) +h

qθ2 · epθ1s∗

p2θ1− pθ2

e−pθ1(x+s∗)

=p+ q

ph− (h+ b)e−θ2(x+s∗)−h qθ1

pθ1− θ2

e−θ2x +hqθ2

p2θ1− pθ2

e−pθ1x

By substituting s∗ = ln(1 + b/h)/θ1 into the above equation we have

∂xG2(2, s∗+x) =

p+ q

ph− (h+ b)e−θ2x

(h

h+ b

)θ2/θ1−h qθ1

pθ1− θ2

e−θ2x +hqθ2

p2θ1− pθ2

e−pθ1x.

Further take derivative with respect to q of the above expression. We obtain

∂q(∂xG

2(2, s∗+x))

=h

p−h θ1

pθ1− θ2

e−θ2x +hθ2

p2θ1− pθ2

e−pθ1x.

47

Note that ∂q (∂xG2(2, s∗+x)) |x=0 = 0, ∂q (∂xG

2(2, s∗+x)) |x=+∞ = h/p> 0, and

∂x(∂q(∂xG

2(2, s∗+x)))

= hθ1θ2

pθ1− θ2

(e−θ2x− e−pθ1x

)≥ 0 for x≥ 0.

Therefore we must have ∂q (∂xG2(2, s∗+x))≥ 0 for x≥ 0, which implies that s2(2∗) decreases as q

increases.

Next we examine our heuristic. Since sa(2)≥ s∗, we can also apply the above change of variable

to (A12). Therefore, sa(2) is given by the solution to

h− (h+ b)e−θ2(x+s∗) +qh

p· (1− e

−θ1x) (1− e−θ2x)2

= 0.

Clearly the left-hand side of the above equation is increasing in q, which implies that our heuristic

solution for state 2 also decreases as q increases.

48

Appendix C: Gamma Demand Distribution and Laplace Transform

Recall from (3) and (4) that the exact algorithm requires convolutions involving the demand

distributions Fk. In this section, we assume the demand distributions under different demand states

all belong to the gamma distribution family, with the probability density function given by (28).

We show that, in this case, the optimal policy can be computed relatively easily by leveraging

Laplace transformation and its inverse.

Denote Laplace transform of ∂yRi(k, y) by

ri(k,λ) =

∫ ∞0

e−λydRi(k, y),

and denote Laplace transform of ∂yGi(i∗, y) by

gi(i∗, λ) =

∫ ∞0

e−λydGi(i∗, y).

Given the optimal demand state sequence m∗ = (1∗, ...,K∗), define ri(λ) = [ri(1∗, λ), ..., ri(i∗, λ)]T ,

gi(λ) = [0, ...,0, gi(i∗, λ)]T . Because Laplace transform of a gamma density f(x|nk, θk) is given by

θnkk /(λ+ θk)

nk , Laplace transform of D(i)m∗(x) defined in (6) is given by

d(i)m∗(λ) =

θn1∗1∗

(λ+θ1∗ )n1∗ . . . 0...

. . ....

0 . . .θni∗i∗

(λ+θi∗ )ni∗

.

Under Laplace transform, and noting that I(i)−d(i)m∗(λ)P

(i)m∗ is strictly diagonal dominant, (7) can

be simplified to

ri(λ) =(I(i)−d

(i)m∗(λ)P

(i)m∗

)−1

gi(λ).

Here, because d(i)m∗(λ) has a simple form, the inverse Laplace transform of ri(λ) can be fairly easily

determined by Laplace transform table or by standard software packages such as Matlab. Thus,

we can obtain ∂yRi(k, y) from the inverse Laplace transform, and substitute it into (3) to compute

the optimal policy.

It is worth emphasizing here that the computation of the solution bounds and heuristic in Corol-

lary 2 does not require the function of ∂yRi(k, y); it only involves evaluation of simple newsvendor

derivative functions. Thus, unlike the optimal policy, the solution bounds and heuristic proposed

in this paper can be computed for other general demand distributions.

49

Appendix D: Single-Stage Approximation under Stationary Demand

Shang and Song (2003) established solution bounds for a serial inventory system with i.i.d.

demand, which is a special case of our problem. Our results relate to theirs as follows. Let ΩN =

(N, (hi,Li)Ni=1, b,D) denote the original N -stage serial system. Shang and Song (2003) showed

that under i.i.d. demand, the echelon n problem is equivalent to an n-stage serial system Ωn =(n, (hi,Li)

ni=1, b+

∑N

i=n+1 hi,D)

. That is, the echelon n problem is the same as the original N -

stage system truncated at stage n, except that the backorder cost rate is modified to be b +∑N

i=n+1 hi. They also showed that (the cost of) Ωn is bounded by a lower bound system Ωln =(

n, (hi,Li)ni=1, b+

∑N

i=n+1 hi,D)

, with hn = hn and hi = 0 for i < n, and an upper bound system

Ωun =

(n, (hi,Li)

ni=1, b+

∑N

i=n+1 hi,D)

, with hn =∑n

i=1 hi and hi = 0 for i < n. They further showed

that the optimal solution to the lower (upper) bound system is an upper (lower) bound for the

optimal echelon base-stock solution of the original system. Because the two bounding systems are

equivalent to single-stage systems, where the myopic base-stock level is optimal, their solution

bounds are simple single-stage myopic base-stock levels, and one can regard these solutions as

single-stage approximations.

We can adapt the same two bounding systems to our problem with MMD. Specifically, the

derivative of the single-period cost function for demand state k in the lower bound system Ωln is

given by

∂yGn(k, y|Ωln) = −

(b+∑N

i=n+1 hi

)+(b+∑N

i=n+1 hi +hn

)Fk,L′n(y)

= −(b+Hn+1) + (b+Hn)Fk,L′n(y). (A13)

And its counterpart in the upper bound system Ωun is given by

∂yGn(k, y|Ωun) = −

(b+∑N

i=n+1 hi

)+(b+∑N

i=n+1 hi + hn

)Fk,L′n(y)

= −(b+Hn+1) + (b+H1)Fk,L′n(y). (A14)

Observe that (A13) is exactly the same as the derivative lower bound in Proposition 3 (a). On the

other hand, (A14) is only the first term in the derivative upper bound in Proposition 3 (b). Under

the i.i.d. demand process, i.e., K = 1, the ∆ terms in Proposition 3(b) vanish, and we recover

the solution bound results in Shang and Song (2003). However, their bounding approach cannot

be fully extended to the general MMD process: While the myopic base-stock level to the lower

bound system remains an upper bound for the optimal solution of the original system, the myopic

base-stock level to the upper bound system is no longer guaranteed to be a lower bound (See Table

1 for the counterexamples marked with a “†” sign).

50

Appendix E: Additional Numerical Results

Five-State Demand Cases

As a robustness check, we conduct an additional numerical study for two five-state demand cases.

Each of the five states follows a gamma distribution, with parameters given by (ni, θi) = (1,1),

(2,2), (3,3), (2,0.2), and (2,0.2) for i= 1,2,3,4,5, representing demand means of 1, 1, 1, 10, and

10, respectively. We set the demand distributions for states 4 and 5 to be the same in these two

cases, to illustrate the fact that the optimal echelon base-stock levels for these two states may

be different (because the lead time demand distributions starting from these two states may be

different under MMD). The transition matrices for these two cases are given as follows:

Cyclic demand Noncyclic demand0 1 0 0 00 0 1 0 00 0 0 1 00 0 0 0 11 0 0 0 0

,

0.10 0.10 0.20 0.40 0.200.35 0.10 0.20 0.10 0.250.10 0.20 0.20 0.10 0.400.50 0.05 0.25 0.15 0.050.30 0.05 0.25 0.30 0.10

.

We set the cost parameters to be the same as in our base case. Table 3 below presents the detailed

numerical results for the two five-state demand cases. As we can see, the numerical insights are

essentially the same as those observed in the three-state demand cases (Table 1).

Internal Fill Rates

We design an additional numerical study to shed light on the internal fill rate under the optimal and

heuristic policies (i.e., s∗n(k), svn(k), and san(k|h[1,n])). For this purpose, we simulate the two-stage

inventory system for 50,000 periods to measure the realized fill rates at both stages. In each period,

we measure the fill rate (= fulfilled order quantity/placed order quantity) at each stage and then

average the fill rates across 50,000 periods. We present the results for the base case of three-state

MMD in Table 4. The numerical insights from the five-state demand cases are essentially the same,

and we omit them for brevity.

From Table 4, we observe that the fill rates under the enhanced heuristic san(k|h[1,n]) are close to

those under the optimal policy s∗n(k), except for the cyclic demand case with lead times (L1,L2) =

(1,1). In this case, as can be seen from Table 1, the Stage 2 heuristic policy is the same across

different states (because the lead time demand distributions are the same across different states),

whereas the optimal policy varies depending on the state. This explains why the Stage 2 fill rate

under the heuristic policies is significantly lower than that under the optimal policy.

We observe that, under the optimal policy, the Stage 2 fill rate need not be high, especially

when the Stage 1 lead time is relatively long. However, the difference between the internal and

external fill rates is not as significant as what is observed under the i.i.d. demand. The behavior

51

Table 3 Comparison of policies in a two-stage system under the five-state demand cases.

Cyclic demand Noncyclic demand(L1,L2) n k s∗n(k) sn(k) sn(k) svn(k) san(k) s∗n(k) sn(k) sn(k) svn(k) san(k)(1, 1) 1 1 4.47 4.03 4.63 4.63† 4.49 23.12 21.39 23.40 23.40† 23.20

2 3.88 3.88 3.88 3.88 3.88 20.01 20.01 20.01 20.01 20.013 26.44 13.17 26.44 26.44 26.44 22.03 20.92 22.22 22.22† 22.054 34.21 23.74 40.85 40.85† 38.84 30.15 24.06 31.58 31.58† 30.485 20.65 13.21 26.50 26.50† 24.25 32.91 24.98 35.04 35.04† 33.61

2 1 6.14 5.09 6.50 5.40 5.40 28.97 24.83 36.92 28.98† 28.912 25.02 6.50 31.35 25.03† 25.03 28.18 24.47 35.25 27.31 27.313 34.15 7.37 46.58 38.85† 38.85 29.63 24.75 36.57 28.60 28.514 37.45 7.41 46.63 38.89† 36.90 38.27 25.96 46.71 38.02 36.765 27.34 6.72 31.42 25.09 22.88 39.98 26.09 49.05 40.13† 38.75

Soln error % - 58.5% 20.5% 9.9% 8.5% - 19.0% 14.8% 2.2% 2.0%Cost error % 66.46 - - 4.3% 2.2% 79.23 - - 0.6% 0.6%(1, 3) 1 1 4.47 4.03 4.63 4.63† 4.49 23.12 21.39 23.40 23.20† 23.40

2 3.88 3.88 3.88 3.88 3.88 20.01 20.01 20.01 20.01 20.013 26.44 13.17 26.44 26.44 26.44 22.03 20.92 22.22 22.22† 22.054 34.21 23.74 40.85 40.85† 38.84 30.15 24.06 31.58 31.58† 30.485 20.65 13.21 26.50 26.50† 24.25 32.91 24.98 35.04 35.04† 33.61

2 1 36.35 27.14 48.69 40.94† 40.94 47.85 34.15 54.72 44.87† 44.772 42.87 27.44 48.69 40.94 40.94 46.07 33.61 52.79 42.98 42.983 48.41 27.44 48.69 40.94 40.94 47.73 33.99 54.18 44.32 44.234 47.81 27.44 48.69 40.94 40.94 55.72 35.49 63.27 52.90 51.795 43.07 27.14 48.69 40.94 40.94 57.58 35.62 65.52 54.97 53.67

Soln error % - 36.8% 12.2% 11.6% 10.1% - 25.8% 10.3% 4.9% 4.9%Cost error % 72.90 - - 7.7% 6.2% 89.63 - - 1.0% 1.2%(3, 1) 1 1 28.57 28.57 28.57 28.57 28.57 40.79 39.58 40.93 40.93† 40.82

2 42.53 31.96 42.90 42.90† 42.90 38.81 38.81 38.81 38.81 38.813 41.77 31.97 42.94 42.94† 42.94 40.22 39.39 40.32 40.32† 40.204 40.17 31.97 42.95 42.95† 40.94 47.86 41.83 49.22 49.22† 48.085 28.57 28.57 28.57 28.57 28.57 49.79 42.30 51.57 51.57† 50.24

2 1 40.83 33.20 48.69 40.94† 40.94 45.13 42.09 54.72 44.87 44.772 40.48 33.20 48.69 40.94† 40.94 43.33 41.73 52.79 42.98 42.983 39.41 33.45 48.69 40.94† 40.94 45.20 42.01 54.18 44.32 44.234 44.62 33.48 48.69 40.94 40.94 52.96 43.06 63.27 52.90 51.795 43.91 33.44 48.69 40.94 40.94 54.65 43.18 65.52 54.97† 53.67

Soln error % - 18.2% 9.9% 3.3% 2.8% - 9.8% 11.5% 1.1% 1.0%Cost error % 87.90 - - 0.6% 0.6% 108.59 - - 0.4% 0.1%

of the system also differs depending on whether W is cyclic. In the noncyclic demand case with

lead times (L1,L2) = (3,1), the realized fill rate at Stage 2 is only 78%. However, when Stage 2 has

longer lead time than Stage 1, i.e., (L1,L2) = (1,3), the realized fill rate at Stage 2 is 98% under

cyclic demand and 93% under noncyclic demand. This observation suggests that the high internal

fill rate assumption under the decoupling heuristic holds when the downstream lead time is shorter

than the upstream lead time (e.g., Abhyankar and Graves 2001). However, this assumption may

be problematic when the downstream lead time is longer than the upstream lead time.

52

Table 4 Fill rates in a two-stage system under the three-state demand cases.

Cyclic demand Noncyclic demand(L1,L2) n s∗n(k) svn(k) san(k|h[1,n]) s∗n(k) svn(k) san(k|h[1,n])(1, 1) 1 88% 89% 89% 89% 90% 89%

2 88% 60% 77% 88% 85% 86%(1, 3) 1 90% 90% 90% 89% 89% 88%

2 98% 87% 96% 93% 90% 89%(3, 1) 1 91% 92% 91% 92% 92% 92%

2 80% 77% 81% 78% 74% 75%

The Bullwhip Effect

The bullwhip effect, or the amplification of demand variability propagating from the downstream

to the upstream stages of a supply chain, has been extensively studied in the literature (see Chen

and Lee 2009, 2012 and the references therein). Most studies of the bullwhip effect have focused on

either a single-stage system or a two-stage system with decentralized decision making. In a serial

supply chain with centralized decision making, it is known that a stationary echelon base-stock

policy is optimal under i.i.d. demand (Clark and Scarf 1960). As a result, the replenishment order

at each stage in each period is simply the replacement of the demand in the previous period, so

there is no amplification of demand variability in such a system. However, when demand follows an

MMD process, the optimal inventory policy is state-dependent, and it is not clear whether demand

variability would be amplified or dampened in such a system.

Specifically, we simulate the two-stage inventory system for 50,000 periods to measure the order

variability amplification ratio (i.e., the bullwhip effect ratio) at both stages. We first calculate the

sample variances of external demand, orders placed by Stage 1, and orders placed by Stage 2. Then,

the bullwhip effect ratio for Stage 1 is determined by the ratio of Stage 1 order variance over the

external demand variance, and the bullwhip effect ratio for Stage 2 (or more precisely echelon 2) is

determined by the ratio of Stage 2 order variance over the external demand variance. We present

the results of the base case of three-state MMD in Table 5. The numerical insights of the five-state

demand cases are essentially the same, and we omit them for brevity.

Table 5 Bullwhip effect ratio in a two-stage system under the three-state demand cases.

Cyclic demand Noncyclic demand(L1,L2) n s∗n(k) svn(k) san(k|h[1,n]) s∗n(k) svn(k) san(k|h[1,n])(1, 1) 1 0.85 0.98 1.09 0.63 0.69 0.64

2 0.64 1.00 1.00 0.70 0.67 0.63(1, 3) 1 1.15 1.02 1.19 0.63 0.68 0.63

2 0.75 0.78 0.91 0.64 0.65 0.62(3, 1) 1 0.68 0.77 0.66 0.63 0.67 0.63

2 0.49 0.76 0.89 0.65 0.65 0.62

53

From Table 5, we observe that the bullwhip effect ratios under both heuristics closely track

those under the optimal policy for the noncyclic demand. When the demand is cyclic, the bullwhip

effect ratios under both heuristics tend to be higher than those under the optimal policy. This

observation accords with our earlier observations that the heuristics tend to perform better when

demand is noncyclic.

An interesting observation from the above table is that the bullwhip effect ratio under the optimal

policy is less than one in most cases, indicating a sweeping demand variability dampening effect.

Specifically, for noncyclic demand, the bullwhip effect ratios for Stages 1 and 2 are consistently

below 0.70. To further investigate this phenomenon, we compute the autocorrelation of sequential

demand under the noncyclic demand. We find that ρ(1) = −0.122, ρ(2) = −0.002, ρ(3) = 0.010,

ρ(4) =−0.003, and ρ(5) = 0.000, where ρ(i) = CorrD(t),D(t+ i). Thus, there exists only a weak

negative correlation among sequential demands, which cannot fully account for the large magnitude

of the variability dampening effect. Chen and Lee (2012) showed that the existence of system

capacity constraints can help dampen the bullwhip effect. In our system with MMD, there is no

capacity constraint, yet we still observe that the bullwhip effect can be significantly dampened

under the optimal policy. This suggests that the state-dependent inventory policy under MMD

may have an inherent smoothing effect on the order variability. Finding the root cause for this

intriguing observation is beyond the scope of this paper, and we shall leave it for future research.

serial inventory systems with markov-modulated demand ...lc91/more/papers/chen_song_zhang... ·...

Documents