optimal stopping problems in operations a …such problems are pervasive in the areas of operations...

OPTIMAL STOPPING PROBLEMS IN OPERATIONS

MANAGEMENT

A DISSERTATION

SUBMITTED TO THE DEPARTMENT OF MANAGEMENT

SCIENCE AND ENGINEERING

AND THE COMMITTEE ON GRADUATE STUDIES

OF STANFORD UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

Sechan Oh

March 2010

http://creativecommons.org/licenses/by-nc/3.0/us/

This dissertation is online at: http://purl.stanford.edu/yx778cf5733

© 2010 by Se Chan Oh. All Rights Reserved.

Re-distributed by Stanford University under license with the author.

This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.

ii



http://purl.stanford.edu/yx778cf5733

I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.

Ali Ozer, Primary Adviser


Warren Hausman, Co-Adviser


Benjamin Van Roy

Approved for the Stanford University Committee on Graduate Studies.

Patricia J. Gumport, Vice Provost Graduate Education

This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file inUniversity Archives.

iii

Abstract

Optimal stopping problems determine the time to terminate a process to maximize ex-

pected rewards. Such problems are pervasive in the areas of operations management,

marketing, statistics, finance, and economics. This dissertation provides a method

that characterizes the structure of the optimal stopping policy for a general class of

optimal stopping problems. It also studies two important optimal stopping problems

arising in Operations Management.

In the first part of the dissertation, we provide a method to characterize the

structure of the optimal stopping policy for the class of discrete-time optimal stop-

ping problems. Our method characterizes the structure of the optimal policy for some

stopping problems for which conventional methods fail. Our method also simplifies

the analysis of some existing results. Using the method, we determine sufficient condi-

tions that yield threshold or control-band type optimal stopping policies. The results

also help characterize parametric monotonicity of optimal thresholds and provide

bounds for them.

In the second part of the dissertation, we first generalize the Martingale Model

of Forecast Evolution to account for multiple forecasters who forecast demand for

the same product. The result enables us to consistently model the evolution of fore-

casts generated by two forecasters who have asymmetric demand information. Using

the forecast evolution model, we next study a supplier’s problem of eliciting credible

forecast information from a manufacturer when both parties obtain asymmetric de-

mand information over multiple periods. For better capacity planning, the supplier

designs and offers a screening contract that ensures the manufacturer’s credible in-

formation sharing. By delaying to offer this incentive mechanism, the supplier can

iv

obtain more information. This delay, however, may increase (resp., or decrease) the

degree of information asymmetry between the two firms, resulting in a higher (resp.,

or lower) cost of screening. The delay may also increase capacity costs. Considering

all such trade-offs, the supplier has to determine how to design a mechanism to elicit

credible forecast information from the manufacturer and when to offer this incentive

mechanism.

In the last part of the dissertation, we study a manufacturer’s problem of deter-

mining the time to introduce a new product to the market. Conventionally, manu-

facturing firms determine the time to introduce a new product to the market long

before launching the product. The timing decision involves considerable risk because

manufacturing firms are uncertain about competing firms’ market entry timing and

the outcome of production process development activities at the time when they make

the decision. As a solution for reducing such risk, we propose a dynamic market entry

strategy under which the manufacturer makes decisions about market entry timing

and process improvements in response to the evolution of uncertain factors. We show

that the manufacturer can reduce profit variability and increase average profit by em-

ploying this dynamic strategy. Our study also characterizes the industry conditions

under which the dynamic strategy is most effective.

v

Acknowledgements

I would like first thank my advisor Prof. Ozalp Ozer for his insightful advices through-

out my doctoral studies. He has been a great mentor in every aspects of my graduate

life, and this dissertation would have not been possible without his dedicated teaching.

I would also like to thank Prof. Warren H. Hausman for his great support. He has

provided valuable personal advices during difficult times. His own doctoral research

was the most important reference for the forecast evolution model that is presented

in the third chapter of my dissertation.

I am also very grateful to Prof. Benjamin Van Roy, Prof. Sunil Kumar, and Prof.

Thomas A. Weber for serving in the committee of my dissertation defense. They

provided great suggestions to substantially improve this dissertation. They have

also taught me courses such as dynamic programming, economic theory, and revenue

management in the early stage of my graduate study. I also wish to acknowledge and

thank Samsung Scholarship Foundation for financial support.

Graduate school would have been very difficult without my friends. I give special

thanks to my MS&E and Korean friends who have filled my graduate life with joy.

I was very fortunate to have met such brilliant and fun friends in the course of my

graduate study.

Last but not least, I would like to thank my family for their unconditional love

and support. All my accomplishments to this point are the outcome of my parents’

sacrifice and dedication.

vi

Contents

Abstract iv

Acknowledgements vi

1 Introduction 1

2 Characterizing Optimal Stopping Policy 5

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Optimal Stopping Problem and the Two-Step Method . . . . . . . . . 10

2.3 Two Example Applications . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.1 Time-to-Market Model . . . . . . . . . . . . . . . . . . . . . . 13

2.3.2 American-Asian Option . . . . . . . . . . . . . . . . . . . . . 15

2.4 Conditions that Imply the Structure of the Optimal Policy . . . . . . 16

2.4.1 Univariate Benefit Functions . . . . . . . . . . . . . . . . . . . 17

2.4.2 Multivariate Benefit Functions with a Partially Dependent State

Transition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.4.3 Multivariate Benefit Functions with a Dependent State Transition 20

2.4.4 Monotonicity Results and Bounds for Optimal Thresholds . . 21

2.5 Optimal Stopping Problems with Additional Decisions . . . . . . . . 23

2.6 Infinite-Horizon Optimal Stopping Problems . . . . . . . . . . . . . . 26

2.7 Example Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.7.1 Time-to-Market Model . . . . . . . . . . . . . . . . . . . . . . 29

2.7.2 Option Pricing Problems . . . . . . . . . . . . . . . . . . . . . 30

2.7.3 Dynamic Quality Control Problem . . . . . . . . . . . . . . . 32

vii

2.7.4 Organ Transplantation Problem . . . . . . . . . . . . . . . . . 33

2.7.5 The Secretary Problem . . . . . . . . . . . . . . . . . . . . . . 34

2.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3 Capacity Planning in Two-level Supply Chain 36

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.2 The Martingale Model of Forecast Evolution for Multiple Decision

Makers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.2.1 The General MMFE . . . . . . . . . . . . . . . . . . . . . . . 42

3.2.2 The Multiplicative MMFE . . . . . . . . . . . . . . . . . . . . 43

3.2.3 Collaborative Forecasting, Delayed Information and Informa-

tion Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.2.4 Asymmetric Forecast Evolution and Information Sharing . . . 48

3.3 Determining the Optimal Time to Offer an Optimal Mechanism . . . 51

3.4 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.4.1 The First Stage Problem . . . . . . . . . . . . . . . . . . . . . 54

3.4.2 The Second Stage Problem . . . . . . . . . . . . . . . . . . . . 55

3.5 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.5.1 Optimal Capacity Reservation Contract . . . . . . . . . . . . 56

3.5.2 Optimal Time to Offer the Contract . . . . . . . . . . . . . . 60

3.6 Centralized Supply Chain . . . . . . . . . . . . . . . . . . . . . . . . 62

3.7 Numerical Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.7.1 Optimal Capacity Reservation Contract . . . . . . . . . . . . 65

3.7.2 Optimal Time to Offer the Contract . . . . . . . . . . . . . . 66

3.7.3 Value of Determining the Optimal Time . . . . . . . . . . . . 68

3.8 Extensions and Generalizations . . . . . . . . . . . . . . . . . . . . . 70

3.8.1 Endogenous Wholesale Price . . . . . . . . . . . . . . . . . . . 70

3.8.2 Forecast Update Costs . . . . . . . . . . . . . . . . . . . . . . 71

3.8.3 State Dependent Reservation Profit . . . . . . . . . . . . . . . 72

3.8.4 Dynamic Mechanism Design under Non-Commitment . . . . . 73

3.8.5 MMFE for J > 2 decision makers . . . . . . . . . . . . . . . . 74

viii

3.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4 Dynamic Market Entry Strategy 76

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.3 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

4.3.1 The First-Stage Problem . . . . . . . . . . . . . . . . . . . . . 84

4.3.2 The Second-Stage Problem . . . . . . . . . . . . . . . . . . . . 86

4.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.4.1 Optimal Production and Pricing Decisions . . . . . . . . . . . 86

4.4.2 Optimal Market Entry Policy . . . . . . . . . . . . . . . . . . 90

4.5 Numerical Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

4.5.1 Optimal Market Entry and Process Improvement Decisions . . 95

4.5.2 Measuring the Value of the Dynamic Strategy . . . . . . . . . 99

4.5.3 Effectiveness of the Dynamic Strategy under Various Industrial

Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4.6 Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . . 106

A Chapter 2 Appendices 108

A.1 Stochastic Monotonicities of the State Transition . . . . . . . . . . . 108

A.2 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

B Chapter 3 Appendices 115

B.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

B.2 Additive Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

B.2.1 The Additive MMFE . . . . . . . . . . . . . . . . . . . . . . . 117

B.2.2 Determining the Optimal Time to Offer an Optimal Mechanism 118

B.3 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

C Chapter 4 Appendices 135

C.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

C.2 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

ix

List of Tables

3.1 Expected Profits When the Supplier Offers the Contract at Period n . 66

3.2 Optimal Thresholds of Decentralized and Centralized Supply Chains . 67

3.3 Percentage Improvements in Expected Profits . . . . . . . . . . . . . 69

4.1 Expected Value and Coefficient of Variation of Profits . . . . . . . . 100

4.2 Key Factors that Determine the Value of the Dynamic Strategy . . . 101

x

List of Figures

2.1 Value function and the benefit function of the time-to-market problem 14

3.1 Information Structure of the MMFE . . . . . . . . . . . . . . . . . . 45

3.2 Information Structure of the multiplicative MMFE . . . . . . . . . . 47

3.3 Optimal Capacity Reservation Contract . . . . . . . . . . . . . . . . . 65

4.1 Expected Values of Market Potential . . . . . . . . . . . . . . . . . . 95

4.2 Optimal Investment and Stopping Policy of Base Numerical Setting . 96

4.3 Knowledge-Level-Based Upper Threshold Policy . . . . . . . . . . . . 97

4.4 Market-Potential-Based Lower Threshold Policy . . . . . . . . . . . . 98

4.5 Probability of Stopping at Period 3 . . . . . . . . . . . . . . . . . . . 99

4.6 Impact of Uncertainties in Learning . . . . . . . . . . . . . . . . . . . 102

4.7 Impact of Reducible Unit Production Cost . . . . . . . . . . . . . . . 102

4.8 Impact of the Cost of Expedited Learning . . . . . . . . . . . . . . . 103

4.9 Impact of Uncertainties in Market Potential Change . . . . . . . . . . 104

4.10 Impact of Demand Uncertainty . . . . . . . . . . . . . . . . . . . . . 105

4.11 Impact of the Size of the Salvage Market . . . . . . . . . . . . . . . . 106

B.1 Information Structure of the additive MMFE . . . . . . . . . . . . . . 118

xi

Chapter 1

Introduction

Optimal stopping problems determine the time to stop a process in order to maximize

expected rewards. Such problems appear frequently in the areas of economics, finance,

statistics, marketing and operations management. For example, a stock option holder

faces the problem of determining the time to exercise the option in order to maximize

the expected income. As another example, employers face the problem of determining

the time to stop interviewing job candidates in order to hire the best candidate. This

dissertation studies two important optimal stopping problems arising in Operations

Management. It also provides a method that characterizes the structure of the optimal

stopping policy for a general class of optimal stopping problems.

Optimal stopping problems often have simple threshold or control-band type op-

timal stopping policies. For example, for an American put option, a threshold policy

under which the option holder exercises the stock option if the current stock price

is below a certain threshold is optimal. Such structural properties of the optimal

stopping policy are important for three reasons. First, knowing the structure of the

optimal policy provides managerial insights. They provide actionable policies that a

decision maker can follow to optimize her objective. Second, structural results also

enable one to develop efficient numerical algorithms to solve optimal stopping prob-

lems. Finally, structural results are important when the optimal stopping problem

is part of a higher-level and/or larger scale optimization problem. For these reasons,

1

CHAPTER 1. INTRODUCTION 2

most research papers that study optimal stopping problems provide structural prop-

erties of the optimal stopping policy if such structures exist (see, e.g., Chen 1970, Yao

and Zheng 1999b, Ben-Ameur et al. 2002, Alagoz et al. 2004).

In Chapter 2, we provide a method that characterizes the structure of the optimal

stopping policy for the class of discrete-time optimal stopping problems. Our method

is based on structural properties of the benefit function, which we define as the dif-

ference between the reward of continuing the process and the reward of stopping the

process. To characterize its structure, we establish the benefit function’s recursive

relation with the one-step benefit function, which we define as the difference between

the rewards of stopping at the next period and the current period. Using this recur-

sive relation and the stochastic monotonicity of state-transition, we determine several

sufficient conditions that yield threshold or control-band type optimal stopping poli-

cies. We show that our method can characterize the structure of the optimal policy

of some stopping problems for which conventional methods fail. We also show that

the method simplifies the analysis of some existing results.

Next, in Chapter 3, we study an optimal stopping problem faced by a supplier who

invests in new capacity. For a timely delivery, the supplier has to secure component

capacity prior to receiving a firm order from a product manufacturer. The supplier

relies on the demand forecast for his capacity decision. However, the manufacturer

often has other forward-looking information because of her superior relationship with

or proximity to the market and expert opinion about her own product. To elicit the

manufacturer’s private information, the supplier needs to design and offer a screening

contract. As the sales season approaches, both the supplier and the manufacturer can

update their demand forecasts over time. Hence, by delaying to offer the screening

contract, the supplier can reduce the demand uncertainty that he faces. However,

delaying the capacity decision is not always beneficial for the supplier. For example,

the delay may increase the degree of information asymmetry between the two firms

if the manufacturer obtains more information than the supplier over time. Capacity

costs may also increase as the supplier delays the capacity decision because of a tighter

deadline for building capacity. By considering all such trade-offs, the supplier needs to

determine when to offer a screening contract and how to design the screening contract


to maximize his profit. In Chapter 3, we develop an optimal stopping problem to solve

this problem.

The supplier’s decision problem consists of two stages. The first stage is an optimal

stopping problem that determines the optimal time to offer a screening contract, and

the second stage is a mechanism design problem for forecast information sharing.

Using the method that we develop in Chapter 2, we establish the optimality of a

control band policy that prescribes when to offer an optimal incentive mechanism.

Under this policy, the supplier offers a menu of contracts if the supplier’s demand

forecast falls within the control band. We also provide structural properties of the

optimal screening contract and explicitly show how the optimal contract depends

on the demand forecast and how the timing decision affects the mechanism design

problem. Through numerical studies, we characterize the environment in which the

supplier should offer the contract late or early. By comparing the profits of the

dynamic strategy with those of a static one in which the supplier offers a contract

in a fixed period, we show that the supplier can significantly improve his profit by

optimally determining the time to offer a contract. However, the results also show

that this dynamic strategy can reduce the total supply chain efficiency.

Modeling the aforementioned stopping problem requires a forecast evolution model

that describes forecast sequences made by two decision makers. To develop such a

model, we extend the Martingale Model of Forecast Evolution (MMFE) framework to

the cases with multiple decision makers in Chapter 3. The MMFE is a general model

that describes the evolution of forecasts arising from many statistical and judgment-

based forecasting methods. Due to its descriptive power and generality, researchers

have used the MMFE in many studies that involve dynamic forecast updates such as

inventory control and production planning (e.g., Heath and Jackson 1994, Aviv 2001,

Gallego and Ozer 2001, Toktay and Wein 2001, Altug and Muharremoglu 2009, Iida

and Zipkin 2009, Schoenmeyr and Graves 2009). Our extension enables the MMFE

framework to model several plausible forecast evolution scenarios that involve multiple

decision makers in a consistent way.

Finally, in Chapter 4, we study an optimal stopping problem faced by a manufac-

turer who introduces a new product to the market. When determining the timing for


introducing the new product, the manufacturer takes into consideration the trade-off

between the time-to-market and the completeness of the production processes. On

the one hand, the manufacturer can attain a large market share by entering the mar-

ket early. On the other hand, the manufacturer can improve the production process

for the new product by investing more time in process design, which results in a

reduction of production costs. However, the manufacturer are uncertain about both

the timing of the competitors’ market entry and the outcome of production process

development activities. For this reason, the manufacturer needs to dynamically make

the timing decision depending on the competitors’ movements and the readiness of

his own production process. We formulate the manufacturer’s problem as an optimal

stopping problem.

The manufacturer’s decision process also consists of two stages. The first stage is

an optimal stopping problem that determines the optimal time to introduce a new

product to the market and optimal investment decisions to improve the production

process. The second stage is a production and pricing decision problem that deter-

mines the production quantity and the sales prices for the new product. Using the

method that we develop in Chapter 2, we establish the optimality of threshold-type

market entry policies that prescribe the optimal time to introduce the new prod-

uct. We also characterize structural properties of the optimal production and pricing

decisions. By comparing to a static market entry decision, we show that the dy-

namic market entry decision yields a higher and less variable profit. Our study also

characterizes when the value of the dynamic market entry is the greatest.

Chapter 2

Characterizing the Structure of the

Optimal Stopping Policy

2.1. Introduction

Optimal stopping problems are determining the time to terminate a process to max-

imize expected rewards given the initial state of the process. Such problems appear

frequently in the operations, marketing, finance and economics literature. Some ex-

amples are the problem of determining the time to exercise a stock option, to sell

or purchase an asset, and to introduce a new product. Optimal stopping problems

are rarely solvable in a closed form and they require computational methods. Hence,

researchers often try to characterize the structure of the optimal stopping policy that

determines when to stop the process based on the state of the process at each decision

epoch. When possible, researchers also provide monotonicity results (comparative

statics) regarding the optimal policy parameters. Such structural results enable ac-

tionable policies that a decision maker can follow to maximize rewards. They also

help develop efficient numerical algorithms to solve the problem. This chapter pro-

vides a method to characterize the structure of an optimal stopping policy for the

class of discrete-time optimal stopping problems. This chapter also determines suf-

ficient conditions that yield simple threshold or control-band type stopping policies.

These conditions are presented in eight propositions that can be used to characterize

5

CHAPTER 2. CHARACTERIZING OPTIMAL STOPPING POLICY 6

optimal stopping policy for various application areas.

Structural properties of the optimal stopping policy are helpful for three reasons.

First, knowing the structure of the optimal policy provides managerial insights. They

provide actionable policies that a decision maker can follow to optimize her objective.

For example, exercising an American put option (stopping the process) is optimal

when the current stock price (current state) is below a certain threshold. Another ex-

ample is from Alagoz et al. (2007a,b) who establish an optimal organ-transplantation

policy for a patient with end-stage liver disease. They show that given the current

health condition of the patient, transplanting an organ is optimal if the quality1 of

the offered organ is above a certain level. Second, structural results also enable one

to develop efficient numerical algorithms to solve optimal stopping problems (as in

Yao and Zheng 1999b, Ben-Ameur et al. 2002, Wu and Fu 2003). For example, Wu

and Fu (2003) first establish the optimality of a threshold-type exercise policy for

an American-Asian option. Using this structure, they develop a computationally

efficient simulation-based algorithm. Finally, structural results are important when

the optimal stopping problem is part of a higher-level and/or larger scale optimiza-

tion problem (Terwiesch and Xu 2004, Anily and Grosfeld-Nir 2006). For example,

Terwiesch and Xu (2004) study a manufacturer’s problem of pricing prototypes and

the final product. To do so, they first model a customer’s purchasing decision as an

optimal stopping problem. They show that the customer’s optimal purchase policy

has a threshold structure. Using this result, the authors formulate the manufac-

turer’s optimal pricing problem. For aforementioned reasons, several other papers

also characterize structural properties of optimal stopping policies for various stop-

ping problems (such as Chow et al. 1964, Cox et al. 1979, Chen et al. 1998, and

Boyaci and Ozer 2009).

The above observations motivated us to identify a universal method that can help

determine the structure of optimal stopping policy for problems arising in various

fields. The method also helps us to identify sufficient conditions that yield simple

optimal policies. Before discussing our new method, we describe the two approaches

currently used in the literature. The first approach is to verify whether the problem

1Quality of an organ is measured, for example, by the age of the donor (Alagoz et al. 2007b).


satisfies the monotone-case condition (Chow et al. 1971). This approach first deter-

mines the set of states at which the reward of immediate stopping is greater than the

expected reward of stopping at the next period. The policy that stops the process

if the current state is in this set is known as the one-step look-ahead policy. Chow

et al. (1971) show that if this set is closed almost surely2, then the one-step look-

ahead policy is optimal and call such a case the monotone-case. Since the one-step

look-ahead policy is easy to compute and implement, researchers often try to verify

whether the problem satisfies the monotone-case condition (see, for example, Stadje

1991, Hui et al. 2008). However, most optimal stopping problems do not satisfy the

monotone-case condition because the one-step look-ahead policy, a myopic policy, is

not optimal in general. The second and the most common approach is based on the

dynamic programming formulation of the optimal stopping problem. This approach

determines structural properties of the value function of the dynamic program to

characterize the optimal stopping policy (see, for example, Wu and Fu 2003, Babich

and Sobel 2004, and Alagoz et al. 2007a). As we will illustrate in §2.3, these two ap-

proaches, although helpful, do not always yield the structure of the optimal stopping

policy.

This chapter provides a different approach to characterize the structure of the

optimal stopping policy. Our method is based on structural properties of the bene-

fit function, which we define as the difference between the reward of continuing the

process and the reward of stopping the process. To characterize its structure, we

establish the benefit function’s recursive relation with the one-step benefit function,

which we define as the difference between the rewards of stopping at the next pe-

riod and the current period. Next, we determine sufficient conditions that yield a

threshold or control-band type optimal stopping policy using this recursive relation

and the stochastic monotonicity of state-transition. We show that our method can

characterize the structure of the optimal policy of some optimal stopping problems

for which the two approaches discussed above fail to do so. We also show that the

2That is, when the current state is in this set, the future states will be in this set with probabilityone.


method simplifies the analysis of existing results. One can use the method to char-

acterize the structure of the optimal policy before numerically solving the optimal

stopping problem. The results also make the sufficient conditions that yield simple

optimal stopping policies transparent and easier to determine.

For the analysis of our method, we use stochastic monotonicities of parameter-

ized random variables. Stochastic monotonicities are widely used in the analysis of

stochastic objective functions (Shaked and Shanthikumar 2007). For example, Athey

(2000) uses them to characterize the structure of the objective functions that arise

in economics. Smith and McCardle (2002) use them to characterize the structure of

the value function of a Markov decision process (MDP). Even though optimal stop-

ping problems are a sub-class of general MDPs, our method is not related to those

in Smith and McCardle (2002). Our method for characterizing the structure of the

optimal policy is based on the benefit function. The benefit function is the difference

between the two reward functions: the expected reward of continuing the process and

the reward of stopping the process. In contrast, the value function is the maximum

of the two. Hence, the benefit function is different from the value function of MDPs.

We show that the properties of the benefit function directly imply the structure of

the optimal stopping policy when the structure of the value function and the result

of Smith and McCardle (2002) do not. In addition, the benefit function and the

value function have different recursions as we discuss in §2.2. Our approach is ef-

fective because there are only two actions for optimal stopping problems. Despite

its effectiveness and simplicity, this method has been neglected (see §2.3.2 for some

examples). Our research formalizes this method in a general framework.

There has also been extensive research on the monotonicity of optimal control.

Several researchers have provided sufficient conditions that yield monotone optimal

policies (Altman and Stidham 1995 and Glasserman and Yao 1994). However, the

monotonicity of the optimal policies in this stream of research is based on the theory

of ordered optimal solutions by Topkis (1978) and its generalizations. For example,

Altman and Stidham (1995) show that a threshold policy is optimal for stationary

Markov decision processes with two actions when the reward function is submodular in

state and action and the state transition is stochastically monotone. We remark that


none of the discrete-time optimal stopping problems satisfies Altman and Stidham

(1995)’s assumptions set forth for binary Markov decision problems. Our method

and sufficient conditions are based on the zero-crossing points of the benefit function

and do not require the submodularity of reward functions. Hence, the results of this

chapter can be applied to the problems that do not satisfy the sufficient conditions

provided by the literature on monotone optimal control. We refer to Glasserman and

Yao (1994) for additional references on this literature.

Due to the importance of stopping problems, extensive research has been done

(Chow et al. 1971, Shiryaev 1978, Tsitsiklis and Van Roy 1999, Peskir and Shiryaev

2006). This line of research characterizes optimal stopping times and optimal re-

wards under various general assumptions. However, this literature has not focused on

characterizing the structure of the optimal stopping policy. Our study, in contrast,

focuses on a method to characterize the structure of optimal stopping policies and

determines sufficient conditions to obtain them.

The rest of this chapter is organized as follows. In §2.2, we define the optimal

stopping problem and propose the method to characterize the structure of the op-

timal stopping policy. In §2.3, we provide an example for which only our method

can characterize the structure of the optimal policy and another example for which

our method simplifies the analysis of an existing result. In §2.4, we provide sufficient

conditions that yield a threshold or control-band type optimal stopping policy and

provide monotonicity results and bounds for the optimal policy. In §2.5, we consider

optimal stopping problems with additional decisions other than the stopping deci-

sion. In §2.6, we consider infinite-horizon optimal stopping problems. In §2.7, we

provide example applications to facilitate the use of the proposed method. In §2.8,

we conclude. The objective of this chapter is to introduce an easy and useful method

to researchers in broad application areas. Hence, in Appendix A.1, we provide a

theorem and examples that one can use to verify stochastic monotonicities of state

transition models. Proofs not provided right after the propositions are deferred to

Appendix A.2.


2.2. Optimal Stopping Problem and the Two-Step

Method

Let xt|t = 1, 2, . . . , T be a Markov process that evolves in a state space X ⊂ Rd,

defined on a probability space (Ω,F ,P). We denote the σ-algebra generated by

x1, x2, . . . , xt by Ft ⊂ F . A stopping time τ is a random variable that takes values

in 1, 2, . . . , T and satisfies ω ∈ Ω|τ(ω) ≤ t ∈ Ft for all t ≤ T . We denote the set

of all such stopping times by UT .

At each period t ∈ 1, 2, . . . , T, a decision maker observes the state xt and decides

whether to continue or stop a process. If the decision maker decides to continue, she

attains a reward Ct(xt) and the state evolves. If she stops, then the decision maker

attains a reward St(xt), and the problem is terminated. Without loss of generality, we

assume that stopping is a forced decision at period T .3 In addition, we assume that

reward functions Ct : Rd → R and St : Rd → R are integrable, i.e., E|St(xt)| < ∞and E|Ct(xt)| <∞ for every t <∞. The objective is to determine the optimal time

to stop the process in order to maximize the total discounted rewards with a discount

factor α ∈ (0, 1]. This problem can be formulated as

V ∗(x) ≡ supτ∈UT

E

[τ−1∑t=1

αt−1Ct(xt) + ατ−1Sτ (xτ )∣∣x1 = x

]. (2.1)

The optimal value function V ∗(x) corresponds to the total expected reward when

the optimal stopping time achieves the supremum in (2.1), and the initial state is

x. Note that the optimal stopping time and the optimal value function are well-

defined. However, this formulation does not help characterize an actionable policy

that a decision maker can follow to maximize her expected reward. Hence, we provide

a dynamic programming (DP) recursion that specifies an optimal action for each state

at each period.

3In some problems, not stopping and receiving a reward of 0 at the end of the decision horizon isan option for the decision maker. For such problems, one can introduce a fictitious period t = T + 1with ST+1(xT+1) = 0, and enforce the stopping at period t = T + 1. Therefore, the forced stoppingassumption is not restrictive.


Let UTt be the set of stopping times that satisfy τ ∈ [t, T ]. Then, we define

Vt(x) ≡ supτ∈UTt

E

[τ−1∑u=t

αu−tCu(xu) + ατ−tSτ (xτ )∣∣xt = x

],

which indicates the total expected rewards from period t when the process has

not yet stopped and the current state is x. If the decision maker stops the pro-

cess at period t, the reward is St(x). If she continues, the expected reward is

Ct(x) + αE[Vt+1(xt+1)|xt = x]. Therefore, Vt(x) satisfies the following DP recursion

(Theorem 3.2 of Chow et al. 1971):

Vt(xt) = maxSt(xt), Ct(xt) + αE[Vt+1(xt+1(xt))], t < T, (2.2)

where VT (xT ) = ST (xT ). We denote E[Vt+1(xt+1)|xt] by E[Vt+1(xt+1(xt))] to empha-

size its functional dependence on xt. Note that Vt(x) is the value function of the DP

recursion, and the optimal value function satisfies V ∗(x) = V1(x). An optimal policy

stops the process at period t if St(xt) ≥ Ct(xt) + αE[Vt+1(xt+1(xt))].

To date, the most common approach for characterizing the structure of the optimal

stopping policy is based on characterizing structural properties of the corresponding

value function. However, this approach is not always useful in characterizing the

structure of the optimal stopping policy. Note that the optimal stopping decision is

based on the relative values of St(xt) and Ct(xt) + αE[Vt+1(xt+1(xt))]. For example,

the information that Vt+1(xt+1) is increasing in xt+1 is not useful in characterizing the

structure of the optimal policy, when both St(xt) and Ct(xt) +αE[Vt+1(xt+1(xt))] are

increasing in xt. In contrast, the structural properties of

Bt(xt) ≡ αE[Vt+1(xt+1(xt))] + Ct(xt)− St(xt)

help directly characterize the optimal stopping policy because the optimal policy at

each period t < T is to stop the process if Bt(xt) ≤ 0 and to continue, otherwise. We

refer to this function as the benefit function. It is the expected benefit of delaying the

stopping decision at period t. For optimal stopping problems, determining structural


properties of the benefit function provides an easier way to characterize the optimal

policy than determining structural properties of the value function. This approach is

possible because optimal stopping problems have only two options to choose at each

period: stop or continue.

We also define the one-step look-ahead, or in short, the one-step benefit function

Mt(xt) ≡ αE[St+1(xt+1(xt))] + Ct(xt)− St(xt),

which indicates the expected benefit of delaying the stopping decision at period t

without considering the possible benefit of delaying the decision beyond period t +

1. These two functions are closely related. The following recursion formalizes this

relationship:

Bt(xt) = αE[Vt+1(xt+1(xt))] + Ct(xt)− St(xt)

= αE[maxSt+1(xt+1(xt)), Bt+1(xt+1(xt)) + St+1(xt+1(xt))]

+Ct(xt)− St(xt)

= Mt(xt) + αE[max0, Bt+1(xt+1(xt))], t < T − 1, (2.3)

BT−1(xT−1) = MT−1(xT−1).

This recursive relationship highlights two important observations. First, struc-

tural properties of the benefit function is closely related to that of the one-step ben-

efit function. Second, these properties also depend on the functional form of the

state transition xt+1(xt) and the corresponding transition probabilities. Naturally,

establishing structural properties of Bt(xt) from this recursive relationship involves

an inductive argument. Suppose Mt(xt) has a certain structural property for every

t. Then BT−1(xT−1) inherits its structural property by definition. We will show that

when the state transition xt+1(xt) has an appropriate stochastic monotonicity, Bt(xt)

also has the same property with Mt(xt) for all other periods. The two-step method

is based on this idea. In particular,

First, we determine structural properties of Mt(xt) from Ct(xt), St(xt)


and xt+1(xt). Next, we verify whether xt+1(xt) has the stochastic mono-

tonicity that enables Bt(xt) to inherit the structural property of Mt(xt).

Then, structural properties of Bt(xt) directly imply the structure of the optimal stop-

ping policy. For example, when Mt(xt) is increasing4 in xt, a stochastically increasing

state transition xt+1(xt) carries the increasing property to Bt(xt). Hence, a threshold

policy is optimal. We provide such sufficient conditions in §2.4, which corresponds to

the second step. We provide several examples to illustrate the first step in §2.7.

2.3. Two Example Applications

We introduce two examples that support the importance of our method discussed

in the previous section. The first example illustrates that the conventional value-

function-based approach for characterizing the structure of the optimal stopping pol-

icy does not always work. This example also does not satisfy the monotone-case

condition. Hence, the conventional methods discussed previously fail to characterize

the optimal policy for this first example. However, the proposed method successfully

characterizes the structure of the optimal policy. The second example illustrates how

the two-step method substantially simplifies the analysis of an existing result.

2.3.1 Time-to-Market Model

Consider a firm that decides when to introduce a new product. When the firm

introduces the product earlier than competitors, it captures a larger market share.

However, an early introduction results in high production costs and low profit margins

due to low manufacturing yields. Hence, the firm needs to determine the optimal

time to enter the market. Suppose that the total market demand D is deterministic.

There are T periods in which the firm can introduce the new product. At each period

t ∈ 1, 2, . . . , T, the firm decides whether to delay the market entry depending on

the number of competitors who are already in the market, xt ∈ 0, 1, . . .. Let v(xt)

4We use the terms increasing and decreasing in the weak sense; i.e., increasing means non-decreasing.


be the market share of the firm when she enters the market after the xtth competitor.

It is decreasing concave in xt. That is, as more competitors enter the market the firm

loses more market share. Let p be the sales price of the product and ct be the unit

production cost when the firm enters the market at period t. The discounted profit

margin αt−1(p− ct) increases with t due to higher manufacturing yields5. If the firm

enters the market at period t, she attains a reward of St(xt) = v(xt)(p− ct)D. If she

delays the market entrance at period t, then ξt more competitors enter the market,

and the state evolves as xt+1(xt) = xt + ξt. The random variable ξt is independent of

xt.

This problem can be formulated as an optimal stopping problem. The value

function is derived as Vt(x) = supτ∈UTt E[ατ−1Sτ (xτ )

∣∣xt = x]

and the benefit function

is derived as Bt(xt) = αE[Vt+1(xt+1(xt))] − St(xt). The DP recursion is given by

Vt(xt) = maxSt(xt), αE[Vt+1(xt+1(xt))] with VT (xT ) = ST (xT ). Figure 2.1 shows

an example of Vt(xt), Bt(xt), and Mt(xt) for t = 13. The problem setting for this

example is T = 15, α = 1, v(x) = 0.9− 0.85e0.1(x−15), p = 5, ct = 4− 0.1t− 0.005t2,

D = 10, and ξt is a Bernoulli r.v. with 0.7.

Figure 2.1: Value function and the benefit function of the time-to-market problem

First note from Figure 2.1 that Mt(x) and Bt(x) do not cross the zero line at the

same point. The monotone-case condition of Chow et al. (1971) is also not satisfied.

5This example is a simplified version of a problem faced by Hitachi GST, a global provider ofhard disk drives, as discussed in Ozer and Uncu (2008).


Hence, the one-step look ahead policy is not optimal. Second, note that although

the reward function St(xt) is decreasing concave in xt for every t, the value function

Vt(xt) is not concave in xt. The decreasing property of Vt(xt) in xt does not provide

any information about the optimal stopping policy, either. The value function is

the maximum of the reward of stopping and the reward of continuing, which are

both decreasing in xt. Hence, structural properties of the value function (and hence

the methods of Smith and McCardle 2002) cannot characterize the structure of the

optimal stopping policy in this case. However, by using the two-step method, we can

easily verify the decreasing property of Bt(xt), which establishes the optimality of a

threshold policy; i.e., the firm should enter the market at period t if xt ≥ xt for a

certain threshold xt. We provide the complete analysis in §2.7.

2.3.2 American-Asian Option

Our second example is the problem of pricing an American-Asian option studied

in Ben-Ameur et al. (2002) and Wu and Fu (2003). A stock option is a financial

derivative security that promises the option holder a payoff when the holder exercises

the option. The payoff depends on the future prices of an underlying stock and the

agreed upon strike price K. A holder of the American-Asian option can exercise

the stock option at fixed periods t ∈ 1, 2, . . . , T + 1.6 The payoff depends on the

average price of the underlying stock, where the average is taken for the stock prices at

periods 1, 2, . . . , t. Let xt,1 be the current stock price at period t and xt,2 be the average

stock price at period t. If the option holder exercises the option at period t ≤ T , she

receives a reward St(xt) = xt,2−K. If the option holder does not exercise the option at

period t, the stock price changes as xt+1,1(xt) = ξxt,1, where ξ is a log-normal random

variable. Accordingly, the average stock price is updated as xt+1,2(xt) = txt,2+ξxt,1t+1

. If

the option is not exercised in any periods, then ST+1(·) = 0. By using the risk-neutral

measure (Black and Scholes 1973, Harrison and Kreps 1979) of ξ, the option pricing

problem can be formulated as an optimal stopping problem. The price of the option

is the optimal value function of the stopping problem, and the optimal exercise policy

6Period T + 1 is a fictitious period, which gives the option holder the right not to exercise theoption at period T .


is the optimal stopping policy. The discount factor is α = e−r, where r is the risk-free

interest rate. The value function is derived as Vt(x) = supτ∈UTt E[ατ−tSτ (xτ )

∣∣xt = x]

and the benefit function is derived as Bt(xt) = αE[Vt+1(xt+1(xt))]− St(xt). The DP

recursion is given by Vt(xt) = maxSt(xt), αE[Vt+1(xt+1(xt))] with VT+1(xT+1) = 0.

Ben-Ameur et al. (2002) and Wu and Fu (2003) independently determine the struc-

ture of the optimal policy based on the properties of the value function. For example,

Proposition 1 (in Ben-Ameur et al. 2002) verifies that St(xt) and αE[Vt+1(xt+1(xt))]

are increasing and convex in xt,1 and xt,2. However, this information does not charac-

terize the structure of the optimal policy. Instead, both of these papers establish the

optimality of a state-dependent threshold policy by verifying that the increasing rate

of αE[Vt+1(xt+1(xt))] in xt,2 is less than or equal to 1. Although elegant, establishing

the structure of the optimal policy using the value function requires a lengthy anal-

ysis that involves three propositions and one lemma (§4 in Ben-Ameur et al. 2002).

However, as we will show in §2.7, the two-step method can be used to easily verify

that the benefit function Bt(xt) is decreasing in xt,2. This result directly implies the

optimality of a state-dependent threshold policy. Under this policy, the option holder

optimally exercises the option at period t if xt,2 ≥ xt,2(xt,1) for certain thresholds

xt,2(xt,1). We provide the complete analysis in §2.7.

2.4. Conditions that Imply the Structure of the

Optimal Policy

We provide sufficient conditions on Mt(xt) and xt+1(xt) that together imply the struc-

ture of the benefit function Bt(xt) and the optimal stopping policy. We also establish

some monotonicity results and bounds for the optimal policy parameters. The result

of this section corresponds to the second step of the two-step method.7

7We remark that the results of this section are not related to those in in Chow et al. (1971) andthey do not imply the optimality of the one-step look-ahead policy. This policy calls for stopping theprocess when the reward of instant stopping is greater than the expected reward of stopping at thenext period, i.e., when Mt(xt) ≤ 0. The one-step look-ahead policy is generally sub-optimal. Chowet al. (1971) provide sufficient conditions under which the one-step look-ahead policy is optimal andrefer to it as the monotone-case. In the monotone-case, x : Mt(x) ≤ 0 = x : Bt(x) ≤ 0. The


For the case of a multi-dimensional state space, we denote the ith element of the

vector xt by xt,i and the d− 1 dimensional vector excluding the element xt,i from xt

by xt,−i. Similarly, xt,−(i,j) denotes the d − 2 dimensional vector excluding elements

xt,i and xt,j from xt. We write (x′t,i, xt,−i) for the state (xt,1, xt,2, . . . , x

′t,i, . . . , xt,d) and

write (x′t,i, x

′′t,j, xt,−(i,j)) for the state (xt,1, xt,2, . . . , x

′t,i, . . . , x

′′t,j, . . . , xt,d).

We also define the stopping set of period t as x ∈ X : Bt(x) ≤ 0. It is the

set of states for which the optimal policy stops the process at period t. For a single-

dimensional state space problem, we define xt ≡ supx ∈ X : Bt(x) ≤ 0 and

xt ≡ infx ∈ X : Bt(x) ≤ 0. Similarly, for a multi-dimensional state space problem,

we define xt,i(xt,−i) ≡ supxt,i : Bt(xt,i, xt,−i) ≤ 0, xt ∈ X and xt,i(xt,−i) ≡ infxt,i :

Bt(xt,i, xt,−i) ≤ 0, xt ∈ X.

2.4.1 Univariate Benefit Functions

Consider optimal stopping problems with a single-dimensional state space and an

increasing or decreasing one-step benefit function. Before stating the first proposi-

tion, we define the stochastic monotonicity necessary for the analysis. We follow the

definitions in Shaked and Shanthikumar (2007) for all stochastic monotonicities.

Definition 2.1. A set of random variables x(θ), θ ∈ R is stochastically increasing

in θ if E[u(x(θ))] is increasing in θ for all increasing functions u.

We note that many common Markov process models have this property. In Ap-

pendix A.1, we provide a theorem and examples that one can use to verify stochastic

monotonicities of state transitions. Using this definition, we provide the first sufficient

condition.

Proposition 2.1. When Mt(xt) is increasing [resp., decreasing] in xt, and xt+1(xt)

is stochastically increasing in xt for every t, then the following statements are true

for every t:

1. Bt(xt) is increasing [resp., decreasing] in xt.

conditions we provide in this section do not imply that these two sets are identical.


2. A threshold policy that stops the process if xt ≤ xt [resp., xt ≥ xt] is optimal.

Proof. The proof is based on an induction argument. Consider the increasing one-

step benefit function case. At period t = T − 1, we have BT−1(x) = MT−1(x).

Hence, Bt(x) is increasing in x for t = T − 1. Next assume for the induction

argument that Bt+1(xt+1) is increasing in xt+1. The composition of an increasing

function and max0, x is also increasing, hence, max0, Bt+1(x) is an increasing

function. Because the state transition xt+1(xt) is stochastically increasing in xt,

E[max0, Bt+1(xt+1(xt))] is increasing in xt. Because the increasing property is

closed under summation, Bt(xt) = Mt(xt) + αE[max0, Bt+1(xt+1(xt))] is increas-

ing in xt, which concludes the induction hypothesis and the proof of Part 1 for the

increasing Mt(x) case.

To prove Part 2, we define two sets Q = x ∈ X : x ≤ xt and Q∗ = x ∈X : Bt(x) ≤ 0. We prove that Q = Q∗. When the state space is discrete or

Bt(xt) is a continuous function, xt satisfies Bt(xt) = 0. Then, every x ∈ Q satisfies

Bt(x) ≤ Bt(xt) = 0 because Bt(xt) is increasing. Hence, Q ⊂ Q∗. Conversely, for

every x ∈ Q∗, we have x ≤ xt by the definition of xt. Hence, Q∗ ⊂ Q, which implies

that Q = Q∗. Therefore, the optimal stopping policy stops the process at period t

if xt ∈ Q, i.e., if xt ≤ xt. Note that when the state space is continuous and Bt(xt)

has a discontinuous point, it is possible that Bt(xt) > 0. In this case, the optimal

policy stops the process if xt < xt instead of xt ≤ xt. However, the structural result

does not change. Hence, we assume throughout this chapter that Bt(xt) is continuous

when the state space is continuous. The decreasing case can be proved in a similar

way.

Recall that the recursive relationship in Equation (2.3) has max0, ·. In general,

max0, f(x) does not preserve structural properties of f(x). Increasing, decreasing

and convex properties are among the few properties that are preserved under this

operation. Even concavity is not preserved. Other properties, such as subadditivity

in xt are preserved, but those properties are not useful in characterizing the structure

of the optimal stopping policy. Note that the structure of the optimal policy does

not depend on the maximum point of Bt(xt), but depends on the pattern of Bt(xt)


crossing 0. Hence, we focus on increasing, decreasing, and convex properties.

Next, consider optimal stopping problems with a single-dimensional state space

and a convex one-step benefit function. For the next result, we need a stochastically

convex state transition.

Definition 2.2. A set of random variables x(θ), θ ∈ R is stochastically convex in

θ if E[u(x(θ))] is convex in θ for all convex functions u.

Using this definition, we provide the second sufficient condition.

Proposition 2.2. When Mt(xt) is convex in xt, and xt+1(xt) is stochastically convex

in xt for every t, then the following statements are true for every t:

1. Bt(xt) is convex in xt.

2. A control-band policy that stops the process if xt ∈ [xt, xt] is optimal.

2.4.2 Multivariate Benefit Functions with a Partially Depen-

dent State Transition

Consider optimal stopping problems with a d-dimensional state space. We consider a

partially dependent state transition in which there exists an element i such that the

state transition xt+1,−i(xt) is independent of xt,i. In other words, state xt,i affects only

the ith element of the next period’s state. In this case, the state transition can be ex-

pressed as xt+1(xt) = (xt+1,i(xt), xt+1,−i(xt,−i)). Note that xt+1,i(xt) can still depend

on xt,−i in addition to depending xt,i. Note also that a special case of the partially

dependent state transition is the fully independent state transition case, in which the

state transition can be expressed as xt+1(xt) = (xt+1,1(xt,1), xt+1,2(xt,2), . . . , xt+1,d(xt,d)).

To better understand this case, recall the American-Asian option pricing problem

discussed in §2.3.2. The state transition of the current stock price is xt+1,1(xt) = ξxt,1.

This update is stochastically increasing in xt,1 but is independent of xt,2. However, the

state transition of the average stock price xt+1,2(xt) = txt,2+ξxt,1t+1

is stochastically in-

creasing in both xt,1 and xt,2. Next, we provide sufficient conditions for the optimality

of a state-dependent threshold or control-band policy.


Proposition 2.3. When Mt(xt) is increasing [resp., decreasing] in xt,i, xt+1,i(xt) is

stochastically increasing in xt,i and xt+1,−i(xt) is independent of xt,i for every t, then

the following statements are true for every t:

1. Bt(xt) is increasing [resp., decreasing] in xt,i.

2. A state-dependent threshold policy that stops the process when xt,i ≤ xt,i(xt,−i)

[resp., xt,i ≥ xt,i(xt,−i)] is optimal for each xt,−i.

Proposition 2.4. When Mt(xt) is convex in xt,i, xt+1,i(xt) is stochastically convex

in xt,i and xt+1,−i(xt) is independent of xt,i for every t, then the following statements

are true for every t:

1. Bt(xt) is convex in xt,i.

2. A state-dependent control-band policy that stops the process when

xt,i ∈ [xt,i(xt,−i), xt,i(xt,−i)] is optimal for each xt,−i.

2.4.3 Multivariate Benefit Functions with a Dependent State

Transition

Elements of the state transition xt+1(xt) are dependent on each other when the ith

element of the next period state xt+1,i(xt) depends on the ith element of the current

state and also on the other elements of the current state. An interesting analysis can

be applied to the case where the state transition xt+1(xt) is stochastically increasing

in xt. Note that the state xt has multiple dimensions in this case. Hence, we need a

more general definition of the stochastically increasing property.

Definition 2.3. A set of random vectors x(θ), θ ∈ Rd of dimension m is stochas-

tically increasing in θ ∈ Rd if E[u(x(θ))] is increasing for all increasing functions

u : Rm → R.

Given this definition, we have the following result.

Proposition 2.5. When Mt(xt) is increasing[resp., decreasing] in each xt,i and xt+1(xt)

is stochastically increasing in xt ∈ X for every t, then the following statements are

true for every t.


1. Bt(xt) is increasing [resp., decreasing] in each xt,i.

2. A state-dependent threshold policy that stops the process when xt,i ≤ xt,i(xt,−i)

[resp., xt,i ≥ xt,i(xt,−i)] is optimal for each i.

At first sight, the conditions that the one-step benefit function is increasing in xt,i

for all elements i and that the state transition is stochastic increasing appear to be

somewhat restrictive. Consider, for example, a two-dimensional state space in which

a higher value of xt,2 leads to a lower value of xt,1 in the next period. In this case,

one can redefine the state variable as yt,2 = −xt,1 and yt,1 = xt,1, which makes the

state transition yt,1(yt) stochastically increasing in both yt,1 and yt,2. Similarly, if

an initial formulation of Mt(xt) is increasing in xt,1 and decreasing in xt,2, a similar

transformation makes the one-step benefit function Mt(yt) an increasing function.

Therefore, the increasing benefit-function condition is not too restrictive. In general,

one element of the current state may impact some elements of the next period state

but not all of them. Suppose xt+1,i(xt) is independent of xt,j for some j 6= i. Then, by

definition xt+1,i(xt) is stochastically increasing in xt,j. Therefore, the stochastically

increasing property of multi-dimensional state transition can also be satisfied easily.

Cases in which the state transition depends only on parts of the state space can

be analyzed following the analyses given in this and the previous subsections. For

example, consider the case in which the state transition of a three-dimensional state

space xt+1(xt) can be separated into xt,(1,2)(xt,(1,2)) and xt,3(xt,3), where the random

transition xt,(1,2)(xt,(1,2)) is stochastically increasing in xt,(1,2). If the one-step benefit

function Mt(xt) is increasing in both xt,1 and xt,2, we can apply a slight modification

of Proposition 2.5 to xt,(1,2) for each fixed xt,3. We omit the proposition and the

analysis to avoid repetition.

2.4.4 Monotonicity Results and Bounds for Optimal Thresh-

olds

We discuss two types of monotonicity results for the optimal thresholds. The first

one is the parametric monotonicity of the state-dependent optimal thresholds. The


second one is the time-monotonicity of optimal thresholds. We also provide bounds

for the optimal thresholds. The result of this subsection is useful for developing

efficient numerical algorithms. They also help characterize how policy parameters

respond to the changes in the environment.

Proposition 2.6. The following statements are true for every t:

1. If Bt(xt) is increasing [resp., decreasing] in both xt,i and xt,j for i 6= j, then

xt,i(xt,−i) [resp., xt,i(xt,−i)] is decreasing in xt,j and xt,j(xt,−j) [resp., xt,j(xt,−j)]

is also decreasing in xt,i.

2. If Bt(xt) is increasing in xt,i and decreasing in xt,j for i 6= j, then xt,i(xt,−i) is

increasing in xt,j and xt,j(xt,−j) is also increasing in xt,i.

3. If Bt(xt) is increasing [resp., decreasing] in xt,i and convex in xt,j for i 6= j,

then xt,j(xt,−j) is increasing [resp., decreasing] in xt,i and xt,j(xt,−j) is decreasing

[resp., increasing] in xt,i.

Next we consider time-monotonicity of optimal thresholds in stationary optimal

stopping problems. An optimal stopping problem is stationary if the Markov process

xt is time-homogeneous and the reward functions C(xt) and S(xt) are time-invariant.

It is a well-known result that the value function Vt(x) is decreasing in t for every

x in such problems. See, for example, §4.4 of Bertsekas (2005). As t increases, the

decision maker has less opportunity to delay the stopping decision, hence the value

function Vt(x) decreases in t. This property directly implies that Bt(x) is decreasing

in t, which in turn implies the following proposition.

Proposition 2.7. For stationary optimal stopping problems, the following statements

are true:

1. xt is increasing in t and xt is decreasing in t.

2. xt,i(xt,−i) is increasing in t and xt,i(xt,−i) is decreasing in t for every xt,−i.

Finally, we provide bounds for the optimal thresholds. By definition, Bt(xt) ≥Mt(xt) for every xt. Hence, if the one-step look-ahead policy continues the process at


period t with xt, i.e., Mt(xt) > 0, then the optimal policy also continues the process

at period t, i.e, Bt(xt) > 0. This property implies the following proposition.

Proposition 2.8. The optimal thresholds have the following bounds:

1. xt ≤ supx ∈ X : Mt(x) ≤ 0 and xt ≥ infx ∈ X : Mt(x) ≤ 0.

2. xt,i(xt,−i) ≤ supxt,i : Mt(xt,i, xt,−i) ≤ 0, xt ∈ X and xt,i(xt,−i) ≥ infxt,i :

Mt(xt,i, xt,−i) ≤ 0, xt ∈ X.

Note that determining the x that satisfies supx ∈ X : Mt(x) ≤ 0 is simple

and does not involve recursive computation. Often it can be derived in a closed

form. Hence, these bounds together with the monotonicity results considerably help

reduce the computational time required to determine the optimal thresholds and

resulting expected profit by reducing the search region. They also provide qualitative

understanding of a decision process modeled as an optimal stopping problem.

2.5. Optimal Stopping Problems with Additional

Decisions

The general theory of optimal stopping has focused primarily on problems in which

stopping time is the only decision to make. Yet, it is possible to have optimal

stopping problems with additional decisions. In particular, at each decision period

t ∈ 1, 2, . . . , T, a decision maker observes the state xt ∈ X of a process and decides

whether to stop or continue the process. When stopping the process, the decision

maker attains a reward of St(xt) and the process is terminated. When continuing the

process, the decision maker takes an action at ∈ At and attains a reward of Ct(at, xt).

Then, the state evolves. The state transition depends on both xt and at, and we de-

note it by xt+1(at, xt). The action set At is independent of the state. Let π be a policy

that specifies both the action to take for every t and every xt and the stopping time

τ that satisfy τ ≤ T . Let ΠT be the set of all admissible policies. Then, the decision

maker’s optimal stopping problem is to determine the optimal action to take and the


optimal time stop the process in order to maximize the total discounted rewards. We

can formulate this problem as

V ∗(x) ≡ supπ∈ΠT

E

[τ−1∑t=1

αt−1Ct(at, xt) + ατ−1Sτ (xτ )∣∣x1 = x

].

The following DP specifies the optimal action for each state at each period.

Vt(xt) = maxSt(xt), supat∈At

[Ct(at, xt) + αE[Vt+1(xt+1(at, xt))]], t < T, (2.4)

where VT (xT ) = ST (xT ). The optimal value function satisfies V ∗(x) = V1(x). An op-

timal policy stops at period t if St(xt) ≥ supat∈At [Ct(at, xt) + αE[Vt+1(xt+1(at, xt))]].

The optimal action when continuing the process at period t is the maximizer of the

function inside sup[·]. Note that the optimal action depends on the state. Then, we

define the one-step benefit function and the benefit function as

Mt(at, xt) ≡ αE[St+1(xt+1(at, xt))] + Ct(at, xt)− St(xt),

Bt(at, xt) ≡ αE[Vt+1(xt+1(at, xt))] + Ct(at, xt)− St(xt).

Additionally, we define the maximal benefit function as Bt(xt) ≡ supat∈At Bt(at, xt).

We have:

Bt(at, xt) = αE[Vt+1(xt+1(at, xt))] + Ct(at, xt)− St(xt)

= αE[max

0, Bt+1(xt+1(at, xt))

+ St+1(xt+1(at, xt))

]+ Ct(at, xt)− St(xt)

= Mt(at, xt) + αE[max

0, Bt+1(xt+1(at, xt))

], t < T − 1, (2.5)

BT−1(aT−1, xT−1) = MT−1(aT−1, xT−1).

Note that the optimal policy stops the process at period t when Bt(xt) ≤ 0. Hence,

we need to determine structural properties of Bt(xt) to characterize the structure of

the optimal stopping policy. However, even when Bt(at, xt) has a certain structural

property in xt for each fixed at, Bt(xt) is not guaranteed to have the same property,

because the optimal action depends on xt. Fortunately, increasing, decreasing and


convex properties are preserved under maximization. Following the two-step method,

we provide the following sufficient conditions for the case of a single-dimensional state

space. We define xt ≡ supx ∈ X : Bt(x) ≤ 0 and xt ≡ infx ∈ X : Bt(x) ≤ 0.

Proposition 2.9. When Mt(at, xt) is increasing [resp., decreasing] in xt, and xt+1(at, xt)

is stochastically increasing in xt ∈ X for every at and every t, then the following state-

ments are true for every t:

1. Bt(xt) is increasing [resp., decreasing] in xt.

2. A threshold policy that stops the process if xt ≤ xt [resp., xt ≥ xt] is optimal.

Proof. We prove the first part; then the second part follows Proposition 2.1 Part 2.

The proof is based on an induction argument. Consider the increasing one-step bene-

fit function case. At period t = T−1, we have BT−1(at, x) = MT−1(at, x). Let a∗t (x) =

arg maxat∈At Bt(at, x). For any x1 ≤ x2, we have BT−1(x1) = BT−1(a∗T−1(x1), x1) ≤BT−1(a∗T−1(x1), x2) ≤ BT−1(a∗T−1(x2), x2), where the first inequality is from the fact

that BT−1(a, x) is increasing in x, and the second inequality is by the definition of

a∗T−1(x). Next assume for the induction argument that Bt+1(xt+1) is increasing in

xt+1. The composition of an increasing function and max0, x is also increasing,

hence, max0, Bt+1(x) is an increasing function of x. Because the state transition

xt+1(at, xt) is stochastically increasing in xt, E[max0, Bt+1(xt+1(at, xt))] is increas-

ing in xt for each at. Because the increasing property is preserved under summation,

the benefit function Bt(at, xt) = Mt(at, xt)+αE[max0, Bt+1(at, xt+1(xt))] is increas-

ing in xt for each at. By applying the same argument that we applied on BT−1(x),

we can verify that Bt(xt) = supat∈At Bt(at, xt) is increasing in xt, which concludes the

induction hypothesis and the proof of the proposition.

Proposition 2.10. When Mt(at, xt) is convex in xt, and xt+1(at, xt) is stochastically

convex in xt ∈ X for every at and every t, then the following statements are true for

every t:

1. Bt(xt) is convex in xt.

2. A control-band policy that stops the process if xt ∈ [xt, xt] is optimal.


We can derive a similar result for the case of the multi-dimensional state space as

in §2.4. We omit the discussion here to avoid repetition.

2.6. Infinite-Horizon Optimal Stopping Problems

Here, we show that the results of this chapter can be applied to infinite-horizon

optimal stopping problems. First, we define the infinite-horizon optimal stopping

problem. As before, xt|t = 1, 2, . . . is a Markov process that evolves in a state space

X ⊂ Rd. A stopping time τ is a random variable that takes values in 1, 2, . . . ,∞and satisfies ω ∈ Ω|τ(ω) ≤ t ∈ Ft for all finite t. We denote the set of all such

stopping times by U . In the infinite-horizon case, we limit our interest to problems

with a time-homogeneous Markov process and reward functions that are integrable.

Unlike in the finite-horizon case, the stopping time can have an infinite value; hence,

we need to agree on S(xτ ) for τ =∞. Clearly, if limt→∞ S(xt) exists, then it is natural

to set S(xτ ) to this value. Otherwise, we can set S(xτ ) to be a fixed value, such as

0, or set S(xτ ) = lim supt→∞ S(xt), but this choice depends on the specific problem

setup. Our results are valid regardless of this choice. As before, the decision maker’s

optimal stopping problem is the problem of determining the optimal time to stop the

process in order to maximize the total discounted rewards. We can formulate this

problem as

V ∗(x) ≡ supτ∈U

E

[τ−1∑t=1

αt−1C(xt) + ατ−1S(xτ )∣∣x1 = x

]. (2.6)

Unlike in the finite-horizon case, a DP recursion is not possible because the backward

recursion requires a well-defined final period. To apply the two-step method, we

consider a T -period optimal stopping problem that has the same Markov process and

the reward functions as the infinite-horizon problem. We use the notation (·|T ) to

emphasize that the functions we consider are the finite horizon counter parts of the

infinite horizon problem. For example, we denote the optimal value function of the

T -period problem by V ∗(x|T ). Because the infinite-horizon problem can be seen as

a finite-horizon problem with a long decision horizon, one can conjecture that the


optimal value function of the infinite horizon problem is the limit of the sequence of

optimal value functions of finite-horizon problems, i.e., V ∗(x) = limT→∞ V∗(x|T ). In

many cases, the optimal value function also satisfies the Bellman equation V ∗(x) =

maxS(x), C(x) + αE[V ∗(x(x))], where x(x) denotes the one-step state transition,

and a stationary policy that stops the process if S(x) ≥ C(x) + αE[V ∗(x(x))] is

optimal. Although these properties do not always hold, researchers have found several

sufficient conditions that guarantee these properties. For example, optimal stopping

problems with negative reward functions satisfy these properties as noted in §3.1 of

Bertsekas (2007). Shiryaev (1978) also provides several such conditions. The two-

step method can be applied to all infinite-horizon problems with these properties.

Therefore, instead of providing a sufficient condition, we directly assume the following

properties for all infinite-horizon problems we consider:

Assumption 2.1. 1. V ∗(x) = maxS(x), C(x)+αE[V ∗(x(x))], and a stationary

policy that stops the process if S(x) ≥ C(x) + αE[V ∗(x(x))] is optimal.

2. limT→∞ V∗(x|T ) = V ∗(x).

As before, we define the one-step benefit function and the benefit function as

M(x) ≡ αE[S(x(x))] +C(x)−S(x) and B∗(x) ≡ αE[V ∗(x(x))] +C(x)−S(x). From

Part 1 of Assumption 2.1, a stationary policy that stops the process if B∗(x) ≤ 0 is

optimal. Then, the following proposition constructs the relationship between B∗(x)

and B1(x|T ) and the optimal thresholds. Before stating the proposition, we define

x ≡ supx ∈ X : B∗(x) ≤ 0 and x ≡ infx ∈ X : B∗(x) ≤ 0 for the case of a

single-dimensional state space and define xi(x−i) ≡ supxi : B∗(xi, x−i) ≤ 0, x ∈ Xand xi(x−i) ≡ infxi : B∗(xi, x−i) ≤ 0, x ∈ X for the case of a multi-dimensional

state space. Similarly, we denote the thresholds of the T-period problem by xt|T , xt|T ,

xt,i|T (x−i) and xt,i|T (x−i).

Proposition 2.11. For infinite-horizon problems that satisfy Assumption 2.1, the

following statements are true:

1. B1(x|T ) ↑ B∗(x) as T →∞.

2. x1|T ↓ x, x1|T ↑ x, x1,i|T (x−i) ↓ xi(x−i) and x1,i|T (x−i) ↑ xi(x−i) as T →∞.


Increasing, decreasing and convex properties are preserved under limit. Hence,

when B1(x|T ) has any of these structural properties for every T , B∗(x) also has the

same structural property. Therefore, we can apply the results of previous sections to

infinite-horizon problems as follows: First, fix a finite time horizon T . Next, determine

a structural property of B1(x|T ) using the results of previous sections. Verify these

properties are preserved under limit. If so, the structure of the optimal policy follows

from the finite horizon counter part. Consider, for example, the case in which the

state space is single-dimensional, M(x) is increasing in x and x(x) is stochastically

increasing in x. Proposition 2.1 implies that B1(x|T ) is increasing in x for every finite

T . Then, from Proposition 2.11, B∗(x) = limT→∞B1(x|T ), which implies that B∗(x)

is also increasing in x. When B∗(x) is increasing in x, it is optimal to stop the process

if x ≤ x. Hence, a threshold policy that stops the process if x ≤ x is optimal. We

can apply similar arguments to other cases.

Part 2 of Proposition 2.11 is useful when running the value iteration algorithm

(Bertsekas 2005 Section 7) to solve the infinite horizon problem. The value iteration

algorithm is based on Part 2 of Assumption 2.1. It recursively computes V ∗(x|T ) =

V1(x|T ), which converges to V ∗(x) as T → ∞.8 From the time-homogeneity of the

state transition and the reward functions, we have Vt(x|T ) = Vt+1(x|T + 1), which

implies V1(x|T + 1) = maxS(x), C(x) + αE[V2(x(x)|T + 1)] = maxS(x), C(x) +

αE[V1(x(x)|T )]. Hence, the value iteration recursion is as follows:

V1(x|T + 1) = maxS(x), C(x) + αE[V1(x(x)|T )], (2.7)

where V1(x|1) = S(x). Then, consider the case in which B1(x|T ) is increasing in x.

From the proposition, we have x1|T ≥ x1|T+1. By the definition of x1|T+1, we have

S(x) < C(x) + αE[V2(x(x)|T + 1)] for every x > x1|T+1. Because x1|T ≥ x1|T+1 and

V2(x|T + 1) = V1(x|T ), we have S(x) < C(x) + αE[V1(x(x)|T )] for every x > x1|T .

Hence, we need not compute the value of S(x) in (2.7) for x > x1|T . One can apply

similar arguments to the other cases.

8We do not need to compute B1(x|T ) to determine the optimal policy. The benefit function isan auxiliary function to determine the structure of the optimal policy.


2.7. Example Applications

We apply the two-step method to some optimal stopping problems from the literature

including those discussed in §2.3. The objective of this section is three-fold: (1) We

illustrate how to use the two-step method to obtain structural results. (2) We show

that this method can be used for a variety of applications that arise in fields such

as Finance, Marketing and Operations. Hence, the approach can be used for broad

application areas. (3) We illustrate that the method makes it easy and transparent

to characterize structural results. This transparency also allows one to obtain some

results that were not reported in the original papers.

In what follows, we do not state all the assumptions, model elements and results

from each paper but instead focus on the basic model and the optimal policy. For

each example, we first show how to obtain structural properties of the one-step benefit

function. Next, we characterize the structure of the optimal policy using the results

of §2.4-2.6. We refer the reader to Appendix A.1 for the stochastic monotonicities of

state transitions.

2.7.1 Time-to-Market Model

Recall the time-to-market model from §2.3.1. The reward function and the state

transition of this problem have the following properties:

1. St(xt) = v(xt)(p− ct)D.

2. Ct(xt) = 0.

3. xt+1(xt) = xt + ξt, where ξt ≥ 0 is independent of xt.

The one-step benefit function can be expressed as

Mt(xt) = αE[St+1(xt+1(xt))]− St(xt)

= αE[v(xt + ξt)− v(xt)](p− ct+1)D + v(xt)(α(p− ct+1)− p+ ct)D.

The first term is decreasing in xt because E[v(xt + ξt) − v(xt)] is decreasing in xt

due to concavity of v(x). The second term is also decreasing in xt because v(xt) is


decreasing in xt and p − ct ≤ α(p − ct+1). Therefore, Mt(xt) is decreasing in xt.

Because the state transition is stochastically increasing, a threshold policy is optimal

from Proposition 2.1. Under this policy, the firm should enter the market at period

t if xt ≥ xt.

2.7.2 Option Pricing Problems

We discuss two option pricing problems. First recall the American-Asian option

pricing problem from §2.3.2. The reward function and the state transition are as

follows:

1. St(xt) = xt,2 −K for t ≤ T and ST+1(x) = 0.

2. Ct(xt) = 0.

3. xt+1,1(xt) = ξxt,1 and xt+1,2(xt) = txt,2+ξxt,1t+1

, where ξ is a log-normal random

variable and is independent of xt.

Discounting factor is α = e−r, where r is the risk-free rate. Note that period T + 1

is a fictitious period to apply the forced stopping restriction at the last period. The

one-step benefit function is

Mt(xt) = αE[txt,2 + ξxt,1

t+ 1−K]− (xt,2 −K)

=

(αt− (t+ 1)

t+ 1xt,2 +

αE[ξ]

t+ 1xt,1 + (1− α)K

), t < T,

MT (xT ) = −(xT,2 −K).

For every t, Mt(xt) is decreasing in xt,2. The state transition xt+1,2(xt) is stochastically

increasing in xt,2, and xt+1,1(xt) is independent of xt,2. Therefore, Proposition 2.3

establishes the optimality of the state-dependent threshold policy that exercises the

option if xt,2 ≥ xt,2(xt,1).

Next we consider an American call option pricing problem. An option holder can

exercise the stock option at periods t = 1, 2, . . . , T at a fixed strike price K. Let xt

be the price of the underlying stock at period t. If the option holder exercises the


option at period t, she receives a reward of xt − K. It is a well-known result that

early exercise is never optimal for this option. We can prove this fact by using the

two-step method. To do so, we use the binomial valuation approach from Cox et al.

(1979). The reward function and the state transition are as follows:

1. St(x) = x−K for t ≤ T and ST+1(x) = 0.

2. Ct(x) = 0.

3. xt+1(xt) = ξxt, where ξ = u with probability p = r−du−d and ξ = d with 1− p.

Note that the probability measure of ξ is the risk-neutral measure of Harrison and

Kreps (1979), where r is the risk-free rate. The discount factor is α = 1r. For t < T−1,

the one-step benefit function is

Mt(xt) = α [(uxt −K)p+ (dxt −K)(1− p)]− (xt −K) = (1− α)K,

which is always strictly positive. Because Bt(xt) ≥Mt(xt) > 0 for every xt, stopping

is never optimal at period t < T − 1.

Early exercising can be optimal if the underlying stock pays a dividend during the

life of the call option. Suppose that the stock pays a dividend of δxm at period m < T

with certain values of δ and m. the reward functions are identical to the no-dividend

case, but the state transition is xt+1(xt) = ξxt for t 6= m−1 and xt+1(xt) = ξ(1−δ)xtfor t = m − 1. Hence, the one-step benefit function is identical to the no dividend

case when t 6= m− 1. When t = m− 1,

Mt(xt) = α [(u(1− δ)xt −K)p+ (d(1− δ)xt −K)(1− p)]−(xt−K) = (1−α)K−δxt,

which is a strictly decreasing function of xt. From Proposition 2.1, exercising the

option is optimal if xt ≥ xt for a threshold xt. The structure of the optimal exercise

policy of the American option is also studied in Chen (1970) and Chapter 1 of Ross

(1983) under different modeling assumptions.


2.7.3 Dynamic Quality Control Problem

Consider the dynamic quality control problem studied in Chen et al. (1998) and

Yao and Zheng (1999b). The objective is to determine the optimal time to stop an

inspection process to minimize the total inspection and warranty costs. For a batch

of T units, period t indicates that exactly t units have been inspected and repaired

if defective. The state variable xt denotes the total number of defective units among

t units inspected. Each unit in a batch is defective with probability Θ, which is an

unknown random variable, and can be estimated with the observation xt and a prior

distribution fΘ. Let Θt(xt) be the best estimate of Θ at period t, with the observation

xt. The decision maker incurs an inspection cost ci per each unit inspected and a

repair cost cr per each unit repaired. When the decision is to stop the inspection at

period t and the defective rate is θ, the expected warranty cost is φ(t, θ). We note that

the expected cost depends on the state xt through an intermediate random variable

Θt(xt). The properties of the profit functions and state transition are as follows:

1. St(xt) = E[φ(t,Θt(xt))], where φ(t, θ) is K-submodular9 in (t, θ) with K = cr.

2. Ct(xt) = ci + crE[Θt(xt)].

3. xt+1(xt) = xt +D(xt), where D(xt) is a Bernoulli random variable with param-

eter E[Θt(xt)].

The objective is to minimize the total cost instead of maximizing the total profit,

hence St(xt) and Ct(xt) are defined as cost functions. The random variable Θt(xt) is

stochastically increasing in xt, so is xt+1(xt). In addition Θt+1(xt+1(xt)) has the same

distribution with Θt(xt). Then, the one-step penalty of delaying can be derived as

Mt(xt) = E[St+1(xt+1(xt))] + Ct(xt)− St(xt)

= E[φ(t+ 1,Θt+1(xt+1(xt)))] + E[ci + crΘt(xt)]− E[φ(t,Θt(xt))]

= E[φ(t+ 1,Θt(xt))− φ(t,Θt(xt)) + ci + crΘt(xt)].

9Chen et al. (1998) define a function φ(t, θ) as K-submodular in (t, θ) if [φ(t+ 1, θ) + φ(t, θ′)]−[φ(t+ 1, θ′) + φ(t, θ)] ≥ K(θ′ − θ) for all t and θ′ ≥ θ.


The one-step penalty function Mt(xt) is decreasing in xt because φ(t+1, θ)−φ(t, θ)+

crθ is decreasing in θ from K-submodularity, and Θt(xt) is stochastically increasing

in xt. Hence, from Proposition 2.1, a threshold policy is optimal.

Yao and Zheng (1999a) also study a two-stage dynamic quality control problem.

In this problem, each unit in a batch can possibly have two types of defects. The batch

is processed in two inspection stages. Type 1 defects can be inspected only in the

first-stage, and type 2 defects can be inspected only in the second-stage. The second

stage problem is a good example of a two-dimensional optimal stopping problem. In

their second stage problem, the number of units with type 1 defects, xt,1, and the

number of units of type 2 defects, xt,2, are the state variables. The one-step benefit

function Mt(xt,1, xt,2) is increasing in both xt,1 and xt,2, and the state transition is

stochastically increasing in each variable. From Proposition 2.5, a state-dependent

threshold policy that stops the process if xt,1 ≤ xt,1(xt,2) or xt,2 ≤ xt,2(xt,1), is optimal.

The two-step method enables us to determine a new result that is not reported in

Yao and Zheng (1999a). Because Bt(xt) is increasing in both xt,1 and xt,2, from

Proposition 2.6, we have:

Proposition 2.12. The threshold xt,1(xt,2) is decreasing in xt,2 and xt,2(xt,1) is de-

creasing in xt,1.

2.7.4 Organ Transplantation Problem

Alagoz et al. (2007b) study the organ transplantation decision problem faced by

patients with end-stage liver disease. In this problem, a patient has to decide whether

to accept an allocated organ at each period t or to wait for another organ. The

state xt,1 ∈ 1, . . . , H + 1 indicates the condition of the patient, and the state

xt,2 ∈ 1, . . . , L + 1 indicates the quality of the organ allocated at period t. Higher

values of xt,1 and xt,2 indicate worse conditions. The reward S(xt) that the patient

receives when he accepts the organ at period t is decreasing in both xt,1 and xt,2.

When the patients waits for another organ, there is a continuing reward C(xt), which

is decreasing in xt,1 and independent of xt,2. The state of the organ that will be

allocated in the next period xt+1,2(xt) is independent of the state of current allocated


organ, xt,2, but depends on the current health condition of the patient, xt,1. The

authors consider an infinite-horizon problem with α < 1. Alagoz et al. (2007b) provide

qualitative properties of the reward function and the state transition as follows:

1. S(xt) is decreasing in both xt,1 and xt,2

2. C(xt) is decreasing in xt,1 and independent of xt,2

3. xt+1(xt) is independent of xt,2.

To apply the two-step method, we first consider the T -period problem that has

the same Markov process and the reward functions as the infinite-horizon problem.

Because xt+1(xt) is independent of xt,2, the one-step benefit function Mt(xt|T ) =

αE[S(xt+1(xt))] + C(xt) − S(xt) is increasing in xt,2 for each fixed xt,1. From the

same reason, αE[max0, Bt+1(xt+1(xt)|T )] is independent of xt,2. Hence, Bt(xt|T ) =

Mt(xt|T ) + αE[max0, Bt+1(xt+1(xt)|T )] is increasing in xt,2. Finally, from Propo-

sition 2.11, B∗(x) is increasing in xt,2, which establishes the optimality of the state-

dependent threshold policy that stops the process at period t if xt,2 ≤ x2(xt,1).

2.7.5 The Secretary Problem

Consider an extension of the classical secretary problem studied in Chow et al. (1964)

and Freeman (1983). A decision maker interviews T candidates for a single position

in a random order. The decision maker can accept only the current candidate he is

interviewing. The utility of accepting ith best candidate among T candidates is T − i.At period t, let state xt denote the relative rank of the current candidate among all

t candidates the decision maker has interviewed. When the decision maker accepts

the xtth best candidate among t, the expected utility is T − T+1t+1

xt. The properties

of the profit functions and state transition for this problem are as follows:

1. St(xt) = T − T+1t+1

xt.

2. Ct(xt) = 0.

3. xt+1(xt) is uniformly distributed from 1 to t+ 1 and independent of xt.


Because xt+1 is independent of xt, the one-step benefit functionM(xt) = αE[S(xt+1(xt))]−S(xt) is increasing in xt. From the same reason, αE[max0, Bt+1(xt+1(xt))] is inde-

pendent of xt. Therefore, Bt(xt) = Mt(xt)+αE[max0, Bt+1(xt+1(xt))] is increasing

in xt, which establishes the optimality of the state-dependent threshold policy that

accepts a candidate at period t if xt ≤ xt.

2.8. Conclusion

This section has proposed a two-step method to characterize the structure of the op-

timal stopping policy for a general class of optimal stopping problems. The method

first characterizes structural properties of the one-step benefit function, which de-

pends on the reward functions and the state transition over two periods. The method

then verifies the stochastic monotonicity of the state transition that enables the ben-

efit function to inherit the structural property of the one-step benefit function, which

characterizes the structure of the optimal stopping policy. We have also considered

optimal stopping problems with additional decisions other than the stopping deci-

sion. By applying the proposed method to examples from several fields, we have

illustrated how to use the method; why it is needed to obtain structural results where

other conventional methods fail; and how it simplifies the analysis. Our hope is that

the method and the propositions will also help researchers to easily determine spe-

cific conditions needed and the resulting optimal policy structure for their optimal

stopping problems appearing in broad application areas.

Chapter 3

Mechanism Design for Capacity

Planning under Dynamic Evolution

of Asymmetric Demand Forecasts

3.1. Introduction

This chapter studies a supplier’s problem of eliciting credible forecast information

from a manufacturer when both parties obtain forecast information over time. The

supplier relies on the demand forecast for his capacity decision. Both parties obtain

information and update their forecasts over time. However, the manufacturer often

has other forward-looking information because of her superior relationship with or

proximity to the market and expert opinion about her own product. Hence, firms have

asymmetric information, which changes over time. In such a dynamic environment,

what is the optimal mechanism/contract that maximizes supplier’s profit by enabling

credible forecast information sharing? What is the right time for the supplier to

offer this mechanism? Does time play an important role? If so, how? This chapter

addresses these questions. In doing so, this chapter also characterizes the supplier’s

optimal mechanism/contract and shows how this mechanism changes over time as

a function of forecast updates. This chapter also rigorously models the information

evolution process for multiple decision makers who forecast the same object. To do

36

CHAPTER 3. CAPACITY PLANNING IN TWO-LEVEL SUPPLY CHAIN 37

so, this chapter generalizes the Martingale Model of Forecast Evolution (MMFE) to

account for multiple decision makers who forecast demand for the product.

The MMFE successfully describes the evolution of forecasts arising from many

statistical and judgment-based forecasting methods. There are two variants of the

MMFE: the multiplicative and the additive models. Both variants frequently appear

in the literature. Hausman (1969) first develops the multiplicative MMFE and veri-

fies the model using actual data from several independent forecast-revision processes

(agricultural supply forecasts, financial forecasts, and forecasts of seasonal sales of

apparel). Hausman and Peterson (1972) extend the model to a multiple products

system. Graves et al. (1986) develop the additive MMFE for a single-product system.

Heath and Jackson (1994) generalize both variants of MMFE by allowing the corre-

lation of demands of different time periods. Sapra and Jackson (2009) develop the

continuous-time analog of the MMFE. Due to its descriptive power and generality,

researchers have used the MMFE in several studies that involve dynamic forecast up-

dates, for example, to develop effective production, inventory, capacity management

methods that are responsive to forecast updates (Heath and Jackson 1994, Graves

et al. 1998, Gallego and Ozer 2001, Toktay and Wein 2001, Iida and Zipkin 2006,

Altug and Muharremoglu 2009, Schoenmeyr and Graves 2009). This model is also

used to understand and quantify the value of information sharing (Chen and Lee 2009

and Iida and Zipkin 2009) and collaborative forecasting (Aviv 2001, 2002, 2007).

This chapter provides a framework to model evolutions of forecasts for multiple

decision makers in a consistent and rigorous manner. With the exception of Aviv

(2001) and Iida and Zipkin (2009), the literature focuses on forecast models from a

single forecaster’s perspective. As pointed out by Aviv, the consideration of multiple

forecasters and ensuring consistency among forecast evolution across multiple fore-

casters is an important and nontrivial process. We provide a framework that ensures

consistency, and that can be used to model several plausible forecast evolution sce-

narios, such as collaborative forecasting and delayed information among forecasters.

Consider, for example, multiple forecasters who have different capability and speed in

learning information about the demand for a product. These forecasters would have

asymmetric demand information that would change over time. Depending on how


much information each party obtains, information asymmetry among forecasters may

get larger or smaller. We model this scenario using our framework and refer to it as

the Martingale Model of Asymmetric Forecast Evolution (MMAFE). During the last

decade, Operations Management research has been focusing increasingly on problems

with multiple decision makers who may employ different forecasters to obtain infor-

mation about the same object (such as demand). Our hope is that the framework

provided in this chapter will also enable researchers to consider and revisit, for ex-

ample, the performance of supply chain contracts in dynamic environments. In this

chapter, we provide one such study.

Using the MMAFE for multiple decision makers, we revisit the incentive problem

observed in forecast information sharing when firms need to share this information for

better planning. In particular, we study a supplier’s capacity planning problem. For a

timely delivery, the supplier has to secure component capacity prior to receiving a firm

order from a product manufacturer. The supplier makes the capacity decision based

on demand forecasts. The manufacturer possesses more demand information than

the supplier due to, for example, her proximity to the market and her expert opinion

about the final product. Hence, the supplier can better plan for capacity by using the

manufacturer’s forecast information. However, without a proper incentive mechanism,

when asked for this information, the manufacturer may inflate her forecast so that the

supplier secures more capacity. Being aware of this situation, the supplier may not

find the information credible. This interaction may result in forecast manipulation1

that reduces both parties’ expected profits.

Forecast manipulation has been widely observed in industry and its adverse effects

are also well-documented (see, for example, Files 2001 and Clark 2007). To address

this problem, researchers have recently started to provide some analytical remedies.

For example, Cachon and Lariviere (2001) provide some properties of incentive mech-

anisms, and Ozer and Wei (2006) design explicit contracts to ensure credible forecast

information sharing. The literature that provides analytical remedies for credible

1This manipulation could be either due to the manufacturer who inflates her forecast report orthe supplier who “corrects” the forecast information thinking that the manufacturer inflated herforecast.


forecast or cost information sharing in a strategic setting is relatively recent (e.g, Cor-

bett and de Groote 2000, Ha 2001, Lovejoy 2006). Chen (2003) provides a review

of this literature in strategic supply chain settings. To date, the mechanism design

literature and the supply chain contracting literature have primarily focused on static

environments and designed incentive mechanisms for them. For example, the sup-

plier (principal) can design and offer a menu of contracts to the manufacturer (agent)

and screen her private forecast information while maximizing his objective function.

Ozer and Wei (2006) show that such a mechanism results in a capacity reservation

contract where the manufacturer pays a fee to reserve capacity (instead of directly

sharing her private forecast). This literature assumes that the information is static

and not updated over time. It also assumes that information asymmetry between the

principal and the agent is static. The MMAFE framework help us to address this

forecast sharing problem when both parties update their forecasts over time.

The supplier2 and the manufacturer can obtain demand information and update

their forecasts as the sales season approaches. Hence, by delaying to offer an incentive

mechanism, the supplier (and the manufacturer) can obtain more information, which

reduces demand uncertainty and increases expected profits. This delay, however, may

increase (resp., or decrease) the degree of information asymmetry between the two

decision makers, resulting in higher (resp., or lower) cost of screening. The capacity

cost may also increase as the supplier delays the capacity decision because of tighter

deadline for building capacity. Considering all such trade-offs, the supplier has to

determine (i) when to offer an incentive mechanism/contract to elicit credible forecast

information and (ii) how to design the mechanism so as to maximize his profit while

ensuring that the manufacturer participates and credibly reveals her forecast.

The aforementioned timing, contract and capacity decisions are closely linked be-

cause the optimal contract and the capacity decision depend on the supplier’s forecast

information and the information asymmetry, both of which change over time. Hence,

we formulate this problem as a two-stage closely-embedded stochastic decision pro-

cess. The first-stage determines the optimal time to offer a capacity reservation

2To be consistent with the literature, we use the supplier-manufacturer narrative, which couldalso be described as a supplier-retailer interaction.


contract. The solution of this problem depends on the solution of the second-stage

problem, which is about designing an optimal mechanism for capacity planning. Sim-

ilarly, the solution of the second stage depends on the timing decision obtained from

the first-stage. We establish the optimality of a control band policy that prescribes

when to offer an optimal incentive mechanism. Under this policy, the supplier offers a

menu of contracts if the supplier’s demand forecast falls within the control band. We

also establish properties of this optimal stopping policy. Next we provide structural

properties of the optimal incentive mechanism (which is interpreted as a capacity

reservation contract) and explicitly show how the optimal mechanism depends on the

demand forecast and how the timing decision affects the mechanism design problem.

Using this framework, we also solve the problem from a centralized decision maker’s

perspective. In this case, the decision maker has access to all relevant information

and forecast updates and determines the optimal time to decide and build capacity.

Through numerical studies, we characterize the environment in which the supplier

should offer the contract late or early. By comparing the profits of the dynamic

strategy with those of a static one in which the supplier offers a contract in a fixed

period, we show that the supplier can significantly improve his profit by optimally

determining the time to offer a contract. However, the results also show that this

dynamic strategy can reduce the total supply chain efficiency.

The literature on mechanism design for a dynamic framework is sparse. Plambeck

and Zenios (2000), Zhang and Zenios (2008), Lutze and Ozer (2008) and Akan et al.

(2009) are among the few exceptions. These authors study a principal’s problem

of designing a long-term contract when the agent takes actions over multiple peri-

ods. The principal in their model offers a contract at the beginning of the planning

horizon. In contrast, the principal in our model dynamically determines the time to

design and offer a contract and the agent takes a single action. Note that when such a

dynamic system is managed by a centralized decision maker (i.e., the integrated firm

solution), the problem reduces to determining when the firm should make a decision

given information updates. The centralized decision maker does not face the problem

of information asymmetry. For example, Boyaci and Ozer (2009) study a supplier’s


problem of determining when to stop advance selling under demand information up-

dates. Ulu and Smith (2009), and Wang and Tomlin (2009) also consider the optimal

timing decisions under multiple information updates. Although it is not a central

part of our study, the present chapter also determines the integrated firm’s optimal

time to determine capacity when the firm obtains multiple forecast updates over a

capacity planning horizon.

The rest of the chapter is organized as follows. In §3.2, we develop the MMFE for

multiple decision makers and introduce the MMAFE. In §3.3 and §3.4, we describe the

basic elements of our model and provide a formulation to solve the problem. In §3.5,

we provide structural properties of the optimal stopping policy. We also characterize

an optimal contract that enables credible forecast information sharing. In §3.6, we

provide the integrated firm solution. In §3.7, we present numerical studies. In §3.8,

we discuss extensions. In §3.9, we conclude. We defer all proofs to the Appendix.

3.2. The Martingale Model of Forecast Evolution

for Multiple Decision Makers

This section develops the MMFE for multiple decision makers who forecast demand

for the same product. When several decision makers forecast demand for the same

product, their forecast revisions are likely to be positively correlated. The correlation

may occur inter-temporally, because the decision makers may obtain demand infor-

mation with a time delay. For example, the supplier and the retailer of a product

can use past sales data to update demand forecasts, but the supplier may obtain

this information later than the retailer (Lee et al. 2000). The forecasting model for

multiple decision makers should successfully describe all possible interactions between

decision makers’ forecast sequences. The forecast sequences should also be consistent

in such a way that they converge to the same value, i.e., the actual demand. Hence,

we aim to develop a descriptive framework that characterizes the dynamics of general

forecasting processes across multiple decision makers while being consistent.

In §3.2.1, we discuss the general MMFE for multiple decision makers and provide


its properties. These properties help us to better understand the two important

variants of the model: multiplicative and additive forecast revisions. When the size of

forecast revisions are small compared to the size of the forecast, the difference between

the two models is negligible. However, the forecasts are often made long before the

beginning of the sales season; hence, forecast revisions are likely to be large. Similarly,

several researchers point out that the multiplicative model fit empirical data better

than the additive model does (see, for example, Hausman 1969, Heath and Jackson

1994, Chod and Rudi 2006 and Wang and Tomlin 2009). Therefore, in §3.2.2, we

focus on the multiplicative case, and defer the development of the additive MMFE

for multiple decision makers to the Appendix. In §§3.2.3, 3.2.4, we apply the model

to some forecasting scenarios discussed in the literature that involve multiple decision

makers.

3.2.1 The General MMFE

ConsiderN periods during which each decision maker independently forecasts demand

for a single product. We denote the sales season by period N + 1. Demand for

the product is XN+1, which is a random variable prior to the sales season. At the

beginning of each period n ∈ 1, . . . N, demand information available to decision

maker i is given by the set F in, which is a σ-field. The demand forecast of decision

maker i at the beginning of period n is X in ≡ E[XN+1|F in]; i.e., the expected demand

given information F in. We denote the differences between subsequent forecasts by

∆in ≡ X i

n+1 − X in. In Appendix A, we provide a glossary of notation for an easy

reference.

Definition 3.1. The forecast evolution X in constructed by (XN+1,F in) is an MMFE

if it satisfies the properties (a) XN+1 is square-integrable, (b) F in ⊆ F in+1 for every n,

and (c) σ(XN+1) ⊆ F iN+1.

Condition (a) is required to define the Martingale differences and is identical to the

condition that XN+1 has a finite variance. Condition (b) implies that the decision

makers do not lose information over time. Condition (c) implies that demand is


revealed to decision maker i during the sales period. From this definition, we have

the following theorem.

Theorem 3.1. If X in constructed by (XN+1,F in) is an MMFE, then we have the

following properties for every n:

(a) X in is a Martingale adapted to F in.

(b) E[XN+1|F in] = E[XN+1|X in] = X i

n.

(c) E[X in+l|F in] = E[X i

n+l|X in] = X i

n for every l ≥ 0.

(d) E[∆in] = 0 and ∆i

l is uncorrelated with F in for every l ≥ n.

Theorem 3.1 is first discussed in Heath and Jackson (1994) without a proof. We

provide a formal proof in the Appendix. Part (a) verifies that the MMFE is indeed

a Martingale. Part (b) implies that in forecasting XN+1, the value of X in is sufficient

information for decision maker i. Part (c) implies that the forecast is unbiased.

Finally, Part (d) implies that X in is indeed the best forecast for decision maker i.

Note that if ∆in is correlated with any past information, the decision maker can use

the correlation to improve the forecast.3

3.2.2 The Multiplicative MMFE

We denote the ratio of successive forecasts by δin ≡ X in+1/X

in for n < N and δiN ≡

XN+1/XiN for each i ∈ s,m. We consider only two decision makers for the sake

of brevity and without loss of generality. We refer to these decision makers as the

supplier (s) and the manufacturer (m) because we apply the model to a forecast

sharing problem between these two decision makers in §3.3.

The classical MMFE assumes that the multiplicative forecast update for each

decision maker, i.e., δin, is log-normally distributed for every n. Part (c) and (d) of

3From the tower property of conditional expectation, E[XN+1|F in] = E[XN+1|Xin] =

E[E[XN+1|F in+1]|Xin] = E[Xi

n+1|Xin] = Xi

n + E[∆in|Xi

n]. If ∆in is correlated to Xi

n, the value ofE[∆i

n|Xin] may not be 0. In this case, decision maker i’s best demand forecast at period t would be

Xin + E[∆i

n|Xin] rather than Xi

n.


Theorem 3.1 imply that δin is independent of X in and has a mean value of 1. Hence,

the initial forecast X i1 and the variances of log(δin) fully characterize the evolution of

X in for each decision maker i4. However, the variance of log(δin) is not sufficient to

characterize the interaction between the two forecast processes Xsn and Xm

n .

One may determine the correlation coefficient between log(δsns) and log(δmnm) for

every ns and nm to characterize the interaction between the two forecast sequences.

However, this approach can lead to inconsistency. We provide one such example.

Suppose that the correlation coefficient between log(δs1) and log(δm1 ) is 1 and the

correlation coefficient between log(δs2) and log(δm1 ) is also 1. Then, by obtaining the

value of δs1, the supplier obtains the full information of δm1 , which also contains the

full information of δs2. Hence, the property that δs1 and δs2 are independent does not

hold.

We propose a different approach to model the interaction between Xsn and Xm

n ,

which does not suffer from any inconsistency such as the one discussed above. Deci-

sion makers update their forecasts by obtaining information about events that affect

demand. Following Hausman (1969), suppose there are in total K such events and

let ej be the random variable that models the impact of event j. According to the

theory of proportional effect (Aitchison and Brown 1957), the change in the forecast

by each event is proportional to the size of the current forecast. In other words, after

obtaining the information of event j, decision maker i updates the forecast from X in to

X inej. Following this explanation, we first express demand by XN+1 =

∏Kj=1 ej. Next,

we divide the set of all events into (N + 1) × (N + 1) sets by the time at which the

information is obtained by each decision maker. More specifically, we define Ens,nm

as the set of events whose information is obtained by the supplier during period ns

and by the manufacturer during period nm.

We define δns,nm ≡∏

j∈Ens,nmej, which indicates the total information obtained

by the supplier at period ns and by the manufacturer at period nm. We assume that

each δns,nm is log-normally distributed and has a mean value of 1 except δ0,05. When

4The assumption that E[δin] = 1 for a log-normal random variable δin implies E[log(δin)] =−V ar(log(δin))/2. Therefore, the variance of log(δin) is sufficient to characterize δin.

5By taking the logarithm of δns,nm , we get log(δns,nm) =∑j∈Ens,nm

log(ej). When the number ofevents in Ens,nm

becomes large,∑j∈Ens,nm

log(ej) will be asymptotically normal from the central


Ens,nm is an empty set, i.e., when no information is obtained by the supplier at period

ns and by the manufacturer at period nm, then δns,nm = 1. Note that by construction

a distinct piece of information is contained in one event set, hence δns,nm forms an

independent set of random variables.

Given this construction, we can express demand as XN+1 =∏N

ns=0

∏Nnm=0 δns,nm .

The supplier’s information set at the beginning of period n is

F sn ≡ σ([δ0,0, . . . , δ0,N ], . . . , [δn−1,0, . . . , δn−1,N ]).

Then, the supplier’s demand forecast is Xsn = E[XN+1|F sn] =

∏n−1ns=0

∏Nnm=0 δns,nm ,

and the ratio of successive forecasts is δsn =∏N

nm=0 δn,nm . Because the multipli-

cation of log-normal random variables is also a log-normal random variable, δsn is

also log-normally distributed. Therefore, from the supplier’s perspective, the fore-

cast evolution is consistent with the classical MMFE. The manufacturer’s forecast

can be expressed in a similar way. Figure 3.1 illustrates the information structure

of the MMFE for two decision makers. During each period, the supplier obtains all

information given in the row corresponding to that period, whereas the manufacturer

obtains all information given in the corresponding column.

Figure 3.1: Information Structure of the MMFE

n 0 1 · · · N (s)0 δ0,0 × δ0,1 × · · · × δ0,N Xs

1

× × × × ×1 δ1,0 × δ1,1 × · · · × δ1,N δs1

× × × × ×...

... × ... × . . . × ......

× × × × ×N δN,0 × δN,1 × · · · × δN,N δsN

(m) Xm1 × δm1 × . . . × δmN XN+1

From this construction, we can fully characterize the evolution of Xsn and Xm

n by

limit theorem. Because both decision makers have the information δ0,0 before the beginning ofthe forecast horizon, we assume that δ0,0 is a deterministic value. When E[δns,nm

] 6= 1 for some(ns, nm), we can push this information to δ0,0 and normalize δns,nm

by δns,nm/E[δns,nm

], hence theassumption E[δns,nm ] = 1 is without loss of generality.


determining the value of δ0,0 and the variances of log(δns,nm).

Note that the demand and the forecast revisions have the following relationship;

XN+1 = X i1

∏Nk=1 δ

ik for each decision maker i. At the beginning of period n, deci-

sion maker i has the information X i1, δ

i1, . . . , δ

in−1, but does not have the information

δin, δin+1, . . . , δ

iN . Hence, the multiplication

∏Nk=n δ

ik represents the demand uncertainty

faced by decision maker i at the beginning of period n.

3.2.3 Collaborative Forecasting, Delayed Information and In-

formation Sharing

Consider the collaborative forecasting process discussed in Aviv (2001, 2002, and

2007). When the two decision makers collaborate to forecast demand, they share all

available information. Based on Definition 3.1, we define the collaborative information

set as F cfn ≡ F sn ∪ Fmn . Because the union of two σ-fields is also a σ-field, F cfn is a

well-defined information set. Then, the collaborative forecast (CF) of the two decision

makers is Xcfn ≡ E[XN+1|F cfn ]. Next we derive the most important property of the

CF.

Theorem 3.2. The CF has a smaller mean-squared-error than the forecast of a single

decision maker, i.e., E[(XN+1 −Xcfn )2] ≤ E[(XN+1 −X i

n)2] for every i ∈ s,m.

This result states that two decision makers who collaborate can predict demand more

accurately.

For the case of multiplicative MMFE, the collaborative information set F cfn in-

cludes all δns,nm such that ns ≤ n or nm ≤ n. Hence, the initial forecast is Xcf0 =

δ0,0

∏Nns=1 δns,0

∏Nnm=1 δ0,nm , and the ratio of successive forecasts are

δcfn = δn,n

N∏ns=n+1

δns,n

N∏nm=n+1

δn,nm .

Figure 3.2(a) illustrates the information structure available to decision makers under

a collaborative forecasting scheme. Because each δcfn is a log-normal random variable

with the mean value of 1, the collaborative forecast is also a multiplicative MMFE.


Figure 3.2: Information Structure of the multiplicative MMFE

(a) Collaborative Forecast (b) Asymmetric Forecast Evolution

Aviv (2001) also uses the MMFE to describe the forecast sequences of the two

decision makers. He first models the forecast sequence of each decision maker as an

MMFE with the initial forecast of X i1 = µδi0 and the multiplicative forecast revisions

of δin. Then, he assumes that V ar(log(δin)) = (ηiσn)2 for every n = 0, . . . , N − 1,

and V ar(log(δiN)) = σ2−∑N−1

n=0 (ηiσn)2. The value of σ represents the degree of total

demand uncertainty, and the value of ηi represents the forecasting power of decision

maker i. He models the interaction between the two forecast sequences by assuming

that the correlation coefficient between log(δsns) and log(δmnm) is ρ for ns = nm, and

0 for ns 6= nm. In other words, the forecast revisions of two decision makers are

correlated, but not inter-temporally6.

In the MMFE for multiple decision makers, we construct a single demand model,

which automatically constructs the forecast sequence of each decision maker. By

construction, our model does not suffer from inconsistency. In contrast, Aviv (2001)

constructs the forecast sequence of each decision maker separately and then models

the interaction between them. This approach may lead to inconsistency. For example,

when ηs = ηm = 1 and σN−1 = σ, both decision makers obtain all demand information

during period N − 1, hence the correlation coefficient ρ should be exactly 1 for the

demand models to be consistent. For this reason, Aviv (2001) provides a sufficient

6Iida and Zipkin (2009) also use the MMFE to describe the forecast sequences of two decisionmakers in a similar way to Aviv (2001).


condition on (ρ, ηs, ηm) that guarantees consistency.

As mentioned above, Aviv (2001) assumes the inter-temporal independence be-

tween the two decision makers’ forecast revisions. However, some important forecast-

ing scenarios do not follow this assumption. We discuss two such examples (delayed

information and asymmetric forecast evolution) shortly. The MMFE for multiple

decision makers covers the general case including that of Aviv (2001). Here, we de-

scribe his model with our framework. The inter-temporal independence means that

δns,nm = 1 unless ns = nm or ni = N for i ∈ s,m. Hence, the information obtained

by the two decision makers at period n consists of three parts, δn,n, δn,N and δN,n.

They correspond to the three corners of a single block in Figure 3.2(a). The correlated

part of δsn and δmn corresponds to δn,n, hence we can set V ar(log(δn,n)) = ρηsηmσ2n.

Similarly, the uncorrelated parts of δsn and δmn correspond to δn,N and δN,n, hence we

can set V ar(log(δn,N)) = (ηsσn)2−ρηsηmσ2n and V ar(log(δN,n)) = (ηmσn)2−ρηsηmσ2

n.

Next consider the delayed information scenario discussed by Chen (1999). In this

case, the manufacturer observes demand of a product at each period and makes a

replenishment order to the supplier. The supplier of the product receives the man-

ufacturer’s order with the delay of l periods. The decision makers use the demand

history to update forecasts. Therefore, the information sets of the two decision mak-

ers are identical with l periods of delay. In other words, we have F sn+l = Fmn for every

n. Then, the supplier and the manufacturer have the same sequence of forecasts with

a delay of l periods. In the multiplicative MMFE, the delayed information can be

represented by δn,k = 1 for every k + l 6= n.

3.2.4 Asymmetric Forecast Evolution and Information Shar-

ing

Consider the scenario in which the manufacturer has all information that the sup-

plier has at each period; i.e., F sn ⊆ Fmn for every n. Both decision makers obtain

new demand information from a third-party research firm (such as Gartner; see

http://www.gartner.com) over time, but the manufacturer has additional informa-

tion because of her proximity to the market and the insider information about her


own product. In other words, the manufacturer has private demand information that

is not available to the other decision maker. We refer to this forecast evolution model

as the Martingale Model of Asymmetric Forecast Evolution (MMAFE). We denote

the difference between the two decision makers’ forecasts by An ≡ Xmn −Xs

n. Then,

we have the following properties of the MMAFE:

Theorem 3.3. If (Xsn, X

mn ) constructed by (XN+1,F sn,Fmn ) is an MMAFE, then we

have the following properties for every n:

(a) E[Xmn+l|F sn] = E[Xm

n+l|Xsn] = Xs

n for every l ≥ 0.

(b) E[XN+1|Xsn, An] = E[XN+1|Fm

n ] = Xmn .

(c) E[An] = 0 and An is uncorrelated with F sn.

Part (a) implies that the supplier’s estimate of the manufacturer’s forecast is the same

as his own forecast. Part (b) implies that by knowing the value of An, the supplier can

make the best forecast. Part (c) implies that An is uncorrelated with the supplier’s

information set, F sn.

For the case of multiplicative MMFE, we can model the asymmetric information

scenario by setting δns,nm = 1 for every nm > ns. In other words, the supplier obtains

no information earlier than the manufacturer. Hence, the information obtained by

the supplier at period n consists of δn,0, δn,1, . . . , δn,n, where each δn,nm has already

been obtained or is being obtained at the same time by the manufacturer. We refer

this case by the multiplicative Martingale Model of Asymmetric Forecast Evolution

(m-MMAFE) and we will use this model in the second part of the chapter. The

information structure of the m-MMAFE is provided in Figure 3.2(b). We refer the

reader to this figure for better understanding of the following discussion.

The manufacturer’s private demand information represents the information asym-

metry between the two decision makers. The manufacturer’s demand uncertainty is

also the demand uncertainty faced by the system. Recall that the multiplication of

δmn , δmn+1, . . . , δ

mN represents the demand uncertainty faced by the manufacturer at the

beginning of period n, and we denote it by εn ≡∏N

k=n δmk . From the manufacturer’s

perspective, demand is XN+1 = Xmn εn, where Xm

n is her current forecast, which is


deterministically known to her. The remaining market uncertainty εn is resolved over

periods n to N as the manufacturer obtains information, i.e. the forecast updates.

In contrast, the demand uncertainty faced by the supplier at the beginning of

period n is∏N

k=n δsk. The manufacturer has already obtained part of this information.

To distinguish the known part, we rewrite

N∏k=n

δsk =N∏k=n

(n−1∏nm=0

δk,nm

N∏nm=n

δk,nm

)

=

(N∏k=n

n−1∏nm=0

δk,nm

)︸︷︷︸

ξn

(N∏k=n

N∏nm=n

δk,nm

)︸︷︷︸

εn

. (3.1)

The first part of Equation (3.1) represents the demand information that is already

obtained by the manufacturer. The second part represents the demand information

that is not yet obtained by the manufacturer. Because δns,nm = 1 for nm > ns, the

second part of (3.1) is

N∏k=n

N∏nm=n

δk,nm =N∏

nm=n

N∏k=n

δk,nm =N∏

nm=n

N∏k=0

δk,nm =N∏

nm=n

δmnm ,

which is equal to εn. The first part of (3.1) is the manufacturer’s private information,

and we denote it by ξn. Then, demand can be represented as XN+1 = Xsnξnεn.

From the supplier’s perspective, Xsn is deterministic, ξn and εn are uncertain. By

construction, Xsn, ξn and εn are independent. Note also that Xm

n = Xsnξn. The

supplier obtains only part of the information of εn and ξn during period n and he

obtains the full information of εn and ξn over periods n to N .

For notational simplicity, we denote the standard deviation of log(Z) of a log-

normal random variable Z as σZ throughout this chapter. The value of σεn represents

the degree of demand uncertainty of the system at period n, and the value of σξn

represents the degree of information asymmetry between the supplier and the man-

ufacturer at period n. By construction, σεn always decreases in n. In contrast, σξn

can either increase or decrease in n depending on the values of σδsn and σδmn . When


the supplier obtains more information than the manufacturer during period n, i.e.,

σδsn > σδmn , we have σξn+1 < σξn , and vice versa.

3.3. Determining the Optimal Time to Offer an

Optimal Mechanism

Consider a supplier who sells a key component to a manufacturer at a wholesale price.

Because of the long leadtime required in building capacity, the supplier determines

the production capacity before receiving a firm order from the manufacturer. The

supplier has some flexibility on when to decide and build capacity but delaying the

production capacity decision may require the supplier to incur expediting costs or

overtime labor. We refer to the time window during which the supplier sets capac-

ity as the capacity planning horizon. Both decision makers are uncertain about the

product demand during the capacity planning horizon. Hence, the supplier relies on

the demand forecast for his capacity decision. During the capacity planning horizon,

the supplier and the manufacturer may obtain new demand information and update

their forecasts. The new information includes changes in the general economic con-

ditions, competitor’s sales data, and market research done by a third-party research

firm, but the two decision makers may also obtain different information. The man-

ufacturer, for example, often has other forward-looking information because of her

superior relationship with or proximity to the market and expert opinion about her

own product. Such a forecast process for two decision makers; i.e, (Xsn, X

mn ) can be

cast as an m-MMAFE as defined in §3.2.4.

The supplier can use the manufacturer’s private information for better capacity

planning. However, when asked for this information, the manufacturer has an in-

centive to inflate her forecast so that the supplier secures more capacity. Credible

information sharing requires an appropriate mechanism. This incentive problem has

recently been studied by various researchers. (see, for example, Cachon and Lariviere

2001, Ozer and Wei 2006). Being aware of this incentive, the supplier can design and

offer a mechanism to screen the manufacturer’s forecast information truthfully. The


screening contract in this case can be interpreted as a capacity reservation contract.

This interpretation will become clearer once we solve for the supplier’s optimal screen-

ing contract. In contrast to the previous literature on credible forecast information

sharing, here we study the timing of when the supplier should offer the contract and

its connection with the optimal screening contract. By delaying the capacity decision,

the supplier can reduce the demand uncertainty that he faces. At the same time, the

manufacturer also obtain more demand information. Depending on the information

that the two decision makers obtain, the degree of information asymmetry can either

increase or decrease over time. Even when it increases, the supplier might be better off

by delaying because he can use the manufacturer’s more-accurate demand forecast for

better capacity planning. This delay may change the capacity cost due to the afore-

mentioned reasons. Considering all these trade-offs, the supplier needs to address the

two questions: (1) When is the optimal time to offer a capacity reservation contract

to the manufacturer? and (2) What is the optimal capacity reservation contract that

maximizes the profit? Because the optimal capacity reservation contract depends on

the time when the supplier offers the contract and the demand forecasts at that time,

these two questions are closely coupled.

The sequence of events is as follows. At the beginning of period n ∈ 1, 2, . . . , Nof the capacity planning horizon, the supplier has demand forecast, Xs

n, and decides

whether to offer a capacity reservation contract to the manufacturer. If he does not

offer a contract, the supplier obtains the forecast update δsn and the manufacturer

obtains a forecast update δmn during period n. Otherwise, the supplier offers a menu

of contracts K(ξn), P (ξn). Both capacity K(·) and corresponding payment P (·)are functions of the manufacturer’s private information ξn

7. Given this menu, the

manufacturer chooses a particular contract (K(ξ), P (ξ)) to maximize her profit if her

expected profit from the contract is larger than her reservation profit πm. By doing

so, she announces her forecast information to be ξ, which could differ from her true

forecast information. The supplier commits to not offering another contract if an offer

7Recall from Section 3.2.4 that demand and forecasts have the following relationship; XN+1 =Xmn εn = Xs

nξnεn, where εn ≡∏Nk=n δ

mk represents the demand uncertainty of the system and

ξn ≡∏Nk=n δ

sk/εn represents the manufacturer’s private information.


is declined, and thus the manufacturer chooses a contract and commits to it when

the supplier offers a menu of contracts for the first time8. The supplier builds K(ξ)

units of capacity, and charges P (ξ). The supplier builds capacity at a unit cost of

cn and a fixed cost of Cn. The unit capacity cost cn and the fixed capacity cost Cn

change over time9. During the sales period N+1, the manufacturer observes demand,

XN+1, and places an order at a unit wholesale price of w. The supplier produces the

component at a unit production cost of c and fulfils the order to the extent possible

given the reserved capacity. Then, the manufacturer sells the final product to the

market at a unit retail price of r.10 Unmet demand is lost, and unsold components

have zero salvage value. We denote the c.d.f. and the p.d.f. of εn by Gn(.) and gn(.)

and those of ξn by Fn(.) and fn(.). We assume that r > w > c + cn holds for every

n. Because of the long leadtime for capacity construction, the capacity decision is

irreversible. Therefore, the supplier has a single opportunity to offer a contract to

the manufacturer, and we do not consider the possibility of renegotiation after the

capacity decision is made.

3.4. Formulation

To solve the aforementioned problem, we formulate a two-stage dynamic program.

The first-stage problem is an optimal stopping problem to determine the time to offer

an optimal menu of contracts. The second-stage problem solves the mechanism design

problem. These two stages are nested optimization problems; i.e., the solution of one

stage depends on the solution of the other.

8Alternatively, the manufacturer is not aware of the supplier’s search of the time to offer contracts,and hence she does not defer accepting a contract to later time periods.

9Although it is natural to assume that cn and Cn are increasing in n, we do not make thisassumption to provide a more general result.

10The manufacturer may carry out some value added operations that cost, say, m per unit. Shesells at a unit price r′ > 0. Her effective sales price is r = r′ −m. Hence, without loss of generality,we assume m = 0.


3.4.1 The First Stage Problem

At the beginning of period n, the supplier’s demand forecast is Xsn. Given this

information, the supplier decides whether to offer a menu of contracts or to delay this

offer to the next period;

un(Xsn) =

us, offer a contract,

ud, delay to offer a contract,

for n < N . If the supplier offers the menu of contracts at the beginning of period n,

then the state is updated to indicate that the process has already been stopped. To

do so, we define the termination state t. If the supplier delays to offer the menu of

contracts to the next period, he obtains demand information δsn during period n, and

updates his forecast as Xsn+1 = Xs

nδsn. Hence, the state transition is

Xsn+1 =

t, if Xs

n = t, or Xsn 6= t and un(Xs

n) = us,

Xsnδ

sn, otherwise.

We denote the supplier’s expected profit when he offers the optimal menu of contracts

at period n by πn(Xsn). This profit depends on the optimal mechanism offered in the

second stage. We will explicitly define this function in the next section. The reward

function hn(Xsn) is given by

hn(Xsn) =

πn(Xs

n), if Xsn 6= t and un(Xs

n) = us,

0, if Xsn = t or un(Xs

n) = ud.

Let P = u1(Xs1), . . . , uN(Xs

N) represent a policy that determines when to offer a

contract. Then, the optimal stopping problem is given as

maxP

E

[N∑n=1

hn(Xsn)

],


where the maximization is taken over all admissible policies. The following dynamic

programming algorithm provides the solution to this problem:

VN(Xsn) =

πN(Xs

n), if Xsn 6= t,

0, if Xsn = t,

Vn(Xsn) =

maxπn(Xs

n), E[Vn+1(Xsn+1)|Xs

n], if Xsn 6= t for n < N ,

0, if Xsn = t for n < N .

Notice that it is optimal for the supplier to offer a capacity reservation contract at

the beginning of period n when πn(Xsn) ≥ E[Vn+1(Xs

n+1)|Xsn]; otherwise, it is optimal

to delay to offer a contract.

3.4.2 The Second Stage Problem

Suppose the supplier offers a menu of contracts K(ξ), P (ξ) at the beginning of pe-

riod n. The manufacturer chooses a specific contract (K(ξ), P (ξ)) that maximizes her

expected profit while implying that her forecast information is ξ. Recall that this fore-

cast information could be different from her true information. Hence, by choosing a

contract, the manufacturer defines the supplier’s expected profit, her expected profit,

and the total supply chain’s expected profit. As a function of the manufacturer’s

private forecast information ξn, these profits are defined as follows:

Πsn(K(ξ), P (ξ), ξn, X

sn) ≡ (w − c)Eεn [min(Xs

nξnεn, K(ξ))] + P (ξ)− (cnK(ξ) + Cn),

Πmn (K(ξ), P (ξ), ξn, X

sn) ≡ (r − w)Eεn [min(Xs

nξnεn, K(ξ))]− P (ξ), (3.2)

Πtotn (K(ξ), ξn, X

sn) ≡ (r − c)Eεn [min(Xs

nξnεn, K(ξ))]− (cnK(ξ) + Cn).

The supplier’s objective is to design a menu of contracts that maximizes his ex-

pected profit among all possible contracts. From the revelation principle (Myerson

1982), the supplier can limit the search for the optimal menu of contracts to the class

of incentive-compatible, direct-revelation contracts. Under this type of contract, the

manufacturer credibly reports her private information ξn. In this case, the supplier

can screen the true value of ξn from the manufacturer’s contract choice. Hence,


the supplier’s expected profit from offering an incentive-compatible, direct-revelation

contract (K(ξ), P (ξ)) is Πsn(K(ξn), P (ξn), ξn, X

sn). To identify an optimal menu of

contracts, the supplier solves

πn(Xsn) = max

K(.),P (.)Eξn [Πs

n(K(ξn), P (ξn), ξn, Xsn)] subject to (3.3)

(IC) : Πmn (K(ξ), P (ξ), ξ,Xs

n) ≥ Πmn (K(ξ), P (ξ), ξ,Xs

n) for every ξ 6= ξ

(PC) : Πmn (K(ξ), P (ξ), ξ,Xs

n) ≥ πm for every ξ.

The first constraint is the incentive compatibility constraint (IC), which ensures the

manufacturer’s credible information sharing. The second constraint is the participa-

tion constraint (PC), which ensures the manufacturer’s participation on the contract.

Note that the solution of this problem is a function of the supplier’s forecast infor-

mation, which depends on his timing decision.

3.5. Analysis

To solve the first-stage optimal stopping problem, we need the optimal profit obtained

in the second-stage, which is πn(Xsn). Hence, we solve the mechanism design problem

first and embed its solution to the optimal stopping problem to determine the optimal

time to offer the optimal menu of contracts.

3.5.1 Optimal Capacity Reservation Contract

Once the supplier decides to offer a capacity reservation contract at period n, he

determines the optimal contract given his forecast informationXsn, market uncertainty

εn, and the belief about the manufacturer’s private forecast information ξn. When the

forecast model follows an m-MMAFE, the random variables εn and ξn are log-normally

distributed. The result of this subsection also holds for other random variables.

Hence, we do not assume that εn and ξn are log-normally distributed in this subsection.

We denote the optimal menu of contracts that solves the optimization problem in (3.3)

by Kdcn (ξ), P dc

n (ξ), where εn ∈ [εn, εn] and ξn ∈ [ξn, ξn].


Lemma 3.1. (a) A menu of contracts K(ξ), P (ξ) is feasible for (3.3) if and only

if it satisfies the following conditions for all ξ ∈ [ξn, ξn]:

(i) Πmn (K(ξ), P (ξ), ξ,Xs

n) = πm +

∫ ξ

ξn

[(r − w)Xs

n

∫ K(x)xXsn

εn

ygn(y)dy

]dx,

(ii) K(ξ) is increasing in ξ.

(b) The optimization problem (3.3) has the following equivalent formulation:

πn(Xsn) = max

K(.)Eξn

[(r − c)Eεn [min(Xs

nξnεn, K(ξn))]− (cnK(ξn) + Cn)

−1− Fn(ξn)

fn(ξn)(r − w)Xs

n

∫ K(ξn)ξnX

sn

εn

ygn(y)dy]− πm (3.4)

s.t. K(ξ)is increasing.

After determining the optimal capacity reservation function Kdcn (.) from (3.4), we

derive the corresponding payment as:

P dcn (ξ) = (r − w)Eεn [min(Xs

nξεn, Kdcn (ξ))]

−∫ ξ

ξn

(r − w)Xsn

∫ Kdcn (x)

xXsn

εn

ygn(y)dy

dx− πm. (3.5)

Note that the optimal menu of contracts depends on the forecast, Xsn. This

dependency does not result in the analytical complexity for the structural properties

of the optimal mechanism and the optimal stopping policy. However, to numerically

solve the optimal stopping problem, one needs to determine the optimal capacity

reservation contract for every Xsn to derive πn(Xs

n). To reduce the computational


complexity, we introduce a normalized version of (3.4), which we define as follows:

πn ≡ maxK(.)

Eξn

[(r − c)Eεn [min(ξnεn, K(ξn))]− cnK(ξn)

−1− Fn(ξn)

fn(ξn)(r − w)

∫ K(ξn)ξn

εn

ygn(y)dy]

s.t. K(ξ)is increasing. (3.6)

We denote the optimal solution of (3.6) by Kdcn (.), and we define

P dcn (ξ) ≡ (r − w)Eεn [min(ξεn, K

dcn (ξ))]−

∫ ξ

ξn

(r − w)

∫ Kdcn (x)

x

εn

ygn(y)dy

dx.Then, we have the following properties of the optimal contract and the expected

profit:

Theorem 3.4. The following statements are true for all n:

(a) Kdcn (ξ) = Xs

nKdcn (ξ).

(b) P dcn (ξ) = Xs

nPdcn (ξ)− πm.

(c) πn(Xsn) = Xs

nπn − Cn − πm.

These properties and results show that the optimal menu of contracts is state-

dependent. The menu depends on the supplier’s forecast information at the time the

menu is offered to the manufacturer. The multiplicative nature of the results is due to

the multiplicative nature of forecast updates. If we ignore the fixed parts πm and Cn,

the forecast Xsn scales the problem (3.4). However, the supplier needs to guarantee πm

to ensure the manufacturer’s participation regardless of Xsn. Hence, in Part (a) and

(b) of the theorem, the optimal capacity reservation and prices are linearly increasing

in Xsn, where the capacity prices have the y-intercept of −πm. Note that the unit

capacity reservation price is P dcn (ξ)Kdcn (ξ)

= P dcn (ξ)

Kdcn (ξ)− πm

XsnK

dcn (ξ)

. Without the reservation profit,

the supplier would charge a unit capacity reservation price of P dcn (ξ)/Kdc

n (ξ) to the

manufacturer for all Xsn. However, to ensure the participation of the manufacturer,


the supplier discounts πm

XsnK

dcn (ξ)

from the unit reservation price. The supplier also

incurs the fixed capacity cost Cn regardless of Xsn. Hence, in Part (c), the supplier’s

optimal expected profit is linearly increasing in Xsn with the y-intercept of −Cn−πm.

This result also implies that one needs to solve only N optimization problems to

derive πn(Xsn) for every decision period n to numerically solve the optimal stopping

problem.

Next, we discuss how to solve (3.6), which is a function-space optimization prob-

lem. Without the constraint that K(ξ) is increasing, we can derive the optimal

function K(.) by choosing the value K that maximizes the inner function of (3.6) for

each ξ. If the optimal K(.) of the unconstrained problem is increasing in ξ, then it

is the optimal solution of (3.6). We prove that this approach works when εn and ξn

have increasing generalized failure rates (IGFR)11. Note that the log-normal random

variables have IGFRs, hence the following result holds when the demand forecast is

an m-MMAFE.

Theorem 3.5. If εn and ξn have IGFRs, then the following properties hold:

(a) Kdcn (ξ) is the unique solution of the first-order condition

(r − c)(1−Gn(K

ξ))− cn −

1− Fn(ξ)

ξfn(ξ)(r − w)

K

ξgn(

K

ξ) = 0. (3.7)

(b) Both Kdcn (ξ) and P dc

n (ξ) are increasing in ξ.

(c) Both Πsn(Kdc

n (ξ), P dcn (ξ), ξ,Xs

n) and Πmn (Kdc


n) are increasing in

ξ.

(d) P dcn (Kdc

n ) is an increasing concave function of Kdcn .

Together with Theorem 3.4 and Equation (3.5), Part (a) fully characterizes the

optimal menu of contracts. Part (b) implies that the manufacturer reserves more ca-

pacity and pays more when her forecast is larger. We can represent the manufacturer’s

11The generalized failure rate of a random variable is defined as xf(x)1−F (x) , where f(x) and F (x) are

the p.d.f. and the c.d.f. of the random variable. All random variables with increasing failure rateshave IGFRs, and other common classes of random variables have IGFRs, including Log-normal,Gamma and Weibull.


expected profit under the optimal capacity reservation contract as

Πmn (Kdc


n) = πm + (Πmn (Kdc


n)− πm).

The first term is the manufacturer’s reservation profit, and the second term is the

informational rent of the manufacturer. This rent is the price that the supplier has to

pay to learn the manufacturer’s private information. When the manufacturer has no

private information, i.e., σξn = 0, the informational rent is 0. Then, Part (c) implies

that the manufacturer’s informational rent increases in ξ. Because both Kdcn (ξ) and

P dcn (ξ) are increasing in ξ, we can map the capacity Kdc

n to the price P dcn directly

without ξ. This mapping is denoted by P dcn (Kdc

n ). For this reason, Ozer and Wei

(2006) referred to this contract as the capacity reservation contract12. To reserve K

units of capacity, the manufacturer has to pay P dcn (K) to the supplier. The capacity

reservation contract, P dcn (K), is a concave increasing function as shown in Part (d).

The concavity implies that the marginal reservation price is decreasing in the capacity

level, hence the supplier provides more incentive to a manufacturer who has a higher

forecast.

3.5.2 Optimal Time to Offer the Contract

The previous section reveals that primarily three factors affect the timing decision:

cost of demand uncertainty, cost of asymmetric information (i.e., cost of screening),

and cost of capacity. When the supplier offers the contract at period n, his resulting

expected profit is given in Theorem 3.4(c). This profit consists of the variable part

Xsnπn and the fixed part −Cn − πm. The normalized profit πn in Equation (3.6)

is determined by demand uncertainty εn, information asymmetry ξn, and the unit

capacity cost cn. The impact of these factors on the supplier’s profit is proportional

to his demand forecast. In contrast, the impact of the fixed capacity cost Cn is

independent of the demand forecast. The first two terms in Equation (3.6) represent

12We remark that Ozer and Wei (2006) studied the static mechanism design problem for the ad-ditive uncertainty. This chapter considers the dynamic version of this problem for the multiplicativeuncertainty that is resolved through multiple forecast updates. In the Appendix, we provide thedynamic version of the problem for the additive case.


the total supply chain profit, which is increasing with reduced demand uncertainty.

The last term can be interpreted as the cost of screening13. If the supplier offers the

menu of contracts in the next period instead of the current period, his profit increases

when the benefit of reduced demand uncertainty outweighs the change in the cost of

asymmetric information and the change in the capacity cost. The optimal stopping

formulation considers these trade-offs in determining the optimal time to offer the

menu of contracts. The following theorem characterizes the optimal stopping policy.

Theorem 3.6. A control band policy that offers a capacity reservation contract

at period n if Xsn ∈ [Ln, Un] is optimal, and the optimal thresholds are given as

Un ≡ supXsn : πn(Xs

n) ≥ E[Vn+1(Xsn+1)|Xs

n], and Ln ≡ infXsn : πn(Xs

n) ≥E[Vn+1(Xs

n+1)|Xsn].

Theorem 3.6 establishes the optimality of a control band policy. Under this policy,

the supplier offers the capacity reservation contract at period n if the current forecast

falls within the control band, i.e., Xsn ∈ [Ln, Un]. When the demand forecast is larger

than Un, the benefit of reducing demand uncertainty and asymmetric information

is significant and outweighs the cost of delaying. Recall that demand uncertainty is

multiplicative. Hence, the supplier can significantly increase his profit by reducing the

degree of demand uncertainty when the forecast is large. In contrast, the potential

increase in the fixed capacity cost is constant regardless of the forecast level. In this

case, delaying to offer the contract is optimal when Xsn > Un. When the demand

forecast is smaller than Ln, the cost of delaying to offer the contract (e.g., possible

increase in the unit capacity cost) is negligible. In contrast, the potential decrease

in the fixed capacity cost is again constant regardless of the forecast level. In this

case, delaying to offer the contract is optimal. The following set of results further

characterizes the optimal stopping policy and sheds some light onto these trade-offs.

Theorem 3.7. The following statements are true for all n:

(a) When Cn+1 > Cn for all n, the lower threshold, Ln, is 0 for all n. Hence, an

13Without this last term, the supplier’s objective would be the same as that of a centralizedsystem. This term is the information rent that the supplier (and also the total supply chain) has toforgo to enable credible information sharing.


upper threshold policy that offers the capacity reservation contract at period n

if Xsn ≤ Un is optimal.

(b) Let n∗ ≡ arg maxnπn. When Cn+1 = Cn for all n, the lower and upper thresholds

satisfy that Ln = Un = 0 for n 6= n∗, and that Ln = 0 and Un =∞ for n = n∗.

Hence, a state-independent stopping policy that offers the capacity reservation

contract at period n∗ is optimal.

When the supplier always incurs more fixed capacity costs by delaying the capacity

decision, he should optimally offer the contract when his demand forecast is small.

As the forecast level converges to 0, the benefit of reduced demand uncertainty and

information asymmetry vanishes, whereas the loss in the fixed capacity cost remains

constant. Hence, the supplier should delay to offer the contract only when the de-

mand forecast is large, which implies that the lower threshold is 0 and that an upper

threshold policy is optimal. When the fixed capacity cost is constant over time, the

optimal time to offer the contract is fully determined by the trade-offs among the

demand uncertainty, information asymmetry, and the unit capacity cost, whose im-

pacts on the supplier’s profit are all proportional to the forecast level. Hence, if the

expected profit of offering the contract at period n is greater than the expected profit

of offering the contract at period m, then it is true regardless of the forecast level.

Thus, the optimal policy is to always stop at the optimal stopping period, n∗, at

which the normalized expected profit is the greatest.

3.6. Centralized Supply Chain

In a centralized system, a single decision maker determines both decisions: (1) the

optimal time to decide and build capacity and (2) the optimal capacity level that

maximizes the total supply chain profit. Note that the demand forecast for the

centralized decision maker is Xmn because the centralized decision maker would have

access to all information; i.e., private information concept is irrelevant in this case.

At the beginning of each period n, the centralized decision maker decides whether to

set the capacity level or delay the decision to the next period. This problem can be


formulated as a two-stage dynamic program similar to that in §3.4. However, in this

case, the second-stage is a simple newsvendor-type optimization problem instead of

a mechanism design problem.

First we consider the second-stage problem. The centralized decision maker de-

termines the optimal capacity level that maximizes the total supply chain expected

profit as:

πcsn (Xmn ) ≡ max

K(r − c)Eεn [min(Xm

n εn, K)]− (cnK + Cn).

We define the normalized expected profit πcsn ≡ maxK(r − c)Eεn [min(εn, K)] − cnK,

and then we can represent the optimal expected profit as πcsn (Xmn ) = Xm

n πcsn − Cn.

The optimal capacity level is given as Kcsn (Xm

n ) ≡ Xmn G

−1n ( r−c−cn

r−c ).

Next we consider the first-stage optimal stopping problem. The formulation is

similar to that of the decentralized supplier’s problem in §3.4.1, hence we focus on

the differences between the two problems. In the centralized case, the state variable

is given by Xmn instead of Xs

n. If the centralized decision maker delays the capacity

decision, she obtains the forecast update δmn and updates the forecast as Xmn+1 =

Xmn δ

mn . If she stops and sets the capacity, the reward of stopping at period n is

πcsn (Xmn ). Hence, to solve the problem for a centralized system, we replace Xs

n with

Xmn and πn(Xs

n) with πcsn (Xmn ) in the formulation of §3.4.1. We denote the value-to-go

function of the centralized case by V csn (Xm

n ). Then, we have the following results:

Theorem 3.8. For the centralized supply chain, the following statements are true for

all n:

(a) A control band policy that determines the capacity level at period n if Xmn ∈

[Lcsn , Ucsn ] is optimal, and the optimal thresholds are given as U cs

n ≡ supXmn :

πcsn (Xmn ) ≥ E[V cs

n+1(Xmn+1)|Xm

n ], and

Lcsn ≡ infXmn : πcsn (Xm

n ) ≥ E[V csn+1(Xm

n+1)|Xmn ].

(b) When Cn+1 > Cn for all n, an upper threshold policy that determines the ca-

pacity level at period n if Xmn ≤ U cs

n is optimal.

(c) When Cn+1 = Cn for all n, a state-independent policy that determines the


capacity level at period ncs is optimal, where the optimal stopping period is

given as ncs ≡ arg minnπcsn .

(d) The capacity level of the centralized supply chain is equal to or greater than the

capacity level of the decentralized supply chain, i.e., Kcsn (ξ) ≥ Kdc

n (ξ).

The centralized decision maker’s optimal stopping policy has the same structure

as the supplier’s policy discussed in the previous section. Note, however, that this

theorem does not imply that the optimal stopping thresholds for the centralized

and decentralized systems are the same. The decentralized supplier may determine

the capacity level at a different time than the centralized decision maker, which

is one source of channel inefficiency. Part (d) of the theorem also shows that the

decentralized supplier builds less capacity than the centralized decision maker, which

is another source of channel inefficiency. We quantify these differences through a

numerical study.

3.7. Numerical Study

The purpose of this section is four-fold. First we illustrate some of our results through

numerical examples, such as the form of optimal capacity reservation contract. Sec-

ond, we compare the capacity levels and profits of the centralized supply chain with

those of decentralized supply chain. This comparison enables us to quantify the

cost of suboptimal timing and capacity decisions on profits; i.e., inefficiency due to

decentralization. We also compare the optimal thresholds of the decentralized and

centralized systems. These comparisons characterize the environments in which the

supplier’s self-interested strategy deviates largely from the socially optimal action.

Third, we quantify the impact of the three factors - market uncertainty, information

asymmetry and capacity cost - on the expected profits as well as the optimal stopping

thresholds. These comparisons help us to characterize when the supplier should offer

the contract early or late. Finally, we compare our model with a static model in

which the supplier offers the menu of contracts at a fixed time to quantify the value

of determining the optimal time to offer the capacity reservation contract.


3.7.1 Optimal Capacity Reservation Contract

(a) Optimal Menu of Contract (b) Capacity Levels for Centralized and De-centralized Systems

Figure 3.3: Optimal Capacity Reservation Contract

Figure 3.3(a) illustrates the optimal capacity reservation contracts for three differ-

ent levels of information asymmetry, which is measured by the standard deviation of

log(ξn), i.e., σξn . The supply chain setting for this test is σξn ∈ 0.05, 0.2, 0.5, r = 15,

c = 3, w = 10, cn = 2, Cn = 0, πm = 0, σεn = 1, and Xsn = 1. As shown in Theo-

rem 3.5 Part (e), the optimal capacity reservation contract P dcn (K) is an increasing

concave function of K. This figure also shows that the supplier charges less to reserve

capacity as the degree of information asymmetry, σξn increases. In other words, the

supplier’s cost of screening the manufacturer’s private information decreases with the

degree of information asymmetry. Figure 3.3(b) illustrates the capacity level deter-

mined by the optimal capacity reservation contract and the optimal capacity level

of the centralized system when σξn = 0.5. In both cases, the decision makers builds

more capacity as the forecast increases, but the supplier always builds less capacity

than a centralized decision maker.

When the supplier delays to offer a capacity reservation contract, the expected

profits of each decision maker change due to the changes in the three values: the

degree of the demand uncertainty σεn , the degree of information asymmetry σξn , and

the capacity cost (cn, Cn). Table 3.1 shows the expected profits and the total supply


chain profit for various values.

Table 3.1: Expected Profits When the Supplier Offers the Contract at Period n

σεn Πsn Πm

n Πtotn σξn Πs

n Πmn Πtot

n cn Πsn Πm

n Πtotn

1.00 35.74 19.49 55.22 0.80 33.21 21.16 54.37 2.0 35.74 19.49 55.220.95 37.68 19.78 57.46 0.75 33.54 20.94 54.48 2.3 32.26 18.64 50.900.90 39.63 20.06 59.69 0.70 33.90 20.70 54.60 2.6 29.07 17.88 46.960.85 41.59 20.32 61.91 0.65 34.29 20.44 54.73 2.9 26.14 17.20 43.340.80 43.56 20.55 64.11 0.60 34.72 20.15 54.88 3.2 23.42 16.58 40.000.75 45.54 20.75 66.28 0.55 35.20 19.84 55.04 3.5 20.90 16.01 36.910.70 47.52 20.91 68.43 0.50 35.74 19.49 55.22 3.8 18.54 15.49 34.03

cn = 2, σξn = 0.5 cn = 2, σεn = 1 σεn = 1, σξn = 0.5r = 15, c = 3, w = 10, Cn = 0, πm = 10, Xs

n = 10

The first section of Table 3.1 shows that the expected profits of both decision

makers increase as the demand uncertainty decreases. The second section of Table 3.1

shows that the supplier’s profit and the total supply chain profit increase but the

manufacturer’s expected profit decreases as the degree of information asymmetry

decreases. In other words, the manufacturer’s informational rent decreases with the

degree of information asymmetry. The third section of Table 3.1 shows that the

increased capacity cost reduces the expected profits of both decision makers.

3.7.2 Optimal Time to Offer the Contract

The basic supply chain setting for this and the next subsection is N = 6, r = 15, c = 3,

w = 10, πm = 10, cn = 2, and Cn = 20+(n−1)∆C . The term ∆C indicates the rate of

increase in the fixed cost of capacity. We set the parameters for the forecast evolution

as σδmN = 1, σδsN =√

1.6, σδmn = σm and σδsn = σs for n = 1, 2, . . . , N − 1. The value of

σi quantifies the amount of information that decision maker i obtains at each period

of the capacity planning horizon, and the value of σδiN indicates the degree of residual

demand uncertainty of decision maker i at the end of the capacity planning horizon.

When σm < σs, the supplier obtains more information than the manufacturer during

the capacity planning horizon. We set the values of σs and σm such that the forecast

model follows an m-MMAFE, i.e., we maintain (n−1)(σs)2+σ2δsN≥ (n−1)(σm)2+σ2

δmN


for every n = 1, 2, . . . , N .

Table 3.2 shows the optimal thresholds Un of the decentralized supplier and the

optimal thresholds U csn of the centralized decision maker for several values of σs, σm,

and ∆C . Recall that when Cn is increasing in n, the optimal stopping policy is

an upper threshold policy. Hence, Un is sufficient to describe the optimal stopping

decision. The difference between Un and U csn indicates how much the decentralized

supplier’s strategy deviates from the socially optimal action14. When the optimal

Table 3.2: Optimal Thresholds of Decentralized and Centralized Supply Chains

(σm)2 (σs)2 ∆C U1 U3 U5 U cs1 U cs

3 U cs5

(a) 0.1 0.1 8 39.49 41.12 44.44 30.13 31.49 34.30(b) 0.1 0.2 8 31.39 33.25 36.83 30.13 31.49 34.30(c) 0.2 0.1 8 31.44 26.97 26.11 15.38 16.03 17.57(d) 0.1 0.1 4 19.75 20.56 22.22 15.07 15.75 17.15

threshold of period n is large, the optimal policy delays to offer a contract at period

n only when the forecast is large. Therefore, high threshold levels indicate that the

decision maker tends to offer a contract early and low threshold levels indicate the

opposite.

First we focus on the optimal thresholds of the decentralized supplier. Case (a)

of Table 3.2 is the basis of our experiment. In case (b), the supplier obtains more

information than he does in case (a). Therefore, the supplier waits longer in case

(b) than in case (a) in order to obtain more demand information. In case (c), the

manufacturer obtains more information than she does in case (a), but the supplier’s

obtains the same amount of information in both cases. Even in this case, the supplier

waits longer than in case (a) so that the manufacturer obtains more demand informa-

tion. It is a surprising result because the delay of offering the contract increases the

degree of information asymmetry in case (c). Therefore, the increase of the degree

of information asymmetry is not necessarily bad for the supplier if by offering the

contract later, the supplier can reduce the demand uncertainty through screening the

manufacturer who obtains more information in the current period. Finally, the result

14Even though the stopping decision of the decentralized supplier is made based on Xsn and the

centralized decision maker on Xmn , the forecasts Xs

n and Xmn are the same in expectation.


of case (d) implies that the supplier offers a contract early when the capacity cost

increases rapidly. To summarize, the supplier waits longer when the decision makers

obtain more demand information over the capacity planning horizon and when the

capacity cost increase less rapidly.

Next, we focus on the difference between Un and U csn . The centralized system of

case (b) is identical to that of case (a) because in both cases the centralized system has

the same information. In case (a), the information asymmetry between the two deci-

sion makers remains constant over time. In this case, the decentralized supplier stops

the process earlier than the centralized decision maker, which implies that the sup-

plier’s self-interested strategy keeps the system from waiting until the socially optimal

time and hence reduces channel efficiency. In case (b), the information asymmetry

between the two decision makers decreases over time. Therefore, the decentralized

supplier waits longer because he can reduce the degree of demand uncertainty and

information asymmetry at the same time. Even though the supplier’s optimal pol-

icy is close to the socially optimal policy, this results not from altruism but from

the supplier’s self-interest. In case (c), the information asymmetry between the two

decision makers increases over time. Therefore, the supplier offers a contract much

earlier than the socially optimal point to prevent the information asymmetry from

growing too much. Finally, case (d) shows that both the decentralized supplier and

the centralized decision maker commit early when the capacity cost increases rapidly.

From this result, we can conclude that the supplier offers a contract earlier than the

socially optimal time when the manufacturer obtains much information.

3.7.3 Value of Determining the Optimal Time

In this subsection, we assess the value of determining the optimal time to offer a

capacity reservation contract. Clearly, the supplier is better off by optimizing the

time to offer a contract. However, it is unclear whether or not this strategy benefits

the manufacturer and the total supply chain. To clarify this point, we compare the

profits from our model with those from a static model in which the supplier offers

a contract at a fixed period regardless of the forecast level. We consider N static


models, which offer the contract at each fixed period n ∈ 1, 2, . . . , N, and report

the average gain and the lowest gain in the expected profits. We denote the percentage

improvement in the supplier’s expected profit of using the dynamic model compared

to profit from the static model that offers the contract at period n by

Isn ≡V1(Xs

1)− E[πn(Xsn)|Xs

1 ]

E[πn(Xsn)|Xs

1 ]× 100%.

Similarly, we denote the percentage improvement in the manufacturer’s profit and

the total supply chain profit by Imn and I totn . For each j ∈ s,m, tot, we define

Ijave ≡∑Nn=1 I

jn

Nand Ijmin ≡ minIj1 , . . . , I

jN.

Table 3.3 shows these values for several supply chain settings. The basic setting

is the same as that we use in Table 3.2. The improvement in the supplier’s profit is

Table 3.3: Percentage Improvements in Expected Profits

(σm)2 (σs)2 ∆C Xs1 Isave Ismin Imave Immin I totave I totmin

(a) 0.1 0.1 8 41.12 4.21 0.63 1.10 -5.43 3.12 2.66(b) 0.1 0.1 8 45.00 3.94 2.01 2.15 -4.55 3.32 1.82(c) 0.2 0.1 8 26.97 14.85 0.00 -22.55 -38.00 -1.02 -5.70(d) 0.2 0.1 8 35.00 7.57 3.23 15.45 -9.18 9.31 -0.37(e) 0.1 0.2 8 33.25 8.27 1.19 0.49 -2.09 5.31 2.15(f) 0.1 0.2 8 37.00 7.39 3.25 0.89 -1.76 5.07 3.64(g) 0.1 0.1 4 20.56 5.79 0.85 0.88 -4.61 3.55 3.03(h) 0.1 0.1 4 23.00 5.09 3.03 2.01 -3.74 3.80 1.84

always positive, and we observe Ismin = 0 only in case (c). In this case, the initial

forecast Xs1 is below the optimal threshold U s

1 . Therefore, the supplier stops at the

initial period, and the expected profit from the dynamic strategy is the same as

that from the static model that stops at period 1. The average improvements in the

supplier’s profit range from 3.94% to 14.85% in our experiment. Therefore, we can

conclude that the supplier can significantly improve his expected profit by optimally

determining the time to offer a contract. However, this strategy may not lead to an

improvement in the total supply chain profit. In cases (c) and (d), I totmin is negative, and

in case (c), even I totave is negative. As discussed in the previous subsection, the supplier

in cases (c) and (d) offers the contract very early to prevent the manufacturer from


obtaining too much information. This example verifies that the mechanism designer’s

self-interested strategy to determine the optimal time can decrease the total supply

chain profit. The impact of this strategy is even worse on the manufacturer. In

Table 3.3, Immin is always negative, and Imave is much smaller than Isave except for case

(d).

3.8. Extensions and Generalizations

In this section, we discuss some extensions and generations of our model. We discuss

how the structural properties of the capacity planning problem would change in each

of the generalized models. We also discuss the MMFE for more than two decision

makers.

3.8.1 Endogenous Wholesale Price

In previous sections, we have assumed that the wholesale price is determined be-

fore the beginning of the capacity planning horizon. In this subsection, we con-

sider the case in which the supplier determines the wholesale price w when he offers

the screening contract. In this case, the supplier includes w(ξ) in his menu of con-

tracts, i.e., he offers K(ξ), P (ξ), w(ξ) to the manufacturer. As before, the manu-

facturer chooses a specific contract (K(ξ), P (ξ), w(ξ)) that maximizes her expected

profit given this menu. We denote the optimal menu of contracts at period n by

Kdcn (ξ), P dc

n (ξ), wdcn (ξ).

Theorem 3.9. The optimal menu of contracts is given as wdcn (ξ) = r, Kdcn (ξ) =

XsnξG

−1n ( r−c−cn

r−c ), and P dcn (ξ) = −πm. The supplier’s optimal expected profit is given

as πsn(Xsn) = Xs

nπcsn − Cn − πn.

It is important to note that the manufacturer’s unit profit margin is 0 under the

optimal menu of contracts because w = r. This result implies that with the ability

to determine the wholesale price w, the supplier obtains the manufacturer’s business

in exchange for the manufacturer’s reservation profit. Hence, the capacity level de-

termined by the optimal contract is the same as the optimal capacity level of the


centralized supply chain, and the supplier’s optimal expected profit has the same

structure as in the original problem except that πsn is now replaced by πcsn . Because

the structure of the optimal stopping policy does not depend on the value of πsn in

the original problem, Theorems 3.6 and 3.7 hold for the endogenous wholesale price

case.

3.8.2 Forecast Update Costs

Previously, the decision makers obtain new demand information over time without

incurring an additional cost specific to each update (as in Ozer et al. 2007 and all

forecast-related papers referenced in this chapter except for Taylor and Xiao 2009,

and Ulu and Smith 2009). Firms update forecasts as they get closer to the sales period

because they obtain information about, for example, the overall economy, consumer

tastes or past sales data. Often forecast related costs are sunk because firms invest

in forecasting upfront regardless of whether they obtain information updates. Hence,

most managerial decisions and related literature treats these costs as exogenous to

the decision problem. However, there may be cases in which firms may incur some

fixed cost to obtain more information (as in Taylor and Xiao 2009, and Ulu and Smith

2009). Let κsn be the cost to obtain the demand information δsn for the supplier, and

let κmn be the fixed cost to obtain the demand information δmn for the manufacturer.

Because the manufacturer incurs extra costs to update demand information, the man-

ufacturer’s reservation profit is now πmn = πm1 +∑n−1

k=1 κmk . In this case, the fixed part

of the expected profit πn(Xsn) becomes −Cn − πmn instead of −Cn − πm. In addition,

the reward function in §3.4.1 is now

hn(Xsn) =

πn(Xs

n), if Xsn 6= t and un(Xs

n) = us,

−κsn, if Xsn = t or un(Xs

n) = ud.

The optimal stopping policy is also a control-band policy. However, now the fixed

loss of delaying to offer a contract at period n is κsn + (Cn+1 + πmn+1 − Cn − πmn ) =

κsn+κmn + (Cn+1−Cn). Because κsn+κmn + (Cn+1−Cn) ≥ Cn+1−Cn ≥ 0, the optimal

stopping policy is still the upper threshold policy when Cn+1 > Cn for all n. However,


even when Cn+1 = Cn, the fixed loss of delaying is non-negative, hence Part (b) of

Theorem 3.7 does not hold any more.

3.8.3 State Dependent Reservation Profit

In some cases, the manufacturer’s reservation profit is proportionally increasing in

her demand forecast Xmn , i.e., the reservation profit is given as Xm

n πm = Xs

nξnπm. In

this case, the mechanism design problem (3.3) now has the following participation

constraint: PC : Πmn (K(ξ), P (ξ), ξ,Xs

n) ≥ Xsnξπ

m for every ξ. In this case, the struc-

ture of the optimal stopping policy remains the same. To prove this result, we define

K(.) = XsnK(.) and P (.) = Xs

nP (.). Then, the supplier’s and the manufacturer’s

profit in Equation (3.2) can be expressed as XsnΠs

n(K(ξ), P (ξ), ξ, 1) +Cn−Cn and

XsnΠm

n (K(ξ), P (ξ), ξ, 1), respectively. Using these properties, the supplier’s optimiza-

tion problem can be reformulated as

πn(Xsn) = Xs

n

(max

K(.),P (.)

Eξn [Πs

n(K(ξn), P (ξn), ξn, 1)] + Cn

)− Cn (3.8)

s.t. (IC) : Πmn (K(ξ), P (ξ), ξ, 1) ≥ Πm

n (K(ξ), P (ξ), ξ, 1) for every ξ 6= ξ

(PC) : Πmn (K(ξ), P (ξ), ξ, 1) ≥ ξπm for every ξ.

Because all constraints and the term inside of . in (3.8) are independent of Xsn,

we have πn(Xsn) = Xs

nπn − Cn, where πn ≡ πn(1) + Cn. Hence, following the proofs

of Theorems 3.6 and 3.7, one can verify that the structure of the optimal stopping

policy remains the same.

Although the optimal stopping policy remains the same, the optimal mechanism

cannot be derived in a simple form any more. The key property that establishes

Lemma (3.1) is the fact that the participation constraint is binding only at ξn. When

the reservation profit is increasing in ξ, this property does not hold in general, and

thus we cannot obtain a simpler formulation for the mechanism design problem as in

Lemma (3.1).


3.8.4 Dynamic Mechanism Design under Non-Commitment

Previously, we have assumed that the supplier commits to not offering another menu

of contracts if the manufacturer declines an offer. This assumption has been com-

monly made in the auction literature (McAfee and McMillan 1987), and similar take-

it-or-leave-it strategies are prevalent in various markets. Riley and Zeckhauser (1983)

proved that such a commitment is indeed the best strategy for the seller of a product

who faces a single buyer. However, in some cases, the agent (manufacturer) may not

find the principal’s (supplier’s) commitment of not offering another contract credible.

For example, consider the case in which the supplier and the manufacturer would

never be involved in the same business after the current business, and the manu-

facturer is the only one who purchases the component that the supplier currently

provides. In this case, if the manufacturer declines an offer from the supplier, the

supplier has an incentive to provide a new menu of contracts instead of committing to

not offering another menu. Such cases are called non-commitment in the economics

literature (Laffont and Tirole 1988, Skreta 2006).

Deriving an optimal mechanism under non-commitment in a dynamic framework

is a very challenging problem. Solving our capacity planning problem under non-

commitment is especially challenging for three reasons. First, it has been shown by

Laffont and Tirole (1988) that the revelation principle is not valid when the principal

offers multiple short-term contracts over time under non-commitment except for the

very last period. Hence, the supplier’s mechanism design problem cannot be derived

in a simple form as in Equation (3.3) except for period N . Second, the supplier can

acquire some information about the manufacturer’s private information by observing

the manufacturer’s refusal of offered contracts. Hence, the supplier needs to design

the optimal mechanism by taking this information into account. The probability

distribution function of the manufacturer’s private information given such information

is not that of a log-normal random variable any more. It is important to recall that

the IGFR property of log-normal random variables is critical for deriving the simple

optimality condition given in Theorem 3.5. For this reason, we cannot easily derive the

optimal mechanism even for period N at which the revelation principle is valid. Third,

the manufacturer’s private information evolves over time in our model. Because of


the complexity of fully deriving an optimal mechanism for a two-period problem

with static private information, Laffont and Tirole (1988) provide some important

properties of special classes of equilibria instead of characterizing the optimal contract.

Although our problem setting is different from that of Laffont and Tirole (1988), the

dynamic nature of the private information makes our problem more complex than

theirs.

3.8.5 MMFE for J > 2 decision makers

When we construct the MMFE for two decision makers, we divide the total informa-

tion into (N+1)2 groups by the time that each decision maker obtains the information.

Similarly, for J > 2 decision makers, we divide the information sets into (N + 1)J

groups such that each δn1,...,nJ represents demand information that is obtained by

J decision makers at the specified period. By determining the standard deviations

of each log(δn1,...,nJ ), one can similarly describe the forecast evolution for J decision

makers in a consistent manner.

3.9. Conclusion

In this chapter, we have provided a framework to generalize the MMFE for multiple

decision makers who forecast demand for the same product. We have shown that

this framework is consistent and can be used to model several forecast evolution

scenarios when multiple decision makers employ different forecasters. We model the

scenario in which forecasters have asymmetric demand information that changes over

time and referred to this model as the Martingale Model of Asymmetric Forecast

Evolution. Using this model, we have studied a supplier’s problem of determining

the optimal time to offer a capacity reservation contract to a manufacturer. We have

characterized structural properties of the optimal time to offer the contract, and the

optimal capacity reservation contract. We have also established how these decisions

are linked. We have provided managerial insights through numerical studies. For

example, we have shown that the supplier can significantly improve his profit by


optimally determining the time to offer a contract. Even though our focus has been

on the capacity reservation contract, we have also discussed how the framework can

be used to determine the optimal time to design a mechanism for other problems

with asymmetric information and dynamic information updates. For example, the

seller of a product can delay to design a selling mechanism to update his knowledge

of the customers’ valuation on the product by observing more early sales data. The

consumer of an airline ticket may delay to bid for a name-your-own-price ticket to

update her valuation of the product as the traveling date approaches (Courty and Li

2000 and Akan et al. 2009).

Chapter 4

A Dynamic Strategy to Optimize

Market Entry Timing and Process

Improvement Decisions

4.1. Introduction

Manufacturing firms determine the timing for introducing a new product to the mar-

ket in the presence of two major uncertainties. First, manufacturing firms are un-

certain about competing firms’ market entry timing. Entering the market later than

competitors results in a drastic reduction of profit in a highly competitive environ-

ment. For example, Kumar and McCaffrey (2003) estimate the penalty of being late

to market by one quarter in the hard disk drive industry at $106 million, which corre-

sponds to fifty percent of gross profit. Second, manufacturing firms are also uncertain

about whether they can complete the development of the new production process by

the product launch time because the outcomes of manufacturing process development

activities are often highly unpredictable. Product launch with an ill-prepared pro-

duction process significantly reduces profit. In 2005, Microsoft launched the Xbox360

one year ahead of the competing game consoles’ market entry. The early market

entry resulted in a huge number of failing units, which cost Microsoft $1.15 billion

for repairs (Taub 2007). In the presence of such uncertainties, optimizing the market

76

CHAPTER 4. DYNAMIC MARKET ENTRY STRATEGY 77

entry timing is a challenging problem. The market entry decision also involves con-

siderable risk, because the decision is irreversible. This chapter seeks for a solution

for optimizing the market entry timing in consideration of risk.

Conventionally, manufacturing firms determine the market entry timing long be-

fore launching the product. When determining the market entry timing, firms take

into consideration the trade-off between the time-to-market and the completeness of

the production processes. On the one hand, manufacturing firms can attain a large

market share by entering the market early. On the other hand, manufacturing firms

can improve the production process for the new product by investing more time in

process design, which results in a reduction of production costs. When determining

the market entry timing, manufacturing firms are uncertain about the timing of the

competitors’ market entry. Hence, they often make the timing decision based on

their estimation of the competitors’ market entry timing. The decision may have a

poor outcome if the actual timing of the competitors’ market entry deviates signif-

icantly from the estimation. In addition, manufacturing firms frequently encounter

failures of production process development activities for a new product (Pisano 1996),

i.e., uncertainties reside in the learning activities for process design. Hence, without

effectively adjusting the market entry timing depending on the outcome of process

development activities, manufacturing firms may introduce a new product with an

ill-understood production process. As a remedy for these failures, manufacturing

firms can adopt a strategy that determines the market entry timing dynamically in

response to the evolution of uncertain factors.

For the dynamic market entry strategy to be effective, the coordination between

the market entry timing and the development of the production process is important.

Consider, for example, the case in which competitors of a manufacturer have intro-

duced their products earlier than expected. In this case, the manufacturer would want

to accelerate his market entry in order to avoid a significant loss in the market share.

However, to complete the new production process by the accelerated market entry

timing, the manufacturer has to invest more capital in process design to expedite the

learning rate. As another example, consider the case in which the manufacturer has

encountered several failures of manufacturing process development projects during


the process design. In this case, the manufacturer may suffer from a low production

yield and a large production cost unless he delays the market entry timing. With-

out flexible management of process development, the dynamic market entry strategy

cannot be effective.

The production and the pricing decisions for a new product also have to be coordi-

nated with the market entry decision. For example, if a manufacturing firm introduces

a new product much earlier than competitors, the firm can set a high monopolist price

for the new product. The firm also needs to produce a large number of products to

fulfil high demand. On the other hand, when manufacturing firms determine the mar-

ket entry timing, they take into consideration the production and pricing decisions

for the new product. In other words, manufacturing firms need to jointly optimize

the prior-market-entry decisions, i.e., market entry timing and process improvement

decisions, and post-market-entry decisions, i.e., production and pricing decisions, for

a new product.

In this chapter, we consider the problem of a manufacturer who employs a dynamic

strategy to optimize the decisions about market entry timing and process improve-

ment. The manufacturer faces a two-stage stochastic decision process, which consists

of the process design stage and the production and sales stage. During the process

design stage, the manufacturer first determines whether to continue process design

or stop it. When the manufacturer continues process design, he makes investment

decisions to improve the production process for the new product. However, the size

of the potential market for the new product decreases as the manufacturer delays

market entry. The decision process proceeds to the production and sales stage when

the manufacturer stops process design. In this stage, the manufacturer determines

the production quantity and the sales prices for the new product in the presence of

demand uncertainty.

We establish the optimality of several threshold-type market entry policies that

prescribe the optimal time to introduce the new product to the market. Under a

threshold-type market entry policy, the manufacturer stops process design if the state

of the problem exceeds a threshold and continues it otherwise. For example, we

prove the optimality of a knowledge-level-based threshold policy under which the


manufacturer enters the market if the level of cumulative knowledge regarding the

production process exceeds a certain threshold. We also characterize monotonicities

of the optimal thresholds. We next provide properties of the optimal production and

pricing decisions. To do so, we derive the solution of the second-stage problem as

functions of the outcome of the first-stage problem. These functions enable us to

quantify the trade-off between the time-to-market and process improvements in the

process design stage.

Using our modeling framework, we develop two measures that assess the value

of the dynamic strategy. The measures are based on comparisons of the dynamic

strategy to a conventional static strategy. The first measure evaluates the profit gain

that the dynamic strategy provides, and the second measure evaluates the reduction

in the variability of profit achieved by the dynamic strategy. Our numerical study

shows that the manufacturer can increase profit, while reducing the variability of

profit by employing the dynamic strategy. Our numerical study also characterizes

industry characteristics under which the dynamic strategy is the most effective.

The trade-off between the time-to-market and the completeness of development in

new product introduction has been studied extensively. By developing quantitative

models that assess this trade-off, researchers have determined optimal market entry

decisions under various industry characteristics (e.g., Kalish and Lilien 1986, Cohen

et al. 1996, Bayus 1997, Bayus et al. 1997, and Ulku et al. 2005). Researchers have

also studied market entry decisions in game theoretic frameworks (e.g., Klastorin and

Tsai (2004) and Savin and Terwiesch (2005)). However, the existing research on time-

to-market decisions has focused on static market entry strategies. One exceptional

example is the work by Ozer and Uncu (2008) who studied a supplier’s problem of

dynamically determining the time to apply for product qualification. Our study is

different from Ozer and Uncu (2008) in four dimensions. First, we consider a general

manufacturing firm which does not face a qualification procedure, whereas they con-

sider a supplier who have to pass qualification tests to sell their products. Second, we

take into consideration the uncertainties in the development of the production pro-

cess. The dynamic strategy we propose enables manufacturing firms to respond to

the realization of unexpected events driven by such uncertainties. Third, in addition


to the market entry decision, we consider the decisions about process improvements.

As mentioned above, the coordination between the market entry decision and pro-

cess improvement decisions are crucial for the effectiveness of the dynamic strategy.

Fourth, we investigate the risk reduction benefit of the dynamic strategy, which has

been neglected by Ozer and Uncu (2008). We refer the reader to Krishnan and Ulrich

(2001) and Shane and Ulrich (2004) for further references regarding time-to-market

decisions.

A group of researchers have addressed the problem of managing process improve-

ment decisions during a product introduction stage (e.g., Terwiesch and Bohn 2001,

Terwiesch and Xu 2004, and Carrillo and Franza 2006). In contrast to the existing

studies on this literature, we develop a stochastic learning model that incorporates

the uncertainties in the outcome of process improvement activities. Fine and Porteus

(1989) have also developed a stochastic learning model for process improvements, but

they consider process improvements in the middle of a product’s life cycle. In our

model, process improvement decisions have to be jointly optimized with the mar-

ket entry decision, whereas no market entry decision needs to be made in Fine and

Porteus (1989)’s model.

The rest of the chapter is organized as follows. In §4.2, we introduce our model and

notation. In §4.3, we formulate a two-stage dynamic program to solve the problem.

In §4.4, we provide structural properties of the optimal market entry policy. We also

characterize the optimal production and pricing decisions. In §4.5, we present the

results of our numerical study and generate managerial insights regarding the dynamic

strategy. In §4.6, we conclude and provide possible future research directions. All

proofs are in the Appendix.

4.2. Model

We consider a manufacturer who dynamically determines the timing for introducing

a new product to the market. Prior to introducing the product, the manufacturer

can improve the production process for the new product by investing in learning ac-

tivities such as adjustments of the process recipe, development of faster inspection


methods, and reduction of defect rates (Terwiesch and Bohn 2001)1. After launching

the product, the manufacturer makes production and pricing decisions for the new

product in the presence of demand uncertainty. We model the manufacturer’s prob-

lem as a two-stage stochastic decision process. The first stage is the process design

stage during which the manufacturer dynamically determines (i) learning activities to

improve the production process and (ii) the time to stop process design and introduce

the product. The second stage is the production and sales stage, during which the

manufacturer determines (i) the production quantity and (ii) the sales prices for the

new product. All revenues and costs are discounted by α ∈ (0, 1].

The process design stage consists of T periods, indexed from 1 to T . Let xt be the

manufacturer’s level of cumulative knowledge related to the production process at the

beginning of period t of the process design stage. At the beginning of each period t,

the manufacturer first determines whether to stop process design or continue it. If the

manufacturer decides to continue process design, then he chooses a learning activity

it from the set of available options It and invests in it. If the manufacturer invests in

option it, the manufacturer’s knowledge level increases2 by a random amount kt(it),

which is independent of xt3. Learning activities incur investment costs, which we

denote by ci(it).

As the manufacturer continues process design, the expected size of the potential

market for the new product decreases for two reasons. First, the delay in market entry

increases the chance of competitors’ launching competing products earlier than the

manufacturer, which reduces the manufacturer’s market share (Savin and Terwiesch

2005, Ozer and Uncu 2008). Second, by delaying the market entry, the manufacturer

loses a fixed amount of time for selling the new product because the life cycle (or

the time window) of one generation of products is often determined exogenously

(Cohen et al. 1996, Kumar and McCaffrey 2003, Klastorin and Tsai 2004). We model

the dynamics of the market potential following Kalyanaram and Krishnan (1997)

1Such learning activities are also called learning-before-doing (Pisano 1996, Carrillo and Gaimon2000, Terwiesch and Xu 2004).

2We use the terms increasing and decreasing in the weak sense; i.e., increasing means non-decreasing.

3Similar additive learning models have been used by Fine and Porteus (1989), Cohen et al. (1996),and Terwiesch and Xu (2004).


and Ulku et al. (2005). To incorporate uncertainties in the dynamics of the market

potential, we extend Ulku et al. (2005)’s model. Let st be the size of the potential

market for the new product when the manufacturer introduces the product at period t.

If the manufacturer continues process design, then the market potential decreases by

a random amount wt(st), which is independent of kt(it). The random variable wt(st)

models uncertainties residing in external factors such as competing firms’ market

entry timing. We assume that st+1 = st−wt(st) is stochastically increasing4 in st. In

other words, if the market potential at period t is larger, then the market potential

at period t + 1 is also larger. We do not make any additional assumptions on the

dynamics of the market potential, and thus this model can describe various plausible

cases of market potential changes including that of Ulku et al. (2005).

When the manufacturer stops process design, the problem proceeds to the pro-

duction and sales stage, which consists of a regular sales period r and a salvage period

s. Demand during each period n ∈ r, s depends on the market potential st and

price pn, and has the following form:

Dn(st, pn) = stAnp−bn .

The parameter b > 1 is the price elasticity of demand, and An is a random variable

that models demand uncertainty. We denote the support of An by [An, An] for each

n ∈ r, s. At the beginning of the regular sales period, the manufacturer first

determines the number of products to produce, Q, and then determines the regular

sales price, pr. During the regular sales period, random demand Dr(st, pr) is realized,

and the problem proceeds to the salvage period. At the beginning of the salvage

period, the manufacturer determines the salvage price, ps, to sell remaining products

from the regular sales period. During the salvage period, random demand Ds(st, ps)

is realized, and unsold products have no zero value after the salvage period. Revenue

from the salvage period is discounted by β ∈ (0, 1].

The unit production cost for the new product is determined by the manufacturer’s

knowledge level xt at the time when he stops process design. If the manufacturer stops

4A parameterized random variable x(θ) is stochastically increasing [resp., decreasing] in θ ifE[u(x(θ))] is increasing in θ for all increasing [resp., decreasing] functions u.


process design with knowledge level xt, the unit production cost of the new product is

cp(xt) ≡ δ0 + δ1e−γxt , which consists of the irreducible cost δ0 and the reducible cost

δ1e−γxt (Savin and Terwiesch 2005). The parameter γ indicates the rate of return

of knowledge. The unit production cost decreases in xt, but the marginal benefit

becomes smaller as the knowledge level becomes higher. There is well-documented

evidence for diminishing returns in learning (see, for example, Zangwill and Kantor

1998 and Lapre et al. 2000). Because the learning activities we consider are not

intended to improve the product but to improve the production process, the attributes

and quality of the new product are independent of the knowledge level. Hence, the

knowledge level does not directly affect demand for the new product. However, it has

an indirect impact on demand because the manufacturer’s pricing decision depends

on the unit production cost, which in turn depends on the knowledge level.

The sequence of events is as follows. At the beginning of period t ∈ 1, 2, . . . , Tof the process design stage, the manufacturer first determines whether to continue

process design or stop it, depending on the market potential, st, and the knowledge

level, xt. The manufacturer only knows the distribution of wt(st) and kt(it) at this

point. If the manufacturer decides to continue process design, he makes an investment

decision it ∈ It to improve the production process, which incurs an investment cost

ci(it). At the end of period t, kt(it) and wt(st) are realized, and the states are updated

as xt+1 = xt + kt(it) and st+1 = st − wt(st). Then, the problem proceeds to period

t + 1. If the manufacturer decides to stop process design at the beginning of period

t, the problem proceeds to the production and sales stage. The manufacturer has

to stop process design at the beginning of period T + 1 (or at the end of period T ).

Before the beginning of the regular sales period of the production and sales stage, the

manufacturer produces Q units of products at the unit production cost of cp(xt). The

manufacturer then determines the regular sales price, pr. During the regular sales

period, Ar is realized, and the manufacturer collects revenue pr minQ, stArp−br and

loses unmet demand (Q − stArp−br )+. Then, the problem proceeds to the salvage

period. At the beginning of the salvage period, the manufacturer determines the

salvage price ps to sell remaining products. During the salvage period, As is realized,

and the manufacturer collects revenue. Appendix A provides a glossary of notation


for easy reference.

4.3. Formulation

This section describes a two-stage dynamic program that we have formulated to solve

the manufacturer’s problem. The first-stage problem is an optimal stopping problem

with additional decisions to determine the time to introduce the new product and

investment decisions for process improvement. The second-stage problem is a pricing

and production decision problem. The solution of each stage depends on the solution

of the other stage, i.e., the two stages are nested.

4.3.1 The First-Stage Problem

At the beginning of period t ∈ 1, . . . , T, the manufacturer’s knowledge level is xt

and the market potential is st. Given this information, the manufacturer decides

whether to continue process design or to stop it;

ut =

us, stop process design

uc, continue process design.

If the manufacturer stops process design at period t, then the state is updated to

indicate that the process has already been stopped. To do so, we define the terminal

state st = S. If the manufacturer decides to continue process design at period t, he

invests in a learning activity it ∈ It to improve the production process. Then, the

knowledge level is updated as

xt+1 = xt + kt(it),

and the market potential is updated as

st+1 =

S, if st = S, or st 6= S and ut = us

st − wt(st), otherwise.


The manufacturer has to stop process design at the beginning of period T+1 if he has

not done so before. We denote the manufacturer’s optimal expected profit when he

stops process design at period t by Πt(st, xt). We explicitly define this profit function

in the next subsection. The reward that the manufacturer attains at period t ≤ T is

given as

gt(st, xt, it) =

−ci(it), if st 6= S and ut = uc,

Πt(st, xt), if st 6= S and ut = us,

0, otherwise,

and the reward at period t = T + 1 is given as

gT+1(sT+1, xT+1) =

ΠT+1(sT+1, xT+1), if sT+1 6= S

0, otherwise.

Let P = u1(s1, x1), i1(s1, x1), . . . , uT (sT , xT ), iT (sT , xT ) indicate a policy that de-

termines investment decisions and the stopping decision. Then, the manufacturer’s

first-stage problem is given as

maxP

E

[T∑t=1

αt−1gt(st, xt, it) + αTgT+1(sT+1, xT+1)

], (4.1)

where the maximum is taken for all admissible policies.

We can use the following dynamic programming algorithm to solve this problem:

VT+1(sT+1, xT+1) = ΠT+1(sT+1, xT+1)

Vt(st, xt) = max

Πt(st, xt),max

it∈It(−ci(it) + αE [Vt+1(st − wt(st), xt + kt(it))])

.

It is optimal to stop process design at period t if

Πt(st, xt) ≥ maxit∈It

(−ci(it) + αE [Vt+1(st − wt(st), xt + kt(it))]) , (4.2)

and it is optimal otherwise to continue process design by investing in

i∗t (st, xt) ≡ arg maxit∈It [−ci(it) + αE[Vt+1(st − wt(st), xt + kt(it))]] . (4.3)


4.3.2 The Second-Stage Problem

The second stage problem has two decision points: the beginning of the regular

sales period and the beginning of the salvage period. We formulate the second-stage

problem by a backward induction. Suppose that the manufacturer has Qs units of

unsold products before the salvage period. The manufacturer determines the salvage

price to maximize the expected revenue as

Js(st, Qs) ≡ maxps

psE[minQs, stAsp−bs ]. (4.4)

Next we suppose that the manufacturer has Q units of products before the regular

sales period. The manufacturer determines the regular sales price to maximize the

revenue-to-go as

Jr(st, Q) ≡ maxpr

prE[minQ, stArp−br ] + βE[Js(st, [Q− stArp−br ]+)]. (4.5)

Finally, the optimal production quantity is determined to maximize the total profit

as

Πt(st, xt) = maxQ

Jr(st, Q)− cp(xt)Q. (4.6)

4.4. Analysis

To solve the first-stage problem, we need the profit function Πt(st, xt) from the second-

stage problem. Hence, we solve the second-stage problem first, and then embed the

solution in the first-stage problem.

4.4.1 Optimal Production and Pricing Decisions

We first derive the optimal salvage price, p∗s(st, Qs) as a function of the market po-

tential st and the number of remaining products Qs. To do so, we transform the op-

timization problem (4.4) into an optimization problem that is independent of states

by changing the decision variable. We define zs ≡ Qsstp−bs

, which is called the stocking


factor (Petruzzi and Dada 1999). The stocking factor indicates the number of stan-

dard deviations that stocking quantity deviates from the expected demand. Hence,

a higher stocking level results in a higher demand fill-rate. We refer the reader to

Petruzzi and Dada (1999) and Monahan et al. (2004) for further discussions of the

stocking factor. Then, by replacing ps by(stzsQs

)1/b

in (4.4), we can derive

Js(st, Qs) = maxps

psE[minQs, stAsp−bs ]

= st

(Qs

st

)1−1/b

maxzs

E[minz1/bs , Asz

−1+1/bs ], (4.7)

which decouples the states st and Qs from the optimization problem. We denote

the optimal solution and the resulting value of the decoupled problem by z∗s ≡arg maxzsE[minz1/b

s , Asz−1+1/bs ] and J∗s ≡ E[min(z∗s)1/b, As(z

∗s)−1+1/b]. Then, the

following theorem characterizes the optimal salvage price and the optimal revenue-

to-go function.

Theorem 4.1. The optimal stocking factor always satisfies z∗s ∈ [As, As]. If, in

addition, As has an increasing generalized failure rate (IGFR)5, z∗s is the unique

solution of the equation

zProb(As > z) = (1− 1

b)E[minz, As]. (4.8)

Then, the optimal salvage price is given as p∗s(st, Qs) =(stz∗sQs

)1/b

, and the optimal

revenue-to-go function is given as Js(st, Qs) = st

(Qsst

)1−1/b

J∗s .

The optimality of z∗s that satisfies the equation (4.8) is shown by Monahan et al.

(2004).

Theorem 4.1 shows that the optimal salvage price increases in the market potential

and decreases in the number of remaining products. As the size of the potential

5The generalized failure rate of a random variable is defined as xf(x)1−F (x) , where f(x) and F (x) are

the p.d.f. and the c.d.f. of the random variable. All random variables with increasing failure rateshave IGFRs, and other common classes of random variables have IGFRs, including Log-normal,Gamma and Weibull.


market becomes larger, i.e., as st becomes larger, the manufacturer charges a larger

salvage price to maximize revenue. However, the salvage price decreases as the number

of remaining products increases because the manufacture would want to salvage most

remaining products during the salvage period. The theorem also shows that the

optimal revenue-to-go function increases in both the market potential and the number

of remaining products.

We next derive the optimal regular sales price, p∗r(st, xt), and the optimal produc-

tion quantity, Q∗(st, xt), as functions of the market potential st and the knowledge

level xt. To do so, we first transform the optimization problem (4.5) into a state-

independent optimization problem as before. We define the stocking factor of the

regular sales period as zr ≡ Q

stp−br

. Then, by replacing pr by(stzrQ

)1/b

in (4.5), we can

derive

Jr(st, Q) = maxpr

prE[minQ, stArp−br ] + βE[Js(st, [Q− stArp−br ]+)]

= max

pr

prE[minQ, stArp−br ] + βE

[st

([Q− stArp−br ]+

st

)1−1/b

J∗s

]

= st

(Q

st

)1−1/b

maxzr

E[minz1/b, Arz

−1+1/b] + βJ∗s z−1+1/bE[((z − Ar)+)1−1/b]

,

which again decouples the states from the optimization problem. For notational

simplicity, we define f(z) ≡ E[minz1/b, Arz−1+1/b] + Asz

−1+1/bE[((z − Ar)+)1−1/b],

and denote the optimal solution and the resulting value of the decoupled problem by

z∗r ≡ arg maxzf(z) and J∗r ≡ f(z∗r ).

Theorem 4.2. The optimal stocking factor z∗r is either the unique solution of the

equation

βJ∗sE[Ar(z − Ar)−1/b] = E[Ar] (4.9)

or arg maxz∈[Ar,Ar]f(z). Then, the optimal revenue-to-go function is give as Jr(st, Q) =

stJ∗r

(Qst

)1−1/b

, and the optimal regular sales price and the optimal production quantity


are given as

p∗r(st, xt) =(b− 1)J∗r

bcp(xt)(z∗r )1/b

Q∗(st, xt) = st

((b− 1)J∗rbcp(xt)

)b.

Finally, the optimal expected profit that the manufacturer can attain by introducing

the product at period t is

Πt(st, xt) = stcp(xt)1−bπ∗, (4.10)

where π∗ ≡ 1b−1

((b−1)J∗r

b

)b.

The optimal stocking factor for the regular sales period is either in the set [Ar, Ar]

or greater than Ar. Because f(z) is not a unimodal function in general, we cannot

derive the optimal z∗r from a simple equation as before. However, for z > Ar, this

function is quasi-concave and the maximum is at the point that satisfies the equation

(4.9). Hence, to determine the optimal stocking factor z∗r , one needs to evaluate

f(z) for every z ∈ [Ar, Ar] and for the single point that satisfies the equation (4.9).

Although numerically determining such z∗r is computationally difficult, the optimal

stocking factor depends on neither st nor Q. Hence, by solving only one optimization

problem, we can derive the optimal production quantity, regular sales price, and the

revenue-to-go function for all states.

Theorem 4.2 shows that the optimal production quantity is linearly increasing in

the market potential and also increasing in the knowledge level. A higher knowledge

level means a lower unit production cost. Hence, this result implies that the manufac-

turer produces more products when the unit production cost is smaller. The optimal

regular sales price is increasing in the knowledge level and is independent of the mar-

ket potential. It is important to note that the market potential affects neither the

price elasticity of demands nor the coefficient of variation of demands. The market

potential only scales the size of the second-stage problem. Hence, the optimal regular

price is independent of it. Finally, the optimal expected profit is increasing in the


market potential and the knowledge level. When the manufacturer continues process

design, the market potential and the knowledge level change in opposite directions,

i.e., there is a trade-off between the market potential and the knowledge level.

4.4.2 Optimal Market Entry Policy

When determining whether to continue process design or to stop it, the manufacturer

takes three factors into account: investment cost, the profit gain achieved by improved

knowledge, and the profit loss incurred by decrease in the market potential. Among

the three factors, investment cost depends neither on the current knowledge level

nor on the current market potential, whereas the other two factors depend on both.

For example, when the current knowledge level is very high, additional knowledge

does not improve the manufacturer’s profit. The benefit of improved knowledge is

also negligible when the size of the potential market is small. By investigating such

dependencies, we establish the optimality of threshold-type optimal market entry

policies. A threshold-type market entry policy, for example, has the following form:

stop process design if the current knowledge level exceeds a certain threshold and

continue otherwise. Such threshold-type control policies are easy to implement and

are also useful for numerically solving the optimal stopping problem.

Before discussing optimal market entry policies, we define three functions and four

thresholds that facilitate the analysis of the optimal market entry policy. We first

define the one-step benefit function as

Mt(st, xt, it) ≡ −ci(it) + αE[Πt+1(st − wt(st), xt + kt(it))]− Πt(st, xt),

which indicates the myopic benefit of investing in learning activity it at period t. The

one-step benefit function assesses the benefit of investing in it by taking only one

period into account. Similarly, we define the benefit function as

Bt(st, xt, it) ≡ −ci(it) + αE[Vt+1(st − wt(st), xt + kt(it))]− Πt(st, xt),

which indicates the true benefit of investing in learning activity it at period t. We


also define the maximal benefit function as Bt(st, xt) ≡ Bt(st, xt, i∗t (st, xt)), which

indicates the benefit of continuing process design at period t.6 Then, a market entry

policy that stops process design if Bt(st, xt) ≤ 0 is optimal. The set of states at

which stopping process design is optimal is given as (st, xt) : Bt(st, xt) ≤ 0. We

denote the boundaries of this set by xt(st) ≡ infx : Bt(st, x) ≤ 0, xt(st) ≡ supx :

Bt(st, x) ≤ 0, st(xt) ≡ infs : Bt(s, xt) ≤ 0, and st(xt) ≡ sups : Bt(s, xt) ≤ 0,which are the optimal thresholds for the market entry policies that we derive shortly.

First, we establish the optimality of a knowledge-level-based lower threshold pol-

icy.

Theorem 4.3. Let t be the first period at which xt ≥ − 1γ

log( δ0(b−1)δ1

) holds for every

realization of xt. Then, (i) Bt(st, xt) is decreasing in xt, and (ii) a knowledge-level-

based lower threshold policy that stops process design if xt ≥ xt(st) is optimal for

every t ≥ t.

The manufacturer’s expected profit, Πt(st, xt), is increasing in xt with an increasing

return for small values of xt, and is increasing in xt with a decreasing return for large

values of xt.7 In other words, the manufacturer’s expected profit increases rapidly as

the knowledge level increases from a small value, but the benefit of additional knowl-

edge diminishes as the knowledge is accumulated. At periods t ≥ t, the knowledge

level is sufficiently high such that additional knowledge does not improve the manu-

facturer’s profit much. In contrast, investment cost and the market potential loss are

independent of the knowledge level. In this case, the benefit of continuing process

design, i.e., Bt(st, xt), decreases as the manufacturer’s knowledge level increases, and

thus the manufacturer should stop process design if the knowledge level exceeds a

certain threshold.

For periods earlier than period t, the structure of the optimal market entry policy

is generally unknown. However, the knowledge-level-based lower threshold policy is

likely to be optimal for those periods as well. If the knowledge level is low at early

6Note from (4.3) that arg maxit∈ItBt(st, xt, it) = arg maxit∈It

− ci(it)+αE[Vt+1(st−wt(st), xt+kt(it))] = i∗t (st, xt).

7Πt(st, xt) is proportional to 1/cp(xt)b−1, which is convex increasing in xt for xt < − 1γ log( δ0

(b−1)δ1)

and concavely increasing in xt for xt ≥ − 1γ log( δ0

(b−1)δ1)


periods, the manufacturer would want to continue process design for multiple periods

to sufficiently reduce the unit production cost. On the other hand, if the knowledge

level is high at early periods, the benefit of continuing process design decreases in xt

for the same reason as in periods t ≥ t. In such a case, the knowledge-level-based

lower threshold policy is optimal.

The optimal market entry policy may have a reversed structure if the manufacturer

does not have enough time to sufficiently improve the production process. If the

manufacturer’s knowledge level can never reach a certain point, the profit gain that

additional knowledge provides always increases as the knowledge level increases. In

this case, continuing process design is more beneficial when the knowledge level is

higher, and thus a knowledge-level-based upper threshold policy that stops process

design if xt ≤ xt(st) is optimal. In §4.5.1, we provide one extreme example under

which such a policy is optimal. However, this case can hardly arise in practice.

Manufacturing firms usually allocate enough time for process design to sufficiently

reduce the unit production cost.

We next establish the optimality of a market-potential-based upper threshold

policy.

Theorem 4.4. Suppose that the condition

dE[wt(st)]

dst≤ 1−min

it∈It

cp(xt)1−b

αE[cp(xt + kt(it))1−b](4.11)

holds for every t. Then, (i) Bt(st, xt) is increasing in st, and (ii) a market-potential-

based upper threshold policy that stops process design if st ≤ st(xt) is optimal for

every t.

The left-hand-side of (4.11) indicates the sensitivity of the market potential reduc-

tion in the current market potential, and the right-hand-side of (4.11) indicates the

minimum relative profit improvement achieved by additional knowledge. Hence, the

condition (4.11) holds, for example, when investing in learning activities sufficiently

improves the production process. The condition also holds when the history of the

market potential does not provide much information about future changes in the


market potential, i.e., when wt(st) is insensitive to st. We have mentioned above

that three factors affect the manufacturer’s market entry decision: investment cost,

the profit gain achieved by improved knowledge, and the profit loss incurred by de-

crease in the market potential. Among them, investment cost is independent of the

market potential, and the profit gain achieved by improved knowledge is proportion-

ally increasing in the current market potential. The profit loss incurred by market

potential reduction can either increase or decrease in the current market potential

depending on the sensitivity of wt(st) in st. When the change in the market po-

tential is sufficiently insensitive to the current market potential, i.e., when dE[wt(st)]dst

is sufficiently small, or when additional knowledge substantially improves profit, i.e.,

when 1−minit∈Itcp(xt)1−b

αE[cp(xt+kt(it))1−b]is large, the benefit of continuing process design in-

creases as the market potential increases, and thus the market-potential-based upper

threshold policy is optimal.

When the large value of the current market potential signals a big market potential

loss for the upcoming time period, a reversed market-potential-based threshold policy

is optimal.

Theorem 4.5. Suppose that the condition

dE[wt(st)]

dst≥ 1−max

it∈It

cp(xt)1−b

αE[cp(xt + kt(it))1−b](4.12)

holds for every t. Then, (i) Bt(st, xt) is increasing in st, and (ii) a market-potential-

based lower threshold policy that stops process design if st ≥ st(xt) is optimal for every

t.

If the expected market potential loss, E[wt(st)], increases in the current market po-

tential more rapidly than the relative profit improvement, 1 − cp(xt)1−b

αE[cp(xt+kt(it))1−b], the

benefit of investing in learning activities decreases as the current market potential

increases. In this case, the manufacturer should stop process design if the market

potential is larger than a certain threshold. In §4.5.1, we provide an example under

which the market-potential-based lower threshold policy is optimal, but the condition

(4.12) rarely holds in general.


Theorem 4.6. (a) If Bt(st, xt) is decreasing [resp., increasing] in xt and increasing

[resp., decreasing] in st, then xt(st) [resp., xt(st)] is increasing in st and st(xt) [resp.,

st(xt)] is increasing in xt.

(b) If Bt(st, xt) is increasing [resp., decreasing] in both xt and st, then xt(st) [resp.,

xt(st)] is decreasing in st and st(xt) [resp., st(xt)] is decreasing in xt.

Consider the case in which Bt(st, xt) is decreasing in xt and increasing in st in Theo-

rem 4.6(a). The increasing property of xt(st) implies that when the market potential

is larger, continuing process design is optimal for a larger region of knowledge levels.

Similarly, the increasing property of st(xt) implies that when the knowledge level is

higher, continuing process design is optimal for a smaller region of the market po-

tential. Such monotonicities of the optimal thresholds help us to better understand

how the optimal decision responds to the changes in the environment and are also

useful for numerically solving the problem. The other cases of Theorem 4.6 can also

be explained in a similar way.

4.5. Numerical Study

This section presents the result of our numerical studies. The purpose of this section

is three-fold. First, we illustrate some examples of the optimal market entry and

process improvement decisions. The examples further generate managerial insights

regarding the market entry and process improvement decisions. Second, we develop

two measures that quantify the value of the dynamic strategy. The two measures

respectively assess the profit improvement benefit and the risk reduction benefit of the

dynamic strategy. Third, by evaluating the two measures under various conditions,

we characterize industry characteristics under which the dynamic strategy is the most

effective.

For our numerical study, we model the dynamics of process improvement, mar-

ket potential changes, and demand uncertainty as follows. At each period t ∈1, 2, . . . , T of the process design stage, the manufacturer has two investment op-

tions for process improvement: regular learning, r, and expedited learning, e. Regular

learning incurs investment costs of ci(r) and increases the manufacturer’s knowledge


level by kt(r), which is uniformly distributed from k to k. Expedited learning incurs

investment costs of ci(e), and increases the knowledge level by kt(e), which is uni-

formly distributed from 2k to 2k, i.e., expedited learning doubles the learning rate of

regular learning at additional costs of ci(e) − ci(r). The market potential decreases

by wt at each period, where wt is normally distributed and independent of st. We

denote the coefficient of variation of wt by cvw. The demand uncertainty Ar is also

normally distributed, and we denote the coefficient of variation of Ar by cvA. We

assume As is deterministic in the numerical studies.

The base numerical setting for our studies is as follows: T = 4, α = 1, β = 1,

k = 3, k = 13, ci(r) = 0, ci(e) = 4, cvw = 0.2, E[Ar] = 1, cvA = 0.2, As = 0.1,

b = 1.5, δ0 = 1, δ1 = 10, and γ = 0.075. We illustrate the values of E[st] and E[wt]

in Figure 4.1. The investment cost for regular learning is sunk, i.e., ci(r) = 0.

Figure 4.1: Expected Values of Market Potential

50

100

150

200

250

1 2 3 4 5

Exp

ect

ed

Mar

ket

Po

ten

tial

(E[

st])

Time (t)

E[w1]

E[w2]

E[w3]

E[w4]

4.5.1 Optimal Market Entry and Process Improvement De-

cisions

Figure 4.2 illustrates the optimal market entry and process improvement decisions

for periods 2 and 3 under the base numerical setting. Because the base numerical

setting satisfies the sufficient condition of Theorem 4.4, the market-potential-based

upper threshold policy is optimal for both periods, i.e., stopping process design is


Figure 4.2: Optimal Investment and Stopping Policy of Base Numerical Setting

(a) Period 2

3

13

23

33

43

117 137 157 177

Kn

ow

led

ge L

eve

l (X

t)

Market Potential (St)

R1. Expedited Learning

R4: Stopping

R3: Expedited Learning

R2: Regular Learning

S2

(b) Period 3

6

16

26

36

46

56

73 123 173

Kn

ow

led

ge L

eve

l (X

t)


R1: Expedited Learning


R3: Stopping

optimal if the market potential is smaller than the optimal threshold. In the base

setting, a larger value of the current market potential does not imply a bigger loss in

the market potential, and thus the benefit of continuing process design increases as

the current market potential increases.

Although some possible realizations of x2 and x3 do not satisfy the condition xt ≥− 1γ

log( δ0(b−1)δ1

) in Theorem 4.3, Figure 4.2 shows that the knowledge-level-based lower

threshold policy is optimal for both periods, i.e., stopping process design is optimal if

the current knowledge level exceeds the optimal threshold. As we discussed above, the

knowledge-level-based market entry policy can have a reversed structure only when

the manufacturer does not have enough time to sufficiently improve the production

process. To verify this argument, we examine the optimal market entry decision

when the manufacturer has only one period to improve the production process, i.e.,

when T = 1, and report the result in Figure 4.3. The other numerical settings are

the same as in the base numerical setting except b = 2.5, ci(r) = 2, ci(e) = 8, and

E[w1] = 20. In this case, the manufacturer cannot increase his knowledge level to

the point above which benefit of additional knowledge diminishes as the knowledge

level increases, because he has only one period to improve the process. Except for in

such an extreme case, the knowledge-level-based lower threshold policy is optimal for


Figure 4.3: Knowledge-Level-Based Upper Threshold Policy

0

5

10

15

20

190 195 200 205 210

Kn

ow

led

ge L

eve

l (X

t)



R1: Stopping

every period. We have tested 18 variations of the base numerical setting8, and the

knowledge-level-based lower threshold turns out to be optimal for every period in all

tests.

When a large value of the current market potential signals a big market potential

loss for the upcoming time period, the market-potential-based market entry policy

can also have a reversed structure. We consider the case in which T = 1, b = 2,

w1(s1) = s1 − 120, and other settings are the same as in the base numerical setting.

In this case, the market potential of period 2 is always s2 = s1 − w1(s1) = 120

regardless of the market potential of period 1, which implies that the manufacturer

completely loses the benefit of a large market potential by delaying the market entry.

Hence, continuing process design is less beneficial when the market potential is larger.

In this case, the market-potential-based lower threshold policy is optimal as shown

in Figure 4.4.

We next discuss the optimal process improvement decisions in Figure 4.2. The

optimal investment policy at period 3 is to invest in expedited learning when the

knowledge level is low (Region 1) and invest in regular learning when the knowledge

level is high (Region 2). Intuitively, a low knowledge level may significantly delay the

market entry unless the manufacturer expedite the learning rate, and thus the man-

ufacturer should invest more capital in learning when the knowledge level is lower.

8We refer the reader to §4.5.3 for the details of the test settings.


Figure 4.4: Market-Potential-Based Lower Threshold Policy

0

5

10

15

20

190 195 200 205 210

Kn

ow

led

ge L

eve

l (X

t)



R2: Stopping

However, the optimal process improvement decision of period 2 shows a different

structure. In Figure 4.2(a), investing in expedited learning is optimal not only when

the knowledge level is very low (Region 1), but also when the knowledge level is rea-

sonably high (Region 3). To explain the reasoning behind this structure, we measure

the probability that the manufacturer stops process design at period 3 for different

states of period 2. For each (s2, x2) in the set Π ≡ (s2, x2) : s2 = 192, x2 ∈ [3, 27],9

we first calculate the probability distribution of (s3, x2) when the manufacturer fol-

lows the optimal process improvement decision at period 2. Using this probability

distribution, we next calculate Prob(B3(s3, x3) ≤ 0|s2, x2), which indicates the prob-

ability that the manufacturer stops process design at period 3 for each given (s2, x2).

Figure 4.5 illustrates this probability for every (s2, x2) ∈ Π as a function of x2. For

all states across Regions 1 and 2, the probability of stopping at period 3 is almost 0,

which means that the manufacturer will introduce the product either at period 4 or

5. In contrast, when the knowledge level falls in Region 3, the manufacturer intro-

duces the product at period 3 with a substantial probability. When the knowledge

level is sufficiently high, the manufacturer can effectively accelerate the market entry

timing by expediting the learning rate accordingly. This result shows the importance

of coordinating process improvement decisions with the market entry decision. By

dynamically adjusting the learning rate depending on states, the manufacturer can

9The set Π corresponds to the gray box at the bottom right corner of Figure 4.2(a).


Figure 4.5: Probability of Stopping at Period 3

0

0.2

0.4

0.6

0.8

1

3 8 13 18 23

Pro

bo

f St

op

pin

g at

Pe

rio

d 3

Knowledge Level (x2)

Expedited Learning

Regular Learning

successfully control the market entry timing.

4.5.2 Measuring the Value of the Dynamic Strategy

In this subsection, we propose two measures that assess the value of the dynamic

strategy. Compared to a static strategy under which the manufacturer determines

market entry timing and process improvement decisions upfront, the dynamic strat-

egy increases the manufacturer’s expected profit. In addition, the dynamic strategy

also reduces the variability of profit by enabling the manufacturer to respond to an

unexpectedly low knowledge level or unusual changes in market potential. The two

measures quantify these benefits respectively.

We first explicitly define the static strategy using our modeling framework. We

define period 0 as the period at which the manufacturer makes the market entry timing

and process improvement decisions. At this time, the manufacturer is uncertain

about the initial knowledge level x1 and the initial market potential s1. However,

the manufacturer has the information that the initial knowledge level x1 is uniformly

distributed from 0 to 23 and the initial market potential is normally distributed with

a mean value of 200 and the coefficient of variation of 0.5. The market entry timing

and process improvement decisions under the static strategy are state-independent,

i.e., the decisions are static. For example, the manufacturer may decide to invest in


regular learning at period 1 and enter the market at period 2 regardless of the states.

Among all such static decisions, the manufacturer chooses the optimal static decision

that maximizes his expected profit. Mathematically, this problem is identical to the

stochastic optimization problem (4.1) except that now the maximum is taken over all

static policies.

It is important to note that the optimal policy under the static strategy is an

admissible policy for the original problem (4.1). Hence, the dynamic strategy always

yields a higher expected profit than the static strategy. Let Wd be the profit of the

dynamic strategy, and Ws be the profit of the static strategy. Then, the percent-

age difference in the expected profit, I ≡ E[Wd]−E[Ws]E[Ws]

× 100%, measures the profit

improvement benefit of the dynamic strategy. By enabling the manufacturer to re-

spond to the realization of uncertain states, the dynamic strategy also reduces the

variability of profit. The percentage difference in the coefficient of variation of profit,

R ≡√V ar(Ws)/E[Ws]−

√V ar(Wd)/E[Wd]√

V ar(Ws)/E[Ws]×100%, measures this risk reduction benefit of the

dynamic strategy. The variability of profit is a widely used measure of risks involved

in managerial decisions (e.g., Martinez-de Albeniz and Simchi-Levi 2003)10.

Table 4.1 shows the expected profits, coefficient of variations of profits, I, and R

under the base numerical setting. We have computed V ar(Wd) and V ar(Ws) using

40, 000 independent samples of Wd and Ws. Under the base numerical setting, the

Table 4.1: Expected Value and Coefficient of Variation of Profits

E[Wd] E[Ws] I(%)

√V ar(Wd)

E[Wd]

√V ar(Ws)

E[Ws]R(%)

34.44 33.50 2.836 0.2061 0.2288 9.911

expected profit of the dynamic strategy is 2.836% larger than that of the static strat-

egy. In addition, the profit of the dynamic strategy has a 9.911% smaller coefficient

of variation than that of the static strategy. In other words, the dynamic strategy

yields a higher and less variable profit for the manufacturer.

10For fair comparison, we use the coefficient of variation of profit instead of the standard deviationof profit, because the dynamic strategy and the static strategy yield different expected profits.


4.5.3 Effectiveness of the Dynamic Strategy under Various

Industrial Conditions

We next evaluate the two measures, I and R, under various numerical settings. The

objective of this test to identify when the dynamic strategy becomes the most effective.

In particular, we examine the impact of the six factors illustrated in Table 4.2 on the

value of the dynamic strategy. For each factor, we first explain the test setting and

Table 4.2: Key Factors that Determine the Value of the Dynamic Strategy

Factors of Interest

Process Characteristicsuncertainty in learning

cost of expedited learningreducible unit production cost

Market Characteristicsuncertainty in market potential changes

demand uncertaintysize of the salvage market

then report I and R that we have computed via simulation.

Uncertainty in learning: We first evaluate the impact of the uncertainty in learn-

ing, i.e., the impact of the variability of kt(it). To change the variability of kt(it), we

change the difference k − k while holding the mean of k and k to be as in the base

numerical setting. The larger difference between k and k means a more variable out-

come of the learning activities. Because we hold the average of k and k to be constant,

the average process improvement rate remains constant for all test settings.

Figure 4.6 illustrates I and R for several values of k− k. As the degree of the un-

certainty in learning increases, both I and R increase rapidly. This result implies that

the dynamic strategy is effective when the outcome of process improvement activities

is highly uncertain. For example, the dynamic strategy is promising for industries

in which failures of manufacturing process development projects are pervasive. The

dynamic strategy enables manufacturing firms to effectively adjust the market entry

timing depending on the outcome of R&D activities for process improvement.


Figure 4.6: Impact of Uncertainties in Learning

4

6

8

10

12

1.5

2

2.5

3

3.5

6 8 10 12

Re

du

ctio

n in

C.V

. (R

)

Imp

rove

me

nts

in P

rofi

t (I

)

Width of the Support of kt

I

R

Reducible unit production cost: We next consider the impact of the reducible

unit production cost, δ1. Recall that the unit production cost, cp(xt), consists of the

irreducible cost, δ0, and the reducible cost, δ1e−γxt . As the manufacturer’s knowledge

level increases, the reducible production cost decreases, whereas the irreducible cost

remains constant regardless of the knowledge level. Hence, process improvement

decisions are more important when the reducible cost is larger. For this reason, the

dynamic strategy, which optimally controls process improvement decisions, becomes

more effective as δ1 becomes larger as shown in Figure 4.7. When the knowledge

Figure 4.7: Impact of Reducible Unit Production Cost

6

12

18

2

3

4

5

10 12 14 16

Re

du

ctio

n in

C.V

. (R

)

Imp

rove

me

nts

in P

rofi

t (I

)

Reducible Unit Cost (δ1)

I R

regarding the production process is the key determinant of the production cost such


as in the hard disk drive industry (Bohn and Terwiesch 1999), the dynamic strategy

can significantly improve the firm’s profit.

Cost of expedited learning: We next evaluate the impact of the cost of expedited

learning, ci(e). Recall from §4.5 that the regular learning cost is sunk, i.e., ci(r) = 0.

Hence, ci(e) indicates the additional cost that the manufacturer has to pay to expedite

the learning rate. In Figure 4.8, we illustrate the value of the dynamic strategy for

several values of ci(e). The dynamic strategy provides the flexibility of determining

Figure 4.8: Impact of the Cost of Expedited Learning

0

6

12

18

2

3

4

5

1 2 3 4

Re

du

ctio

n in

C.V

. (R

)

Imp

rove

me

nts

in P

rofi

t (I

)

Cost of Expedited Learning (c1i(e))

I R

the learning rate depending on the realization of uncertain states. However, when

expedited learning incurs a substantially higher cost than regular learning, i.e., when

ci(e) is large, the manufacturer does not often invest in expedited learning, and thus

the value of the flexible management is low. From this reason, the profit improvement

achieved by the dynamic strategy decreases as the cost of expedited learning increases.

In Figure 4.8, the variability reduction benefit shows a non-monotonic pattern as

the cost increases from 1 to 2. When ci(e) = 1, the cost difference between regular

learning and expedited learning is small, and hence the manufacturer who employs

the static strategy invests in expedited learning at period 1, whereas he invests in

regular learning at period 1 when ci(e) = 2, 3, 4. For this reason, when ci(e) = 1,

the manufacturer enters the market earlier than the other cases, which substantially

reduces the variability of profit. Such drastic changes in the optimal static policy


can yield a non-monotonic pattern of R as in Figure 4.8, but the profit improvement

benefit, I, always shows a monotonic pattern in all our numerical studies. Hence, we

focus on I when discussing the impact of environments on the value of the dynamic

strategy.

Uncertainty in market potential changes: We next examine the impact of the

uncertainty in market potential changes, i.e., the impact of the variability of wt. We

fix the expected value E[wt] as in the base setting and change the coefficient of vari-

ation of wt, i.e., cvw, for this test. Figure 4.9 illustrates I and R for several values

of cvw. As the coefficient of variation increases, the dynamic strategy becomes more

Figure 4.9: Impact of Uncertainties in Market Potential Change

8

9

10

11

2

2.5

3

3.5

0 0.1 0.2 0.3

Re

du

ctio

n in

C.V

. (R

)

Imp

rove

me

nts

in P

rofi

t (I

)

Coefficient of Variation of wt

I R

effective. The market potential changes are highly uncertain when, for example, com-

petitors’ strategies are difficult to anticipate or economic conditions are volatile. In

such cases, the market entry timing decision that has been made before the realiza-

tion of uncertain states may force the manufacturer to wait until he loses a great

amount of market potential. In contrast, the market entry decision that has been

made based on the realization of states enables the manufacturer to respond to such

events. Hence, the value of the dynamic strategy is greater when the changes in the

market potential are more uncertain.


Demand uncertainty: We next evaluate the impact of demand uncertainty, i.e.,

the impact of cvA. In Figure 4.10, the value of the dynamic strategy increases as the

Figure 4.10: Impact of Demand Uncertainty

6

9

12

2

2.5

3

3.5

0.1 0.2 0.3 0.4

Re

du

ctio

n in

C.V

. (R

)

Imp

rove

me

nts

in P

rofi

t (I

)

Coefficient of Variation of Ar

I R

degree of demand uncertainty increases. The expected profit that the manufacturer

can attain during the production and sales stage is smaller when the demand is more

uncertain. On the other hand, the optimal control of pre-introduction decisions -

market entry and process improvement decisions - is more critical when the expected

profit during the production and sales season is smaller. Hence, the value of the

dynamic strategy increases as the demand becomes more uncertain.

Size of the salvage market: Finally, we consider the impact of the size of the

salvage market, As. Figure 4.11 illustrates the value of the dynamic strategy for

several values of As. The manufacturer’s expected profit during the production and

sales stage increases as the size of the salvage market increases, which implies that

the size of the salvage market and the degree of demand uncertainty have opposite

effects on the value of the dynamic strategy. From this reason, both I and R decrease

as the size of the salvage market increases.


Figure 4.11: Impact of the Size of the Salvage Market

6

9

12

2

2.5

3

3.5

0 0.1 0.2 0.3

Re

du

ctio

n in

C.V

. (R

)

Imp

rove

me

nts

in P

rofi

t (I

)

Size of Salvage Market (As)

I R

4.6. Conclusion and Discussion

In this chapter, we have studied a manufacturer’s problem of dynamically optimiz-

ing the market entry timing and process improvement decisions for a new product.

In contrast to the existing studies on new product introduction decisions, we have

considered a dynamic strategy under which the manufacturer of the new product

makes the market entry decision depending on the realization of two major uncer-

tainties: competitors’ movements and the readiness of the production process. We

have developed a two-stage stochastic decision process for the manufacturer’s prob-

lem. The first-stage is an optimal stopping problem that determines the timing for

introducing a new product to the market, and process improvement decisions for the

new product. The second-stage is a production and pricing decision problem for the

new product. To solve this problem, we have formulated a two-stage dynamic pro-

gram from which we establish the optimality of several threshold-type market entry

policies and characterize optimal production and pricing decisions. Via numerical

studies, we have shown that the dynamic strategy can increase the manufacturer’s

profit while substantially reducing the variability of profit. By examining the value of

the dynamic strategy under various environments, we have shown that the dynamic

strategy is effective when (i) the outcome of learning activities involve a great amount

of uncertainties, (ii) the manufacturer can significantly reduce the production cost by


improving the production process, (iii) flexible management of learning activities is

possible, (iv) changes in the market potential are difficult to anticipate, (v) demand

is highly uncertain, and (vi) the size of salvage market is small.

We have studied the market entry decision from a single firm’s perspective. When

multiple firms employ the same dynamic market entry strategies, the equilibrium

outcome would be different from the conventional equilibrium outcome. Although

the dynamic market entry strategy may trigger a more severe competition among

competing firms, the impact of the dynamic strategy is not always negative for man-

ufacturing firms. For example, when a firm is aware of the competitors’ abilities to

respond to its market entry decision, the firm may not want to enter the market with

an ill-understood production process to simply beat the competition. Investigating

the equilibrium outcome when multiple manufacturing firms adopt dynamic market

entry strategies would be an interesting research problem.

Our study also highlights the importance of managing risks in new product in-

troduction. Although balancing the trade-off between the time-to-market and the

completeness of the production process has been the main subject of many studies,

the risk involved in new product introduction decisions has received little attention.

We have shown that the dynamic market entry strategy significantly reduces the vari-

ability of profit, i.e., the risk involved in new product introduction decisions, while

increasing profit. We hope that our research paves a new line of research that inves-

tigates new product introduction decisions that take risk into consideration.

Appendix A

Chapter 2 Appendices

A.1. Stochastic Monotonicities of the State Tran-

sition

In this section, we provide some results that can be used to prove stochastic mono-

tonicities of state transition models. We first provide a theorem.

Theorem A.1. Suppose that x(θ) = φ(θ, ξ), where φ is a deterministic function and

ξ is a random vector.

1. If φ is linear in θ for every realization of ξ, then x(θ) is stochastically convex

in θ.

2. If φ is increasing in θ for every realization of ξ, then x(θ) is stochastically

increasing in θ.

Proof. For the first part, suppose that for any realization ξ of ξ, φ(θ, ξ) is a deter-

ministic linear function of θ. Then, for any convex function u, u(φ(θ, ξ)) is convex

in θ, because the composition of a convex function and a linear function is convex.

Then, E[u(φ(θ, ξ))] is convex in θ, because taking expectation preserves the convex-

ity. Therefore, x(θ) is stochastically convex. Part 2 can be proved in a similar way.

Suppose for any realization ξ of ξ, φ(θ, ξ) is a deterministic increasing function of

108

APPENDIX A. CHAPTER 2 APPENDICES 109

θ. Then, for any increasing function of u, u(φ(θ, ξ)) is increasing in θ for each ξ.

Therefore, E[u(φ(θ, ξ))] is increasing in θ, which concludes the theorem.

This theorem can be used to prove stochastic monotonicities of a commonly used

set of state transition models listed below. With the exception of the first three, the

other parameterized random variables are from Shaked and Shanthikumar (2007).

Additional results can be found in Muller and Stoyan (2002).

1. xt+1(xt) = ξ + xt is stochastically convex and stochastically increasing.

2. xt+1(xt) = ξxt for a positive random variable ξ is stochastically convex and

stochastically increasing.

3. xt+1(xt) = xt + (1 − xt)ξ for xt ∈ [0, 1] and ξ ∈ [0, 1] is stochastically convex

and stochastically increasing.

4. Suppose xt+1,1(xt,1, xt,2) is a normal random variable with mean xt,1 and stan-

dard deviation xt,2. Then it is stochastically increasing in xt,1 and stochastically

convex in xt,2.

5. Suppose xt+1(xt) is a Poisson random variable with mean xt. Then it is stochas-

tically increasing in xt.

6. Suppose xt+1,1(xt,1, xt,2) is a binomial random variable with mean xt,1xt,2 and

variance xt,1xt,2(1− xt,2). Then it is stochastically increasing in xt.

7. Suppose xt+1(xt) is uniformly distributed on [0, xt]. Then it is stochastically

increasing in xt > 0.

8. Suppose xt+1(xt) is uniformly distributed on 0, 1, . . . , xt − 1. Then it is

stochastically increasing in xt > 0.

9. Let Y (m), m = 1, 2, . . ., be a sequence of non-negative i.i.d. random variables.

Then xt+1(xt) =∑xt

m=1 Y (m) is stochastically increasing in xt.


10. Let Yi(j) for i = 1, 2, . . . , d and j = 1, 2, . . ., be a sequence of non-negative i.i.d.

random variables. Then xt+1(xt) = (∑xt,1

j=1 Y1(j), . . . ,∑xt,d

j=1 Yd(j)) is stochasti-

cally increasing in xt.

A.2. Proofs

Proof of Proposition 2.2. The proof is based on an induction argument. At period

t = T − 1, we have BT−1(xT−1) = MT−1(xT−1), which is convex in xT−1. Next

assume for the induction argument that Bt+1(x) is convex in x. The composition

max0, Bt+1(x) is convex in x because function max0, x is convex increasing. The

stochastic convexity of the state transition implies that E[max0, Bt+1(xt+1(xt))] is

also convex in xt. Because convexity is preserved under addition, the benefit function

Bt(xt) = Mt(xt) + αE[max0, Bt+1(xt+1(xt))] is convex in xt, which concludes the

induction argument and hence the proof of Part 1. Part 2 can be proved in a similar

way to the proof of Proposition 2.1 Part 2.

Proof of Proposition 2.3. The proof is based on an induction argument. We consider

the increasing case. At period t = T − 1, we have BT−1(xT−1) = MT−1(xT−1), which

is increasing in xT−1,i. Next assume for the induction argument that Bt+1(xt+1)

is increasing in xt+1,i. The composition max0, Bt+1(xt+1) is also increasing in

xt+1,i. For a given xt,−i, define f(xt+1,i) ≡ E[max0, Bt+1(xt+1,i, xt+1,−i(xt,−i))].Because taking expectation preserves the increasing property, f(xt+1,i) is increas-

ing in xt+1,i. Then the stochastic increasing property of xt+1,i(xt) in xt,i implies that

E[max0, Bt+1(xt+1,i(xt), xt+1,−i(xt,−i))] = E[f(xt+1,i(xt)] is increasing in xt,i. Be-

cause increasing property is preserved under addition,

Bt(xt) = Mt(xt) + αE[max0, Bt+1(xt+1(xt))]

is also increasing in xt,i, which concludes the induction argument, hence the proof of

the first part.



t = T−1, we have BT−1(xT−1) = MT−1(xT−1), which is convex in xT−1,i. Next assume

for the induction argument that Bt+1(xt+1) is convex in xt+1,i. Then the composi-

tion max0, Bt+1(xt+1) is also convex in xt+1,i. For a given xt,−i, define f(xt+1,i) ≡E[max0, Bt+1(xt+1,i, xt+1,−i(xt,−i))]. Because taking expectation preserves convex-

ity, f(xt+1,i) is convex in xt+1,i. Then the stochastic convex property of xt+1,i(xt) in xt,i

implies that E[max0, Bt+1(xt+1(xt))] = E[max0, Bt+1(xt+1,i(xt), xt+1,−i(xt,−i))] =

E[f(xt+1,i(xt)] is convex in xt,i. Then, Bt(xt) = Mt(xt) +αE[max0, Bt+1(xt+1(xt))]is convex in xt,i, which concludes the induction argument, hence the proof of the first

part.

Proof of Proposition 2.5. The proof is based on an induction argument. At period t =

T − 1, we have BT−1(xT−1) = MT−1(xT−1), which is increasing in xT−1. Next assume

for the induction argument that Bt+1(xt+1) is increasing in each element of xt+1.

The composition of an increasing function with max0, x is also increasing, hence,

max0, Bt+1(xt+1) is increasing in xt+1. Because the state transition xt+1(xt) is

stochastically increasing in xt, E[max0, Bt+1(xt+1(xt))] is increasing in xt. Because

the multi-dimensional increasing property is preserved under addition, the benefit

function Bt(xt) = Mt(xt)+αE[max0, Bt+1(xt+1(xt))] is also an increasing function,

which concludes the induction argument and the proof of the first part.

Proof of Proposition 2.6. We define a set At(xt,−i) ≡ xt,i : Bt(xt,i, xt,−i) ≤ 0, xt ∈X. To prove Part 1, first consider the case when Bt(xt) is increasing in both xt,i and

xt,j for i 6= j. Then for given yt and zt such that zt,−(i,j) = yt,−(i,j) and yt,j ≥ zt,j, the

following inequality holds:

Bt(xt,i(yt,−i), zt,j, zt,−(i,j)) = Bt(xt,i(yt,−i), zt,j, yt,−(i,j))

≤ Bt(xt,i(yt,−i), yt,j, yt,−(i,j)) = 0.

The inequality stems from the fact that Bt(xt) is increasing in xt,j, and the last

equality stems from the definition of xt,i(yt,−i). Therefore, xt,i(yt,−i) ∈ At(zt,−i),

which in turn implies that xt,i(zt,−i) ≥ xt,i(yt,−i). This concludes the proof of Part 1

for the increasing case. The decreasing case can be proved in a similar way.


To prove Part 2, note that for a given yt and zt such that yt,−(i,j) = zt,−(i,j) and

yt,j ≤ zt,j, the following inequality holds:

Bt(xt,i(yt,−i), zt,j, zt,−(i,j)) = Bt(xt,i(yt,−i), zt,j, yt,−(i,j))

≤ Bt(xt,i(yt,−i), yt,j, yt,−(i,j)) = 0.

The inequality stems from the fact that Bt(xt) is decreasing in xt,j, and the equal-

ity stems from the definition of xt,i(yt,−i). Therefore, xt,i(yt,−i) ∈ At(zt,−i), hence,

xt,i(yt,−i) ≤ xt,i(zt,−i) ≡ supAt(zt,−i). The argument for the increasing property of

xt,j(xt,−j) in xt,i can be proved in a similar way.

To prove part 3, first consider the case when Bt(xt) is increasing in xt,i and con-

vex in xt,j. Then for a given yt and zt such that yt,−(i,j) = zt,−(i,j) and yt,i ≤ zt,i, the

following inequality holds: Bt(yt,i, xt,j(zt,−i), yt,−(i,j)) = Bt(yt,i, xt,j(zt,−i), zt,−(i,j)) ≤Bt(zt,i, xt,j(zt,−i), zt,−(i,j)) = 0. The inequality stems from the fact that Bt(xt) is in-

creasing in xt,i, and the last equality stems from the definition of xt,i(zt,−i). Therefore,

xt,i(zt,−i) ∈ At(yt,−i), hence, xt,i(zt,−i) ≤ xt,i(yt,−i). Similarly, Bt(yt,i, xt,j(zt,−i), yt,−(i,j)) =

Bt(yt,i, xt,j(zt,−i), zt,−(i,j)) ≤ Bt(zt,i, xt,j(zt,−i), zt,−(i,j)) = 0. Therefore, xt,i(zt,−i) ∈At(yt,−i), hence, xt,i(zt,−i) ≥ xt,i(yt,−i), which concludes Part 3.

Proof of Proposition 2.7. For the first part, we define At ≡ x ∈ X : Bt(x) ≤ 0.The decreasing property of Bt(x) in t implies that At ⊂ At+1 for every t. There-

fore, xt = supAt ≤ supAt+1 = xt+1, and xt = inf At ≥ inf At+1 = xt+1. For

the second part, we define At(xt,−i) ≡ xt,i : Bt(xt,i, xt,−i) ≤ 0, xt ∈ X. The de-

creasing property of Bt(x) in t implies that At(xt,−i) ⊂ At+1(xt,−i) for every t and

every xt,−i. Therefore, xt(xt,−i) = supAt(xt,−i) ≤ supAt+1(xt,−i) = xt+1(xt,−i), and

xt(xt,−i) = inf At(xt,−i) ≥ inf At+1(xt,−i) = xt+1(xt,−i).

Proof of Proposition 2.8. For the first part, we define two sets At ≡ x ∈ X : Bt(x) ≤0 and Ot ≡ x ∈ X : Mt(x) ≤ 0. By definition, Bt(x) ≥ Mt(x) for every x, which

implies that At ⊂ Ot. Therefore, xt = supAt ≤ supOt, and xt = inf At ≥ inf Ot. For

the second part, we define two sets At(xt,−i) ≡ xt,i : Bt(xt,i, xt,−i) ≤ 0, xt ∈ X and

Ot(xt,−i) ≡ xt,i : Mt(xt,i, xt,−i) ≤ 0, xt ∈ X. By definition, Bt(x) ≥Mt(x) for every


x, which implies that At(xt,−i) ⊂ Ot(xt,−i). Therefore, xt(xt,−i) = supAt(xt,−i) ≤supOt(xt,−i), and xt(xt,−i) = inf At(xt,−i) ≥ inf Ot(xt,−i).


t = T −1, we have BT−1(at, x) = MT−1(at, x). Let a∗t (x) = arg maxat∈At Bt(at, x). For

any x1, x2, and β ∈ (0, 1), we have BT−1(βx1 + (1− β)x2) = BT−1(a∗T−1(βx1 + (1−β)x2), βx1 +(1−β)x2) ≤ βBT−1(a∗T−1(βx1 +(1−β)x2), x1)+(1−β)BT−1(a∗T−1(βx1 +

(1 − β)x2), x2) ≤ βBT−1(a∗T−1(x1), x1) + (1 − β)BT−1(a∗T−1(x2), x2), where the first

inequality is from the convexity of BT−1(a, x), and the second inequality is by the def-

inition of a∗T−1(x). Next assume for the induction argument that the benefit function

Bt+1(xt+1) is convex in xt+1. The composition of a convex function and max0, x is

also convex, hence, max0, Bt+1(x) is a convex function. Because the state transi-

tion xt+1(at, xt) is stochastically convex in xt, E[max0, Bt+1(xt+1(at, xt))] is convex

in xt for each fixed at. Because the convexity is preserved under summation, the

benefit function Bt(at, xt) = Mt(at, xt) + αE[max0, Bt+1(at, xt+1(xt))] is convex in

xt for each at. By applying the same argument that we applied on BT−1(x), we con-

clude that Bt(xt) = supat∈At Bt(at, xt) is convex in xt, which concludes the induction

hypothesis and the proof of the Proposition.

Proof of Proposition 2.11. For Part 1, we first prove that Vt(x|T ) is increasing in

T for every t by an induction argument. Note that Vt(x|t + 1) = maxS(x), T (x) +

αE[Vt+1(x(x)|t+1)] ≥ S(x) = Vt(x|t). Assume for an induction argument Vt(x|T ) ≥Vt(x|T − 1). Because x(x) is time-homogeneous and the reward functions are time-

invariant, Vt(x|T ) = Vt+1(x|T + 1), which implies

Vt(x|T + 1) = maxS(x), T (x) + αE[Vt+1(x(x)|T + 1)]

= maxS(x), T (x) + αE[Vt(x(x)|T )]. (A.1)

Therefore, we have Vt(x|T+1) = maxS(x), T (x)+αE[Vt(x(x)|T )] ≥ maxS(x), T (x)+

αE[Vt(x(x)|T − 1)] = Vt(x|T ) for every t, which concludes the induction argument.

If V2(x|T ) is increasing in T , B1(x|T ) = αE[V2(x(x)|T )] + C(x) − S(x) is also

increasing in T for every x. Therefore, limT→∞B1(x|T ) is well-defined if we allow


it to take an infinite value. To conclude Part 1, we need to prove that the limit of

B1(x|T ) is indeed B∗(x). To do so, we first prove that

limT→∞

E[V2(x(x)|T )] = E[ limT→∞

V2(x(x)|T )] = E[V ∗(x(x))], (A.2)

under Assumption 2.1. For notational convenience, we denote V2(x(x)|T ) by YT and

V ∗(x(x)) by Y ∗. The increasing property of Vt(x|T ) in T and Assumption 2.1 im-

ply that YT ↑ Y ∗ almost surely. However, we cannot directly apply the monotone

convergence theorem (Durrett 1996 p. 464) to (A.2) because the monotone conver-

gence theorem is applicable to non-negative random variables. Hence, we instead

consider YT = (YT − Y2) + Y2. Because YT ≥ Y2 for every T ≥ 2, YT − Y2 ≥ 0

almost surely. In addition, Y2 = V2(x(x)|2) = S(x(x)) is integrable from the as-

sumption that E|S(xt)| < ∞. When (YT − Y2) ≥ 0 and Y2 is integrable, E[YT ] =

E[(YT −Y2) +Y2] = E[YT −Y2] +E[Y2] (Durrett 1996 p. 455). From the same reason,

E[Y ∗] = E[Y ∗ − Y2] + E[Y2]. Therefore, (A.2) can be derived as limT→∞E[YT ] =

limT→∞E[(YT−Y2)+Y2] = limT→∞E[YT−Y2]+E[Y2] = E[limT→∞(YT−Y2)]+E[Y2] =

E[Y ∗−Y2]+E[Y2] = E[Y ∗], where the third equality is by the monotone convergence

theorem. Finally, we have limT→∞B1(x|T ) = limT→∞ αE[V2(x(x)|T )]+C(x)−S(x) =

αE[V ∗(x(x))] + C(x)− S(x) = B∗(x), which concludes the proof of Part 1.

Next we consider Part 2. We prove x1|T ↓ x as T → ∞, then the other cases

can be proved in a similar way. From Part 1, B1(x1|T+1|T ) ≤ B1(x1|T+1|T + 1) = 0,

which implies x1|T ≥ x1|T+1 for every T . Similarly, B1(x|T ) ≤ B∗(x) = 0, which

implies x1|T ≥ x for every T . Therefore, limT→∞ x1|T ≥ x. Now suppose y ≡limT→∞ x1|T > x. This inequality implies that B∗(y) > 0. Because B1(y|T ) converges

to B∗(y) as T →∞, there exits W <∞ such that B1(y|W ) > 0, which implies that

y = limT→∞ x1|T > x1|W . This inequality contradicts the decreasing property of x1|T

in T . Therefore, limT→∞ x1|T = x, which concludes the proof.

Appendix B


B.1. Notation

We denote the supplier by (s), the manufacturer by (m), and the centralized decision

maker by (c).

Demand and ForecastXN+1 : demand during the sales periodej : random variable representing the impact of event j on demandEns,nm : set of events whose information is obtained by (s) at period ns and by (m)at period nmδns,nm =

∏j∈Ens,nm

ej : random variable representing the total information obtained

by (s) at period ns and by (m) at period nmXsn : (s)’s demand forecast

Xmn : (m)’s demand forecast

∆sn = Xs

n+1 −Xsn : difference between (s)’s subsequent forecasts

∆mn = Xm

n+1 −Xmn : difference between (m)’s subsequent forecasts

An = Xmn −Xs

n : difference between the (s) and (m)’s forecastsδsn = Xs

n+1/Xsn : random variable representing the ratio of (s)’s successive forecasts

δmn = Xmn+1/X

mn : random variable representing the ratio of (m)’s successive forecasts

εn =∏N

k=n δmk : random variable representing demand uncertainty at period n

ξn =∏N

k=n δsk/εn : random variable representing information asymmetry at period n

XN+1 = Xmn εn = Xs

nξnεnσZ : standard deviation of log(Z) of the log-normal random variable ZGn(.), gn(.) : c.d.f. and p.d.f. of εnFn(.), fn(.) : c.d.f. and p.d.f. of ξn

115

APPENDIX B. CHAPTER 3 APPENDICES 116

Cost Parametersr : per unit retail pricew : per unit wholesale pricec : per unit production costcn : per unit capacity cost at period nCn : fixed capacity cost at period nπm : (m)’s reservation profit(Optimal) Decision variablesun : (s)’s stopping decisionK(.), P (.) : menu of contracts(Kdc

n , Pdcn ) : optimal contract

[Ln, Un] : (s)’s optimal control-bandn∗ = arg minnπnKcsn : (c)’s optimal capacity level

[Lcsn , Ucsn ] : (c)’s optimal control-band

ncs ≡ arg minnπcsn

Profit FunctionsΠsn(K(ξ), P (ξ), ξ,Xs

n) : (s)’s expected profitΠmn (K(ξ), P (ξ), ξ,Xs

n) : (m)’s expected profitΠtotn (K(ξ), P (ξ), ξ,Xs

n) : total expected profitΠcsn (K,Xm

n ) : (c)’s expected profitOptimal Profit Functionsπn(Xs

n) : (s)’s optimal expected profitπn : (s)’s normalized expected profitVn(Xs

n) : (s)’s optimal value-to-go functionπcsn (Xm

n ) :(c)’s optimal expected profitπcsn : (c)’s normalized expected profitV csn (Xm

n ) : (c)’s optimal value-to-go function

B.2. Additive Case

In Chapter 3, we have used the multiplicative MMFE for the capacity planning prob-

lem. Although the multiplicative MMFE fits actual data better than the additive

MMFE (Hausman 1969, Heath and Jackson 1994), the additive model has also been

used in the literature. In this subsection, we first extend the additive MMFE to the

cases of multiple decision makers, and then investigate how the capacity planning

problem changes when we employ the additive MMFE.


B.2.1 The Additive MMFE

We develop the additive MMFE for multiple decision makers. We define the difference

between successive forecasts as δin ≡ X in+1−X i

n, for n < N and δiN ≡ XN+1−X iN . As

before, we assume that there are in total K events that affect demand, and let ej be

the random variable that models the impact of event j. Unlike in the multiplicative

case, now we assume that the change in the forecast due to each event is independent

of the current forecast. In other words, after obtaining the information of event j,

decision maker i updates his forecast from X in to X i

n+ej. Following this explanation,

we first express demand as XN+1 =∑K

j=1 ej. Next, we define Ens,nm as the set

of events whose information is obtained by the supplier during period ns and by the

manufacturer during period nm. Using this set, we define δns,nm ≡∑

j∈Ens,nmej, which

indicates the total demand information obtained by the supplier at period ns and by

the manufacturer at period nm. We assume that each δns,nm is normally distributed

has a mean value of 0 except δ0,01. When Ens,nm is an empty set, δns,nm = 0.

Given this construction, we can express demand as XN+1 =∑N

ns=0

∑Nnm=0 δns,nm .

The supplier’s information set at the beginning of period n is

F sn ≡ σ([δ0,0, . . . , δ0,N ], . . . , [δn−1,0, . . . , δn−1,N ]).

Then, the supplier’s demand forecast is Xsn = E[XN+1|F sn] =

∑n−1ns=0

∑Nnm=0 δns,nm ,

and the difference between successive forecasts is δsn =∑N

nm=0 δn,nm . Because the

sum of normal random variables is also a normal random variable, δsn is also normally

distributed. The manufacturer’s demand forecast can be expressed in a similar way.

Figure B.1 illustrates the information structure of the additive MMFE for two decision

makers. From this construction, we can fully characterize the evolution of Xsn and

Xmn by determining the value of δ0,0 and the variances of δns,nm . The two variants of

the MMFE are identical except that multiplication operators are replaced by addition

operators and log-normal random variables are replaced by normal random variables

1Both decision makers have the information δ0,0 before the beginning of the forecast horizon.Hence, δ0,0 is a deterministic value. Note also that when E[δns,nm

] 6= 0 for some (ns, nm), we canpush this information to δ0,0 and normalize δns,nm

by δns,nm− E[δns,nm

]. Hence, the assumptionE[δns,nm ] = 0 is without loss of generality.


Figure B.1: Information Structure of the additive MMFE

n 0 1 · · · N (s)0 δ0,0 + δ0,1 + · · · + δ0,N Xs

1

+ + + + +1 δ1,0 + δ1,1 + · · · + δ1,N δs1

+ + + + +...

... +... +

. . . +...

...+ + + + +

N δN,0 + δN,1 + · · · + δN,N δsN(m) Xm

1 + δm1 + . . . + δmN XN+1

in the additive case.

We can define the additive Martingale Model of Asymmetric Forecast Evolution

(a-MMAFE) in a similar way to the m-MMAFE. Because the supplier obtains no

information strictly earlier than the manufacturer, we have δns,nn = 0 for every nm >

ns. In this case, the manufacturer’s demand uncertainty at the beginning of period

n is given as εn ≡∑N

k=n δmk , and the manufacturer’s private information is given as

ξn ≡∑N

k=n δsk − εn =

∑Nk=n

∑n−1nm=0 δk,nm . Finally, demand and forecasts have the

following relation: XN+1 = Xmn + εn = Xs

n + ξn + εn.

B.2.2 Determining the Optimal Time to Offer an Optimal

Mechanism

We next revisit the capacity planning problem with the a-MMAFE. The problem set-

ting is the same as before except that now demand forecasts follow an a-MMAFE. We

first consider the second-stage mechanism design problem. Suppose that the supplier

has decided to offer a screening contract at period n. Then, the supplier has to deter-

mine the optimal screening contract given demand forecast Xsn, demand uncertainty

εn, and the manufacturer’s private information ξn. When demand forecasts follow an

a-MMAFE, the random variables εn and ξn are normally distributed.

As before, to determine the optimal screening contract, the supplier solves (3.3).

However, because the demand and the forecast have the relationship; XN+1 = Xsn +


ξn + εn, now the supplier’s expected profit is

Πsn(K(ξ), P (ξ), ξn, X

sn) ≡ (w−c)Eεn [min(Xs

n+ξn+εn, K(ξ))]+P (ξ)− (cnK(ξ)+Cn),

and the manufacturer’s expected profit is

Πmn (K(ξ), P (ξ), ξn, X

sn) ≡ (r − w)Eεn [min(Xs

n + ξn + εn, K(ξ))]− P (ξ).

Ozer and Wei (2006) solve this problem in a static setting, but they provide structural

properties of the optimal contract when ξn has an increasing probability density

function. Because ξn does not have an increasing probability density in our model,

we extend their result to the class of random variables ξn that have increasing failure

rates (IFR).2

To solve the supplier’s problem, we first introduce an equivalent formulation of

(3.3), and then define a normalized problem. From Lemma 1 of Ozer and Wei (2006),

the optimization problem (3.3) has the following equivalent formulation:

πn(Xsn) ≡ max

K(.)Eξn

[(r − c)E[min(Xs

n + ξn + εn, K(ξn))]− (cnK(ξn) + Cn)

−1− Fn(ξn)

fn(ξn)(r − w)Gn(K(ξn)− ξn −Xs

n)]− πm (B.1)

s.t. K(ξ) is increasing.

After determining the optimal capacity reservation function Kdcn (.) from (B.1), we

derive the corresponding payment function as

P dcn (ξ) = (r − w)Eεn [min(Xs

n + ξ + εn, Kdcn (ξ))]

−∫ ξ

ξn

(r − w)Gn(Kdcn (x)− x−Xs

n)dx− πm. (B.2)

2The failure rate of a random variable is defined as f(x)1−F (x) , where f(x) and F (x) are the p.d.f.

and the c.d.f. of the random variable. Normal random variables have IFRs.


Next, we define the normalized version of (B.1) as follows:

πsn ≡ maxK(.)

Eξn

[(r − c)Eεn [min(ξn + εn, K(ξn))]− cnK(ξn) (B.3)

−1− Fn(ξn)

fn(ξn)(r − w)Gn(K(ξn)− ξn)

]s.t. K(ξ) is increasing.

We denote the optimal solution of (B.3) by Kdcn (.), and then we can derive the nor-

malized payment function as

P dcn (ξ) = (r − w)Eεn [min(ξn + εn, K

dcn (ξ))]−

∫ ξ

ξn

(r − w)Gn(Kdcn (x)− x)dx− πm.

Based on these functions, we derive the following structural properties of the

optimal contract and the expected profit:

Theorem B.1. The following statements are true for all n:

(a) Kdcn (ξ) = Xs

n + Kdcn (ξ).

(b) P dcn (ξ) = Xs

n(r − w) + P dcn (ξ)− πm.

(c) πn(Xsn) = Xs

n(r − c− cn) + πsn − Cn − πm.

Theorem 3.4 and Theorem B.1 show the major difference between the multiplicative

and the additive models. When demand uncertainty is additive to demand forecast,

the impact of εn and ξn are independent of the forecast level, Xsn. Hence, the nor-

malized capacity reservation and prices, (Kdcn , P

dcn ), are additive to Xs

n. Similarly, the

normalized expected profit πsn is also additive to Xsn.

Next, we discuss how to solve the optimization problem (B.3). The solution

approach is the same as before. We first relax the constraint that K(.) is increasing,

and then verify that the optimal solution of the unconstraint problem is increasing

in ξ.

Theorem B.2. If εn and ξn have IFRs, then the following properties hold:


(a) Kdcn (ξ) is the unique solution of the first-order condition

(r − c)(1−Gn(K − ξ))− cn −1− Fn(ξ)

fn(ξ)(r − w)gn(K − ξ) = 0.

(b) Both Kdcn (ξ) and P dc

n (ξ) are increasing in ξ.

(c) Both Πsn(Kdc


n) and Πmn (Kdc


n) are increasing in

ξ.

(d) P dcn (Kdc

n ) is an increasing concave function of Kdcn .

The first-order condition in Part (a) is slightly different from that of Theorem 3.5, but

other structural properties of the optimal contract remain the same in the additive

case.

We next consider the first-stage optimal stopping problem. The formulation of

the first-stage problem is also the same as before except that now the state variable

is updated Xsn+1 = δsn + Xs

n, and the reward of stopping at period n is πn(Xsn) =

Xsn(r − c − cn) + πsn − Cn − πm. The following theorem characterizes the optimal

stopping policy:

Theorem B.3. The following statements are true for all n:

(a) A control band policy that offers a capacity reservation contract at period n if

Xsn ∈ [Ln, Un], is optimal.

(b) When cn+1 > cn for all n, the upper threshold, Un, is ∞ for all n. Hence, a

lower threshold policy that offers the capacity reservation contract at period n if

Xsn ≥ Ln, is optimal.

(c) Let n∗ ≡ arg maxnπn − Cn. When cn+1 = cn for all n, the lower and upper

thresholds satisfy that Ln = Un = 0 for n 6= n∗, and that Ln = 0 and Un =∞ for

n = n∗. Hence, a state-independent policy that offers the capacity reservation

contract at period n∗ is optimal.


For both the additive and multiplicative cases, a control band policy is optimal.

However, when both cn and Cn are strictly increasing in n, the optimal thresholds of

the two models have different limits: the multiplicative model satisfies Ln = 0 and

the additive model satisfies Un =∞. When cn is increasing in n, the supplier incurs

more unit capacity costs by delaying the capacity decision, and the additional costs

are proportionally increasing in the forecast level. In contrast, the impact of εn and

ξn on profit is independent of the forecast level. Hence, delaying the capacity decision

becomes less beneficial as the demand forecast increases, which implies the optimality

of the lower threshold policy. Although both variants of the MMFE have been widely

used in the literature, the multiplicative model is superior over the additive model in

terms of consistency with empirical data (Hausman 1969, Heath and Jackson 1994).

Hence, the multiplicative MMFE would be more appropriate to use for the capacity

planning problem.

B.3. Proofs

Proof of Theorem 3.1. For Part (a), we first prove that Xn is integrable. This prop-

erty holds directly from the square-integrability ofXN+1. Then, we have E[X in+1|F in] =

E[E[XN+1|F in+1]|F in] = E[XN+1|F in] = X in, where the second equality is from the

tower property of conditional expectation. Therefore, Part (a) is true. For Part

(b), first note that σ(X in) ⊆ F in. Then, again from the tower property, we have

E[XN+1|X in] = E[E[XN+1|F i

n]|X in] = E[X i

n|X in] = X i

n. For Part (c), note that

σ(X in) ⊆ F in+l for every l ≥ 0. Then, from the tower property, we have E[X i

n+l|X in] =

E[E[XN+1|F in+l]|X in] = E[XN+1|X i

n] = X in. Finally, for Part (d), we have E[∆i

n] =

E[X in+1−X i

n] = E[X in+1]−E[X i

n] = E[E[XN+1|F in+1]]−E[E[XN+1|F in]] = E[XN+1]−E[XN+1] = 0. In addition, for any random variable Y that is measurable in F in,

we have E[∆ilY ] = E[E[∆i

lY |F in]] = E[Y E[∆il|F in]] = E[Y E[X i

l+1 − X il |F in]] =

E[Y (X in − X i

n)] = 0. Therefore, ∆il is uncorrelated with F in for every l ≥ n, which

concludes the proof.

Proof of Theorem 3.2. By definition, we have F in ⊆ F cf

n for every i ∈ s,m and n.


Then, we have the following inequality:

E[(XN+1 −X in)2|F cfn ] = E[(XN+1)2|F cfn ]− 2E[XN+1X

in|F cfn ] + E[(X i

n)2|F cfn ]

= E[(XN+1)2|F cfn ]− 2X inE[XN+1|F cfn ] + (X i

n)2

= E[(XN+1)2|F cfn ]− 2X inX

cfn + (X i

n)2

≥ E[(XN+1)2|F cfn ]− (Xcfn )2 = E[(XN+1 −Xcf

n )2|F cfn ],

where the second equality is from the fact that X in is measurable on F cfn , the inequality

is from (Xcfn )2−2X i

nXcfn +(X i

n)2 ≥ 0, and the last equality is from E[XN+1Xcfn |F cfn ] =

Xcfn E[XN+1|F cfn ] = (Xcf

n )2. By taking expectation on both sides, we have E[(XN+1−X in)2] ≥ E[(XN+1 −Xcf

n )2], which concludes the proof.

Proof of Theorem 3.3. For Part (a), we first note that σ(Xsn) ⊆ F sn ⊆ Fmn+l for every

l ≥ 0. Then, from the tower property, we have E[Xmn+l|Xs

n] = E[E[XN+1|Fmn+l]|Xsn] =

E[XN+1|Xsn] = Xs

n. For Part (b), first note that Xmn = Xs

n+An ∈ σ(Xsn, An). Because

both Xmn and Xs

n are measurable in Fmn , and we also have σ(Xsn, An) ⊆ Fmn . Then, we

have E[XN+1|Xsn, An] = E[E[XN+1|Fm

n ]|Xsn, An] = E[Xm

n |Xsn, An] = Xm

n . For Part

(c), we need to note that (F s1 , ..., F

sn, F

mn ) forms a filtration. Then, Part (c) follows

Part (d) of Theorem 3.1.

Proof of Lemma 3.1. To prove Part (a), we first show that we can replace PC with

PC′ : Πmn (K(ξ

n), P (ξ

n), ξ

n, Xs

n) = πm

under IC. For any ξ1 < ξ2, we have

Πmn (K(ξ1), P (ξ1), ξ1, Xs

n) ≤ Πmn (K(ξ1), P (ξ1), ξ2, Xs

n)

≤ Πmn (K(ξ2), P (ξ2), ξ2, Xs

n),

where the first inequality is from the fact that the profit increases with ξn for a fixed

(K,P ) and the second inequality is from IC. Therefore, PC for ξ > ξn

is redundant

once it is satisfied for ξn. In addition, the supplier can increase P (.) uniformly with-

out breaking IC until Πmn (K(ξ

n), P (ξ

n), ξ

n, Xs

n) becomes πm. Therefore, (IC,PC′) is


equivalent to (IC,PC) for this optimization problem.

Next, we show that (IC,PC′) implies the two conditions of Part (a) for (3.3). IC

implies that maxξΠmn (K(ξ), P (ξ), ξ,Xs

n) = Πmn (K(ξ), P (ξ), ξ,Xs

n). Hence, by apply-

ing the envelope theorem on Πmn (K(ξ), P (ξ), ξ,Xs

n), we can derive

dΠmn (K(ξ), P (ξ), ξ,Xs

n)

dξ=∂Πm

n (K(ξ), P (ξ), ξ,Xsn)

∂ξ

∣∣∣∣ξ=ξ

= (r − w)Xsn

∫ K(ξ)ξXsn

εn

ygn(y)dy.

By integrating both sides from ξn

to ξ with the boundary condition of PC′, we can

derive Condition (i).

Next we prove that (IC,PC′) implies Condition (ii). First, note that we have the

following equation for every ξ and ξ:

Πmn (K(ξ), P (ξ), ξ,Xs

n) =

∫ ξ

ξn

∂Πmn (K(ξ), P (ξ), x,Xs

n)

∂xdx+ Πm

n (K(ξ), P (ξ), ξn, Xs

n)

= Πmn (K(ξ), P (ξ), ξ, Xs

n) +

∫ ξ

ξ

[(r − w)Xs

n

∫ K(ξ)xXsn

εn

ygn(y)dy

]dx

= Πmn (K(ξ), P (ξ), ξ,Xs

n)−∫ ξ

ξ

[(r − w)Xs

n

∫ K(x)xXsn

εn

ygn(y)dy

]dx

+

∫ ξ

ξ

[(r − w)Xs

n

∫ K(ξ)xXsn

εn

ygn(y)dy

]dx


n)

+

∫ ξ

ξ

[(r − w)Xs

n

(∫ K(ξ)xXsn

εn

ygn(y)dy −∫ K(x)

xXsn

εn

ygn(y)dy

)]dx


n) +

∫ ξ

ξ

[(r − w)Xs

n

(−∫ K(x)

xXsn

K(ξ)xXsn

ygn(y)dy

)]dx.

Then, IC implies that∫ ξξ

[(r − w)Xs

n

(−∫ K(x)xXsnK(ξ)xXsn

ygn(y)dy

)]dx ≤ 0 for every ξ < ξ,

which in turn implies that∫ K(ξ)ξXsnK(ξ)ξXsn

ygn(y)dy ≥ 0 for every ξ < ξ. Because ygn(y) ≥ 0,

this inequality holds only if K(.) is increasing. Therefore, (IC,PC′) implies the two


conditions of Part (a).

For the converse, we prove that the two conditions of Part (a) imply (IC,PC′).

First note that Condition (i) with ξ = ξn

directly implies PC′. For IC, we recall the

equation

Πmn (K(ξ), P (ξ), ξ,Xs

n)− Πmn (K(ξ), P (ξ), ξ,Xs

n)

=

∫ ξ

ξ

[(r − w)Xs

n

(∫ K(ξ)xXsn

εn


xXsn

εn

ygn(y)dy

)]dx (B.4)

= −∫ ξ

ξ

[(r − w)Xs

n

(∫ K(ξ)xXsn

εn


xXsn

εn

ygn(y)dy

)]dx. (B.5)

If ξ > ξ, the integrand of (B.4) is non-positive, because Condition (ii) implies that

K(.) is increasing. Similarly, if ξ < ξ, the integrand of (B.5) is non-negative, which

implies that Πmn (K(ξ), P (ξ), ξ,Xs

n) ≤ Πmn (K(ξ), P (ξ), ξ,Xs

n) for every ξ and ξ. There-

fore, (IC,PC) and the two conditions of Part (a) are equivalent for the optimization

problem (3.3).

To prove Part (b), we first derive Eξn [Πsn(K(ξn), P (ξn), ξn, X

sn)] as

Eξn

[Πtotn (K(ξn), P (ξn), ξn, X

sn)−

∫ ξn

ξn

[(r − w)Xs

n

∫ K(x)xXsn

εn

ygn(y)dy

]dx

]− πm.

Finally, we can derive (3.4) by applying integration by parts and Condition (ii) to

this equation, which concludes the proof.

Proof of Theorem 3.4. We define K(.) = K(.)/Xsn. By construction, K(.) is increas-

ing if and only if K(.) is increasing. Using K(.), we can derive the objective function

of (3.4) as

Xsn

[(r − c)E[min(ξnεn, K(ξn))]− cnK(ξn)− 1− Fn(ξn)

fn(ξn)(r − w)

∫ K(ξn)ξn

εn

ygn(y)dy

]−Cn − πm.

Because the term inside of [.] is the objective function of (3.6), Part (a) and Part (c)


hold. Finally, by applying Part (a) to (3.5), we can prove Part (b).

Proof of Theorem 3.5. For Part (a), we first define

H(K, ξ) ≡ (r − c)E[min(εnξ,K)]− cnK −1− Fn(ξ)

fn(ξ)(r − w)

∫ K(ξ)ξ

εn

ygn(y)dy.

Without the constraint that K(.) is increasing, the objective function (3.6) can be

maximized with the values of K that maximizes H(K, ξ) for each ξ. If the maximizer

of the unconstraint problem is increasing, it is indeed the optimal solution for (3.6).

We prove that this approach works when εn and ξn have IGFRs. We first prove that

H(K, ξ) is quasi-concave in K and have a finite maximizer. The first-order derivative

is

∂H(K, ξ)

∂K= (r − c)(1−Gn(

K

ξ))− cn −

1− Fn(ξ)

ξfn(ξ)(r − w)

K

ξgn(

K

ξ)

= (1−Gn(K

ξ))

(r − c− 1− Fn(ξ)

ξfn(ξ)(r − w)

Kξgn(K

ξ)

1−Gn(Kξ

)

)− cn.

Then, the second order derivative at the point in which ∂H(K,ξ)∂K

= 0 can be derived as

∂2H(K, ξ)

∂K2

∣∣∣∣∂H(K,ξ)∂K

=0

= −1

ξgn(

K

ξ)

(cn

1−Gn(Kξ

)

)

+ (1−Gn(K

ξ))

(−1− Fn(ξ)

ξfn(ξ)(r − w)

d

dK

(Kξgn(K

ξ)

1−Gn(Kξ

)

))< 0.

The inequality is from the fact thatKξgn(K

ξ)

1−Gn(Kξ

)is increasing in K due to the IGFR

assumption. The inequality is strict because gn(Kξ

)

(cn

1−Gn(Kξ

)

)is strictly positive. In

other words, H(K, ξ) is quasi-concave and the slope of the fuction is strictly negative

at the point it crosses 0. Finally, we have ∂H(K,ξ)∂K|K<ξ

n= r − c − cn > 0, and

limK→∞∂H(K,ξ)∂K

= −cn < 0. Therefore, there exists a finite solution of ∂H(K,ξ)∂K

= 0.

In addition, because ∂H(K,ξ)∂K

is strictly decreasing at ∂H(K,ξ)∂K

= 0, there is only one

solution that satisfies the first-order condition.


Next we prove that the function K(.) that satisfies the first-order conditions is

increasing in ξ. Note that

∂2H(K, ξ)

∂K∂ξ

∣∣∣∣∂H(K,ξ)∂K

=0

=1

ξ2gn(

K

ξ)

(cn

1−Gn(Kξ

)

)

+ (1−Gn(K

ξ))

(− d

dξ

[1− Fn(ξ)

ξfn(ξ)

](r − w)

(Kξgn(K

ξ)

1−Gn(Kξ

)

))

+ (1−Gn(K

ξ))

(−1− Fn(ξ)

ξfn(ξ)(r − w)

d

dξ

(Kξgn(K

ξ)

1−Gn(Kξ

)

))> 0,

where the inequality is from the fact that both 1−Fn(ξ)ξfn(ξ)

andKξgn(K

ξ)

1−Gn(Kξ

)are decreasing in

ξ due to the IGFR assumption. Therefore, Kdcn (ξ) is increasing in ξ, which concludes

Part (a).

We already proved the increasing property of Kdcn (.) of Part (b) in the proof of

Part (a). The increasing property of P dcn (.) is from

dP dcn (ξ)

dξ= (r − w)(1−Gn(

Kdcn (ξ)

ξ))dKdc

n (ξ)

dξ> 0,

which concludes Part (b).

The increasing property of Πmn (Kdc


n) in Part (c) stems directly

from Part (a) of Lemma 3.1. Note that the supplier’s expected profit is

Πsn(Kdc

n (ξ), P dcn (ξ), ξ, 1) = (r − c)E[min(εnξ,K

dcn (ξ))]− cnKdc

n (ξ)

−∫ ξ

ξn

(r − w)

∫ Kdcn (x)

x

εn

ygn(y)dy

dx− πm,


from which we can derive

dΠsn(Kdc

n (ξ), P dcn (ξ), ξ, 1)

dξ

= (w − c)∫ Kdcn (ξ)

ξ

εn

ygn(y)dy +dKdc

n (ξ)

dξ

((r − c)(1−Gn(

Kdcn (ξ)

ξ))− cn

)

= (w − c)∫ Kdcn (ξ)

ξ

εn

ygn(y)dy +dKdc

n (ξ)

dξ

1− Fn(ξ)

ξfn(ξ)(r − w)

Kdcn (ξ)

ξgn(

Kdcn (ξ)

ξ) > 0,

where the second equality is from Part (a). This inequality concludes Part (c).

Finally, for Part (d), we first derive

dP dcn

dKdcn

=dP dc

n /dξ

dKdcn /dξ

= (r − w)(1−Gn(Kdcn (ξ)

ξ)).

Next, we prove that ddξ

(Kdcn (ξ)ξ

)≥ 0. For notational simplicity, we define A(ξ) =

Kdcn (ξ)ξ

. From Part (a), we have

(1−Gn(A(ξ)))

(r − c− 1− Fn(ξ)

ξfn(ξ)(r − w)

A(ξ)gn(A(ξ))

1−Gn(A(ξ))

)− cn = 0.

By taking derivative on this equation, we can derive

dA(ξ)

dξ

− gn(A(ξ))

(cn

1−Gn(A(ξ))

)+ (1−Gn(A(ξ)))

(−1− Fn(ξ)

ξfn(ξ)(r − w)

[A(ξ)gn(A(ξ))

1−Gn(A(ξ))

]′)= (1−Gn(A(ξ)))

d

dξ

(1− Fn(ξ)

ξfn(ξ)

)(r − w)

A(ξ)gn(A(ξ))

1−Gn(A(ξ)).

Because the term inside . and the R.H.S. are negative, we have dA(ξ)dξ

> 0. Therefore,

we have ddξ

(dP dcndKdc

n

)= −(r − w)gn(K

dcn (ξ)ξ

) ddξ

(A(ξ)) ≤ 0. Finally, we have

d2P dcn

(dKdcn )2

=d( dP

dcn

dKdcn

)/dξ

dKdcn /dξ

≤ 0,


which concludes the proof of theorem.

Proof of Theorem 3.6. To determine the structure of the optimal stopping policy, we

use the two-step method that we have proposed in Chapter 2. As before, we define

Mn(Xsn) ≡ E[πn+1(Xs

n+1)|Xsn]− πn(Xs

n),

and and

Bn(Xsn) ≡ E[Vn+1(Xs

n+1)|Xsn]− πn(Xs

n).

Then, Part (c) of Theorem 3.4 implies that

Mn(Xsn) = E[Xs

n+1πn+1 − Cn+1 − πm|Xsn]− (Xs

nπn − Cn − πm) (B.6)

= Xsn(πn+1 − πn)− (Cn+1 − Cn),

which is a linear function of Xsn. Because every linear function is convex, and the

state transition is stochastically convex, a control-band policy is optimal from Propo-

sition 2.2.

Proof of Theorem 3.7. We first prove Part (a). From the proof of Theorem 3.6,

Bn(Xsn) is convex inXs

n. We will prove that Bn(0) ≤ 0 if Cn < Cn+1 for every n. When

Xsn = 0, Xs

l = 0 almost surely for every l ≥ n. Therefore, Vn(0) = −Cn − πm, which

implies that Bn(0) = −(Cn+1 − Cn) ≤ 0. If a convex function satisfies Bn(0) ≤ 0,

then Bn(Xsn) can cross 0 at most once from below to above in (0,∞). Therefore, the

lower threshold, Ln, is 0, and the upper threshold policy is optimal.

For Part (b), we first define ηn ≡ maxm>n πm. We prove by induction that

Bn(Xsn) = (ηn − πn)Xs

n for all n. For period n = N − 1, we have

Bn(Xsn) = Mn(Xs

n) = Xsn(πn+1 − πn)− (Cn+1 − Cn) = (πn+1 − πn)Xs

n,

where ηn = πn+1 by definition. Next assume for an induction argument thatBn+1(X) =


(ηn+1 − πn+1)Xsn. If ηn+1 ≥ πn+1, then ηn = ηn+1 and

Bn(Xsn) = E[max0, Bn+1(Xs

n+1)|Xsn] +Mn(Xs

n)

= (ηn+1 − πn+1)Xsn +Mn(Xs

n) = (ηn+1 − πn+1)Xsn + (πn+1 − πn)Xs

n

= (ηn+1 − πn)Xsn = (ηn − πn)Xs

n.

In contrast, if ηn+1 < πn+1, then ηn = πn+1 and


n+1)|Xsn] +Mn(Xs

n)

= 0 +Mn(Xsn) = (πn+1 − πn)Xs

n = (ηn − πn)Xsn,

which concludes the induction argument.

We next prove that the optimal policy always stops at period n∗. For n < n∗,

ηn = πn∗ , hence ηn > πn by the definition of n∗. In this case, Bn(Xsn) ≥ 0 for all Xs

n,

and it is always optimal to continue the process. For n = n∗, we have ηn∗ ≤ πn, which

implies that Bn(Xsn) ≤ 0. Hence, the optimal policy always stops at period n∗.

Proof of Theorem 3.8. For Parts (a) to (c), we can apply exactly the same method

as in the proofs of Theorems 3.6 and 3.7. In the problem formulation, the only two

changes are the replacement of πn by πcsn and the replacement of Xsn by Xm

n . Because

the proofs of Theorems 3.6 and 3.7 do not depend on the values of πn, the replacement

of πn do not change the proof. In addition, the only required property of Xsn for the

proofs is the Martingale property, which Xmn also satisfies. Therefore, Parts (a) to

(c) stem directly from the two theorems.

For Part (d), recall from the proof of Theorem 3.5 thatKdcn (ξ) maximizesH(K, ξ) ≡

(r − c)E[min(εnξ,K)] − cnK − 1−Fn(ξ)fn(ξ)

(r − w)∫ K(ξ)

ξ

εnygn(y)dy. The optimal Kcs

n sat-

isfies the following first-order condition: (r − c)(1 − Gn(Kξ

)) − cn = 0. Then, we

have ∂H(K,ξ)∂K|K=Kcs

n (ξ) = −1−Fn(ξ)ξfn(ξ)

(r − w)Kξgn(K

ξ) < 0, which implies that Kdc

n (ξ) ≤Kcsn (ξ).

Proof of Theorem 3.9. We first prove the optimality of the contract wdcn (ξ) = r,

Kdcn (ξ) = Xs

nξG−1n ( r−c−cn

r−c ), and P dcn (ξ) = −πm by showing that the supplier cannot


attain a higher expected profit than the expected profit from the proposed contract.

The expected profits of the supplier, manufacturer and the total supply chain under

the contract K(ξ), P (ξ), w(ξ) are

Πsn(K(ξ), P (ξ), w(ξ), ξn, X

sn) ≡ (w(ξ)− c)Eεn [min(Xs

nξnεn, K(ξ))] + P (ξ)

−(cnK(ξ) + Cn),

Πmn (K(ξ), P (ξ), w(ξ), ξn, X

sn) ≡ (r − w(ξ))Eεn [min(Xs

nξnεn, K(ξ))]− P (ξ),

Πtotn (K(ξ), ξn, X

sn) ≡ (r − c)Eεn [min(Xs

nξnεn, K(ξ))]− (cnK(ξ) + Cn),

where (K(ξ), P (ξ), w(ξ)) is the manufacturer’s choice of the contract.

Note that Πmn (K(ξ), P (ξ), w(ξ), ξn, X

sn) ≥ πm from the participation constraint.

We also have that Πtotn (K(ξ), ξn, X

sn) ≤ πcsn (Xs

nξn), because πcsn (Xsnξn) is the optimal

expected profit of the centralized supply chain. Hence, the supplier’s expected profit

under any contract is bounded above as


sn) = Πtot

n (K(ξ), ξn, Xsn)− Πm

n (K(ξ), P (ξ), w(ξ), ξn, Xsn)

≤ πcsn (Xsnξn)− πm.

We show that the proposed contract achieves this upper bound for every ξn.

We first fix w(ξ) = r and P (ξ) = −πm. Then, the supplier’s profit is


sn) = Πtot

n (K(ξ), ξn, Xsn)− πm,

and the manufacturer’s profit is Πmn (K(ξ), P (ξ), w(ξ), ξn, X

sn) = πm. In this case,

the manufacturer always attains her reservation profit πm regardless of her report ξ,

hence she has no incentive to inflate her demand forecast, i.e., the supplier would

report her forecast truthfully. Then, the supplier determines the capacity level to

maximize Πtotn (K(ξn), ξn, X

sn)−πm, which is the same as the objective function of the

centralized supply chain minus πm. Hence, the optimal K(ξ) when we fix w(ξ) =

r and P (ξ) = −πm is K(ξ) = Kcsn (Xs

nξ) = XsnξG

−1n ( r−c−cn

r−c ), which implies that

Πsn(K(ξ), P (ξ), w(ξ), ξ,Xs

n) = πcsn (Xsnξ)− πm, which is the same as the upper bound.


Therefore, the proposed contract is the optimal contract, and the supplier’s optimal

expected profit at period n is give as

πsn(Xsn) = Eξn [πcsn (Xs

nξn)− πm] = Xsnπ

csn − Cn − πm.

Proof of Theorem B.1. We define K(.) = K(.) − Xsn. By construction, K(.) is in-

creasing if and only if K(.) is increasing.Using K(.), we can derive (B.1) as

(r − c)E[min(Xsn + ξn + εn, K(ξn))]− (cnK(ξn) + Cn)

−1− Fn(ξn)

fn(ξn)(r − w)Gn(K(ξn)− ξn −Xs

n)

= (r − c)E[min(ξn + εn, K(ξn))]− cnK(ξn)− 1− Fn(ξn)


+(r − c− cn)Xsn − Cn

=

[(r − c)E[min(ξn + εn, K(ξn))]− cnK(ξn)− 1− Fn(ξn)


]+(r − c− cn)Xs

n − Cn, (B.7)

where the term inside of [.] is identical to the objective function of (B.3), which

instantly implies Part (a) and (c). Finally, by applying K(.) to (B.2), we can prove

that Part (b) also holds.

Proof of Theorem B.2. We first define a functionH(K, ξ) ≡ (r−c)E[min(εn+ξn), K)]−cnK− 1−Fn(ξ)

fn(ξ)(r−w)Gn(K− ξ). For Part (a), we prove that H(K, ξ) is quasi-concave

in K and have a finite maximizer. First, we have

∂H(K, ξ)

∂K= (r − c)(1−Gn(K − ξ))− cn −

1− Fn(ξ)

fn(ξ)(r − w)gn(K − ξ)

= (1−Gn(K − ξ))(r − c− 1− Fn(ξ)

fn(ξ)(r − w)

gn(K − ξ)1−Gn(K − ξ)

)− cn,



∂2H(K, ξ)

∂K2

∣∣∣∣∂H(K,ξ)∂K

=0

= −gn(K − ξ)(

cn1−Gn(K − ξ)

)+ (1−Gn(K − ξ))

(−1− Fn(ξ)

fn(ξ)(r − w)

d

dK

(gn(K − ξ)

1−Gn(K − ξ)

))< 0.

The inequality is from the fact that gn(K−ξ)1−Gn(K−ξ) is increasing in K from the IFR as-

sumption. The inequality is strict because gn(K − ξ)(

cn1−Gn(K−ξ)

)is strictly pos-

itive. In other words, H(K, ξ) is quasi-concave where the slope when it crosses

zero is strictly negative. Finally, we have ∂H(K,ξ)∂K|K<ξ

n= r − c − cn > 0, and

limK→∞∂H(K,ξ)∂K

= −cn < 0. Therefore, there exists a finite solution of ∂H(K,ξ)∂K

= 0.

In addition, because ∂H(K,ξ)∂K

is strictly decreasing at ∂H(K,ξ)∂K

= 0, there is only one

solution that satisfies the first-order condition.

Next we prove that the function K(.) that satisfies the first-order conditions is

increasing in ξ. We take second-order derivative on H, which gives

∂2H(K, ξ)

∂K∂ξ

∣∣∣∣∂H(K,ξ)∂K

=0

= gn(K − ξ)(

cn1−Gn(K − ξ)

)+ (1−Gn(K − ξ))

(− d

dξ

1− Fn(ξ)

fn(ξ)(r − w)

(gn(K − ξ)

1−Gn(K − ξ)

))+ (1−Gn(K − ξ))

(−1− Fn(ξ)

fn(ξ)(r − w)

d

dξ

(gn(K − ξ)

1−Gn(K − ξ)

))> 0.

The inequality is from the fact that both 1−Fn(ξ)fn(ξ)

and gn(K−ξ)1−Gn(K−ξ) are decreasing in ξ

from the IFR assumption. Therefore, the optimal K(.) of the unconstraint problem is

increasing in ξ, hence it is the optimal Kdcn (.). Therefore, Part (a) and the increasing

property of Kdcn in Part (c) are true. For the increasing property of P dc

n (ξ) in Part

(c), we take the derivative on the equation (B.2), which gives

dP dcn (ξ)

dξ= (r − w)(1−Gn(Kdc

n (ξ)− ξ))dKdcn (ξ)

dξ> 0.

The increasing property of Πmn (Kdc


n) in Part (d) is directly from Part


(a) of Lemma 1 of Ozer and Wei (2006). The rest of the proof is the same as the

proof of Theorem 1 of Ozer and Wei (2006).

Proof of Theorem B.3. In the additive case, we have Mn(Xsn) = Xs

n(cn − cn+1) −(Cn+1−Cn) + (πn+1− πn), which is also a linear function of Xs

n. Because every linear

function is convex, Part (a) holds from Proposition 2.2.

For Part (b), first note that Mn(Xsn) is decreasing in Xs

n if cn < cn+1 for every n.

Then, Proposition 2.1 implies Part (b).

For Part (c), we first define ηn ≡ maxm>n πm − Cm. We prove by induction that

Bn(Xsn) = ηn − πn + Cn for all n. For period n = N − 1, we have

Bn(Xsn) = Mn(Xs

n) = (πn+1 − Cn+1)− πn + Cn,

where ηn = πn+1 − Cn+1 by definition. Next assume for an induction argument that

Bn+1(X) = ηn+1 − πn+1 + Cn+1. If ηn+1 ≥ πn+1 − Cn+1, then ηn = ηn+1 and


n+1)|Xsn] +Mn(Xs

n)

= ηn+1 − πn+1 + Cn+1 +Mn(Xsn)

= ηn+1 − πn+1 + Cn+1 + πn+1 − Cn+1 − πn + Cn

= ηn+1 − πn + Cn = ηn − πn + Cn.

In contrast, if ηn+1 < πn+1 − Cn+1, then ηn = πn+1 − Cn+1 and


n+1)|Xsn] +Mn(Xs

n)

= 0 +Mn(Xsn) = πn+1 − Cn+1 − πn + Cn = ηn − πn + Cn,

which concludes the induction argument.

We next prove that the optimal policy always stops at period n∗. For n < n∗,

ηn = πn∗ −Cn∗ , hence ηn > πn−Cn by the definition of n∗. In this case, Bn(Xsn) > 0

for all Xsn, and it is always optimal to continue the process. For n = n∗, we have

ηn∗ ≤ πn − Cn, which implies that Bn(Xsn) ≤ 0. Hence, the optimal policy always

stops at period n∗.

Appendix C


C.1. Notation

Decision Variables State Variables and Updatesut : stopping decision st : market potentialit : investment decision wt(st) : reduction in market potentialQ : production quantity st+1 = st − wt(st)z : stocking factor xt : knowledge levelpr : regular sales price kt(it) : knowledge improvementps : salvage price xt+1 = xt + kt(it)Cost and Demand Parametersci(it) : cost of investment option itcp(xt) = δ0 + δ1e

−γxt : unit production costb : price elasticity of demandDn(st, pn) = stAnp

−bn : demand during the sales period n ∈ r, s

[An, An] : support of AnProfit FunctionsJr(st, Q) : revenue-to-go function of the regular sales periodJs(st, Qs) : revenue-to-go function of the salvage periodΠt(st, xt) : expected profit of stopping at period tVt(st, xt) : value-to-go function of period tMt(st, xt, it) : one-step benefit functionBt(st, xt, it) : benefit functionBt(st, xt) : maximal benefit function

135

APPENDIX C. CHAPTER 4 APPENDICES 136

C.2. Proofs

Proof of Theorem 4.1. The first part is from Proposition 2 of Monahan et al. (2004).

Next, because zs is defined as Qsstp−bs

, we have p∗s(st, Qs) =(stz∗sQs

)1/b

. Finally, by the

definition of J∗s and Equation (4.7), we have Js(st, Qs) = st

(Qsst

)1−1/b

J∗s .

Proof of Theorem 4.2. We first prove that z∗r ≥ Ar. When z ≤ Ar, the objective

function is given as f(z) = z1/b, because z ≤ Ar ≤ Ar. In this case, f(z) is strictly

increasing in z, which implies that z∗r ≥ Ar.

We next prove that f(z) is quasi-concave in z for z > Ar. When z > Ar, the

objective function f(z) is given as

f(z) = z−1+1/b(E[Ar] + βJ∗sE[(z − Ar)1−1/b]

).

We can derive the first-order derivative of f(z) for z > Ar as

f ′(z) =

(1− 1

b

)z−2+1/b

(−E[Ar] + βJ∗sE[Ar(z − Ar)−1/b]

),

and the second-order derivative of f(z) at the point that satisfies that f ′(z) = 0 as

f ′′(z)|f ′(z)=0 = −1

b

(1− 1

b

)z−2+1/bβJ∗sE[Ar(z − Ar)−1−1/b],

which is strictly smaller than 0. Therefore, f(z) is quasi-concave in z for z > Ar, and

the curvature of f(z) at the mode is strictly concave. Hence, the optimal z∗r is either

the unique solution that satisfies βJ∗sE[Ar(z − Ar)−1/b] = E[Ar] or the maximizer of

f(z) in [Ar, Ar].

By embedding Jr(st, Q) = st

(Qst

)1−1/b

J∗r in (4.6), we can derive

Πt(st, xt) = st

(Q

st

)1−1/b

J∗r − cp(xt)Q, (C.1)

from which we can determine the optimal production quantity asQ∗(st, xt) = st

((b−1)J∗rbcp(xt)

)b.


Then, because zr = Q

stp−br

, the optimal regular sales price is given as p∗r(st, xt) =(b−1)J∗r

bcp(xt)(z∗r )1/b . Finally, by embedding Q∗(st, xt) in (C.1), we can derive Πt(st, xt) =

stcp(xt)1−bπ∗.

Proof of Theorem 4.3. We prove that Bt(st, xt) is decreasing in xt at periods t ≥ t.

To do so, we first show that Mt(st, xt, it) is decreasing in xt at periods t ≥ t. We can

derive Mt(st, xt, it) as

Mt(st, xt, it) = −ci(it) + αE[Πt+1(st+1 − wt(st), xt + kt(it))]− Πt(st, xt)

= −ci(it) + αE[st − wt(st)]E[(cp(xt + kt(it)))1−b]π∗ − stcp(xt)1−bπ∗

= −ci(it) + αE[(st − wt(st))]− stE[(cp(xt + kt(it)))1−b]π∗ (C.2)

+stE[cp(xt + kt(it))

1−b]− cp(xt)1−b π∗.Because αE[(st − wt(st))] ≤ st and cp(xt + kt(it)) is increasing in xt for every real-

ization of kt(it), the second term of (C.2) is decreasing in xt. At periods t ≥ t, the

function cp(xt)1−b is concave in xt, which implies that E[cp(xt + kt(it))

1−b]− cp(xt)1−b

is decreasing in xt. Hence, Mt(st, xt, it) is decreasing in xt for every (st, it).

Using the decreasing property of Mt(st, xt, it), we next show that Bt(st, xt) is

decreasing in xt for periods t ≥ t. This proof is based on an induction argument. The

benefit function and the one-step benefit function satisfy the following recursion:

BT (sT , xT , iT ) = MT (sT , xT , iT )

Bt(st, xt, it) = −ci(it) + αE[Vt+1(st+1 − wt(st), xt + kt(it))]− Πt(st, xt)

= αE[max

0, Bt+1(st+1 − wt(st), xt + kt(it))

+ Πt+1(st+1 − wt(st), xt + kt(it))

]−Πt(st, xt)− ci(it)

= Mt(st, xt, it) + αE[max

0, Bt+1(st+1 − wt(st), xt + kt(it))

], for t < T. (C.3)

Hence, at period t = T , we have BT (sT , xT , iT ) = MT (sT , xT , iT ), which is decreas-

ing in xT . Then, for any x1 ≤ x2, we have BT (sT , x2) = BT (sT , x

2, i∗T (sT , x2)) ≤

BT (sT , x1, i∗T (sT , x

2)) ≤ BT (sT , x1, i∗T (sT , x

1)) = BT (sT , x1), where the first inequal-

ity is from the fact that BT (sT , xT , iT ) is decreasing in xT , and the second inequality


is by the definition of i∗T (sT , xT ). This result implies that BT (sT , xT ) is decreasing in

xT .

Next we assume for the induction argument that Bt+1(st+1, xt+1) is decreasing in

xt+1. Because the composition of a decreasing function and max0, x is also de-

creasing, max0, Bt+1(st+1, x) is a decreasing function of x. Then, the independence

between kt(it) and xt implies that E[max0, Bt+1(st−wt(st), xt + kt(it))] is decreas-

ing in xt. Finally, the benefit function Bt(st, xt, it) is also decreasing in xt, because

Mt(st, xt, it) is decreasing in xt for periods t ≥ t. If Bt(st, xt, it) is decreasing in xt,

then so is Bt(st, xt) = supit∈It Bt(st, xt, it) based on the same argument that we have

applied to BT (sT , xT ). Hence, Bt(st, xt) is decreasing in xt for periods t ≥ t.

When Bt(st, xt) is decreasing in xt, Bt(st, xt) ≤ 0 if and only if xt ≥ infx :

Bt(st, x) ≤ 0 = xt(st) for each given st. Hence, at periods t ≥ t, it is optimal to

stop process design if xt ≥ xt(st), and it is optimal otherwise to continue process

design.

Proof of Theorem 4.4. We first prove that if

dE[wt(st)]

dst≤ 1−min

it∈It

cp(xt)1−b

αE[cp(xt + kt(it))1−b],

then Mt(st, xt, it) is increasing in st for every (xt, it). Recall that Mt(st, xt, it) is given

as

Mt(st, xt, it) = −ci(it) + αE[(st − wt(st))]E[(cp(xt + kt(it)))1−b]π∗ − stcp(xt)1−bπ∗,


∂Mt(st, xt, it)

∂st= α(1− dE[wt(st)]

dst)E[(cp(xt + kt(it)))

1−b]π∗ − cp(xt)1−bπ∗.

Hence, when the sufficient condition (4.11) holds, Mt(st, xt, it) is increasing in st for

every (xt, it).

Using the increasing property of Mt(st, xt, it) in st, we next show that Bt(st, xt)

is increasing in st for all periods. The proof is based on an induction argument.


At period t = T , we have BT (sT , xT , iT ) = MT (sT , xT , iT ), which is increasing

in sT . Then, for any s1 ≤ s2, we have BT (s1, xT ) = BT (s1, xT , i∗T (s1, xT )) ≤

BT (s2, xT , i∗T (s1, xT )) ≤ BT (s2, xT , i

∗T (s2, xT )) = BT (s2, xT ), where the first inequality

is from the fact that BT (sT , xT , iT ) is increasing in sT , and the second inequality is

by the definition of i∗T (sT , xT ). This result implies that BT (sT , xT ) is increasing in

sT .

Next assume for the induction argument that Bt+1(st+1, xt+1) is increasing in

st+1. Because the composition of an increasing function and max0, x is also in-

creasing, max0, Bt+1(s, xt+1) is an increasing function of s. Then, the stochas-

tic increasing property of st+1 = st − wt(st) in st implies that E[max0, Bt+1(st −wt(st), xt + kt(it), it)] is increasing in st. Finally, the benefit function Bt(st, xt, it) =

Mt(st, xt, it) + αE[max0, Bt+1(st − wt(st), xt + kt(it), it)

]is increasing in st, be-

cause Mt(st, xt, it) is also increasing in st. If Bt(st, xt, it) is increasing in st, then so

is Bt(st, xt) based on the same argument that we have applied to BT (sT , xT ). Hence,

Bt(st, xt) is increasing in st for all periods.

When Bt(st, xt) is increasing in st, Bt(st, xt) ≤ 0 if and only if st ≤ sups :

Bt(s, xt) ≤ 0 = st(xt) for each given xt. Hence, at each period t, it is optimal to

stop process design if st ≤ st(xt), and it is optimal otherwise to continue process

design.

Proof of Theorem 4.5. The proof of this theorem is similar to that of Theorem 4.4.

We first prove that if

dE[wt(st)]

dst≥ 1−max

it∈It

cp(xt)1−b

αE[cp(xt + kt(it))1−b],

then Mt(st, xt, it) is increasing in st for every (xt, it). Recall from the proof of Theo-

rem 4.4 that

∂Mt(st, xt, it)

∂st= α(1− dE[wt(st)]

dst)E[(cp(xt + kt(it)))

1−b]π∗ − cp(xt)1−bπ∗.

Hence, when the sufficient condition (4.12) holds, Mt(st, xt, it) is increasing in st for

every (xt, it).


Using the decreasing property of Mt(st, xt, it) in st, we next show that Bt(st, xt)

is decreasing in st for all periods. The proof is based on an induction argument.

At period t = T , we have BT (sT , xT , iT ) = MT (sT , xT , iT ), which is decreasing

in sT . Then, for any s1 ≤ s2, we have BT (s2, xT ) = BT (s2, xT , i∗T (s2, xT )) ≤

BT (s1, xT , i∗T (s2, xT )) ≤ BT (s1, xT , i

∗T (s1, xT )) = BT (s1, xT ), where the first inequality

is from the fact that BT (sT , xT , iT ) is decreasing in sT , and the second inequality is

by the definition of i∗T (sT , xT ). This result implies that BT (sT , xT ) is decreasing in

sT .

Next assume for the induction argument that Bt+1(st+1, xt+1) is decreasing in

st+1. Because the composition of a decreasing function and max0, x is also de-

creasing, max0, Bt+1(s, xt+1) is a decreasing function of s. Then, the stochas-

tic increasing property of st+1 = st − wt(st) in st implies that E[max0, Bt+1(st −wt(st), xt + kt(it), it)] is decreasing in st. Finally, the benefit function Bt(st, xt, it) =

Mt(st, xt, it) + αE[max0, Bt+1(st − wt(st), xt + kt(it), it)

]is decreasing in st, be-

cause Mt(st, xt, it) is also decreasing in st. If Bt(st, xt, it) is decreasing in st, then so

is Bt(st, xt) based on the same argument that we have applied to BT (sT , xT ). Hence,

Bt(st, xt) is decreasing in st for all periods.

When Bt(st, xt) is decreasing in st, Bt(st, xt) ≤ 0 if and only if st ≥ infs :

Bt(s, xt) ≤ 0 = st(xt) for each given xt. Hence, at each period t, it is optimal to

stop process design if st ≥ st(xt), and it is optimal otherwise to continue process

design.

Proof of Theorem 4.6. The proof is directly from Proposition 2.6

Bibliography

Aitchison, J., J. A. C. Brown. 1957. The Lognormal Distribution. Cambridge University

Press, Cambridge, UK.

Akan, M., B. Ata, J. Dana. 2009. Revenue management by sequential screening. Working

Paper. Northwestern University, Evanston, IL.

Alagoz, O., L. M. Maillart, A. J. Schaefer, M. S. Roberts. 2004. The optimal timing of

living-donor liver transplantation. Management Sci. 50(10) 1420–1430.

Alagoz, O., L. M. Maillart, A. J. Schaefer, M. S. Roberts. 2007a. Choosing among living-

donor and cadaveric livers. Management Sci. 53(11) 1702–1715.

Alagoz, O., L. M. Maillart, A. J. Schaefer, M. S. Roberts. 2007b. Determining the acceptance

of cadaveric livers using an implicit model of the waiting list. Oper. Res. 55(1) 24–36.

Altman, E., S. Stidham. 1995. Optimality of monotonic policies for two-action markovian

decision processes. Queueing Systems 21 267–291.

Altug, M. S., A. Muharremoglu. 2009. Supply chain management with advance supply

information. Working Paper. Columbia University, New York, NY.

Anily, S., A. Grosfeld-Nir. 2006. An optimal lot-sizing and offline inspection policy in the

case of nonrigid demand. Oper. Res. 54(2) 311–323.

Athey, S. 2000. Characterizing properties of stochastic objective functions. Working Paper.

MIT, Boston, MA.

Aviv, Y. 2001. The effect of collaborative forecasting on supply chain performance. Man-

agement Sci. 47(10) 1326–1343.

Aviv, Y. 2002. Gaining benefits from joint forecasting and replenishment processes: The

case of auto-correlated demand. Manufacturing Service Oper. Management 4(1) 55–74.

141

BIBLIOGRAPHY 142

Aviv, Y. 2007. On the benefits of collaborative forecasting partnerships between retailers

and manufacturers. Management Sci. 53(5) 777–794.

Babich, V., M. J. Sobel. 2004. Pre-ipo operational and financial decisions. Management

Sci. 50(7) 935–948.

Bayus, B. L. 1997. Speed-to-market and new product performance trade-offs. J. of Product

Innovation Management 14(6) 485–497.

Bayus, B. L., S. Jain, A. G. Rao. 1997. Too little, too early: Introduction timing and new

product performance in the personal digital assistant industry. J. of Marketing Res.

34(1) 50–63.

Ben-Ameur, H., M. Breton, P. L’Ecuyer. 2002. A dynamic programming procedure for

pricing american-style asian options. Management Sci. 48(5) 625–643.

Bertsekas, D. P. 2005. Dynamic Programming and Optimal Control, Vol. 1 . Athena Scien-

tific, Belmont, MA.

Bertsekas, D. P. 2007. Dynamic Programming and Optimal Control, Vol. 2 . Athena Scien-

tific, Belmont, MA.

Black, F., M. Scholes. 1973. The pricing of options and corporate liabilities. J. Political

Econom. 3 637–654.

Bohn, R. E., C. Terwiesch. 1999. The economics of yield-driven processes. J. Oper. Man-

agement 18 41–59.

Boyaci, T., O. Ozer. 2009. Information acquisition via pricing and advance selling for

capacity planning: When to stop and act? Oper. Res. Forthcoming.

Cachon, G. P., M. A. Lariviere. 2001. Contracting to assure supply: How to share demand

forecasts in a supply chain. Management Sci. 47(5) 629–646.

Carrillo, J. E., R. M. Franza. 2006. Investing in product development and production

capabilities: The crucial linkage between time-to-market and ramp-up time. Eur. J.

Oper. Res. 171 536–556.

Carrillo, J. E., C. Gaimon. 2000. Improving manufacturing performance through process

change and knowledge creation. Management Sci. 46(2) 265–288.

Chen, A. H. Y. 1970. A model of warrant pricing in a dynamic market. J. Finance 25(5)

1041–1059.

BIBLIOGRAPHY 143

Chen, F. 1999. Decentralized supply chains subject to information delays. Management

Sci. 45(8) 1076–1090.

Chen, F. 2003. Information sharing and supply chain coordination. S. Graves, T. de Kok,

eds., Handbook of Operations Research and Management Science: Supply Chain Man-

agement . Elsevier, Amsterdam, The Netherlands, 341–421.

Chen, J., D. D. Yao, S. Zheng. 1998. Quality control for products supplied with warranty.

Oper. Res. 46(1) 107–115.

Chen, L., H. L. Lee. 2009. Information sharing and order variability control under a gener-

alized demand model. Management Sci. 55(5) 781–797.

Chod, J., N. Rudi. 2006. Strategic investments, trading, and pricing under forecast updat-

ing. Management Sci. 52(12) 1913–1929.

Chow, Y. S., S. Moriguti, H. Robbins, S. M. Samuels. 1964. Optimal selection based on

relative rank. Israel Journal of Mathematics 2(2) 81–90.

Chow, Y. S., H. Robbins, D. Siegmund. 1971. Great Expectations: the Theory of Optimal

Stopping . Houghton Mifflin Company, Boston, MA.

Clark, N. 2007. U.P.S. walks away from A380 order. The New York Times (March 3).

Cohen, M. A., J. Eliashberg, T. Ho. 1996. New product development: the performance and

time-to-market tradeoff. Management Sci. 42(2) 173–186.

Corbett, C. J., X. de Groote. 2000. A supplier’s optimal quantity discount policy under

asymmetric information. Management Sci. 46(3) 444–450.

Courty, P., H. Li. 2000. Sequential screening. Review of Economic Studies 67 697–717.

Cox, J. C., S. A. Ross, M. Rubinstein. 1979. Option pricing: A simplified approach. J.

Financial Econom. 7 229–263.

Durrett, R. 1996. Probability: Theory and Examples. Duxbury Press, Belmont, CA.

Files, J. 2001. Economic downturn leaves Cisco with stacks of excess inventory. San Jose

Mercury News (April 27).

Fine, C. H., E. L. Porteus. 1989. Dynamic process improvement. Management Sci. 37(4)

580–591.

Freeman, P. R. 1983. The secretary problem and its extensions: A review. Internat.

Statistical Rev. 51(2) 189–206.

BIBLIOGRAPHY 144

Gallego, G., O. Ozer. 2001. Integrating replenishment decisions with advance demand

information. Management Sci. 47(10) 1344–1360.

Glasserman, P., D. D. Yao. 1994. Monotone Structure in Discrete Event Systems. Wiley,

New York, NY.

Graves, S. C., D. B. Kletter, W. B. Hetzel. 1998. A dynamic model for requirements

planning with application to supply chain optimization. Oper. Res. 46(3) 35–49.

Graves, S. C., H. C. Meal, S. Dasu, Y. Qiu. 1986. Two-stage production planning in a dy-

namic environment. S. Axater, C. Schneeweiss, E. Silver, eds., Multi-Stage Production

Planning and Control. Lecture Notes in Economics and Mathematical Systems, 26 .

Springer-Verlag, Berlin, Germany, 9–43.

Ha, A. 2001. Supplier-buyer contracting: Asymmetric cost information and cutoff level

policy for buyer participation. Naval Res. Logist. 48 41–64.

Harrison, J. M., D. M. Kreps. 1979. Martingales and arbitrage in multiperiod securities

markets. J. Econom. Theory 20 381–408.

Hausman, W. H. 1969. Sequential decision problems: A model to exploit existing forecasters.

Management Sci. 16(2) B93–B111.

Hausman, W. H., R. Peterson. 1972. Multiproduct production scheduling for style goods

with limited capacity, forecast revisions and terminal delivery. Management Sci. 18(7)

370–383.

Heath, D. C., P. L. Jackson. 1994. Modeling the evolution of demand forecasts with appli-

cation to safety stock analysis in production/distribution systems. IIE Trans. 26(3)

17–30.

Hui, S. K., J. Eliashberg, E. I. George. 2008. Modeling dvd preorder and sales: An optimal

stopping approach. Oper. Res. 27(6) 1097–1110.

Iida, T., P. H. Zipkin. 2006. Approximate solutions of a dynamic forecast-inventory model.

Manufacturing Service Oper. Management 8(4) 407–425.

Iida, T., P. H. Zipkin. 2009. Competition and cooperation in a two-stage supply chain with

demand forecasts. Oper. Res. Forthcoming.

Kalish, S, G. L. Lilien. 1986. A market entry timing model for new techniques. Management

Sci. 32(2) 194–205.

BIBLIOGRAPHY 145

Kalyanaram, G., V. Krishnan. 1997. Deliberate product definition: Customizing the prod-

uct definition process. J. of Marketing Res. 34(2) 276–285.

Klastorin, T., W. Tsai. 2004. New product introduction: Timing, design, and pricing.

Manufacturing Service Oper. Management 6(4) 302–320.

Krishnan, V., K. T. Ulrich. 2001. Product development decisions: A review of the literature.

Management Sci. 47(1) 1–21.

Kumar, S., T. R. McCaffrey. 2003. Engineering economics at a hard disk drive manufacturer.

Technovation 23 749–755.

Laffont, J., J. Tirole. 1988. The dynamics of incentive contracts. Econometrica 56(5)

1153–1175.

Lapre, M. A., A. S. Mukherjee, L. N. Van Wassenhove. 2000. Behind the learning curve:

Linking learning activities to waste reduction. Management Sci. 46(5) 597–611.

Lee, H. L., K. C. So, C. S. Tang. 2000. The value of information sharing in a two-level

supply chain. Management Sci. 46(5) 626–643.

Lovejoy, W. S. 2006. Optimal mechanisms with finite agent types. Management Sci. 52(5)

788–803.

Lutze, H., O. Ozer. 2008. Promised lead-time contracts under asymmetric information.

Oper. Res. 56(4) 898–915.

Martinez-de Albeniz, V., D. Simchi-Levi. 2003. Mean-variance trade-offs in supply contracts.

Naval Res. Logist. 53(7) 603–616.

McAfee, R. P., J. McMillan. 1987. Auctions and bidding. J. of Economic Literature 25(2)

699–738.

Monahan, G. E., N. C. Petruzzi, W. Zhao. 2004. The dynamic pricing problem from a

newsvendor’s perspective. Manufacturing Service Oper. Management 6(1) 73–91.

Muller, A., D. Stoyan. 2002. Comparison Methods for Stochastic Models and Risks. John

Wiley and Sons, Chichester, UK.

Myerson, R.B. 1982. Optimal coordination mechanisms in generalized principal-agent prob-

lems. Journal of Mathematical Economics 10 67–81.

Ozer, O., O. Uncu. 2008. An integrated framework to optimize stochastic, dynamic quali-

fication timing and production decisions. Working Paper.

BIBLIOGRAPHY 146

Ozer, O., O. Uncu, W. Wei. 2007. Selling to the newsvendor with a forecast update: Analysis

of a dual purchase contract. Eur. J. Oper. Res. 182 1150–1176.

Ozer, O., W. Wei. 2006. Strategic commitments for an optimal capacity decision under

asymmetric forecast information. Management Sci. 52(8) 1239–1258.

Peskir, G., A. Shiryaev. 2006. Optimal Stopping and Free-Boundary Problems. Birkhauser,

Basel, Switzerland.

Petruzzi, N. C., M. Dada. 1999. Pricing and the news vendor problem: a review with

extensions. Oper. Res. 47(2) 183–194.

Pisano, G. P. 1996. Learning-before-doing in the development of new process technology.

Research Policy 25(7) 1097–1119.

Plambeck, E. L., S. A. Zenios. 2000. Performance-based incentives in a dynamic principal-

agent model. Manufacturing Service Oper. Management 2(3) 240–263.

Riley, J., R. Zeckhauser. 1983. Optimal selling strategies: When to haggle, when to hold

firm. Quarterly J. of Economics 98(2) 267–289.

Ross, S. 1983. Introduction to Stochastic Dynamic Programming . Academic Press Inc., New

York, NY.

Sapra, A., P. L. Jackson. 2009. On the equilibrium behavior of a supply chain market

for capacity. Working Paper. Indian Institute of Management Bangalore, Bangalore,

India.

Savin, S., C. Terwiesch. 2005. Optimal product launch times in a duopoly: Balancing

life-cycle revenues with product cost. Oper. Res. 53(1) 26–47.

Schoenmeyr, T., S. C. Graves. 2009. Strategic safety stocks in supply chains with evolving

forecasts. Manufacturing Service Oper. Management 11(4) 657–673.

Shaked, M., J. G. Shanthikumar. 2007. Stochastic Orders. Springer, New York, NY.

Shane, S. A., K. T. Ulrich. 2004. Technological innovation, product development, and

entrepreneurship in Management Science. Management Sci. 50(2) 133–144.

Shiryaev, A. N. 1978. Optimal Stopping Rules. Springer-Verlag, New York, NY.

Skreta, V. 2006. Sequentially optimal mechanisms. Review of Economic Stud. 73(4) 1085–

1111.

Smith, J. E., K. F. McCardle. 2002. Structural properties of stochastic dynamic programs.

Oper. Res. 50(5) 796–809.

BIBLIOGRAPHY 147

Stadje, W. 1991. Optimal stopping in a cumulative damage model. OR Spectrum 13(1)

31–35.

Taub, E. 2007. Microsoft to spend $1.15 billion for xbox repairs. The New York Times,

July 6.

Taylor, T. A., W. Xiao. 2009. Incentives for retailer forecasting: Rebates vs. returns.

Management Sci. 55(10) 1654–1669.

Terwiesch, C., R. E. Bohn. 2001. Learning and process improvement during production

ramp-up. Internat. J. Production Econom. 70(1) 1–19.

Terwiesch, C., Y. Xu. 2004. The copy-exactly ramp-up strategy: Trading-off learning with

process change. IEEE Trans. Engrg. Management 51(1).

Toktay, L. B., L. M. Wein. 2001. Analysis of a forecasting-production-inventory system

with stationary demand. Management Sci. 47(9) 1268–1281.

Topkis, D. M. 1978. Minimizing a submodular function on a lattice. Oper. Res. 26 305–321.

Tsitsiklis, J. N., B. Van Roy. 1999. Optimal stopping of markov processes: Hilbert space

theory, approximation algorithms, and an application to pricing high-dimensional fi-

nancial derivatives. IEEE Trans. Automatic Control 44 1840–1851.

Ulku, S., L. B. Toktay, E. Yucesan. 2005. The impact of outsourced manufacturing on

timing of entry in uncertain markets. Production Oper. Management 14(3) 301–314.

Ulu, C., J. E. Smith. 2009. Uncertainty, information acquisition, and technology adoption.

Oper. Res. 57(3) 740–752.

Wang, Y., B. Tomlin. 2009. To wait or not to wait: Optimal ordering under lead time

uncertainty and forecast updating. Naval Res. Logist. 56(8) 766–779.

Wu, R., M. C. Fu. 2003. Optimal exercise policies and simulation-based valuation for

american-asian options. Oper. Res. 51(1) 52–66.

Yao, D. D., S. Zheng. 1999a. Coordinated quality control in a two-stage system. IEEE

Trans. Automatic Control 44(6) 1166–1179.

Yao, D. D., S. Zheng. 1999b. Sequential inspection under capacity constraints. Oper. Res.

47(3) 410–421.

Zangwill, W. I., P. B. Kantor. 1998. Toward a theory of continuous improvement and the

learning curve. Management Sci. 44(7) 910–920.

BIBLIOGRAPHY 148

Zhang, H., S. Zenios. 2008. A dynamic principal-agent model with hidden information:

Sequential optimality through truthful state revelation. Oper. Res. 56(3) 681–696.

optimal stopping problems in operations a …such problems are pervasive in the areas of operations...

Documents