retail inventories and consumer choice __________________________ siddharth mahajan garrett j. van...
Post on 19-Dec-2015
217 views
TRANSCRIPT
Retail Inventories and
Consumer Choice__________________________
Siddharth Mahajan
Garrett J. van Ryzin
Overview
• Modeling Consumer Choice– Attribute Models and Utility Models
– Binary Probit, Binary Logit, Multinomial Logit
• Inventory Management Under Static Choice– Smith and Agrawal Model
– van Ryzin and Mahajan Models• Independent Population and Trend-Following Population
• Inventory Management Under Dynamic Choice– Noonan Model
– Mahajan and van Ryzin Model• Sample Path Gradient Algorithm
• Maximum Likelihood Estimation of the MNL
Introduction
• Faced with limited choices, customers are often willing to substitute rather than go home empty handed.
• Customers are heterogeneous in taste, and they are often willing to pay a higher price for products with attributes closer to their desired attributes.
• Therefore, a retailer has an incentive to offer a broad variety of products to better cover the possible range of consumer tastes.
• There are few direct costs of variety for a retailer, but the indirect costs of stockouts and overstocking impose an implicit cost on variety. Trade-off: “breadth vs. depth” of assortment.
Choice Processes
• Given two alternatives, a choice corresponds to an expression of preference of one alternative over another.
• Let X be a set of alternatives and let be a binary relation on the set X. For alternatives x, y X, x y denotes that y is strictly preferred to x.
• Definition: A binary relation on a set X is called a preference relation if it is asymmetric and negatively transitive.
• Theorem: If X is a finite set, a binary relation is a preference relation if and only if there exists a function u: X (called a utility function) such that
x y iff u(x) < u(y)
Two approaches for modeling choice:
1.) Construct preference relations directly
2.) Construct utilities and then apply utility maximization
Lexicographic Model
• This model implies attributes strictly dominate each other.
• Equivalent utility maximization model: There are n alternatives that have m attributes (1 = highest, m = lowest). Let ajk = 1 if alternative j possesses attribute k. Then a utility satisfying Theorem 1 is the binary number
Uj = aj1aj2…ajm
• A product is made up of binary attributes. The consumer ranks all attributes and then eliminates alternatives which do not possess the most important attribute. If more than one alternative remains, the next most important attribute is chosen as a criterion for elimination of alternatives, and so on.
Address Model
• We have n alternatives, each of which has m attributes that take on real values. Therefore, alternatives can be represented as n points, z1, …, zn, in m, which is called the attribute space.
• Equivalent utility maximization model:
Uj(y) = c - (zj, y)
• Each consumer has an ideal point (address) y m, and chooses the product closest to it in attribute space, where distance is defined by a metric on m m.
• These distances define a preference relation: i j if and only if (zi, y) > (zj, y).
• Randomization of the deterministic lexicographic rule.
• Process:
1.) Delete attributes common to all alternatives.
2.) Select one of the remaining attributes with probability proportional to a specified (deterministic) utility value.
3.) Eliminate alternatives not having the selected attribute.
4.) If a single alternative remains, choose it. If several alternatives remain, repeat the above process. If all
remaining alternatives have the same attributes, chooseone randomly.
Tversky Model
• Alternatives can be represented as elements z1, …, zn, of m, and each individual has an ideal point (address) y m.
• Let g(y), y m, denote the density function for y.
• Uj(y) = c - || zj – y ||, where c is the utility of y and represents the disutility due to deviations from y.
• The market space of variant j is given by
Mj = { y m : Uj(y) Ui(y), i = 1,…,n}
• The probability that a consumer buys variant j is given by
Pj = g(y) dy Mj
Randomized Address Model
• Let the n alternatives be denoted j = 1,…,n. A consumer associates a utility with alternative j, denoted Uj.
• This utility is decomposed into two parts: a representative component uj that is deterministic and a random component j with mean zero:
Uj = uj + j.
• The probability that an individual selects alternative j is
qj = P(Uj = max Ui) i = 1,…,n
• This probability depends on the joint distribution of the random components j. Common versions are the binary probit, binary logit, and multinomial logit.
Random Utility Models
• Two alternatives, and the error terms j, j = 1, 2 are iid N(0,2).
• Probability that Variant 1 is chosen:
q1 = P(U1 = max {U1, U2})
= P(U1 U2)
= P(u1 + 1 u2 + 2)
= P(2 – 1 u1 – u2)
=
• No closed form solution.
221 uu
Binary Probit
• Two alternatives, and the error term = 1 – 2 has a logistic cumulative distribution, i.e.
> 0 is a scale parameter and – < <
x
e
F
1
1 )(
• E() = 0 and Var() = (22)/3
• The logistic distribution provides a good approximation to the normal distribution, but with “fatter tails.”
Binary Logit
• Probability that Variant 1 is chosen:
q1 = P(U1 = max {U1, U2})
= P(U1 U2)
= P(u1 + 1 u2 + 2)
= P(2 – 1 u1 – u2)
=
21
1
uu
u
ee
e
j are iid random variables with a Gumbel (or double exponential) distribution, with cdf
is Euler’s constant (= 0.5772…) and is a scale parameter, E[j] = 0 and Var[j] = (22)/6
• The probability that an alternative j is chosen from a set S {1,2,…,n} that contains j is given by
)(
)()(
x
ej exPxF
Si
u
u
Si
j
e
ejP
)(
Multinomial Logit (MNL)
• For all S N, and T N such that S T, and for all i,j S,
• The ratio of choice probabilities for i and j is independent of the choice set containing these alternatives. This property is not realistic if the choice set contains alternatives that can be grouped such that alternatives within a group are more similar than alternatives outside the group, because adding a new alternative reduces the probability of choosing similar alternatives more than dissimilar alternatives.
)()(
)()(
jPiP
jPiP
T
T
S
S
Independence from Irrelevant Alternatives Property
• Suppose the individual selects to travel by car or by bus with equal probability. Let the set S = {car, bus}. Then
PS(car) = PS(bus) = 1/2
• Introduce a new bus. Let the set T denote {car, blue bus, red bus}. It can be shown that the MNL predicts
PT(car) = PT(blue bus) = PT(red bus) = 1/3
• However, a more natural outcome is
PT(car) = 1/2
PT(blue bus) = PT(red bus) = 1/4
Blue bus/red bus paradox
• The retailer’s decision problem is to determine which subset S from a universe of N possible variants should be included in the assortment and how much to stock of each variant.
• Static choice models: Customer choices do not depend on the transient inventory status of the variants in the assortment, but only on S.
– Examples: catalog retailer, shoe store
Inventory Management Under Static Choice
(A1) The initial choice of a variant is independent of the inventory status of the variants in S.
(A2) If a customer selects a variant in S and the store does not have it in stock, the customer does not undertake a second choice, and the sale is lost.
Assumptions of Static Choice Models
N The set {1,2,…,n} of all variants available in the market
S The subset of variants stocked by the retailer
xj Initial inventory of variant j (decision variable)
x Inventory level vector, {xj : j N}
Y Total number of customers, or store traffic (r.v.)
Mean of Y
Yj Demand for the jth variant (a function of S, r.v.)
qj(S) Probability that variant j is chosen by an arriving customer
Notation
• Demand Model:
– Each customer has an initial preference for variant j N with probability fj.
– If j S, the customer chooses to purchase variant j.
– If j S, the customer will substitute with another variant i N with probability sji.
– A customer chooses not to substitute with probability Lj = 1 - sji
i N
• Probability that variant j is chosen from a set S:
jiSi
ijijj sffSq,
)(
Smith and Agrawal Model
• Total store traffic: Y is modeled as a negative binomial r.v. Let represent the pmf for aggregate demand.
• Given the total demand Y = y and the subset of variants S, Yj has a binomial distribution with parameters y and qj(S). Unconditioning over aggregate demand, Yj also has a negative binomial distribution.
yk ttk
yky )1(
1
1)(
where y = 0,1,2,… and k and t are parameters of the negative binomial distribution, and
t
tk )1(
• Let j(yj,S) represent the pmf for the preference for each variant j given S. Then
• Denote the cdf of Yj by
jyj
kj
jjj rr
k
ykSy )1(
1
1),(
where
)1)(( tSqt
tr
jj
jy
djjj SdSy
0),(),(
• Substitution Probability Models:
1.) Random Substitution
03
13
13
13
10
31
31
31
31
03
13
13
13
10
LLL
LLL
LLL
LLL
2.) Adjacent Substitution
3.) One Variant Substitution
01002
10
21
0
02
10
21
0010
L
LL
LLL
0010
0010
0010
0010
L
L
L
L
• Profit Function:
– Single-period, lost sales (newsvendor) model
– Notation
kj Fixed cost to stock variant j
mj Unit profit margin obtained from selling variant j
coj Overage cost for variant j
cuj Underage cost for variant j
– The profit obtained from stocking xj units of variant j, denoted j(S,xj), is
jx
djjojjjjj SddxcYEmxS
0),()(][),(
j
x
djjuj kSdxdc
j
0
),()(
– The total profit, denoted (S,x), is given by
Sj
jj xSxS ),(),(
• Assortment Optimization
– Smith and Agrawal propose a non-linear integer programming formulation to solve for the optimal subset S* and the optimal stocking quantities x*.
– Binary variable zj indicates whether item j is included in S (zj = 1 if j S).
– Possible constraints on x and S:
Sj
jjj Txzt 1
Sj
j Tz 2
– From the static choice assumptions (A1) and (A2), given S, the optimal stocking decision for each variant, xj*, is determined by solving a simple, single-item newsvendor problem:
),*(),1( * Sxcc
cSx jj
ujoj
ujjj
– Substituting these values into the profit function yields a discrete optimization problem with a nonlinear objective function.
• Numerical Results:
1.) The optimal profit for the No Substitution model is always lower than the substitution model at any given level of variety.
2.) Random Substitution generally requires the largest optimal assortment, Adjacent Substitution a somewhat smaller optimal assortment, and One Variant Substitution is often maximized with only one variant (the “universal substitute”).
3.) As L increases, the number of variants in the optimal assortment increases.
• MNL Choice Model:
– Each customer associates a random utility Uj with the variants j S.
– A no-purchase option, denoted j = 0 with associated utility U0, is introduced.
– Customers choose the variant with the highest utility among the set {Uj : j S {0}}.
– Each variant has an identical retail price p and has an identical unit cost c. No fixed costs.
van Ryzin and Mahajan Models
• Given S, the probability that variant j is selected is given by
}}0{:max{( )( SiUUPSq ijj
Sii
j
vv
v
0
where
}0{ Sjevju
j
• The quantities vj are called preferences, because the values are increasing in the systematic component of utility, uj.
• Each customer assigns utilities to the variants in the subset S based on independent samples of the MNL model.
• The utility Uij that customer i assigns to variant j is given by
Uij = uj + ij
where the {ij, i 1} are iid random variables.
= mean number of customers arriving during the season
• The number of customers selecting variant j, Yj, is normally distributed with mean qj(S) and standard deviation ( qj(S)), where > 0 and 0 < 1.
Independent Population Model
• c = cost for each unit not sold (no salvage value)
• p – c = the opportunity cost for each unit short (loss in margin)
• The maximum expected profit given S and v is:
• The optimal stocking level of variant j is given by
Sj
jjjx
I cxYxpEvS ]},min{[),( max0
)1( where,))(()( 1*
p
czSqzSqx jjj
• A fixed number of customers visits the store. Each customer has identical valuations of the utilities for the variants; these utilities are determined by a single sample of the MNL model.
• {ij = 1j, i > 1}
• The common utility values are not observable to the retailer prior to making assortment decisions.
• Demand for variant j, denoted Yj, is a scaled Bernoulli r.v.
otherwise 0
0 )(1
)(
)( ySq
ySq
yYP j
j
j
Trend-Following Population Model
• Expected profit from variant j given xj is E[p min{Yj, xj} – cxj].
• The optimal expected profit given S and v is:
• The optimal stocking level of variant j is given by
Sj
jT cSpqvS ))((),(
cSpq
cSpqx
j
jj )( if
)( if 0*
• We can formulate the optimal assortment selection problem by solving
• Let the variants be ordered as v1 v2 . … vn.
),(max vSNS
• Theorem: Let Ai = {1, 2, … , i} for 1 i n. Then for each of the assortment problems defined above, there exists an S* {A1, … , An} that maximizes store profits.
Choose the best i variants, where 1 i n.
Assortment Optimization
• Theorem: For all n > i 1,
a) (Ai+1, v) > (Ai, v) for sufficiently high selling price p
b) (Ai+1, v) < (Ai, v) for sufficiently low no-purchase
preference v0
c) (Ai+1, v) > (Ai, v) for sufficiently high store volume (independent population case only)
• Dynamic choice models: Customer choices do depend on the transient inventory status of the variants in the assortment. A consumer may have a preferred variant, but upon finding that variant out of stock, he may decide to substitute a different variant.
– Examples: grocery items, soft drinks
Inventory Management Under Dynamic Choice
• Dynamic choice models are more realistic, but they are less tractable so we must rely more heavily on computational studies to understand the problem.
• Single period inventory model of a merchandise category made up of multiple product variants, each having a different unit selling price and unit procurement cost.
• Customers have a first choice and a second choice and demand is generated in two stages:
1.) Primary demand is realized and satisfied as much as possible with available inventory.
2.) Unfilled demand is converted to secondary demand for products based on deterministic proportions.
Noonan Model
N The set {1,2,…,n} of all variants available in the market
xj Initial inventory of variant j (decision variable)
x Inventory level vector, {xj : j N}
cj Unit procurement cost for variant j
pj Unit selling price for variant j
sij Fixed proportion of unfilled demand for variant i that is transferred to variant j after stockouts
Notation
R Region of demand where neither variant stocks out
R Region of demand where both variants stock out on original demand
R1 Variant 1 stocks out on original demand, but the substitution demand is satisfied using Variant 2
R12 Variant 1 stocks out on original demand and its substitution demand is large enough to stock out Variant 2
R2 Variant 2 stocks out on original demand, but the substitution demand is satisfied using Variant 1
R21 Variant 2 stocks out on original demand and its substitution demand is large enough to stock out Variant 1
Definition of Demand Regions, n = 2
• Let PA denote the probability that the demand vector lies in region RA. The expression for expected profit can be written by integrating over the different regions. Differentiating leads to the following first-order necessary conditions:
1212112
2111 ][ PpsPPPPpc
2121212
2122 ][ PpsPPPPpc
Analysis of Demand Regions
• The merchandise category consists of n substitutable variants:p j = selling price of variant j
c j = procurement cost of variant j
• Single-period (newsvendor-like) inventory model in which the retailer’s only decision is the vector of initial inventory levels x for each of the variants.
Mahajan and van Ryzin Model
• Dynamic choice version of the assortment problem using a general random utility model.
• T = number of customers on a sample path
• xt = (xt1, …, xt
n) is the vector of inventory levels observed by customer t (t = 1,…,T)
• x1 = x, the initial stocking decision
• S(y) = {j {0} : yj > 0} for any real inventory vector y
– S(y) is the set of in-stock variants together with the no-purchase option
– Customer t can only make a choice j S(xt)
– S(xt+1) S(xt), since inventory levels are nonincreasing over time
Sample Path Analysis
• Ut = (Ut0, Ut
1, … , Utn) is the vector of utilities assigned by customer t
• Based on xt and Ut, customer t makes the choice d(xt, Ut) that maximizes his utility:
• Let = {Ut : t = 1,…,T} denote a sample path from some probability space (, , ).
– The retailer does not know the particular realization but does know the probability measure , so we think of as characterizing the retailer’s knowledge of future demand.
– The retailer’s objective is to choose x to maximize total expected profit.
}{maxarg),()(
jt
xSjtt UUxd
t
j(x, ) = the number of sales of variant j made on the sample path given initial inventory levels x
(x, ) = sample path profit
(x, ) = pT (x, ) – cTx
• Retailer’s objective is to solve
(x, ) = (1(x, ), …, n(x, ))
)],([max0
xEx
Total Profit
• Inventory is viewed as a fluid. Each customer t requires a certain quantity of fluid, Qt, which could vary from customer to customer and could be non-integral.
• This process continues until either the customer’s requirement is met or the inventory of all fluids valued higher than the no-purchase utility is exhausted.
• The customer drains the inventory of the most preferred fluid. If this fluid runs out, the customer drains the inventory of the second most preferred fluid and so on.
A Fluid Model Relaxation
(x, ) is the sample path gradient.
• Requires an initial starting inventory level y and a sequence of step sizes {ak} with the following properties:
• Sample Path Gradient Algorithm:1. Initialize: k = 0 and y0 = y
2. At iteration k
i) Generate a new sample path k
ii) Calculate (yk, k) and ak
iii) Update the starting inventory level for the next iteration, using the equation yk+1 = yk + ak (yk, k)
3. Set k = k + 1 and go to Step 2
Sample Path Gradient Algorithm
0
2
0 and
kk
kk aa
• The MNL model was used to generate the sequence of utilities {Ut}. The systematic component of utility is broken down as
uj = y + aj – pj and u0 = y + a0
where y stands for consumer income, aj is a quality index and pj is the price for variant j.
• The examples have n = 10 variants, with linearly decreasing quality indices
aj = 12.25 – 0.5(j – 1) j = 1,…,10
and a0 = 4.0.
Numerical Experiments – Assumptions
• The procurement cost for a unit is set at cj = 3.0 for all j, and the price is set at pj = p ( j = 1,…,10), where p takes values in the range 3–9. This simplification facilitates comparison with other heuristic policies.
• The number of customers in the sequence, T, is a Poisson r.v. with mean 30, and each customer demands exactly one unit of product (Qt = 1 for all t).
Numerical Experiments – Assumptions (cont’d)
• The error terms tj are iid, Gumbel distributed with parameter
= 1.5, so the variance of tj is 1.18.
• Independent Newsvendor
Heuristic Policies
• Pooled Newsvendor: The entire category is aggregated and viewed as a single variant. An aggregate newsvendor inventory level is then calculated for the category based on the probability the category will be chosen over the no-purchase option. This aggregate inventory is then allocated proportionally to the preference values vj.
• Naïve Gradient: At each iteration, we decrease the inventory level of all variants that are left in stock at the end of the sample path and increase the inventory level of all variants that are out of stock at the end of the sample path.
Price 5.0 8.0
Sample Path Gradient 47.82 98.15
Independent Newsvendor 46.31 95.40
Pooled Newsvendor 47.35 96.92
Naïve Gradient 42.48 92.74
Comparison of Gross Profit
0
1
2
3
4
5
6
7
8
9
1 2 3 4 5 6 7 8 9 10
Variant
Inve
ntor
y Le
vel
Sample Path Gradient
Independent Newsvendor
Comparison of Stocking Levels, p = 8.0
Explanation
• The demand for a given variant will be higher than the independent newsvendor predicts because each variant receives some additional substitute demand from other variants that are out of stock. This effect increases the level of demand, which provides an incentive to increase inventory.
• The unit underage cost is lower than the independent newsvendor predicts because an underage in one variant does not always result in a lost sale. This reduces the effective underage cost, and creates an incentive to decrease inventory.
• The independent newsvendor is biased; it stocks too little of the popular variants and too much of the less popular variants.
Comparison of Stocking Levels, p = 8.0
0
1
2
3
4
5
6
7
8
9
1 2 3 4 5 6 7 8 9 10
Variant
Inve
ntor
y Le
vel
Sample Path Gradient
Pooled Newsvendor
Comparison of Stocking Levels, p = 8.0
0123456789
10
1 2 3 4 5 6 7 8 9 10
Variant
Inve
ntor
y Le
vel
Sample Path Gradient
Naïve Gradient
• When inventory of a particular variant is left over at the end of the sample path, both the naïve gradient and the sample path gradient method choose a gradient direction which decreases the inventory level of that variant.
Explanation
• Therefore, the naïve gradient method overestimates the inventory level.
• When a variant is sold out at the end of the sample path, the naïve gradient always increases the inventory level of that variant.
• However, the sample path gradient method does not always increase the inventory level, because an incremental unit may only result in a substitution rather than an additional sale.
• We must estimate the systematic components of utility uj required for computing the choice probabilities of the MNL. Assume utilities are scaled so that = 1.
• Linear in Attributes Model: – Let each variant j = 1 ,…, n be associated with an m-vector of
attribute values, yj = (y1j,…, ymj).
– Let = (1, … , m) be a vector of coefficients which determine how attribute values are weighted to obtain uj.
– Then uj = T yj, j = 1,…, n
Maximum Likelihood Estimation of the MNL
• We will assume that we have data describing T customer choice decisions. Let St denote the set of alternatives available to Customer t and define Customer t’s decision by the values
zjt = 1 if alternative j is chosen by Customer t
0 otherwise
• The data consists of the values of {zjt : t = 1,…,T, j = 1,…,n} together with the choice sets {St : t = 1,…,T}.
• Maximize the log-likelihood function, L() over .
T
t Sj Si
yj
Tjt
t t
iT
eyzL1
ln)(
Estimating
Conclusions
• Consumer choice does affect inventory management, and choice behavior has a significant impact on both stocking decisions and profits.
• In particular, under dynamic substitution one needs to stock relatively more of popular variants and relatively less of unpopular variants.
• Both static and dynamic substitution models suggest narrower assortments are optimal if there is a higher level of substitution among variants in a category.