retail inventories and consumer choice __________________________ siddharth mahajan garrett j. van...

Retail Inventories and

Consumer Choice__________________________

Siddharth Mahajan

Garrett J. van Ryzin

Overview

• Modeling Consumer Choice– Attribute Models and Utility Models

– Binary Probit, Binary Logit, Multinomial Logit

• Inventory Management Under Static Choice– Smith and Agrawal Model

– van Ryzin and Mahajan Models• Independent Population and Trend-Following Population

• Inventory Management Under Dynamic Choice– Noonan Model

– Mahajan and van Ryzin Model• Sample Path Gradient Algorithm

• Maximum Likelihood Estimation of the MNL

Introduction

• Faced with limited choices, customers are often willing to substitute rather than go home empty handed.

• Customers are heterogeneous in taste, and they are often willing to pay a higher price for products with attributes closer to their desired attributes.

• Therefore, a retailer has an incentive to offer a broad variety of products to better cover the possible range of consumer tastes.

• There are few direct costs of variety for a retailer, but the indirect costs of stockouts and overstocking impose an implicit cost on variety. Trade-off: “breadth vs. depth” of assortment.

Choice Processes

• Given two alternatives, a choice corresponds to an expression of preference of one alternative over another.

• Let X be a set of alternatives and let be a binary relation on the set X. For alternatives x, y X, x y denotes that y is strictly preferred to x.

• Definition: A binary relation on a set X is called a preference relation if it is asymmetric and negatively transitive.

• Theorem: If X is a finite set, a binary relation is a preference relation if and only if there exists a function u: X (called a utility function) such that

x y iff u(x) < u(y)

Two approaches for modeling choice:

1.) Construct preference relations directly

2.) Construct utilities and then apply utility maximization

Lexicographic Model

• This model implies attributes strictly dominate each other.

• Equivalent utility maximization model: There are n alternatives that have m attributes (1 = highest, m = lowest). Let ajk = 1 if alternative j possesses attribute k. Then a utility satisfying Theorem 1 is the binary number

Uj = aj1aj2…ajm

• A product is made up of binary attributes. The consumer ranks all attributes and then eliminates alternatives which do not possess the most important attribute. If more than one alternative remains, the next most important attribute is chosen as a criterion for elimination of alternatives, and so on.

Address Model

• We have n alternatives, each of which has m attributes that take on real values. Therefore, alternatives can be represented as n points, z1, …, zn, in m, which is called the attribute space.

• Equivalent utility maximization model:

Uj(y) = c - (zj, y)

• Each consumer has an ideal point (address) y m, and chooses the product closest to it in attribute space, where distance is defined by a metric on m m.

• These distances define a preference relation: i j if and only if (zi, y) > (zj, y).

• Randomization of the deterministic lexicographic rule.

• Process:

1.) Delete attributes common to all alternatives.

2.) Select one of the remaining attributes with probability proportional to a specified (deterministic) utility value.

3.) Eliminate alternatives not having the selected attribute.

4.) If a single alternative remains, choose it. If several alternatives remain, repeat the above process. If all

remaining alternatives have the same attributes, chooseone randomly.

Tversky Model

• Alternatives can be represented as elements z1, …, zn, of m, and each individual has an ideal point (address) y m.

• Let g(y), y m, denote the density function for y.

• Uj(y) = c - || zj – y ||, where c is the utility of y and represents the disutility due to deviations from y.

• The market space of variant j is given by

Mj = { y m : Uj(y) Ui(y), i = 1,…,n}

• The probability that a consumer buys variant j is given by

Pj = g(y) dy Mj

Randomized Address Model

• Let the n alternatives be denoted j = 1,…,n. A consumer associates a utility with alternative j, denoted Uj.

• This utility is decomposed into two parts: a representative component uj that is deterministic and a random component j with mean zero:

Uj = uj + j.

• The probability that an individual selects alternative j is

qj = P(Uj = max Ui) i = 1,…,n

• This probability depends on the joint distribution of the random components j. Common versions are the binary probit, binary logit, and multinomial logit.

Random Utility Models

• Two alternatives, and the error terms j, j = 1, 2 are iid N(0,2).

• Probability that Variant 1 is chosen:

q1 = P(U1 = max {U1, U2})

= P(U1 U2)

= P(u1 + 1 u2 + 2)

= P(2 – 1 u1 – u2)

=

• No closed form solution.

221 uu

Binary Probit

• Two alternatives, and the error term = 1 – 2 has a logistic cumulative distribution, i.e.

> 0 is a scale parameter and – < <

x

e

F

1

1 )(

• E() = 0 and Var() = (22)/3

• The logistic distribution provides a good approximation to the normal distribution, but with “fatter tails.”

Binary Logit

• Probability that Variant 1 is chosen:

q1 = P(U1 = max {U1, U2})

= P(U1 U2)

= P(u1 + 1 u2 + 2)

= P(2 – 1 u1 – u2)

=

21

1

uu

u

ee

e

j are iid random variables with a Gumbel (or double exponential) distribution, with cdf

is Euler’s constant (= 0.5772…) and is a scale parameter, E[j] = 0 and Var[j] = (22)/6

• The probability that an alternative j is chosen from a set S {1,2,…,n} that contains j is given by

)(

)()(

x

ej exPxF

Si

u

u

Si

j

e

ejP

)(

Multinomial Logit (MNL)

• For all S N, and T N such that S T, and for all i,j S,

• The ratio of choice probabilities for i and j is independent of the choice set containing these alternatives. This property is not realistic if the choice set contains alternatives that can be grouped such that alternatives within a group are more similar than alternatives outside the group, because adding a new alternative reduces the probability of choosing similar alternatives more than dissimilar alternatives.

)()(

)()(

jPiP

jPiP

T

T

S

S

Independence from Irrelevant Alternatives Property

• Suppose the individual selects to travel by car or by bus with equal probability. Let the set S = {car, bus}. Then

PS(car) = PS(bus) = 1/2

• Introduce a new bus. Let the set T denote {car, blue bus, red bus}. It can be shown that the MNL predicts

PT(car) = PT(blue bus) = PT(red bus) = 1/3

• However, a more natural outcome is

PT(car) = 1/2

PT(blue bus) = PT(red bus) = 1/4

Blue bus/red bus paradox

• The retailer’s decision problem is to determine which subset S from a universe of N possible variants should be included in the assortment and how much to stock of each variant.

• Static choice models: Customer choices do not depend on the transient inventory status of the variants in the assortment, but only on S.

– Examples: catalog retailer, shoe store

Inventory Management Under Static Choice

(A1) The initial choice of a variant is independent of the inventory status of the variants in S.

(A2) If a customer selects a variant in S and the store does not have it in stock, the customer does not undertake a second choice, and the sale is lost.

Assumptions of Static Choice Models

N The set {1,2,…,n} of all variants available in the market

S The subset of variants stocked by the retailer

xj Initial inventory of variant j (decision variable)

x Inventory level vector, {xj : j N}

Y Total number of customers, or store traffic (r.v.)

Mean of Y

Yj Demand for the jth variant (a function of S, r.v.)

qj(S) Probability that variant j is chosen by an arriving customer

Notation

• Demand Model:

– Each customer has an initial preference for variant j N with probability fj.

– If j S, the customer chooses to purchase variant j.

– If j S, the customer will substitute with another variant i N with probability sji.

– A customer chooses not to substitute with probability Lj = 1 - sji

i N

• Probability that variant j is chosen from a set S:

jiSi

ijijj sffSq,

)(

Smith and Agrawal Model

• Total store traffic: Y is modeled as a negative binomial r.v. Let represent the pmf for aggregate demand.

• Given the total demand Y = y and the subset of variants S, Yj has a binomial distribution with parameters y and qj(S). Unconditioning over aggregate demand, Yj also has a negative binomial distribution.

yk ttk

yky )1(

1

1)(

where y = 0,1,2,… and k and t are parameters of the negative binomial distribution, and

t

tk )1(

• Let j(yj,S) represent the pmf for the preference for each variant j given S. Then

• Denote the cdf of Yj by

jyj

kj

jjj rr

k

ykSy )1(

1

1),(

where

)1)(( tSqt

tr

jj

jy

djjj SdSy

0),(),(

• Substitution Probability Models:

1.) Random Substitution

03

13

13

13

10

31

31

31

31

03

13

13

13

10

LLL

LLL

LLL

LLL

2.) Adjacent Substitution

3.) One Variant Substitution

01002

10

21

0

02

10

21

0010

L

LL

LLL

0010

0010

0010

0010

L

L

L

L

• Profit Function:

– Single-period, lost sales (newsvendor) model

– Notation

kj Fixed cost to stock variant j

mj Unit profit margin obtained from selling variant j

coj Overage cost for variant j

cuj Underage cost for variant j

– The profit obtained from stocking xj units of variant j, denoted j(S,xj), is

jx

djjojjjjj SddxcYEmxS

0),()(][),(

j

x

djjuj kSdxdc

j

0

),()(

– The total profit, denoted (S,x), is given by

Sj

jj xSxS ),(),(

• Assortment Optimization

– Smith and Agrawal propose a non-linear integer programming formulation to solve for the optimal subset S* and the optimal stocking quantities x*.

– Binary variable zj indicates whether item j is included in S (zj = 1 if j S).

– Possible constraints on x and S:

Sj

jjj Txzt 1

Sj

j Tz 2

– From the static choice assumptions (A1) and (A2), given S, the optimal stocking decision for each variant, xj*, is determined by solving a simple, single-item newsvendor problem:

),*(),1( * Sxcc

cSx jj

ujoj

ujjj

– Substituting these values into the profit function yields a discrete optimization problem with a nonlinear objective function.

• Numerical Results:

1.) The optimal profit for the No Substitution model is always lower than the substitution model at any given level of variety.

2.) Random Substitution generally requires the largest optimal assortment, Adjacent Substitution a somewhat smaller optimal assortment, and One Variant Substitution is often maximized with only one variant (the “universal substitute”).

3.) As L increases, the number of variants in the optimal assortment increases.

• MNL Choice Model:

– Each customer associates a random utility Uj with the variants j S.

– A no-purchase option, denoted j = 0 with associated utility U0, is introduced.

– Customers choose the variant with the highest utility among the set {Uj : j S {0}}.

– Each variant has an identical retail price p and has an identical unit cost c. No fixed costs.

van Ryzin and Mahajan Models

• Given S, the probability that variant j is selected is given by

}}0{:max{( )( SiUUPSq ijj

Sii

j

vv

v

0

where

}0{ Sjevju

j

• The quantities vj are called preferences, because the values are increasing in the systematic component of utility, uj.

• Each customer assigns utilities to the variants in the subset S based on independent samples of the MNL model.

• The utility Uij that customer i assigns to variant j is given by

Uij = uj + ij

where the {ij, i 1} are iid random variables.

= mean number of customers arriving during the season

• The number of customers selecting variant j, Yj, is normally distributed with mean qj(S) and standard deviation ( qj(S)), where > 0 and 0 < 1.

Independent Population Model

• c = cost for each unit not sold (no salvage value)

• p – c = the opportunity cost for each unit short (loss in margin)

• The maximum expected profit given S and v is:

• The optimal stocking level of variant j is given by

Sj

jjjx

I cxYxpEvS ]},min{[),( max0

)1( where,))(()( 1*

p

czSqzSqx jjj

• A fixed number of customers visits the store. Each customer has identical valuations of the utilities for the variants; these utilities are determined by a single sample of the MNL model.

• {ij = 1j, i > 1}

• The common utility values are not observable to the retailer prior to making assortment decisions.

• Demand for variant j, denoted Yj, is a scaled Bernoulli r.v.

otherwise 0

0 )(1

)(

)( ySq

ySq

yYP j

j

j

Trend-Following Population Model

• Expected profit from variant j given xj is E[p min{Yj, xj} – cxj].

• The optimal expected profit given S and v is:

• The optimal stocking level of variant j is given by

Sj

jT cSpqvS ))((),(

cSpq

cSpqx

j

jj )( if

)( if 0*

• We can formulate the optimal assortment selection problem by solving

• Let the variants be ordered as v1 v2 . … vn.

),(max vSNS

• Theorem: Let Ai = {1, 2, … , i} for 1 i n. Then for each of the assortment problems defined above, there exists an S* {A1, … , An} that maximizes store profits.

Choose the best i variants, where 1 i n.

Assortment Optimization

• Theorem: For all n > i 1,

a) (Ai+1, v) > (Ai, v) for sufficiently high selling price p

b) (Ai+1, v) < (Ai, v) for sufficiently low no-purchase

preference v0

c) (Ai+1, v) > (Ai, v) for sufficiently high store volume (independent population case only)

• Dynamic choice models: Customer choices do depend on the transient inventory status of the variants in the assortment. A consumer may have a preferred variant, but upon finding that variant out of stock, he may decide to substitute a different variant.

– Examples: grocery items, soft drinks

Inventory Management Under Dynamic Choice

• Dynamic choice models are more realistic, but they are less tractable so we must rely more heavily on computational studies to understand the problem.

• Single period inventory model of a merchandise category made up of multiple product variants, each having a different unit selling price and unit procurement cost.

• Customers have a first choice and a second choice and demand is generated in two stages:

1.) Primary demand is realized and satisfied as much as possible with available inventory.

2.) Unfilled demand is converted to secondary demand for products based on deterministic proportions.

Noonan Model

N The set {1,2,…,n} of all variants available in the market

xj Initial inventory of variant j (decision variable)

x Inventory level vector, {xj : j N}

cj Unit procurement cost for variant j

pj Unit selling price for variant j

sij Fixed proportion of unfilled demand for variant i that is transferred to variant j after stockouts

Notation

R Region of demand where neither variant stocks out

R Region of demand where both variants stock out on original demand

R1 Variant 1 stocks out on original demand, but the substitution demand is satisfied using Variant 2

R12 Variant 1 stocks out on original demand and its substitution demand is large enough to stock out Variant 2

R2 Variant 2 stocks out on original demand, but the substitution demand is satisfied using Variant 1

R21 Variant 2 stocks out on original demand and its substitution demand is large enough to stock out Variant 1

Definition of Demand Regions, n = 2

• Let PA denote the probability that the demand vector lies in region RA. The expression for expected profit can be written by integrating over the different regions. Differentiating leads to the following first-order necessary conditions:

1212112

2111 ][ PpsPPPPpc

2121212

2122 ][ PpsPPPPpc

Analysis of Demand Regions

• The merchandise category consists of n substitutable variants:p j = selling price of variant j

c j = procurement cost of variant j

• Single-period (newsvendor-like) inventory model in which the retailer’s only decision is the vector of initial inventory levels x for each of the variants.

Mahajan and van Ryzin Model

• Dynamic choice version of the assortment problem using a general random utility model.

• T = number of customers on a sample path

• xt = (xt1, …, xt

n) is the vector of inventory levels observed by customer t (t = 1,…,T)

• x1 = x, the initial stocking decision

• S(y) = {j {0} : yj > 0} for any real inventory vector y

– S(y) is the set of in-stock variants together with the no-purchase option

– Customer t can only make a choice j S(xt)

– S(xt+1) S(xt), since inventory levels are nonincreasing over time

Sample Path Analysis

• Ut = (Ut0, Ut

1, … , Utn) is the vector of utilities assigned by customer t

• Based on xt and Ut, customer t makes the choice d(xt, Ut) that maximizes his utility:

• Let = {Ut : t = 1,…,T} denote a sample path from some probability space (, , ).

– The retailer does not know the particular realization but does know the probability measure , so we think of as characterizing the retailer’s knowledge of future demand.

– The retailer’s objective is to choose x to maximize total expected profit.

}{maxarg),()(

jt

xSjtt UUxd

t

j(x, ) = the number of sales of variant j made on the sample path given initial inventory levels x

(x, ) = sample path profit

(x, ) = pT (x, ) – cTx

• Retailer’s objective is to solve

(x, ) = (1(x, ), …, n(x, ))

)],([max0

xEx

Total Profit

• Inventory is viewed as a fluid. Each customer t requires a certain quantity of fluid, Qt, which could vary from customer to customer and could be non-integral.

• This process continues until either the customer’s requirement is met or the inventory of all fluids valued higher than the no-purchase utility is exhausted.

• The customer drains the inventory of the most preferred fluid. If this fluid runs out, the customer drains the inventory of the second most preferred fluid and so on.

A Fluid Model Relaxation

(x, ) is the sample path gradient.

• Requires an initial starting inventory level y and a sequence of step sizes {ak} with the following properties:

• Sample Path Gradient Algorithm:1. Initialize: k = 0 and y0 = y

2. At iteration k

i) Generate a new sample path k

ii) Calculate (yk, k) and ak

iii) Update the starting inventory level for the next iteration, using the equation yk+1 = yk + ak (yk, k)

3. Set k = k + 1 and go to Step 2

Sample Path Gradient Algorithm

0

2

0 and

kk

kk aa

• The MNL model was used to generate the sequence of utilities {Ut}. The systematic component of utility is broken down as

uj = y + aj – pj and u0 = y + a0

where y stands for consumer income, aj is a quality index and pj is the price for variant j.

• The examples have n = 10 variants, with linearly decreasing quality indices

aj = 12.25 – 0.5(j – 1) j = 1,…,10

and a0 = 4.0.

Numerical Experiments – Assumptions

• The procurement cost for a unit is set at cj = 3.0 for all j, and the price is set at pj = p ( j = 1,…,10), where p takes values in the range 3–9. This simplification facilitates comparison with other heuristic policies.

• The number of customers in the sequence, T, is a Poisson r.v. with mean 30, and each customer demands exactly one unit of product (Qt = 1 for all t).

Numerical Experiments – Assumptions (cont’d)

• The error terms tj are iid, Gumbel distributed with parameter

= 1.5, so the variance of tj is 1.18.

• Independent Newsvendor

Heuristic Policies

• Pooled Newsvendor: The entire category is aggregated and viewed as a single variant. An aggregate newsvendor inventory level is then calculated for the category based on the probability the category will be chosen over the no-purchase option. This aggregate inventory is then allocated proportionally to the preference values vj.

• Naïve Gradient: At each iteration, we decrease the inventory level of all variants that are left in stock at the end of the sample path and increase the inventory level of all variants that are out of stock at the end of the sample path.

Price 5.0 8.0

Sample Path Gradient 47.82 98.15

Independent Newsvendor 46.31 95.40

Pooled Newsvendor 47.35 96.92

Naïve Gradient 42.48 92.74

Comparison of Gross Profit

0

1

2

3

4

5

6

7

8

9

1 2 3 4 5 6 7 8 9 10

Variant

Inve

ntor

y Le

vel

Sample Path Gradient

Independent Newsvendor

Comparison of Stocking Levels, p = 8.0

Explanation

• The demand for a given variant will be higher than the independent newsvendor predicts because each variant receives some additional substitute demand from other variants that are out of stock. This effect increases the level of demand, which provides an incentive to increase inventory.

• The unit underage cost is lower than the independent newsvendor predicts because an underage in one variant does not always result in a lost sale. This reduces the effective underage cost, and creates an incentive to decrease inventory.

• The independent newsvendor is biased; it stocks too little of the popular variants and too much of the less popular variants.


0

1

2

3

4

5

6

7

8

9

1 2 3 4 5 6 7 8 9 10

Variant

Inve

ntor

y Le

vel


Pooled Newsvendor


0123456789

10

1 2 3 4 5 6 7 8 9 10

Variant

Inve

ntor

y Le

vel


Naïve Gradient

• When inventory of a particular variant is left over at the end of the sample path, both the naïve gradient and the sample path gradient method choose a gradient direction which decreases the inventory level of that variant.

Explanation

• Therefore, the naïve gradient method overestimates the inventory level.

• When a variant is sold out at the end of the sample path, the naïve gradient always increases the inventory level of that variant.

• However, the sample path gradient method does not always increase the inventory level, because an incremental unit may only result in a substitution rather than an additional sale.

• We must estimate the systematic components of utility uj required for computing the choice probabilities of the MNL. Assume utilities are scaled so that = 1.

• Linear in Attributes Model: – Let each variant j = 1 ,…, n be associated with an m-vector of

attribute values, yj = (y1j,…, ymj).

– Let = (1, … , m) be a vector of coefficients which determine how attribute values are weighted to obtain uj.

– Then uj = T yj, j = 1,…, n

Maximum Likelihood Estimation of the MNL

• We will assume that we have data describing T customer choice decisions. Let St denote the set of alternatives available to Customer t and define Customer t’s decision by the values

zjt = 1 if alternative j is chosen by Customer t

0 otherwise

• The data consists of the values of {zjt : t = 1,…,T, j = 1,…,n} together with the choice sets {St : t = 1,…,T}.

• Maximize the log-likelihood function, L() over .

T

t Sj Si

yj

Tjt

t t

iT

eyzL1

ln)(

Estimating

Conclusions

• Consumer choice does affect inventory management, and choice behavior has a significant impact on both stocking decisions and profits.

• In particular, under dynamic substitution one needs to stock relatively more of popular variants and relatively less of unpopular variants.

• Both static and dynamic substitution models suggest narrower assortments are optimal if there is a higher level of substitution among variants in a category.

retail inventories and consumer choice __________________________ siddharth mahajan garrett j. van...

Documents

alternatives x

binary attributes

binary relation

set of alternatives

n alternatives

utility maximization

lexicographic model

address model