product offerings and product line length dynamics
TRANSCRIPT
Product Offerings and Product Line Length DynamicsXing Li
Department of Economics, Stanford University, 579 Serra Mall, Stanford, CA 94305-6072. [email protected]
This paper provides a model that uses preference heterogeneity to rationalize the cross-sectional and intertem-
poral variation in a firm’s horizontal product differentiation strategies. Product-line dynamics arise from
shocks to preference heterogeneity. For example, in the potato chip category I study, consumer concerns
over fat levels in foods created two desirable alternatives (low fat and zero fat) for each flavor. On the
supply side, firms learn about these changing tastes and adapt product lines accordingly. For tractability,
the heterogeneity in preference is captured by the nesting parameter in an aggregate nested logit demand
model. I find greater preference heterogeneity for chips in smaller packages and for markets with more demo-
graphic diversity. The dominant firm in the market bases its decisions primarily on its past experience in the
market, with the latest preference shocks representing only 30% of the influence in product-line decisions.
Gross margins are increased by 5% if firms have perfect information about preference heterogeneity. Costs
for product line maintenance constitute about 2% of total revenue. Sunk costs incurred when expanding the
product line are estimated to be four times the per-product fixed cost, thereby limiting the flexibility of
product-line adjustment. The probability of line length adjustment grows from 70% to 90% under a smooth
cost structure.
1. Introduction
One of the central decisions firms make is the level of their product differentiation. Product
differentiation can be exercised in two dimensions: vertically or horizontally. Vertical differ-
entiation means providing an upgraded or downgraded model and charging a different price,
for example, Apple iPhone5S and iPhone5C. Within the same “model”, firms can further
differentiate horizontally by providing different features of colors, flavors, or designs. Apple
1
Li: Product Offerings and Product Line Length Dynamics2
offers iPhone5S with three choices of colors; Dannon produces 6oz yogurt in different flavors.
Both companies are doing horizontal product differentiation within the same model.
There are at least two difference between these two types of product differentiation strate-
gies. First, vertical differentiation is mainly driven by leaps in R&D success (e.g., Goettler and
Gordon 2011), whereas horizontal differentiation is largely initiated by consumer tastes (e.g.,
Draganska and Jain 2005) that vary across different markets and over time. Furthermore,
vertical differentiation usually involves higher fixed costs of adapting production processes,
whereas horizontal differentiation typically utilizes the same process as existing products.
For both reasons, horizontal differentiation is more flexible and therefore creates more vari-
ation in a firm’s decisions. This variation in horizontal differentiation is what motivates my
investigation into firms’ extensions and contractions of their product lines.
Acknowledging the fact that consumers preference heterogeneity on the demand side is
the primary driver of horizontal product-line, I propose the following framework to ratio-
nalize both cross-sectional and intertemporal variation in product-line decisions. The extent
of preference heterogeneity varies across markets for reasons such as the concentration of
different demographic groups.1 Firms will provide a richer set of (horizontally differentiated)
products in markets with a more heterogeneous preference to serve a larger proportion of
consumers and make more profits. Within each market, firms can also adjust their product
lines over time in response to changes in preference heterogeneity.2 Firms are more likely to
expand (contract) their product lines when preference heterogeneity increases (decreases).
The main mechanism to support the above argument is as follows. Preference hetero-
geneity affects the tradeoff between cannibalization and new sales creation when expanding1 In this paper, “preference heterogeneity” is an aggregate statistic for both variety seeking within individuals and
preference heterogeneity among individuals.2 For example, manufacturers of potato chip will consider the immigration of Asian and Hispanic population. They
will also be aware of the consumers’ growing concern for their own health.
Li: Product Offerings and Product Line Length Dynamics3
the product line. For multi-product firms, the newly launched product brings in new con-
sumers and cannibalizes market shares from existing products. When consumer’s preference
is quite homogenous, it is difficult to initiate extra sales to new consumers by new product
launching, and cannibalization effect dominates new sales creation effect. As a result, firms
may not maintain a long product line. On the other hand, when consumer’s preference is
quite heterogeneous, new sales creation effect dominates and expanding product line is more
profitable.
To formalize and quantify the above argument, I model the demand side using Nested
Logit. Different products (features) from the same brand are clustered in one nest (line) in
the choice structure. The nesting parameter has the same behavioral interpretation as the
heterogeneity of preference, which is an aggregate measure of both variety seeking within an
individual and preference heterogeneity across individuals.3 When products are more nested
within the line, they are closer substitutes; consumers agree on the preference ranking among
these products and the preference is more homogenous.4 On the other hand, when products
are less nested within the line, they are less substitutes; consumers have more varied views
on their favorite products, and preference is more heterogenous.
On the supply side, the multi-product firm chases time-varying preference heterogeneity
by adjusting product line length,5 which is described by an in-market learning model based
3 This modeling idea is rooted in the early motivation of nested logit model, that is specifying distributions of
unobserved heterogeneity to capture substitution pattern.4 Intuitively, consider two products that are equally favorable, and split the market. Suppose the price of one product
rises. When these two products are closer substitutes, the market share of the second product will increase more,
which means more people agree on the which product is their favorite.5 In reality, the firm decides on the number and the contents of all products in line simultaneously. However, I only
model the line length decision in this paper for the following three reasons. First, taste for potato chips depends on
numerous flavors, so that it is difficult to write down a characteristic-based demand model to predict the demand
Li: Product Offerings and Product Line Length Dynamics4
on Hitsch (2006). Firms decide on product differentiation based on their belief of preference
heterogeneities, and realized market outcomes help them to update their belief. On top of
the standard in-market learning model, I allow the preference heterogeneity to evolve over
time, which explains the frequent adjusting behavior for an experienced firm operating in a
matured market for years.6 The elegant representation of nested logit model that is linear in
the nesting parameter makes modeling supply-side learning framework tractable.
I apply the model to the potato chip market, where there is a leading firm (I call it
Company A hereafter) with a market share of 60%. I find a more heterogeneous preference
for potato chips in smaller packages, which implies consumers are more willing to try new
flavors when buying small-packaged potato chips. I also find that a more diverse population
in the local market will tend to exhibit more preference heterogeneity, which is confirmed by
the estimation with a series of measures for population diversity in that market, including
the dispersion of income and age distribution, and the diversity of ethnic groups. On the
supply side, Company A applies in-market learning on preference heterogeneity to adjust his
product line length. I find that Company A bases its decisions primarily on past experience in
the market, with the latest preference shocks representing 30% of the influence. The marginal
cost of offering one additional product is estimated to be $3,560 per million households by
quarter; the total maintenance cost is estimated to be 2% of total revenue for an average
line with a length of 22. I also estimate the sunk cost incurred when expanding the product
differentiation to be three times the usual maintenance cost, which may limit the flexibility
of product-line adjustments.
for new products. Second, the action space will become extremely large if I model the content of products as well,
i.e., for an average product line in my sample with a length of 20, the action space is {Offer ,NotOffer}20. Third, the
dynamics of product line that I am focused in this paper requires more tractability.6 Technically speaking, by allowing time-evolving preference heterogeneity, the precision of belief does not explode to
infinity so that the belief will still change in response to market outcome, and so is product line strategies.
Li: Product Offerings and Product Line Length Dynamics5
Based on the estimates, two counterfactual exercises evaluate the firm’s optimal line length
decisions under different product-line specific policy experiments. In the first exercise, I
simulate the optimal line length decisions without the existence of extra cost for line length
expansion. This removes the restrictions on Company A’s flexibility in adjusting line length,
and the probability of line length changes grows from 70% to 90%. In the second exercise, I
consider the situation where Company A knows the precise value of preference heterogeneity
at the time of product line length decisions. She can make a better decision based on the
true value instead of some guess, and the gross margin is increased by 5%. A byproduct
of the second counterfactual is to test the hypothesis of learning or knowing preference
heterogeneity when making line length decisions. I construct a test based on gross margin, and
the test result supports the assumption of learning rather than knowing about heterogeneity.
Both simulations shed lights on firm’s potential gain from product-line related improvement.
The first one relates to a more efficient cost for product line maintenance, say, a more flexible
contract on shelf-space and a better distribution system, and the second one relates to a
better knowledge about consumers from sources other than market realizations.
This paper is related to several strands of literature. First, there is a growing literature
on firm’s product differentiation and product-line design, both theoretically and empirically.
Theoretical works have discovered varies factors to determine product differentiation, includ-
ing communication cost to consumers (Villas-Boas 2004), vertical structure of distribution
(Liu and Cui 2010), consumer’s deliberation on their preference (Guo and Zhang 2012), vari-
ety preference and purchase cost (Bronnenberg 2014) and other rational interpretations as
well as behavioral explanations such as cognitive overload (Iyengar and Lepper 2000), artic-
ulated preference (Chernev 2003a,b), and contextual effects (Orhun 2009). However, there
are relatively few empirical papers in this field. Among those, most works take the prod-
uct line as given and evaluate its implications in consumers demand (Hui 2004, Draganska
Li: Product Offerings and Product Line Length Dynamics6
and Jain 2006). The only work I know that explains the product line design is Draganska
et al. (2009) that offer a supply-side model of product provisions for horizontally differen-
tiated products. However, they restrict to a small subset of products and mainly attribute
the cross-sectional variation in product line to various supply-side competition environment.
This paper complements their work by offering a supply-side model for both cross-sectional
and inter-temporal variation in the length of the whole product line where the driving force
is preference heterogeneity on the demand side.
Second, this paper contributes to the in-market learning literature by proposing a new
learning object: preference heterogeneity. Numerous papers study demand-side learning
about the quality of new products (Roberts and Urban 1988, Erdem and Keane 1996, Ching
et al. 2013, Lin et al. 2014). On the supply side, Urban and Katz (1983) and Urban and Hauser
(1993) address firms’ market experimentation in designing new products. Hitsch (2006) stud-
ies firms’ learning the quality of new products when making exit decisions. Another series of
papers(Crawford and Shum 2005, Narayanan and Manchanda 2009) considers physicians and
patients learning about the effectiveness of drugs when making prescription decisions. This
paper differs from those empirical learning papers in two perspectives. First, the learning
object is preference heterogeneity rather than mean preference in most existing literature.
Second, the learning object is evolving over time whereas in standard learning framework,
the learning object is constant.7
Third, this paper is also related to researches on variety seeking. Models for variety seeking
find negative state dependence on past choice (Chintagunta 1998, 1999, Seetharaman et al.
2005, Dubé et al. 2009, 2010). This paper uses a different, but related metrics for the variety
seeking in the aggregate level. Finally, the supply side is modeled similarly as researches
7 Lovett et al. (2009) also model time-evolving learning parameters.
Li: Product Offerings and Product Line Length Dynamics7
on empirical entry and product positioning.8 Early research on empirical entry infer firms’
profitability from their entry decisions (Reiss and Spiller 1989, Bresnahan and Reiss 1990,
1991, Berry 1992). Later research treats as endogenous variables the marketing mix other
than price (Berry and Waldfogel 2001, Mazzeo 2002, Berry et al. 2004, Seim 2006, Crawford
et al. 2011, Ryan and Tucker 2012). This paper contributes to this strand of literature by
proposing a tractable model of product line length dynamics for multi-product firms.
The rest of the paper is organized as follows. Section 2 introduces the data and some
reduced-form evidences on line length dynamics. Section 3 provides an empirical model to
quantify firm’s optimal line length decisions driven by preference heterogeneity. Section 4
describes the full specification and identification. Section 5 shows the results, and section 6
concludes.
2. Product Offerings in US Potato Chip Market
In this section, I will provide an overview of potato chip industry and description on the IRI
Academic Dataset (Bronnenberg et al. 2008) that I use.9 The last part of the section shows
some reduced-form evidence on product line length dynamics.
2.1. The Potato Chip Market
Potato chips can be found in most American households. An average US household will
spend $80 a year in salty snacks. Potato chips have a dollar share of 30% in the industry of
salty snacks, which means an average household will spend around $24 each year on potato
chips (First-Research 2011).
Chip manufacturers anticipate and respond to changes in consumer preferences. First of
all, in the potato chip industry, the ability to be innovative and differentiate a product is the
8 Dubé et al. (2005) provide an excellent summary of these papers.9 All estimates and analyses in this paper based on Information Resources Inc. data are by the author and not by
Information Resources Inc.
Li: Product Offerings and Product Line Length Dynamics8
key to competition. As a result, manufacturers offer different choices of potato chips with
different flavors, fat contents, and cut types. Furthermore, consumer’s tastes vary by region
and over time. For example, Joon (2013) states that “consumers in the Midwestern region
prefer thick cuts and consumers in the southwestern states prefer bold and spicy flavors.” At
the same time, many exogenous factors drive the evolution of tastes over time. Population
migration is one such factor (Bronnenberg et al. 2012). Manufacturers are creating new
spicy flavors catering to a growing Hispanic and Asian population (First-Research 2011).
Consumers’ awareness of the health cost of eating potato chips high in trans fat and salt is
another factor. To capitalize on this shift, leading manufacturers have introduced a number
of new products with reduced fat and low salt content (Joon 2013). A third factor is the
change in taste for (new) flavors. Firms can elicit this change by inviting consumers to submit
their newly designed flavors.10 With the existence of diehard fans of classically flavored
potato chips, the regional and temporal variations of tastes imply changes in preference
heterogeneity and have corresponding implications on product differentiation decisions.
A second feature of potato chip industry is that it is highly concentrated, with a leading
player (Company A) having a market share of 60%. The second largest player has a market
share of only 5.2% (Joon 2013). Company A does not worry too much about potential
entrants. First, consumers have strong brand preference in picking potato chips. They are
willing to pay extra for branded chips. In addition, operating firms in this industry need to
have good relations with upstream suppliers and downstream retailers. They use long-term
contracts to hedge against the volatile prices for potatoes, sugars, oils, and fats from their
suppliers, and they are competing for the best shelf spaces in grocery stores.
10 For example, Frito Lay holds the contest called “Do us a flavor” in each year to invite consumers to submit their
newly designed flavors and launches the winners. The winning flavor will be awarded 1 million dollars.
Li: Product Offerings and Product Line Length Dynamics9
2.2. Data
I use the IRI Academic Dataset from 2001 to 2007 to estimate the model.11 The IRI academic
dataset provides scanned sales data from a sample of grocery stores at the level of UPC-store-
week across 50 IRI markets. I restrict the analysis in this paper to the Salty Snack - Potato
Chip industry. I restrict to 8-13 serving size packages in my analysis. For potato chips offered
by firms other than the leading firm (Company A), I aggregate into the level of market-
quarter. For potato chips by Company A, I aggregate to the level of feature-market-quarter,
where one feature is defined as the triplet of flavor-cut-fat.12
Company A has wide variation in the length of its product line, defined as the count of
features sold in one market-quarter. Table 1 and Figure 1 present the distribution of line
length. The average line length is 22.09 with a standard deviation of 3.86. The shortest line
is in Raleigh/Durham - 2001q1, with a length of 8, whereas the longest line is in Chicago -
2002q2, that supplied 30 different features. Variation in line length derives from two sources:
cross-sectional and inter-temporal. Chip lines vary widely in line length in both sources.
Cross-sectionally, Pittsfield has the shortest line, with an average length of 16.89, whereas
Houston has the longest line, with an average length of 25.05. Line length also changes over
time, as is shown in Table 1 and right panel of Figure 1. Line length is quite sticky, with
about 30% cases there is zero changes, and in 85% the changes are within 2 features, but
there are still cases where Company A is quite aggressive in line length adjustment.
11 Although the IRI Academic Dataset is available from 2001 to 2011, I only make use of seven years for the following
reasons. First, the 2008 financial crisis heavily drove up prices of potato chips, which makes pricing decisions non-
trivial and complicate the model. Second, Company A did a national launch of zero-trans fat in 2008. The reasons
for the timing and scale of such a big event are beyond the scope of this project. Moreover, the concurrence of the
two events further complicates the analysis.12 Please refer online appendix for justification of data aggregation process.
Li: Product Offerings and Product Line Length Dynamics10
I supplement the IRI dataset by merging with the IPUM CPS data to get the demograph-
ics. Among 302 MSAs (Metropolitan Statistical Area) in CPS, I have identified 98 that can
be merged with IRI markets. In terms of population, those matched MSAs constitute half
of the total population nation-wide. I proxy the total market size by number of households
in 2007, whereas other demographics that may correlate with preference heterogeneity are
calculated in a quarter by city level.
2.3. Reduced-form Evidence on Dynamic Product Offering
Before going into the structural estimation, I will show some reduced-form evidence on firms’
changing differentiation decisions based on market outcomes over time. When preference is
homogenous, consumers tend to agree on the preference ranking of all features within a line,
and in-line market shares for features are concentrated. In the extreme case, consumers fully
agree on the preference ranking, and the in-line market share is 1 for the most preferred
feature and 0 for others. In these cases, the model predicts that firms will contract the
product line. On the contrary, when preference heterogeneity becomes high, consumers have
various opinions about the most favorable feature, and in-line market shares become less
concentrated. In this case, the model predicts that firms will expand the product line.
To illustrate the argument above, I run the following regression:
LineLengthmt = ↵0 +↵1HHIm,t�1 +↵2LineLengthm,t�1 + cm + dt + "mt
where m indexes market, t denotes quarter, LineLengthmt is the length of the product
line, HHImt is the Herfindahl Index for in-line market share (i.e., HHImt =P
f s2f |l,mt,
where sf |l,mt is the in-line market share for feature f in market-time mt). cm and dt are
market and time fixed effects to control for geographic unobservables and seasoning effects.
Baseline regression confirms the model prediction (Table 2, Column 1). The higher the market
Li: Product Offerings and Product Line Length Dynamics11
concentration, the shorter the product line in response. A one standard-deviation change in
HHI (0.03) will lead to a change in line length of 0.02.
One of the challenge for the interpretation of the estimate is that the measure of HHI is
mechanically decreasing in line length, and the estimated correlation is artifactual.13 The
worry is partly true as shown in Online Appendix, and I use another measure of market
concentration: the standard deviation of log in-line market share defined as
StdLnShareInLinemt = Std�lnsf |l,mt
�
which is not mechanically correlated with line length (Online Appendix). When in-line mar-
ket shares are more concentrated, the standard deviation is high. Regression results still
support our conjecture. A one standard-deviation increase in this concentration measure will
lead to a 0.36 increase in line length (Table 2, Column 2).
One alternative interpretation of the above findings is that firms will automatically with-
draw losing features that are unpopular. To deal with this challenge, I change the dependent
variable to be the indicator of line expansion. Regression results also confirm the initial the-
ory proposed above (Table 2, Column 3,4). The higher the market concentration, the less
likely the line gets expanded. A one standard-deviation decrease in HHI will lead to a higher
chance of line expansion by 4.74% (Table 2, Column 3) and 4.60% (Table 2, Column 4).
Compared to an average chance of line expansion of 34%, this increase is economically large.
A final caveat is that all these reduced-form evidences are correlational, not causal. The
complete model allows firms to adjust their line length based on all past market realizations
rather than just the last one. To quantify the above mechanism, we will estimate a structural
model with a richer set of specifications.
13 I regress line length on one-period lagged HHI. The mechanics in calculating HHI will contaminate the inference
only when firms have some inertia to adjust line length.
Li: Product Offerings and Product Line Length Dynamics12
3. A Model of Product Line Length Dynamics
In this section, I propose a model that is structural in both demand and supply to capture
the effect of preference heterogeneity on the tradeoff between cannibalization and new sales
creation when firms are making product line length decisions. For simplicity, I assume that in
each market m, there is a separate monopolist. Within each market, the monopolist provides
a line of nt products indexed by j 2 {1,2, ..., nt} to compete with one single outside good
j = 0 in each period t.
3.1. Demand Side
For each market m and period t (suppressed temporarily), the utility for consumer i from
consuming Company A chip j 2 {1,2, ..., n} and outside goods j = 0 is
uij = (a+ "i)+ (c̄j + ✓"ij)�↵pj = �j +("i + ✓"ij)
ui0 = "i0
where (a+ "i) is consumer’s brand preference for company A, which includes average level
a and consumer heterogeneity "i; (c̄j + ✓"ij) is consumer i’s utility for product j, which also
includes the mean value c̄j and consumer’s heterogeneity ✓"ij; pj is the price for product j.
After some rearrangement, the utility for consumer i consuming product j equals the mean
utility level �j = a+ c̄j �↵pj and consumer’s heterogeneity ("i + "ij). Following Berry (1994)
and Cardell (1997), both "ij and ("i + "ij) follows i.i.d. type I extreme value distribution.
From the representation of cij = c̄j + ✓"ij, the value of ✓ measures consumer’s preference
heterogeneity over product j.14 When ✓ is high, cij varies a lot across different individual i,
and the preference is heterogenous. On the other hand, lower ✓ implies more homogenous
preference.
14 Here I am measuring the average level of preference heterogeneity across all products. Ideally we can assign ✓j for
each product j, but the data lack statistical power to identify all ✓js
Li: Product Offerings and Product Line Length Dynamics13
The nesting parameter is an aggregate statistics of both individual level variety seeking and
cross-individual preference heterogeneity. If we think about the repeated purchases of one
individual as different purchase occasions, the variety seeking behavior can be rationalized
as the low correlation for individual-specific demand shocks among different products, which
is captured as high ✓ in current model. With market level data, I cannot identify between
variety seeking within individual or preference heterogeneity among individuals. But these
two channels should have similar implication for product assortment decisions, which is
presented later.
The Nested Logit model proposed here can be easily estimated in linear GMM. Within
each market m,
lns1t � lns0t = �jt + ✓t�� lnsj|l,t
�
= a+ cj �↵pjt + ✓t�� lnsj|l,t
�+ ⇠jt (1)
where s1t is the market share for all Company A products, s0t is the market share for all
non-Company-A products, sj|l,t is the in-line market share, which equals sjt1�s0t
. Following the
standard model, I allow the taste for product j to vary by time, with c̄jt = cj + ⇠jt, where cj
is the product fixed effects, and ⇠jt is the unobserved demand shock, which is distributed as
N�0,��1
⇠
�.
3.2. Static Profit when ✓ is Known
I assume that at the time of product line length decision, Company A does not know the
precise value of mean utility �jt so that she is taking expectation on over some distribution
F� (·). There are three reasons behind. First, product line length decisions are made prior to
the realization of demand, so Company A is ignorant about the demand shock ⇠jt. Second,
retailers can observe the demand shock and adjust the retail price pjt, so pjt is also unknown
Li: Product Offerings and Product Line Length Dynamics14
before market realization. Third, when Company A launches some new product, the value of
cj is also unknown to her.15 By making this assumptions, I abstract away the identity of each
products in line and focus mainly on the length of product line. The total market share for
Company from offering a product line with length n follows the nested logit representation
with
s (n,✓) =E�
✓exp (I)
1+ exp (I)
◆
where
I = ✓ · lnnX
j=1
exp
✓�j✓
◆
The total market share s (n,✓) is increasing in n, increasing in ✓, and super-modular in n
and ✓ for most of the distributions F�.16 Super-modularity implies when expanding the line
length, the marginal gain in total share is larger when the preference is more heterogenous.17
Formally, let C (n, l) denote the cost of launching a line with length n while the line length
in the last period is l. A myopic firm will choose n to maximize
w ·M · s (n,✓)�C (n, l)
15 This also assumes out the product launching in the vertical sense or mass market strategy. (Johnson and Myatt
2006). When a company decides to launch a new product, she can either play mass market strategy so that the
new product is attractive to all consumers (i.e., with a high value of �) or niche market strategy that the feature is
attractive to a set of consumers (i.e., similar �). In the potato chip industry, it is quite difficult to launch a potato
chip that is favorable to all consumers and play mass market strategy.16 Super-modular means s (n+1,✓) � s (n,✓) is increasing in ✓. Proof of these properties are provided in Online
Appendix.17 This is consistent with the standard interpretation of price elasticity in the nested logit model. When products are
more nested within line, the price elasticity is higher within nests than between nests. Lowering price for one feature
will have larger “cannibalization effect” that will consume the market share of other products within the line than
“new sales creation effect” that will increase the total share of all products in line. The same logic applies to the
strategy of line expansion. The “cannibalization effect” of expanding the line dominates the “business stealing effect”
when features are more nested, and in this case firms are less likely to expand the line.
Li: Product Offerings and Product Line Length Dynamics15
where w is the manufacture margin, M is the market size. Let n⇤(✓,m) be the optimal line
length choice made, and super-modularity means n⇤ is increasing in ✓t.
3.3. Dynamic Learning on Time-evolving ✓
As mentioned earlier, preference heterogeneity evolves over time due to many exogenous
factors including population migration, health concerns, as well as evolving tastes for new
flavors. I further assume that firms do not know the true value of preference heterogeneity
when making line length decisions. Instead, they have some beliefs on this value and update
their beliefs based on market realizations.18
3.3.1. Learning from Market Realizations Suppose at the beginning of period t, Com-
pany A has a prior belief on ✓t, which is modeled as a truncated normal with mean µt and
precision �t, truncated at unit interval (0,1), which is denoted as TN�µt,�
�1t
�. After market
gets realized, the market shares on all products are observed, and Company A can observe
one signal from each product j as derived from (1):
'jt = lns1t � lns0t � a� cj �↵pj = ✓t�� lnsj|l,t
�+ ⇠jt
Aggregate signals from all products about the same ✓t will get an aggregate signal19
t =
Pj
�� lnsj|l,t
�·'jtP
j ln2 sj|l,t
= ✓t +
Pj
�� lnsj|l,t
�· ⇠jtP
j ln2 sj|l,t
with precision
ht =
X
j
ln
2 sj|l,t
!·�⇠
A nice property for truncated normal belief is that it is also a conjugate prior for normal
data generation process, which is shown in the next theorem18 There is no direct test about the informational assumption that firms do not know the exact value of preference
heterogeneity because the stationary learning model (as described below in this paper) and complete information
model are not nested with each other. However, I will show some indirect test result based on simulation in later
section.19 For convenience, the notation ln2 sj|l,t means
�ln sj|l,t
�2.
Li: Product Offerings and Product Line Length Dynamics16
Theorem 1. Suppose the prior is truncated normal
✓t ⇠ TN�µt,�
2t = ��1
t
�
and an unbounded signal is observed with value t and precision ht, then the posterior belief
is also truncated normal
✓t|t, ht ⇠ TN⇣µ0t, (�
0t)
2= (�0
t)�1⌘
with
µ0t =
�t
�t +ht·µt +
ht
�t +ht·t (2)
�0t = �t +ht (3)
Proof is shown in Appendix ??.
3.3.2. Evolution of ✓t The next step is to model the time-evolution of preference hetero-
geneity ✓t. The reason for allowing ✓t to evolve over time is two-folds. First, in the potato chip
industry, we do observe preference heterogeneity changes over time and chip manufactures
responds by adjusting their product line strategies. Second, for modeling perspective, if the
preference heterogeneity is constant over time, as an experienced firm operating in a mature
market, Company A is sophisticated enough to know the true value of preference preference
heterogeneity and no intertemporal variation in product line should be observed. The large
intertemporal variation in product line length motivates the assumption of time-evolving
preference heterogeneity.
If ✓t is not truncated, a natural candidate model is random walk, with
✓t+1 = ✓t +�✓⌫t
Li: Product Offerings and Product Line Length Dynamics17
where ⌫t ⇠N (0,1) is the evolution error, or equivalently,
✓t+1|✓t ⇠N�✓t,�
2✓
�
In the truncated case, I propose the following “quasi” random walk
✓t+1|✓t ⇠ f (·|✓t;�✓)
which is similar to the random walk process as for unbounded case with acceptance-rejection
at unit interval. Convoluted with the truncated normality on ✓t, we can approximate the
prior belief of ✓t+1 as TN�µt+1,�
2t+1 = ��1
t+1
�with
µt+1 = µ0t (4)
�2t+1 = (�0
t)2+�2
✓ (5)
Details are included in Appendix ??.
3.4. Line Length Dynamics Chasing Time-evolving Preference Heterogeneity
When we combine the above two pieces of dynamic learning and ✓ evolution, we can have
the full description of firm’s dynamic problem. The action-specific flow profit
⇡n (µt,�t, lt) =w ·M ·E (s (n,✓t) |µt,�t)�C (n, lt)
and the value value function is
Vn (µt,�t, lt) = ⇡n (µt,�t, lt)+� ·E (V (µt+1,�t+1, lt+1) |µt,�t, lt, n)
V (µ,�, l) = Emax
n(Vn (µ,�, l)+�✏✏n)
where the state variables are the belief mean, belief precision, as well as last period line
length, and the transition probability is defined as (2) (3) (4) (5), with an additional one for
lt+1 = n.
Li: Product Offerings and Product Line Length Dynamics18
4. Empirical Specification and Identification
In this section, I will present the full empirical specification and identification of the model.
Similar to Hitsch (2006), I apply two-step estimation, where the demand side is estimated in
linear GMM, and its parameters are plugged in to the supply side. I estimate the dynamic
supply model by maximizing likelihood. This section ends with a discussion on the identifi-
cation of the model.
4.1. Demand Side
The demand side is modeled as a nested logit of with two nests where all Company A chips
of different features are nested in one line, and all non-Company A chips are treated as
homogenous outside products. Based on (1), for each market m,
lns1mt � lns0mt = am + cj �↵pjmt + ✓mt
�� lnsj|l,mt
�+ ⇠jmt (6)
Both pjmt and lnsj|l,mt are endogenous, because they are correlated with the unobserved
demand shock ⇠jmt. I employ the following sets of instruments for the two endogenous vari-
ables:
• The summation of characteristics (flavors, fat content and cut type) of other Company
A chips sold in the same market-time
X
j0 6=j
xj0mt
• Average price of the same feature sold in other geographical markets in the same time
1
#
X
m0 6=m
pjm0t
• Other cost for raw materials, including potatoes, sugar, soy bean oil, edible butter, and
edible tallow
Li: Product Offerings and Product Line Length Dynamics19
• Number of competitor brands and number of competitor UPCs other than Company A
chips within the same market-time
The first set of instruments are widely known as BLP instruments, which Berry et al.
(1995) started to use. The underlying assumption is that the characteristics are exogenous
to demand shocks. In the current model, the upstream wholesalers make product assortment
decisions whereas downstream retailers make pricing decisions. In reality, grocery stores and
manufacturers jointly decide what to display in advance. If some of the features do not sell
well, grocery stores will lower prices to sell out the storage. In this case, it is natural to
assume the assortment decision is made prior to the realization of local demand shock.
The second set of instruments are known as Hausman instruments which Nevo (2001)
started to use in demand estimation. The underlying assumption is that demand shocks are
independent over different markets, but there are factors that may affect the pricing for all
markets. These factors include, but are not restricted to, common cost shifters that affect
the pricing decisions across markets.
The last set of instruments consider the competition environment that was used in Bres-
nahan et al. (1997). The argument is that competition environments affect firms’ pricing
decisions, which is orthogonal to demand shocks. In this project, I can also exploit the huge
variation in the competition environment across different markets measured by the number
of competitor brands and UPCs.
4.2. Supply Side - Flow Profit
In each market m, the action-specific flow payoff of Company A is
⇡n,m (µ,�, l) = wm ·Mm ·E (sm (n,✓) |µ,�)�Hm · c (n, l)
sm (n,✓) = E�⇠Fm
✓exp (I)
1+ exp (I)
◆
I = ✓ · lnnX
j=1
exp
✓�j✓
◆
Li: Product Offerings and Product Line Length Dynamics20
In other words, I allow a market-specific value profit function and calibrate the parameters
as follows:
• wm: manufacturer’s margin, calibrated from average price in that market, adjusted by
retailer’s markup (15%), distributor’s markup (25%) and manufacturer’s gross margin (30%),
i.e., wm = p̄m · 0.85 · 0.75 · 0.3
• Mm: market size, calibrated from total number of household Hm, with assumption that
an average household spend X dollars per quarter in buying potato chips, where X is cal-
culated from $24 spent by an average household in a year in potato chip consumption,
adjusted, by quarters and market shares of large package sized chips, i.e., Mm = Hm · 6 ·
ShareLargem/p̄m
• Cost of line length maintenance: assume a per-capita cost, i.e., Cm (n, l) =Hm · c (n, l).
In the estimation, I tried two specifications of the per-capita cost: linear and kink. In the
linear specification, c (n, l) = c ·n. In the kink specification, c (n, l) = (c1 + c21 (n> l)) ·n
• Fm: distribution of mean utility �, assume normality, with mean and variance calibrated
by the empirical distribution of {�jmt}j,t
The only parameters to estimate in the flow profit is the cost parameter {c1, c2}.
4.3. Supply Side - Dynamics
Firm’s dynamic problem is described as
Vn,m (µt,�t, lt) = ⇡n,m (µt,�t, lt)+� ·Em (Vm (µt+1,�t+1, lt+1) |µt,�t, lt, n)
Vm (µ,�, l) = Emax
n(Vn,m (µ,�, l)+�✏✏n)
The unspecified parameters are initial belief (µ1,m,�1,m), the evolution rate �✓,m as well
as the scale of random fixed cost �✏. All parameters are identified as shown from below, but
I still impose the following cross-market restrictions to simplify the calculation.
Li: Product Offerings and Product Line Length Dynamics21
• �1m: initial prior precision is assumed to be proportional to the precision of signal.
This is justified by stationary assumptions in the learning process. For markets with a more
precise signal, the learning speed is expected to be fast. However, this is only valid if the
belief precision is the same. I equalize the learning speed across all markets by assuming that
the prior belief is proportional to signal precision, i.e., �1m = k� · hm, where hm =
1#
Pt hmt
be the average precision.
• �✓,m: evolution rate of preference heterogeneity. From stationary assumption, �✓,m =
k� · (k� +1) ·hm after combining stationarity and (5)
1
�1m=
1
�1m +hm+�✓,m
• µ1m: initial prior mean, integrated from calibrated normal distribution, with mean and
variance estimated from {mt}t20
So the dynamic parameters to identify is {k�,�✏}
4.4. Identification
This section briefly shows the identification of of supply side parameters without imposing
any cross-market restriction, i.e., market-specific parameters are separately identified. In
the current version, we assume that initial prior mean µ1 is known (and integrated out in
the estimation). However, the identification does not rely on this assumption. A stronger
identification result without knowing prior mean is described in Appendix ??.
In our data, we can observe actual line length decisions, signal values and precisions, as
well as prior mean
{nt,t, ht, µ1}
20 Note that initial prior mean is also identifiable as is shown in Appendix ??. However, I follow the convention of
learning literature to integrate out this value.
Li: Product Offerings and Product Line Length Dynamics22
Based on these information, I will show the non-parametric identification of preference evo-
lution rate, initial belief precision, and line length maintenance cost, and scale of fixed cost
for launching21
{�✓,�1, c}
4.4.1. Preference evolution rate �✓ and prior precision �1 Signal evolution rate �✓
measures how fast ✓ evolves over time. Intuitively, ✓t can be estimated from demand, and
this rate is identified by the demand side estimation ˆ✓t. Equivalently, the signal value t is
calculated based on demand estimation, and �✓ is identified from Var (t+1|t), because t+1
deviates from t by three reasons: signal error in period t, signal error in period t+1, and
the deviation of ✓t+1 from ✓1. The precision of the first two errors are known, so the rate of
evolution is identified.
Initial precision is identified by stationary assumptions that the precision belief does not
explode. From the following equation
1
�1=
1
�1 +¯h+�✓
we can pin down �1. The intuition is that when making line length decisions, Company A
cannot rely too much on market signal, because signal is noisy, measured by ¯h. She can
neither rely too much on her prior belief, because ✓ evolves over time, as is measured by
�✓. The optimal balancing between these two sources pin down the belief precision in the
stationary level.
21 A final supply side parameter �✏ is a nuisance parameter which is not non-parametrically identified. But since
we have impose functional form assumption on the value function, including the estimation of this parameter will
improve the model fit a lot.
Li: Product Offerings and Product Line Length Dynamics23
4.4.2. Cost of line length maintenance c From the last part, I have shown identification
of �1 and �✓. With the knowledge of µ1, I can calculate the whole process of belief process
{µt,�t}, and the state variable is known. The cost parameter is identified by the standard
argument of Conditional Choice Probability E (nt|µt,�t, lt) proposed by Magnac and Thes-
mar (2002). Intuitively, fixing the belief precision, when the cost is low, optimal line length
is more responsive to changes in belief mean, as is shown in Figure 2. The cost is identified
by “regressing” actual line length nt on the belief mean µt, controlling for �t.
5. Results
This section shows the model estimates and various simulation results based on estimates
obtained.
5.1. Demand Estimation
In the demand side, I estimate a Nested Logit model specified in (6). I report the average
estimates of preference heterogeneity by imposing ✓mt = ✓ in this part, but in the supply
side, I allow preference heterogeneity to vary by market and time.
Table 3 reports the estimation result from the demand side. Column (1) disregards the
existence of endogeneity problem and directly estimate the equation by OLS. Column (2)
overcome this problem by applying three sets of instruments as described before. By compar-
ing column (1) and column (2), I find that instrumental variables work well as expected. Both
preference heterogeneity and price elasticity will be under-estimated without controlling for
endogeneity, and the characteristic vectors only become significant in 2SLS specification.
Note that the first two columns in Table 3 use characteristic vectors (flavor fixed effects,
cut types, fat contents) to describe one product. In column (3), I replace with a more precise
control, that is product fixed effects. The estimates for price elasticity does not change too
much (-2.38 in Column 3 compared to -2.53 in Column 2), but the estimates for preference
Li: Product Offerings and Product Line Length Dynamics24
heterogeneity almost doubled. As mentioned, the characteristics vectors cannot capture con-
sumer’s preference completely, so I take the product fixed effects estimates as benchmark
case, where the preference heterogeneity is estimated to be 0.41 (with a standard error of
0.02, Column 3, Table 3). In Column (4), I allow price elasticity to vary by demographics. I
find that price is less elastic in markets with a richer population measured by median income,
or older population measured by median age, which coincides with most previous findings.
The main parameter of interest is the preference heterogeneity in this paper, so in Table
4, I explore the source of preference heterogeneity by interacting with different observables.
Column (1) copies the Column (3) from Table 3 to serve as a benchmark case. In Column
(2), I estimate the same model but in the data for small-package-sized potato chips. I find
that preference is more heterogenous (0.67 in Column 2 compared to 0.41 in Column 1)
and price is more elastic (2.74 in Column 2 compared to 2.38 in Column 1). This extra
heterogeneity in preference may come from the fact that consumers are more willingness to
try new flavors when buying small sized potato chips. There are two sources of preference
heterogeneity estimated in this paper: one is the preference heterogeneity between consumers,
and the other is the preference heterogeneity within consumer but in different purchase
occasions. I cannot separately identify these two sources with only market level data, but I
believe that the second source is more significant in markets for small packaged potato chips.
The difference in heterogeneity estimation supports the existence of heterogeneity within
consumers in different purchase occasions, and this is related to variety seeking behavior.
Another source of preference heterogeneity comes from population diversity. In Column
(3)-(7) of Table 4, I explore to what extent population diversity can explain preference
heterogeneity. The results are robust to a series of diversity measures. In Column (3), I uses
interquartile of income distribution to measure the population diversity. I find that in markets
Li: Product Offerings and Product Line Length Dynamics25
with a more disperse income distribution, the preference heterogeneity is significantly higher.
To quantify this estimates, I take out two markets with minimum (0.04) and maximum
(0.10) diversity measure, and the implied difference in heterogeneity is 0.09,22 or 20% of
the baseline heterogeneity of 0.41. In column (4), the diversity measure is the dispersion
of age distribution, and the implied difference in heterogeneity is 0.07, or 17% of baseline
value. Other than the above two dispersions, the preference heterogeneity is also explained
by diversity of ethnic groups. In Column (5), I use Asian population ratio in that market
and find that in markets with a 10% higher Asian population ratio, the preference is more
heterogenous by a measure of 0.047 out of baseline value of 0.41. In Column (6), I use
Hispanic population ratio, and the interaction term is not significant. This is because there
is a wide range of Hispanic population measure from 0 to 53%. If the true functional form is
non-linear, using linear function form to approximate may not get significant result. Instead,
I discretize the measure using a dummy for above median, and the estimates is reported in
Column (7). In markets with above-median Hispanic population ratio, the preference is more
heterogenous by a measure of 0.12 out of baseline value of 0.41.
5.2. Supply Estimation
I plug in the coefficients and estimate the supply side by maximum likelihood. Solving the
original problem with brute force is difficult, because calculating the line share sm (n,✓), the
flow payoff ⇡n,m (µ,�, l) and the state transition f (µt+1,�t+1|µt,�t, n) all requires simulation.
However, I can employ numerical methods to further simplify the calculations.
For sm (n,✓), I use power polynomials to approximate. Because it does not contain any
parameters to estimate, the approximation needs to be calculated only once. The reason
for using polynomials is the ease for preserving monotonicity and super-modularity in the
22 This is calculated by (0.1� 0.04)⇥ 1.48
Li: Product Offerings and Product Line Length Dynamics26
approximated function, which is the key for identification.23 To calculate ⇡n,m (µ,�, l), I
use quadrature to calculate the expectation with respect to ✓ although ✓ is distributed in
truncated normal instead of normal. When the precision is quite high, and the mean is far
from the boundary, the truncated normal can be approximated by standard normal because
the probability of ✓ lying outside the boundary is low. In terms of state transition probability,
because the line length stays at a high level (for the large package size, the line length ranges
from 8 to 30, with an average of 22), and the precision does not explode because of the
time-varying ✓, I simply assume the state transition probability does not depend on action n,
which relieves the computation burden. Finally, I use Chebyshev polynomials to approximate
the value function and estimate the single-agent dynamic game with unobservable and time-
varying state variables.24
Table 5 reports the estimation results. I estimate the model in two specifications. In the first
specification, I assume the maintenance cost per capita (1M household) is linear in the line
length, whereas in the second specification, the marginal cost is higher when manufactures
are expanding their lines. In the first specification, the marginal cost of expanding a line
by length one is $3,560 per million of household. For an average line length of 22, the total
(variable) cost of maintaining a line length in an average-size city with 2.63 million household
is approximately $0.2 million.25 As a comparison, the industrial in an averaged-sized city
with average line length selling at average price is $8.96 million,26 the product line related
cost constitutes about 2% of total revenue.
23 I use CVX to get the approximation, which is a regularized optimization package (Grant et al. 2008). See Appendix
?? for details.24 The recent development of MPEC (Dubé et al. 2013) is also applicable to this model.25 $3,560⇥ 22⇥ 2.63 = $0.2M, all numbers are taken from Table 1.26 $0.25⇥ 0.03⇥ 22⇥ 54.31M= $8.96M, all numbers are taken from Table 1.
Li: Product Offerings and Product Line Length Dynamics27
In the second specification, the cost is nonlinear, and I find an extra cost ($6.14K compared
to $2.08K) of expanding the product line. This extra cost comes from the inflexibility of
displaying, distributing, storing or advertising additional products. The extra cost limits the
flexibility of line length adjustment in two senses. First, it restricts the possibility of line
expansion because expanding the product line may incur this extra cost. Second, it also
restricts the possibility of product line contraction, because when Company A considers
withdrawing some products, she might worry about the future cost of pulling them back
again. Counterfactual analysis in the next subsection may quantify this inflexibility caused
by non-linear cost structure.
The precision ratio between belief and signal is estimated to be around 2.5 in both specifi-
cations. Note that this ratio determines the linear weight for prior and signal when updating
the belief. From the estimation, Company A places 30% of decision weight on in-market sig-
nal and 70% weights on past experience, summarized by prior belief. Even as an experienced
player in a matured market, Company A is still leveraging heavily on the in-market learning,
because of the evolutionary nature of preference heterogeneity. The market signal is a bit too
noisy, so Company A cannot rely completely on the market signal. Counterfactual analysis
in the next subsection will show the gross margin Company A may achieve if she knows the
true value of heterogeneity in advance.
5.3. Model Fit
In order to evaluate how the model fit the data, I simulate the line length decisions in all
50 markets. Within each market, the prior mean µ1 is drawn from known distribution, and
prior precision �1 is known from estimation, initial line length n1 = l2 is taken as given.
After specifying the initial condition, beliefs are updated from signals (1, h1) to get belief
in period 2 (µ2,�2), and the optimal line length n2 is simulated, and the process goes on to
the end of data period.
Li: Product Offerings and Product Line Length Dynamics28
I run simulations to check how the model fit the data. In the first simulation, signals (t, ht)
are taken from data. In the second simulation, I simulate these signals. Figure 3 compares
actual and simulated line length in two markets, and Figure 4 compares the whole distribution
of line length and line length changes for actual and simulated data. Both simulations fit the
data quite well in most markets. The first simulation fits the data almost perfect, because
it makes use of most information from the data. The second simulation also fits well. In
the model, there are three factors that determines the optimal line length choices. They are
evolution of preference heterogeneity, signaling error caused by demand shocks, and random
fixed cost of product line adjustment. The first simulation only average out random fixed
cost, and simulation result confirms that this cost is not the driving force for actual line
length patterns. The second simulation averaged out both random fixed cost and signaling
error. The only remaining force that determines the line length pattern is the evolution of
preference heterogeneity, which is the main mechanism in this paper. In the remaining part
of this paper, I will always implement the second simulation.
5.4. Counterfactuals
I run two sets of counterfactual simulations to evaluate Company A’s optimal line length
responses to product-line related policy changes. In the first counterfactual exercise, I evalu-
ate firm’s optimal line length decisions under a smooth cost structure; in the second counter-
factual exercise, I estimate firm’s improvement in gross margin under complete information
about preference heterogeneity when making line length decisions. A byproduct of of the sec-
ond counterfactual exercise is to provide some indirect test on the information assumptions
of the firm: does he know or learn?
5.4.1. Smooth cost structure The non-linearity of cost structure restricts firm’s flexibil-
ity to adjust product line. This simulation quantify how much. In this simulation, I take the
Li: Product Offerings and Product Line Length Dynamics29
cost structure as linear in the first specification from supply side and simulate market signals
as well as firm’s optimal responses. The results is illustrated in Figure 5. The distribution
of line length does not change too much, as is shown from the left panel, whereas the distri-
bution of line length changes becomes more dispersed in the right panel, which means that
Company A is more likely to adjust line length aggressively in the smooth cost structure.
To further quantify this change, the probability of line length adjustment grows from 70%
in the raw data to 90% in simulation.
The effect is quite symmetric in line length expansion and line length contraction, as is
shown in the right panel. Under a smooth cost structure, probability of line length expansion
and line length contraction both increases significantly. As is mentioned before, the increase
in line length expansion reflects the static concern that expanding the product line will incur
more cost, whereas increase in line length contraction reflects the dynamic concern that the
firm is more cautious in withdrawing some flavor because they might worry about the future
cost of pull them back again. Simulation result confirms the existence of both effects that
restricts the flexibility of line length adjustment.
5.4.2. Perfect information on preference heterogeneity Figure 6 shows the simula-
tion result for complete information on preference heterogeneity when making line length
decisions. The actual line length decisions under complete information deviate a lot from
the baseline case with learning heterogeneity. This is simply because Company A adapts
instantly to the time-evolving heterogeneity rather than chasing time-varying heterogeneity
under the learning model. The resulting gross margin is increased by 5% under complete
information. On the other hand, the change in line length adjustment does not change a lot.
Based on this simulation result, I can indirectly test the information hypothesis that
Company A learns rather than knows the true value of preference heterogeneity. First note
Li: Product Offerings and Product Line Length Dynamics30
that the two hypothesis are not nested in the model of stationary learning,27 so there is no
direct test based on some parameters. Motivated by the fact that with complete information,
Company A will enjoy a higher gross margin, I propose the following test based on gross
margin.
In the data, we can calculate the gross margin across 50 cities over 28 quarters, which
gives us a vector gm with a length of 1,400. Let FK denote the distribution of gm generated
in model where firms knows heterogeneity, and FL denote the distribution of gm generated
from the model where firms learns heterogeneity. To test the assumption of learning, it is
equivalent to test
H0 : gm⇠ FK , H1 : gm⇠ FL
It is quite difficult to calculate a test statistics in testing high-dimensional vector, but at
least we can sacrifice some of the power and focus on some statistics. Figure 7 reports the
test result for the median level of gross margin. We can see that the two distributions are
quite separated, and the actual data is observed to come from FL. We can reject the null
and tend to believe in the information assumption, that Company A learns about preference
heterogeneity when making product line decisions.
6. Conclusion
This paper links product line length decisions with heterogeneity of preference and rational-
izes its cross-sectional and intertemporal variation. Preference heterogeneity in this paper
includes both preference heterogeneity across individuals and variety seeking within indi-
viduals, and it is measured by the nesting parameter in the standard nested logit model.
Cross-sectional variation in preference heterogeneity, which is partly driven by the diver-
sity of population demographics, explains differentials in line length among different cities.
27 In the standard learning framework, the two hypothesis is nested. In order to test whether agent knows the true
value, it is equivalent to test whether the initial belief precision is infinity. (Hitsch 2006)
Li: Product Offerings and Product Line Length Dynamics31
Within one city, a firm’s in-market learning of preference heterogeneity drives line length
adjustment.
I apply the model to the potato chip industry, where Company A is the lead player. The
preference heterogeneity is estimated to be 0.41 in large package size chips and 0.67 in small
package size chips, which means preference for small packages is more heterogenous. This is
driven by more intensive variety seeking for small package chips. I also find that preference
is more heterogeneous in markets with a more diverse population, measured by dispersion
of income and age distribution and complication of ethnic groups.
On the supply side, Company A, as an experienced firm in a mature market, also applies
in-market learning about preference heterogeneity to adjust differentiation decisions. I find
Company A bases its decisions primarily on past experience in the market, with the most
recent market realization representing only one-third of the influence on product-line deci-
sions. The cost for maintaining an average line length constitutes about 2% of total revenue.
I estimate the sunk cost incurred when expanding product differentiation to be three times
the usual maintenance cost, which may limit the flexibility of product-line adjustment.
Counterfactual analysis based on the estimates evaluate firm’s optimal line length deci-
sions under smooth cost and in cases with complete information rather than learning about
preference heterogeneity. In the first case, Company A is found to be more aggressive in line
length adjustment under a smooth cost structure; in the second case, Company A’s gross
margin is increased by 5% when she knows the true value of preference heterogeneity. The
result for the second counterfactual also helps to test the information assumption that firms
learns rather than knows the preference heterogeneity at the time of line length decisions.
The test statistics supports the information assumption of learning.
The whole model is easily applicable to other industries in which product differentiation
is a key decision. One example is the two MP3 players produced by Apple: iPod Classic
Li: Product Offerings and Product Line Length Dynamics32
and iPod Nano. iPod classic provides a limited choice of colors—always black or white—but
iPod Nano offers a longer line of colors. The length of the Nano line also varies over time,
from two in the first generation to nine in the fourth generation and back to six in the most
recent one. The mechanism in this paper explains the difference between two MP3 players,
as most consumers of the iPod Classic are professional music lovers who care more about
sound quality, control convenience, and storage and less about colors, whereas consumers
buying iPod Nano are younger on average and care more about colors and have more diverse
views on their favorite one. The time-varying changes in line length for the Nano can be
attributed to Apple gradual learning about preference heterogeneity.
The model simplifies the measure of preference heterogeneity. I use a nesting parameter in
the nested logit model for two primary reasons. First, nested logit is simple and interpretable.
A more sophisticated heterogeneity pattern can be carried by a mixed logit, but getting one
statistic to measure the diversity of preference is difficult. Second, the linear representation
of the nested logit model makes the supply-side learning tractable. Further research should
be directed toward finding a better way to model preference heterogeneity and link it to
product differentiation decisions.
Another shortcoming of the paper is the only focus on product line length and abstracting
away the identify of each products. This is a natural result from a less restrictive metric
of preference heterogeneity that is not associated with any of the characteristics or flavors.
The identity of the product will be captured by a more complicated model with preference
heterogeneity specified to each products.
A third limitation is that I make the monopoly assumption. This assumption is justifiable
in the potato chip market, but in other markets with competition, the supply-side learning
model needs to be modified.
Li: Product Offerings and Product Line Length Dynamics33
Acknowledgments
I am grateful to my advisors, Tim Bresnahan, Wes Hartmann, and Petra Moser, for their invaluable guidance,
discussion, and encouragement. I would also like to thank Chris Colon, Chen Cheng, Oystein Daljord, Michael
Dickstein, Liran Einav, Pedro Gardete, Daniel Grodzicki, Han Hong, Mike Kruger, Brad Larsen, James
Lattin, Anqi Li, Harikesh Nair, Sridhar Narayanan, Joe Orsini, Qiusha Peng, Peter Reiss, Gregory Rosston,
Navdeep Sahni, Stephan Seiler, Stephen Teng Sun, Paul Wong, Yiqing Xing, Constantine Yannelis, Pai-Ling
Yin, and seminar participants at Stanford Department of Economics, Stanford Marketing WIP, Marketing
Science Conference in 2014 Atlanta for their helpful comments. The usual disclaimer applies.
Li: Product Offerings and Product Line Length Dynamics34
ReferencesBerry, Steven, James Levinsohn, Ariel Pakes. 1995. Automobile prices in market equilibrium. Econometrica:
Journal of the Econometric Society 841–890.
Berry, Steven, James Levinsohn, Ariel Pakes. 2004. Differentiated products demand systems from a combi-
nation of micro and macro data: The new car market. Journal of Political Economy 112(1) 68–105.
Berry, Steven T. 1992. Estimation of a model of entry in the airline industry. Econometrica: Journal of the
Econometric Society 889–917.
Berry, Steven T. 1994. Estimating discrete-choice models of product differentiation. The RAND Journal of
Economics 242–262.
Berry, Steven T, Joel Waldfogel. 2001. Do mergers increase product variety? evidence from radio broadcast-
ing. The Quarterly Journal of Economics 116(3) 1009–1025.
Bresnahan, Timothy F, Peter C Reiss. 1990. Entry in monopoly market. The Review of Economic Studies
57(4) 531–553.
Bresnahan, Timothy F, Peter C Reiss. 1991. Entry and competition in concentrated markets. Journal of
Political Economy 977–1009.
Bresnahan, Timothy F, Scott Stern, Manuel Trajtenberg. 1997. Market segmentation and the sources of
rents from innovation: Personal computers in the late 1980s. RAND Journal of Economics S17–S44.
Bronnenberg, Bart J. 2014. The provision of convenience and variety by the market. Available at SSRN .
Bronnenberg, Bart J, Jean-Pierre H Dubé, Matthew Gentzkow. 2012. The evolution of brand preferences:
Evidence from consumer migration. American Economic Review 102(6) 2472–2508.
Bronnenberg, Bart J, Michael W Kruger, Carl F Mela. 2008. Database paper-the iri marketing data set.
Marketing Science 27(4) 745–748.
Cardell, N Scott. 1997. Variance components structures for the extreme-value and logistic distributions with
application to models of heterogeneity. Econometric Theory 13(02) 185–213.
Chernev, Alexander. 2003a. Product assortment and individual decision processes. Journal of Personality
and Social Psychology 85(1) 151.
Chernev, Alexander. 2003b. When more is less and less is more: The role of ideal point availability and
assortment in consumer choice. Journal of consumer Research 30(2) 170–183.
Ching, Andrew T, Tülin Erdem, Michael P Keane. 2013. Invited paper-learning models: An assessment of
progress, challenges, and new developments. Marketing Science 32(6) 913–938.
Chintagunta, Pradeep K. 1998. Inertia and variety seeking in a model of brand-purchase timing. Marketing
Science 17(3) 253–270.
Chintagunta, Pradeep K. 1999. Variety seeking, purchase timing, and the "lightning bolt" brand choice
model. Management Science 45(4) 486–498.
Li: Product Offerings and Product Line Length Dynamics35
Crawford, G, A Shcherbakov, Matthew Shum. 2011. The welfare effects of endogenous quality choice: evidence
from cable television markets. Tech. rep., mimeo. University of Warwick.
Crawford, Gregory S, Matthew Shum. 2005. Uncertainty and learning in pharmaceutical demand. Econo-
metrica 73(4) 1137–1173.
Draganska, Michaela, Dipak C Jain. 2005. Product-line length as a competitive tool. Journal of Economics
& Management Strategy 14(1) 1–28.
Draganska, Michaela, Dipak C Jain. 2006. Consumer preferences and product-line pricing strategies: An
empirical analysis. Marketing science 25(2) 164–174.
Draganska, Michaela, Michael Mazzeo, Katja Seim. 2009. Beyond plain vanilla: Modeling joint product
assortment and pricing decisions. QME 7(2) 105–146.
Dubé, J, Jeremy T Fox, C Su. 2013. Improving the numerical performance of blp static and dynamic discrete
choice random coefficients demand estimation. forthcoming in. Econometrica .
Dubé, Jean-Pierre, Günter J Hitsch, Peter E Rossi. 2009. Do switching costs make markets less competitive?
Journal of Marketing Research 46(4) 435–445.
Dubé, Jean-Pierre, Günter J Hitsch, Peter E Rossi. 2010. State dependence and alternative explanations for
consumer inertia. The RAND Journal of Economics 41(3) 417–445.
Dubé, Jean-Pierre, K Sudhir, Andrew Ching, Gregory S Crawford, Michaela Draganska, Jeremy T Fox,
Wesley Hartmann, Günter J Hitsch, V Brian Viard, Miguel Villas-Boas, et al. 2005. Recent advances in
structural econometric modeling: Dynamics, product positioning and entry. Marketing Letters 16(3-4)
209–224.
Erdem, Tülin, Michael P Keane. 1996. Decision-making under uncertainty: Capturing dynamic brand choice
processes in turbulent consumer goods markets. Marketing science 15(1) 1–20.
First-Research. 2011. Industry profile - snack foods manufacturing. Tech. rep.
Goettler, Ronald L, Brett R Gordon. 2011. Does amd spur intel to innovate more? Journal of Political
Economy 119(6) 1141–1200.
Grant, Michael, Stephen Boyd, Yinyu Ye. 2008. Cvx: Matlab software for disciplined convex programming.
Guo, Liang, Juanjuan Zhang. 2012. Consumer deliberation and product line design. Marketing Science
31(6) 995–1007.
Hitsch, Günter J. 2006. An empirical model of optimal dynamic product launch and exit under demand
uncertainty. Marketing Science 25(1) 25–50.
Hui, Kai-Lung. 2004. Product variety under brand influence: An empirical investigation of personal computer
demand. Management Science 50(5) 686–700.
Iyengar, Sheena S, Mark R Lepper. 2000. When choice is demotivating: Can one desire too much of a good
thing? Journal of personality and social psychology 79(6) 995.
Li: Product Offerings and Product Line Length Dynamics36
Johnson, Justin P, David P Myatt. 2006. On the simple economics of advertising, marketing, and product
design. The American Economic Review 756–784.
Joon, Hester. 2013. Snack food production in the us. Tech. rep., IBISWorld.
Lin, Song, Juanjuan Zhang, John R Hauser. 2014. Learning from experience, simply. Marketing Science .
Liu, Yunchuan, Tony Haitao Cui. 2010. The length of product line in distribution channels. Marketing
Science 29(3) 474–482.
Lovett, Michell, William Bolding, Richard Staelin. 2009. Consumer learning models for perceived and actual
product instability. Working Paper .
Magnac, Thierry, David Thesmar. 2002. Identifying dynamic discrete decision processes. Econometrica 70(2)
801–816.
Mazzeo, Michael J. 2002. Product choice and oligopoly market structure. RAND Journal of Economics
221–242.
Narayanan, Sridhar, Puneet Manchanda. 2009. Heterogeneous learning and the targeting of marketing
communication for new products. Marketing Science 28(3) 424–441.
Nevo, Aviv. 2001. Measuring market power in the ready-to-eat cereal industry. Econometrica 69(2) 307–342.
Orhun, A Yesim. 2009. Optimal product line design when consumers exhibit choice set-dependent preferences.
Marketing Science 28(5) 868–886.
Reiss, Peter C, Pablo T Spiller. 1989. Competition and entry in small airline markets. Journal of Law and
Economics 32(2) S179–202.
Roberts, John H, Glen L Urban. 1988. Modeling multiattribute utility, risk, and belief dynamics for new
consumer durable brand choice. Management Science 34(2) 167–185.
Ryan, Stephen P, Catherine Tucker. 2012. Heterogeneity and the dynamics of technology adoption. Quan-
titative Marketing and Economics 10(1) 63–109.
Seetharaman, PB, Siddhartha Chib, Andrew Ainslie, Peter Boatwright, Tat Chan, Sachin Gupta, Nitin
Mehta, Vithala Rao, Andrei Strijnev. 2005. Models of multi-category choice behavior. Marketing
Letters 16(3-4) 239–254.
Seim, Katja. 2006. An empirical model of firm entry with endogenous product-type choices. The RAND
Journal of Economics 37(3) 619–640.
Urban, Glen L, John R Hauser. 1993. Design and marketing of new products, vol. 2. Prentice Hall Englewood
Cliffs, NJ.
Urban, Glen L, Gerald M Katz. 1983. Pre-test-market models: Validation and managerial implications.
Journal of Marketing Research (JMR) 20(3).
Villas-Boas, J Miguel. 2004. Communication strategies and product line design. Marketing Science 23(3)
304–316.
Li: Product Offerings and Product Line Length Dynamics37
Table 1 Summary Statistics
Obs. Mean S.D. Min MaxSales and Prices
Market Share 30930 0.03 0.05 0.00 0.38Market Share In Line 30930 0.05 0.06 0.00 0.53Price ($/oz) 30930 0.25 0.07 0.12 0.43Fat Free 30930 0.09 0.28 0.00 1.00Reduced Fat 30930 0.15 0.35 0.00 1.00Ruffle Cut 30930 0.28 0.45 0.00 1.00Wavy Cut 30930 0.13 0.33 0.00 1.00Line Length (number of features) 1400 22.09 3.86 8.00 30.00Change in Line Length 1350 0.19 2.11 -8.00 9.00Line Expansion 1350 0.39 0.49 0.00 1.00HHI for In-line Market Share 1400 0.13 0.03 0.07 0.36Std. for Log In-line Market Share 1400 1.28 0.23 0.72 2.06Number of competitor firms 1400 7.41 2.77 3.00 20.00Number of competitor UPC 1400 51.77 24.71 12.00 166.00Market Size (Million Oz) 50 54.31 57.10 6.06 278.27
Demographics
Median Income (1K $) 1400 56.81 8.98 23.10 89.09Median Age 1400 35.08 2.89 26.00 48.33Interquartile Income (1M $) 1400 0.06 0.01 0.04 0.10Interquartile Age (10 years) 1400 3.40 0.20 2.75 4.47Asian (%) 1400 0.04 0.04 0.00 0.31Hispanic (%) 1400 0.10 0.11 0.00 0.53Number of Households (Million) 50 2.63 3.10 0.26 17.10
Cost Shifters
Potato Price ($/100lb) 28 12.37 4.29 7.42 21.90Refined Sugar Price (cent/lb) 28 45.20 3.58 41.93 51.93Soy Bean Oil Price (cent/lb) 7 28.28 11.58 16.46 52.03Edible Butter Price ($/lb) 7 1.41 0.27 1.11 1.82Edible Tallow Price (cent/lb) 7 19.60 5.53 13.71 30.76
Notes: Sales and prices data for 58 Company-A features (a unique combination of 36 flavors, 3 fat contents - regular,
reduced fat, fat free, and 3 cut types - flat, ruffle, wavy) across 50 markets, over 28 quarters in 7 years (2001-2007)
are aggregated from IRI Academic dataset. Features that have positive sales for less than 12 weeks are dropped from
the sample and their market shares are proportionally allocated to other features within serving sizes - city - quarter.
Demographic data over 50 cities and 28 quarters are merged from IPUM CPS dataset. Cost shifters for 28 quarters
or 7 years depending on the data availability are collected from various year books published by Bureau of Labor
Statistics and Department of Agriculture.
Li: Product Offerings and Product Line Length Dynamics38
Table 2 Reduced Form Evidence for Line Length Dynamics
FE, dependent variable is
Line Length,t+1 1(Line Expansion,t+1)
(1) (2) (3) (4)HHI for In-line Market Share -6.78*** -1.58***
(1.41) (0.41)sd Ln ShareInLine -1.57*** -0.20***
(0.25) (0.07)Line Length 0.86*** 0.79*** -0.03*** -0.05***
(0.01) (0.02) (0.00) (0.01)Market fe Yes Yes Yes YesQuarter fe Yes Yes Yes YesObservations 1350 1350 1350 1350Adjusted R2 0.86 0.73 0.34 0.34Standard errors in parentheses* p < 0.10, ** p < 0.05, *** p < 0.01
Notes: This table illustrates the reduced-form evidence for line length adjustment in response to time-evolving pref-
erence heterogeneity. Preference heterogeneity is inversely correlated with concentration for in-line market shares,
i.e., concentrated in-line market-share means homogenous preference. All columns are panel data regressions with
market fixed effects. The dependent variables are next-quarter line-length in columns (1) and (2) and next-quarter
dummy for line length expansion in columns (3) and (4). Line length is the count of features (flavor-cut-fat) within
each market-quarter after dropping transient ones with less than 12 weeks of positive sales. All data come from IRI
Academic Dataset.
Li: Product Offerings and Product Line Length Dynamics39
Table 3 Demand Estimation
Dependent Variable is Ln(Share1) - Ln(Share0)
OLS 2SLS
(1) (2) (3) (4)Preference Heterogeneity 0.02*** 0.23*** 0.41*** 0.49***
(0.00) (0.01) (0.02) (0.02)Price -0.13** -2.53*** -2.38*** -49.50***
(0.05) (0.19) (0.22) (2.92)⇥ Ln(Median Income) 3.08***
(0.30)⇥ Ln(Median Age) 3.33***
(0.60)Ruffle cut -0.01*** -0.11***
(0.00) (0.01)Wavy cut 0.01 0.03***
(0.01) (0.01)Fat free 0.01 -0.06***
(0.01) (0.01)Reduced fat -0.01 -0.15***
(0.01) (0.01)Flavor fe Yes Yes No NoProduct fe No No Yes YesQuarter fe Yes Yes Yes YesObservations 30930 30930 30930 30930Adjusted R2 0.90 0.86 0.81 0.76Standard errors in parentheses* p < 0.10, ** p < 0.05, *** p < 0.01
Notes: This table shows the demand estimation induced by nested logit model. The dependent variable for all columns
are the difference between logarithm of total Frito Lay shares and total shares from outside goods. Column (1) uses
OLS, column (2) - (4) uses 2SLS, with three sets of instrumental variables including BLP instruments (summation
of flavor, cut and fat dummies for other features in the same serving-city-quarter), Hausman instruments (average
price sold for the same feature in other city within serving-quarter, price of materials including potatoes, sugar, soy
bean oil, edible butter and edible tallow) and competition environment (number of competitor firms and number of
competitor UPCs other than Company-A chips within serving-city-quarter).
Li: Product Offerings and Product Line Length Dynamics40
Table 4 Demand Estimation with Varieties of Preference Heterogeneity
2SLS, Dependent Variable is Ln(Share1) - Ln(Share0)
Baseline SmallPackage
IqrInc(1M)
IqrAge(10 yrs) Asian Hispan Hispan
>p50
(1) (2) (3) (4) (5) (6) (7)Preference Heterogeneity 0.41*** 0.67*** 0.36*** 0.30*** 0.41*** 0.42*** 0.36***
(0.02) (0.04) (0.02) (0.04) (0.02) (0.02) (0.02)⇥ Diversity Measure 1.48*** 0.04*** 0.47*** -0.09 0.12***
(0.19) (0.01) (0.15) (0.06) (0.01)Price -2.38*** -2.74*** -2.99*** -2.62*** -2.57*** -2.30*** -2.87***
(0.22) (0.29) (0.24) (0.24) (0.23) (0.22) (0.23)Product fe Yes No Yes Yes Yes Yes YesQuarter fe Yes Yes Yes Yes Yes Yes YesObservations 30930 9155 30930 30930 30930 30930 30930Adjusted R2 0.81 0.41 0.79 0.80 0.80 0.81 0.78Summary statistics of population diversity measure
Mean 0.06 3.40 0.04 0.10 0.49Min 0.04 2.75 0.00 0.00 0.00Max 0.10 4.47 0.31 0.53 1.00
Standard errors in parentheses* p < 0.10, ** p < 0.05, *** p < 0.01
Notes: This table shows the demand estimation of nested logit model allowing preference heterogeneity to vary by
observables. The dependent variable for all columns are the difference between logarithm of total Frito Lay shares
and total shares from outside goods. All columns are estimated using 2SLS with three sets of instruments: BLP
instruments, Hausman instruments, and competition environments. Column (1) is the baseline estimates for large
package size potato chips, which is identical to Column (3) in Table 3. Column (2) reports the estimates with identical
specification but in small-sized package chips (1-4 serving sizes). Column (3)-(7) allow preference heterogeneity to
vary by different measures of population diversity, where Column(3) uses interquartile of income, Column (4) uses
interquartile of age, Column(5) uses Asian population ratio, Column (6) uses Hispanic population ratio, and Column
(7) uses discretized Hispanic population ratio, which is the dummy for above-median Hispanic population ratio.
Table 5 Supply Estimation
Linear Cost Nonlinear Cost
b se b seCost c1 (1K$ / 1M HH) 3.56 (0.86) 2.08 (1.38)Cost c2 (1K$ / 1M HH) 6.04 (0.65)Precision Ratio �1/h 2.55 (0.02) 2.42 (0.02)Scale of fixed cost �" 0.14 (0.00) 0.02 (0.00)Prior mean µ Integrated IntegratedLog Likelihood -83.37 -63.61
Notes: The cost function for linear specification is c (n) = c1 ·n„ while the cost function for nonlinear specification is
c (nt, nt�1) = (c1 + c2 (nt >nt�1)) ·nt.
Li: Product Offerings and Product Line Length Dynamics41
Figure 1 Distribution of line length and line length changes
0.0
5.1
.15
10 15 20 25 30Line Length
Line Length
0.1
.2.3
-10 -5 0 5 10Change in Line Length
Change in Line Length
Note: Left figure plots the distribution of line length among 50 markets over 28 quarters, and right
figure plots the distribution of change in line length, which is first difference for line length over
two consecutive quarters within one market. Line length is defined as the count of products (unique
combination of flavor-fat-cut) within the city-quarters. Products with positive sales for less than 12
weeks within city-quarters are not counted.
Li: Product Offerings and Product Line Length Dynamics42
Figure 2 Identification line length maintenance cost
.3.4
.5.6
.7.8
Tota
l Sha
re
5 10 15 20Line length (n)
Low c
.3.4
.5.6
.7.8
Tota
l Sha
re5 10 15 20
Line length (n)
High c
µH µM µL
Note: This figure shows the identification of line length maintenance cost. In each plot, the thick
curves are the total market share as a function of line length. I plot three curves with identical
variance but different mean value of preference heterogeneity ✓. We can see that the total market
share is increasing in line length, preference heterogeneity and super-modular in the two parameters.
Straight lines are cost function, and the slope represents the marginal cost of expanding the line
length. The tangent point of cost line and market share curve represents the optimal line length
decisions. We can see that the implied optimal line length is higher when preference heterogeneity
is higher. The two plots differ in marginal cost, and we can see that when cost is lower, line length
decisions are more responsive to change in mean for heterogeneity, which completes the identification
for cost.
Li: Product Offerings and Product Line Length Dynamics43
Figure 3 Model fit - two markets
1520
2530
2001q3 2003q1 2004q3 2006q1 2007q3
BOSTON
1520
2530
2001q3 2003q1 2004q3 2006q1 2007q3
DETROIT
Actual Signal from simulation Signal from data
Note: This figure shows how the model fits the data in two cities: Boston and Detroit. Solid lines are
actual line length decisions, and two dashed lines are line length decisions from simulation. In the
first simulation, “signal from data”, market signals are taken from the data; in the second simulation,
“signal from simulation”, market signals are also simulated from the model. Prior mean in the first
periods are drawn from the known distribution, prior precision in the first periods are estimated.
Li: Product Offerings and Product Line Length Dynamics44
Figure 4 Model fit - distribution
0.0
5.1
.15
10 15 20 25 30Line Length
Line Length
0.1
.2.3
-10 -5 0 5 10Change in Line Length
Change in Line Length
Actual Signal from simulation Signal from data
Note: This figure shows how the model fits the distribution of line length and line length changes.
Solid bars are distribution of actual line length decisions, two lines are kernel density of simulated
line length. In the first simulation, “signal from data”, market signals are taken from the data; in
the second simulation, “signal from simulation”, market signals are also simulated from the model.
Prior mean in the first periods are drawn from the known distribution, prior precision in the first
periods are estimated.
Li: Product Offerings and Product Line Length Dynamics45
Figure 5 Counterfactual - smooth cost
0.0
5.1
.15
10 15 20 25 30Line Length
Line Length
0.1
.2.3
-10 -5 0 5 10Change in Line Length
Change in Line Length
Actual Simulated, step cost Simulated, smooth cost
Note: This figure shows evaluates optimal line length decisions under a smooth cost structure. Solid
bars are distribution of actual line length decisions, and two lines are kernel density of simulated
line length: dashed line represents simulated line length in original model with nonlinear cost,
whereas solid line represents simulated line length under linear cost structure. In both simulations,
market signals are taken from simulation; prior mean in the first periods are drawn from the known
distribution, prior precision in the first periods are estimated.
Li: Product Offerings and Product Line Length Dynamics46
Figure 6 Counterfactual - known heterogeneity ✓
0.0
5.1
.15
10 15 20 25 30Line Length
Line Length
0.1
.2.3
-10 -5 0 5 10Change in Line Length
Change in Line Length
Actual Simulated, learning θ Simulated, knowing θ
Note: This figure shows evaluates optimal line length decisions when firms know the precise value of
time-varying preference heterogeneity. Solid bars are distribution of actual line length decisions, and
two lines are kernel density of simulated line length: dashed line represents simulated line length
in original model with learning heterogeneity, whereas solid line represents simulated line length
assuming known heterogeneity. In both simulations, market signals are taken from simulation; prior
mean in the first periods are drawn from the known distribution, prior precision in the first periods
are estimated.
Li: Product Offerings and Product Line Length Dynamics47
Figure 7 Testing learning assumption based on gross margin
05
1015
2025
2.1 2.15 2.2 2.25 2.3Gross margin in median market (1M $)
Learning θ Knowing θ
Note: This figure plots the distribution of simulated median level gross margin in two simulations:
learning preference heterogeneity and knowing heterogeneity. In both simulations, market signals
are taken from simulation; prior mean in the first periods are drawn from the known distribution,
prior precision in the first periods are estimated. Vertical line is the observed median gross margin
from the data.