small data: consumer pro ling with data …big data" vs \small data": consumer pro ling...

$: Small Data: Consumer Pro ling with Data …Big Data" vs \Small Data": Consumer Pro ling with Data Requirements Tommaso Valletti, Jiahua Wu Imperial College Business School, Imperial$
“Big Data” vs “Small Data”: Consumer Profilingwith Data Requirements

Tommaso Valletti, Jiahua WuImperial College Business School, Imperial College London, London, UK SW7 2AZ

{[email protected], [email protected]}

We consider a model where a monopolist can profile consumers in order to price discriminate among them,

and consumers can take costly actions to protect their identities and make the profiling technology less

effective. A novel aspect of the model consists in the profiling technology: the signal that the monopolist

gets about a consumer’s willingness-to-pay can be made more accurate either by having more consumers

revealing their identities, or by spending larger amounts of money on third-party complementary data or

data analytics capabilities. We show that the optimal investment level from the monopolist is closely related

to the flexibility of consumers to conceal their identities as well as to data requirements. In particular, a

higher (lower) data requirement is an instance when more (less) consumers are required to achieve the same

signal precision. For a given data requirement, we show that a smaller investment is required to achieve

the same level of accuracy when it gets more difficult for consumers to conceal their identities, leading to a

higher profit for the firm. Consumer surplus and social welfare are instead non-monotone in the ability of

consumers to conceal their identities. Surprisingly, the firm’s investment is not monotone in the level of data

requirement, where investment is the greatest when data requirement is moderate. We also show that the

monopolist has a tendency to invest excessively. This inefficiency is particularly acute either when the firm

needs many profiled consumers in the dataset to be precise (“big data”) and the action for consumers to

protect their identities is costly, or when the firm can profile consumers effectively with only a few of them

(“small data”) and the cost of consumers’ identification concealing is moderate.

Key words : profiling, privacy, price discrimination, signal accuracy, big data, social welfare

1. Introduction

The explosion of information technologies has provided unprecedented ways for firms to collect

data about their consumers. For instance, communications between web browsers and web sites

allow firms to gather information including IP address, the web browser type, the computer model,

as well as its operating system - all of which can be used in consumer profiling. Web sites can also

assign and read unique identifiers, called cookies, which are used to compile records of individuals’

1

mailto:[email protected]

mailto:[email protected]

2

browsing histories. Retail web sites like Amazon use cookies to keep track of what a consumer has

shopped for and bought, and tailor web sites with products that the firm suspects that the consumer

is the most interested in. Apple’s iBeacon is a Near Field Communications technology that allows

mobile Apps to collect a consumer’s position on a micro-local scale, and deliver hyper-contextual

content to users based on his/her location in real time.1 Dansk Supermarked, Denmark’s largest

supermarket chain, partnered with Infosys - a leading IT consulting company - in order to use

address data of its repeated consumers and tailor offers of products relevant to where they live.2

All these examples demonstrate that technological developments have enhanced a firm’s ability to

profile consumers and price discriminate among them.

On the other hand, consumers have become more wary with respect to how their information

is being collected and used by firms. Singer (2015) reports several instances where consumers are

aware of the many trade-offs associated with giving companies access to their data. Ultimately, the

data can be supplied only by consumers themselves. The very same technologies that allow a firm

to profile, can also be used by consumers to counteract the effectiveness of profiling. Consumers can

delete all cookies, block third-party cookies, or purge cookie files. Some privacy-wary consumers

even take further steps and pay third parties to protect their data. For example, Reputation.com

charges individuals $9.95 per month to remove personal data from on-line data markets.

Drawing upon this increasing tension between firms and consumers, we study data profiling in

the context of price discrimination. While a large literature has analyzed consumer profiling and

price targeting, which we review in Section 2, in this paper we consider the novel aspect of the

accuracy of profiling. There are two entangled premises of our model, one related to actions that

can be taken by the firm (data precision) and one related to actions that can be taken by consumers

(data protection).

The degree of precision of consumer’s information is an endogenous choice of the firm. Precision

can be improved in two complementary ways. One way is to gather a dataset about the firm’s own

consumers. The precision of the information that can be extracted directly from own consumers

depends on the consumers’ sample size. Norvig (2010) suggests that the relationship between the

quality of the information and the amount of data available is S-shaped, and the scale of data

needs to reach a minimal threshold, depending on applications, to be sufficiently informative.

Hence, there are situations when only “small data” are enough to generate statistically-relevant

information about consumers, as opposed to truly “big data” instances whereby databases should

1 https://developer.apple.com/ibeacon/

2 https://www.infosys.com/industries/retail/case-studies/Documents/ecommerce-charge.pdf

https://developer.apple.com/ibeacon/

https://www.infosys.com/industries/retail/case-studies/Documents/ecommerce-charge.pdf

3

include abundant information from many consumers. A way to think about a small-data case is one

where there is a relatively simple and general statistical relationship in the population of consumers

based on a few observables (think of gender and age, for instance). A few hundreds observations

may suffice for the firm to unravel the relationship and understand its consumers with a very good

degree of precision. Conversely, a situation about big data is one where there could be thousands

of profiles, and the data requirements are orders of magnitude larger.3

Another way to improve precision is to invest resources in either acquiring data from third

parties to complement the dataset about own consumers, or data analysts and technologies to

support them. A larger investment would correspond with an increase in the precision of consumers’

information that one can potential extract from the dataset about own consumers. For instance,

consider loyalty programs adopted by firms in a wide range of industries, including grocery retailing.

Supermarkets not only collect information directly from their consumers, but also often supplement

their consumer data by spending money on data collected from other sources, such as electoral rolls

and credit reports. Tesco was the first UK supermarket to launch a loyalty scheme dated back to

1995. Later on, Tesco matched its own data with data from other sources, and created Crucible, a

lucrative venture set up to allow other commercial organizations to pay for access to Tesco’s data.4

While these two channels are both relevant and fit the purpose of improving the precision of the

signal received about a consumer, they exhibit different features and therefore it is important to

keep a distinction between them. Own data are likely to be idiosyncratic to the firm, and therefore

more directly relevant, ceteris paribus. Instead, data acquired from third parties (or investment in

data analytics capabilities) are more akin to a general-purpose technology. This distinction plays

an important role when it comes to the modelling as shown later in the paper.

Turning to the second aspect of our model, consumers make endogenous choices whether to

allow the firm to use information about them. A consumer can take a costly action to conceal

his/her identity. This opens various interesting aspects that we contemplate. First, related to the

point about sample size and accuracy, the firm needs to make sure that consumers endogenously

prefer not to conceal their identities - otherwise the profile techniques will be ineffective. This can

be done either by offering them a good price in case they allow data disclosure, or by penalizing

them in case they do not. Second, as information concealing is costly, there are welfare and policy

questions arising. Regulators could make the cost of concealing smaller, for instance by imposing a

3 Although related to a different setting, a “big data” problem is the Netflix prize. In 2009, Netflix awarded a $1mprize for the best filtering algorithm to predict user ratings for films. A data set of 100,480,507 ratings that 480,189users gave to 17,770 movies was provided by Netflix. See https://en.wikipedia.org/wiki/Netflix Prize.

4 http://www.theguardian.com/business/2005/sep/20/freedomofinformation.supermarkets

https://en.wikipedia.org/wiki/Netflix_Prize

http://www.theguardian.com/business/2005/sep/20/freedomofinformation.supermarkets

4

full disclosure policy on the use of cookies. Conversely, this cost could be made larger if the firms

are allowed to trade consumers’ data, so that consumers would need to request potentially many

websites to erase their data. Hence we can also address, in a meaningful way, the question that is at

the core of current debate on consumer privacy. How easy should access to consumers’ information

be?

Taking into account the factors described above, in this paper we consider a model where a

monopolist seeks to price discriminate consumers through data profiling. Consumers can take costly

actions to protect their identities. Otherwise, a signal about a consumer’s willingness-to-pay is

received by the firm. The signal can be made more accurate either by having more consumers

revealing their identities, or by investing larger amounts of money on third-party data or data

analytics capabilities. We characterize the equilibrium outcome of this game. It turns out that

the firm needs to strike a delicate balance when it comes to the investment decision. A higher

investment allows the firm to derive more accurate signals, given the same number of consumers

in the database. At the same time, consumers are wary of the investment from the firm, and thus

a larger fraction of them lean toward being anonymous. The optimal investment level is closely

related to the ability of consumers to conceal their identities as well as to data requirements. In

particular, we define a higher (lower) data requirement as an instance when more (less) consumers

are required to achieve the same signal precision, given the same investment from the monopolist.

Under the same data requirement, we show that a smaller investment is required to achieve the

same level of accuracy as it gets more difficult for consumers to conceal their identities, leading to a

higher profit for the firm. On the other hand, consumer surplus and social welfare are non-monotone

in the ability of consumers to conceal their identities.

Surprisingly, the firm’s investment is not monotone in the level of data requirement, where invest-

ment is the greatest when data requirement is moderate. The rationale behind this phenomenon

is the following. When the data requirement is low, a small investment is sufficient for the firm to

profile consumers with good accuracy, and thus the optimal investment from the firm tends to be

small. As data requirement increases, the firm generally increases its investment level with the hope

of getting more accurate signals from consumers. At the same time, the fraction of consumers who

choose to reveal their identities decreases. Consequently, when the data requirement is sufficiently

high, even a high investment would not allow the firm to profile consumers accurately due to the

lack of the scale of data, leading the firm to scale back its optimal investment level. We also show

that the monopolist has a tendency to invest excessively. This inefficiency is particularly acute

when either the firm needs many profiled consumers in the dataset to be precise and the action

5

for consumers to protect their identities is costly, or the firm can profile consumers effectively with

only a few of them and the cost of consumers’ identification concealing is moderate.

2. Literature Review

Our paper relates to two broad streams in the literature. First, our paper is linked to the literature

on behavior-based price discrimination. In the seminal work by Fudenberg and Tirole (2000), they

study the effects of behavior-based price competition in the framework of a two-period Hotelling

model. Firms are able to profile a consumer’s preference on the Hotelling line based upon his/her

purchase decision in the first period. Past behavior is then used in the second period to design a dis-

criminatory pricing scheme by the firm. The research on behavior-based pricing has been extended

to various settings and applications (Villas-Boas 1999, Villas-Boas 2004, Pazgal and Soberman

2008, Chen and Zhang 2009, Shin and Sudhir 2010, Zhang 2011).5 Comprehensive literature reviews

can be found in Fudenberg and Villas-Boas (2007) and Esteves (2009). Most of the literature on

behavior-based price discrimination assumes a multi-period setting where the firms price discrimi-

nate consumers based on their purchase history. We abstract away from this by proposing a simpler

single-period model that still generates a meaningful way to have targeted and non-targeted prices.

Essentially, there are some consumers (that we call the “old” market, as long as they do not not

protect their privacy) for whom targeting is possible, and then there is an anonymous market,

which can be composed by “new” consumers to the firm as well as by those repeated consumers

who conceal/erase their data, such that the firm is not able to identify them. This feature is shared

with Montes et al. (2015) who study a different question of data intermediaries. Our paper also

distinguishes from this stream of the literature in that consumers’ privacy decisions are endogenous

in our model.

Second, there is a stream of literature that examines the implications of consumer privacy explic-

itly. Similar to the literature on behavior-based price discrimination, the majority of this stream

of literature studies pricing and privacy regulation, as well as their implications and consequences

on welfare, while assuming that consumers’ privacy decisions are exogenously determined (Taylor

2004, Acquisti and Varian 2005, Taylor and Wagman 2014, Bergemann and Bonatti 2015, Shy

and Stenbacka 2016). Acquisti et al. (2015) present an updated literature review of this literature.

A growing number of contributions have considered the implications of consumers’ endogenous

decisions regarding how much information to be revealed to the firm. Conitzer et al. (2012) study

a monopolist’s pricing problem in the framework of a two-period model, where consumers have an

5 The literature on consumer addressability is also closely related (Chen et al. 2001, Chen and Iyer 2002).

6

option of maintaining anonymity at the end of first period. They show that consumers benefit from

an increase in the cost of anonymity, up to a certain point. Casadesus-Masanell and Hervas-Drane

(2015) consider a duopoly setting where consumers can choose the amount of information being

provided to the firms. This information can help the firms to improve the quality of their products.

Firms derive revenues from both consumer purchases and disclosure of consumer information in a

secondary market. Montes et al. (2015) study the effects of price discrimination with endogenous

consumers’ privacy choices in the context of a duopoly Hotelling model. There is a data broker who

collects consumers’ information and can sell data to the two competing sellers. They show that

the optimal selling strategy for the owner of consumer data is to deal with one firm exclusively.

All these papers assume a perfect profiling technology where the exact valuation or preference of

a consumer can be inferred by the firms.

There are two works that are closest related to ours in that the profiling technology is assumed to

be imperfect. In Koh et al. (2015) consumers can choose to disclose their private information to a

monopolist in return for reduced search cost due to more accurate product recommendation. They

face a trade-off between better product-fit and potential price discrimination. Belleflamme (2015)

studies the optimal pricing of a monopolist who is able profile consumers, while consumers are able

to counteract by maintaining anonymity. What these two models have in common is that the signal

received by the firm is assumed to follow a Bernoulli distribution when a consumer’s true valuation

is revealed with probability β, and no new information with probability (1− β). Our work differs

from these papers in two important aspects. First, in our paper, the firm makes an endogenous and

costly investment in the precision of the signal, whereas the profiling technology is exogenously

given in Koh et al. (2015) and Belleflamme (2015). Coupled with the fact that consumers choose

endogenously whether or not to reveal information, we add an important dimension related to

privacy costs and privacy protection, allowing us to have a significant welfare discussion. Second, we

further expand on the profiling technology by considering two distinct scenarios, namely investment

in a general purpose technology, which is independent of consumer base size, and investment that

depends on data requirements, where there exists externality in the size of the dataset. We then

derive managerial implications on when/how/why a firm should invest in profiling technology.

3. The Model3.1. The Setup

A monopolist sells a product to a continuum of consumers with a total mass of one. There are two

market segments, namely an “old market” and a “new market”. The difference is that nothing is

known about consumers in the new market, while information can be obtained about consumers

7

in the old market. Assume that the size of the old market is λ, and the size of the new market is

1−λ. In the following analysis, we use the subscripts o and n to denote the old market and the new

market, respectively. As discussed in the previous section, most of the literature on behavior-based

price discrimination assumes a multi-period setting where the firms price discriminate consumers

based on their purchase history. We abstract away from this by using the “old” vs “new” markets

as a simple but flexible device to model different segments of consumers. These two markets are

described next.

A consumer’s valuation in the old (new) market is determined by the realization of a random

variable, which follows a cumulative distribution function Fo(v) (Fn(v)). The support of Fi(v) is

normalized to be the unit interval, and is assumed to be smooth, with strictly positive density

fi(v), i= o,n. A consumer in the old market can choose to conceal his identity to prevent the firm

from profiling his valuation towards the product. A cost of c (≥ 0) is incurred if a consumer spends

the effort to conceal his identity.

The firm invests a total amount of K in collecting old consumers’ information and profiling con-

sumers. The investment in consumer profiling allows the firm to receive a signal for each consumer

in the old market who reveals his identity. Following the discussion in Section 1, the accuracy of the

signal is dependent on the firm’s investment K, as well as on the fraction of consumers choosing to

reveal their identities, γ. More accurate signals are received with a higher investment K. The firm

is also able to get a more accurate signal with a greater proportion of old consumers γ who choose

to reveal their valuations, which constitute the dataset that can be analyzed with statistical tech-

niques. As a result, there exist externalities in consumer profiling, because a consumer’s decision

of revealing his identity has implications for the rest of consumers in the dataset. No profiling can

instead be conducted in the new market.

Denote by s the random signal the firm receives for an old consumer if he chooses to reveal his

identity. Let G(s|v) be the conditional cumulative distribution of signals from a type-v consumer.

Notice that G(s|v) depends on K and γ, but we omit them to simply the notations. Its correspond-

ing density function is denoted as g(s|v). We make the following assumption on the conditional

distribution of signals.

Assumption 1 (Monotone Likelihood Ratio). ∀v1 ≥ v2 and ∀s1 ≥ s2, then g(s1|v1)g(s1|v2)

≥g(s2|v1)g(s2|v2)

.

The monotonicity in likelihood ratio is a refinement of the concept of first-order stochastic

dominance. This assumption implies that G(s|v1)≤G(s|v2), ∀v1 ≥ v2, i.e., signals generated by a

8

consumer with a higher valuation dominate signals generated by a consumer with a lower valua-

tion in the sense of the first-order stochastic dominance. As we will show below, this assumption

guarantees that there exists a cutoff v such that any consumer with a valuation greater than v

would choose to conceal, and any consumer with a valuation less than v would choose to reveal in

the old market.

The game unfolds in several stages. First, each consumer realizes his valuation, v, on the unit

interval. The firm does not observe v; however, the distribution Fi(v), i= o,n, is common knowl-

edge, known to both the firm and consumers. Second, the firm decides an investment level K in

consumer profiling. In the third stage, each consumer in the old market decides whether or not to

conceal his identity. A cost of c is incurred when concealing his identity. Next, the firm sets a base

price to consumers in the new market, as well as to those in the old market who conceal their iden-

tities. The firm also offers a tailored price to each consumer in the old market who chooses to reveal

his identity, based on the firm’s belief of the consumer’s valuation. Finally, each consumer makes

the purchase decision, and he will purchase the product if and only if his utility from purchasing

the product is non-negative.

3.2. Preliminary Results

We use Perfect Bayesian Equilibrium as the solution concept. For a consumer in the new market,

the only decision he has to make is whether or not to purchase the product in the last stage. For a

type-v consumer, he will purchase the product if and only if his utility from purchasing is greater

than or equal to zero, i.e., un(v) = v−p≥ 0, where p is the price charged to anonymous consumers.

In the old market, a type-v consumer’s expected utility from concealing his identity is uo(v) =

v−p− c. The expected utility for a type-v consumer if he chooses to reveal his identity is given by

uo(v) = v−E[p(s)|v], where p(s) is the price the firm charges upon receiving a signal s. Assume

that if a consumer is indifferent between revealing or concealing, he would choose to save the effort

and simply reveal his identity. Consequently, a consumer with valuation v would choose to conceal

his identity if and only if

v− p− c > v−E[p(s)|v] ⇔ E[p(s)|v]> p+ c.

We focus on the equilibrium where there is a cutoff v, such that all consumers in the old market

with v > v choose to conceal, and all consumers with v < v choose to reveal. The existence of

such cutoff is guaranteed by Assumption 1 as the signal from a consumer with a higher valuation

dominates that of a consumer with a lower valuation in the sense of first-order stochastic dominance

9

(see Lemma 1 below). The cutoff is given by v = inf{v ∈ [0,1] such that E[p(s)|v]> p+ c}, where

v is reduced to E[p(s)|v] = p+ c if E[p(s)|v] is continuous in v.

Suppose that E[p(s)|v] is continuous in v. Given a cutoff v, the firm’s belief of the valuation of

a consumer in the old market who chooses to conceal his identity is given by

q(v) =fo(v)

1−Fo(v), ∀v ∈ (v,1].

The firm’s belief of a consumer’s valuation upon receiving a signal s is given by

h(v|s) =g(s|v) · fo(v)/Fo(v)∫ 1

0g(s|v) · fo(v)/Fo(v) dv

=g(s|v) · fo(v)∫ 1

0g(s|v) · fo(v) dv

, ∀v ∈ [0, v], (1)

where its corresponding cumulative distribution is denoted as H(v|s). The following lemma is a

direct consequence of Assumption 1 and properties of monotone likelihood ratio.

Lemma 1. ∀v1 ≥ v2 and ∀s1 ≥ s2, then h(v1|s1)h(v1|s2)

≥ h(v2|s1)h(v2|s2)

.

In the remainder, we assume that consumers’ valuations are uniformly distributed in both the

new market and the old market. The firm’s belief of the valuation of a consumer in the old market

who chooses to conceal his identity can be simplified as

q(v) =1

1− v, ∀v ∈ (v,1].

Consumers in the new market and those consumers in the old market who conceal their identities,

such that the firm is not able to identify them, constitute an anonymous market. The firm’s

expected revenue from charging a price p to those anonymous consumers, from both the old market

and the new market, is given by

πa(p) = λ(1− v) · p+ (1−λ) ·∫ 1

p

pdx,

and thus the optimal price that maximizes the firm’s profit is given by

p∗(v) =1

2+λ(1− v)

2(1−λ), (2)

subject to the constraint that v − p∗(v) − c ≥ 0. Otherwise the type-(v + ε) consumer,

for any small ε > 0, is always better-off revealing his identity and receives a non-negative expected

utility. Equation (2) has a simple interpretation. Because the average valuation in the anonymous

market is greater than that in the new market, p∗(v) is always above the standard monopoly price

(1/2), if v < 1. If v= 1, no one in the old market conceals, and the firm faces two identical markets.

The indifferent type in the new market is given by vn = p∗(v), which must be between 0 and 1.

Simplifying the preceding two constraints yields that v≥ 2c(1−λ)+1

2−λ . As v≤ 1, a sufficient condition

for the existence of a separating equilibrium, where some consumers in the old market choose to

reveal their identities and others choose to conceal their identities, is given by 1≥ 2c(1−λ)+1

2−λ , which

is equivalent to c≤ 1/2. We assume that this inequality holds throughout the following analysis.

10

3.3. Formalization of Signal Accuracy

Next we formalize the definition of accuracy of signals. In particular, we assume that the conditional

density of signals from a type-v consumer follows the specification below. For any αK,γ > 0,

g(s|v) =

{1

2αK,γif v−αK,γ ≤ s≤ v+αK,γ ,

0 if s < v−αK,γ or s > v+αK,γ .

That is, the conditional distribution of signals still follows a uniform distribution. The signal

s degenerates to a constant of v, if αK,γ = 0. It is easy to verify that our specification of g(s|v)

satisfies Assumption 1. The mean of random signals equals to the consumer’s valuation v, which

is independent of the firm’s investment as well as others’ decisions. However, the conditional dis-

tribution rotates around the mean v as K and γ varies.6 Consequently, the accuracy of the signal

is determined solely by αK,γ . To reflect our assumption that the firm gets a more accurate signal

with either a higher investment K, or a greater proportion of consumers γ choosing to reveal their

valuations, we assume that αK,γ is weakly decreasing in K and γ.

Figure 1 Cumulative Distribution Functions of Prior, Signals, and Posterior

(a) Prior Belief of ConsumerValuations

(b) Conditional Distributionof Signals

(c) Posterior Belief uponReceiving s

Cumulative distribution functions of the firm’s prior belief, conditional distribution of signals,

and the posterior belief upon receiving a signal s are illustrated in Figure 1. Figure 1(a) demon-

strates the firm’s prior belief of consumers’ valuations, which is uniformly distributed on the unit

line segment. When a consumer does not spend the effort to conceal his identity, a signal is gener-

ated. The conditional distribution of the signal, as illustrated in Figure 1(b), is a rotation of the

6 We note that the idea of rotation of distributions is similar to the one studied in Johnson and Myatt 2006.

11

prior belief around the consumer’s true valuation, i.e., v. The dispersion of the signal’s distribution

depends on both the firm’s investment level K, and the fraction of consumers choosing to reveal

their identities γ. Once the firm receives the signal, his belief of the consumer’s valuation is updated

according to Bayes’ rule, and the posterior belief is illustrated by the red curve in Figure 1(c).

4. α as a Step Function

We first consider the case where αK,γ follows a step function as specified below. A more general

specification of signals is considered in Section 5.

αK,γ =

{1 if K < τ(γ)

0 if K ≥ τ(γ). (3)

That is, for a given fraction of consumers who choose to reveal their identities γ, if the investment

from the firm is less than τ(γ), the firm is not able to gain any extra information by profiling those

consumers. On the other hand, the firm is able to perfectly profile those consumers who reveal

their identities and knows their valuations, if the firm’s investment reaches the threshold τ(γ).

We assume that τ(γ) is non-increasing and concave in γ. The non-increasing property of τ(·)

is consistent with the intuition that the firm’s investment required to profile consumers perfectly

decreases (weakly) in the fraction of consumers choosing to reveal their identities. The concavity

of τ(·) suggests that the required investment level decreases slowly when γ is small, however

the marginal effect of the size of dataset on the required investment increases when the dataset

becomes larger. It is motivated by the “data threshold” commonly observed in practice (Norvig

2010). We also assume that τ(1)> 0, i.e., under the situation when all consumers choose to reveal

their identities, the firm still needs to commit a certain level of investment in order to profile the

consumers perfectly. We will often work with the inverse function. If the firm’s investment is K, the

minimal fraction of consumers choosing to reveal their identities that allows the firm to perfectly

profile consumers’ valuations is given by τ−1(K).

4.1. Characterization of Equilibrium

With αK,γ being a step function as specified in Equation (3), consumers’ behavior in the old market

can be characterized by the lemma below.

Lemma 2. (Profiling with αK,γ being a step-function)

(i) (Perfect Profiling) If K ≥ Ko, all consumers in the old market with v > vo choose to

conceal their identities, and all consumers with v≤ vo choose to reveal;

(ii) (None Profiling) If K <Ko, all consumers in the old market with v≥ vK choose to conceal

their identities, and all consumers with v < vK choose to reveal,

12

where vo = 2c(1−λ)+1

2−λ , Ko = τ(vo), and vK = τ−1(K).

Consumers’ optimal responses under various firm’s investment levels are illustrated in Figure 2.

Recall that a consumer from the old market needs to spend effort c in order to conceal his identity.

Intuitively, any consumer with a valuation lower than c would find it unattractive to conceal his

identity, and choose to simply reveal his identity. However, as shown in Lemma 2, the fraction of

consumers who are guaranteed to reveal their identities is given by vo = [2c(1− λ) + 1]/(2− λ),

which is (1− λc)/(2− λ) higher than c. Consequently, as long as the firm is able to commit an

investment level of at least Ko = τ(vo), it would gather sufficient data to profile consumers perfectly.

On the other hand, when the firm’s investment K is less than Ko, due to the fact that consumers

would get zero surplus if the firm is able to profile them perfectly, consumers will thus coordinate

such that the fraction of consumers that choose to reveal their identities is less than τ−1(K).

Figure 2 Consumers’ Optimal Responses under Various Investment Levels

(a) K ≥Ko (Perfect Profiling) (b) K <Ko (None Profiling)

Taking derivatives of vo, as identified in Lemma 2, with respect to c and γ, we have ∂vo∂c

= 2(1−λ)2−λ ≥

0 and ∂vo∂λ

= 1−2c(2−λ)2 ≥ 0, where the first inequality is due to λ≤ 1, and the second inequality is due

to our assumption that c ≤ 1/2. Combining preceding results with the monotonicity of τ(γ), we

can establish the monotonicity of Ko, which is summarized in the corollary below.

Corollary 1. (Monotonicity of Ko)

(i) Ko is non-increasing in c;

(ii) Ko is non-increasing in λ.

Note, in particular, that the required level of investment to perfectly profile consumers Ko is

non-increasing in the cost of concealing c. This is intuitive because more consumers would find

concealing their identities unattractive with a higher c. The effect of the market composition is

more nuanced. In the old market, only those consumers with relatively high valuations would choose

to conceal their identities. Thus, if a larger proportion of consumers come from the old market,

13

the profit-maximizing firm would charge a higher price to anonymous consumers, which dissuades

consumers in the old market from concealing their identities, allowing the firm to get away with a

lower investment level.

Next we study the firm’s optimal investment level in the first stage. When the firm’s investment

level K is greater than or equal to Ko, any consumer in the old market with a valuation greater

than vo would choose to conceal his identity. Consequently, the optimal price the firm charges to

those anonymous consumers in both old and new markets is given by

p∗(vo) = vo− c=1−λc2−λ

, (4)

and thus the firm’s optimal profit with an investment of K is given by

πK(λ, c) = λ

(∫ vo

0

vdv+

∫ 1

vo

p∗(vo)dv

)+ (1−λ)

∫ 1

p∗(vo)p∗(vo)dv−K

=1

2(2−λ)+λc2(1−λ)

2−λ−K. (5)

As indicated by the profit function, any investment beyond Ko does not yield any extra revenue,

because the firm can already profile consumers perfectly with an investment of Ko, and the fraction

of consumers choosing to reveal their identities remain the same. On the other hand, when K <Ko,

the optimal price the firm charges to those anonymous consumers, according to Equation (2), is

given by

p∗(vK) =1

2+λ(1− vK)

2(1−λ), (6)

and the firm’s optimal profit, for any K ∈ [τ(1),Ko), is

πK(λ, c) = λ

(∫ vK

vK/2

vK2dv+

∫ 1

vK

p∗(vK)dv

)+ (1−λ)

∫ 1

p∗(vK)

p∗(vK)dv−K

=λ(vK − 1)2

4(1−λ)+

1

4−K. (7)

When K < τ(1), the firm is not able to gain any information from profiling even if all consumers

in the old market choose to reveal their identities. Thus, all consumers in the old market would be

better-off revealing their identities, and the firm obtains the optimal profit with a zero investment,

i.e., π∗K(λ, c) = π0(λ, c) = 1/4, ∀K ∈ [0, τ(1)). Under the assumption that τ(·) is a concave function,

the firm’s optimal investment level is characterized by the lemma below.

Lemma 3 (Optimal investment level with αK,γ being a step-function). If τ(·) is a

concave function, the firm’s optimal investment level is either 0 or Ko.

14

Lemma 3 indicates that when the required level of investment is concave and decreasing in the

fraction of consumers choosing to reveal their identities, the optimal solution is to either invest

the minimum amount Ko that allows the firm to profile consumers perfectly, or not to invest in

consumer profiling at all. With an investment of Ko, the firm’s optimal expected profit is given by

πKo(λ, c) = 12(2−λ) + λc2(1−λ)

2−λ −Ko. Due to λ∈ [0,1], 12(2−λ) ≥

14

and λc2(1−λ)2−λ > 0, we obtain that the

revenue from investing Ko is always greater than 1/4. As a result, the firm’s decision on investing

in consumer profiling depends ultimately on whether the increase in revenue outweighs the cost of

profiling consumers.

4.2. Welfare Implications

Interestingly, whether or not the firm chooses to invest has different implications on consumer

surplus (CS) and social welfare (SW). With an investment of Ko, consumer surplus and social

welfare are given by

CSKo(λ, c) = λ

∫ 1

vo

[v− p∗(vo)− c]dv+ (1−λ)

∫ 1

p∗(vo)[v− p∗(vo)]dv

=(2λ− 3)(1−λc)2

2(2−λ)2+

1 +λc2

2−λc,

SWKo(λ, c) = λ

(∫ vo

0

vdv+

∫ 1

vo

(v− c)dv)

+ (1−λ)

∫ 1

p∗(vo)vdv−Ko

=1

2−λc+λc2 +

λc(1−λc)2−λ

− (1−λ)(1−λc)2

2(2−λ)2−Ko.

Similarly, with a zero investment, consumer surplus and social welfare are given by

CS0(λ, c) = λ

∫ 1

12

(v− 1

2

)dv+ (1−λ)

∫ 1

12

(v− 1

2

)dv=

1

8,

SW0(λ, c) = λ

∫ 1

12

vdv+ (1−λ)

∫ 1

12

vdv=3

8.

Neither λ nor c plays a role in consumer surplus or social welfare when the firm invests 0. The

reason is that with no investment from the firm, the signal from a consumer is non-informative.

That is, the firm’s posterior belief of a consumer’s valuation is exactly the same as the prior belief.

Consequently, any consumer in the old market would be better off revealing his identity, and the

firm faces two identical markets in terms of the distribution of consumer valuations. However, if

the firm invests, prices will differ and thus both λ and c affect the equilibrium. The impacts of λ

and c on profit and consumer surplus are summarized in the lemma below.

Lemma 4. (Structural Properties)

(i) πKo(λ, c) is increasing in both λ and c;

15

(ii) CSKo(λ, c) is decreasing in λ, and is convex in c;

(iii) SWKo(λ, c) is convex in both λ and c.

A direct consequence of Lemma 4(i) is that the firm is more likely to invest in consumer profiling

with either a higher c or a higher λ. The rationale behind this is that the firm’s profit with

zero investment is given by π0(λ, c) = 1/4, which is independent of both c and γ. On the other

hand, the firm’s profit from investing Ko is increasing in c and γ, and its maximum is realized at

πKo(λ= 1, c= 1/2) = 1/2− τ(1). Consequently, when τ(1)≤ 1/4, the firm would prefer investing in

consumer profiling over no investment if and only if λ and c are sufficiently high. We summarize

the result in the corollary below.

Corollary 2. (To Invest or Not to Invest) Under the equilibrium identified in Lemma

3,

(i) when τ(1)> 1/4, the firm always prefers no investment;

(ii) when τ(1)≤ 1/4, the firm is more likely to make an investment of Ko with a higher c and/or

a higher λ.

With a zero investment in consumer profiling, the firm is able to profile a consumer only to the

granularity of markets, i.e., whether a consumer comes from the old market or the new market.

Consequently, the firm could potentially utilize the information, and offer prices tailored to the two

markets. Arguably in our model, the benefit from this third-degree price discrimination does not

arise due to the assumption of identical valuation distribution across the two markets. The benefit

of investing Ko comes from knowing the exact valuation of every single consumer who chooses

to reveal his identity in the old market, thus allowing the firm to offer a tailored price to each

individual consumer. For a fixed investment, the fraction of consumers who choose to reveal their

identities will be greater with a greater c or a greater γ, and thus the option of perfect profiling

becomes more attractive.

The impacts of profiling on consumer surplus and social welfare are summarized in the corollary

below.

Corollary 3. (Impact of Profiling on Consumer Surplus and Social Welfare)

Comparing consumer surplus and social welfare under investment levels of 0 and Ko, we have

(i) for any λ and c, CSKo(λ, c)≤CS0(λ, c);

(ii) for any λ> 1/4 and c, SWKo(λ, c)≥ SW0(λ, c) when Ko is sufficiently small.

16

It is not surprising that investment in profiling enables the firm to capture more consumer

surplus than it would otherwise without the investment. However, this investment is not necessarily

socially optimal. If the firm chooses to invest in profiling consumers, the firm is able to sell to more

consumers in the old market, especially to those with relatively low valuations due to personalized

pricing. This is good for efficiency. At the same time, consumers with higher valuations would

choose to spend the effort to avoid price discrimination from the firm, leading to a loss in efficiency.

Consequently, if the size of the old market is small, or the investment required to profile consumers

perfectly, i.e., Ko, is high, investment leads to a suboptimal situation from the perspective of social

welfare.

Having described how the equilibrium looks like, and having identified possible inefficiencies,

we now ask a natural follow up and central question. What determines the extent to which the

investment is socially optimal? Imagine a situation where prices to consumers are always set by the

firm, but the investment level could be set by a social planner that maximizes total welfare instead

of just the firm’s profit. How does the investment level compare to that chosen by the firm? It turns

out that whether the firm’s investment is socially optimal depends critically on the function τ(·),

which determines the amount of investment required to perfectly profile consumers who choose to

reveal their identities, i.e., Ko. If the investment function τ(·) evaluated at vo is greater than an

upper threshold Kλ,c, then it would be prohibitive for the firm to invest, and this decision turns

out to be efficient. On the other end of the spectrum, if the amount of investment required is less

than a lower threshold Kλ,c, the firm prefers to invest in profiling consumers’ valuations, and the

increase in the sales outweighs the cost of investment Ko and the amount of effort consumers spend

to conceal their identities, leading to a socially-optimal investment decision. However, for moderate

Ko = τ(vo), the firm makes an excessive investment from the perspective of social welfare.

Proposition 1. (Optimal Investment vs. Excessive Investment)

(i) When Ko >Kλ,c, the firm does not invest in consumer profiling, and this decision is socially

optimal;

(ii) when Kλ,c <Ko ≤Kλ,c, the firm invests Ko in consumer profiling, which leads to excessive

investment from the perspective of social welfare;

(iii) when Ko ≤Kλ,c, the firm invests Ko in consumer profiling, and this decision is also socially

optimal;

where Kλ,c = 12(2−λ) + λc2(1−λ)

2−λ − 14

and Kλ,c = 18−λc+λc2 + λc(1−λc)

2−λ − (1−λ)(1−λc)2

2(2−λ)2 .

A direct consequence of Proposition 1, as shown in Corollary 4(i) below, is that the interval

where the firm’s investment is inefficient becomes larger for a relatively larger old market. That

17

is, when the size of the old market is large, the option of knowing consumers’ valuations perfectly

becomes more attractive for the firm because the firm is able to price discriminate a larger fraction

of the market. Consequently, the chance of excessive investment becomes higher, especially when

the investment required is high.

Corollary 4. (Impacts of λ and c on the Inefficiency Interval) The interval (Kλ,c−

Kλ,c) is

(i) increasing in λ;

(ii) concave in c, where its maximum and minimum are realized at c = (1 − λ)/(4 − 3λ), and

c= 1/2, respectively.

Corollary 4(ii) shows that the interval (Kλ,c −Kλ,c) is concave in c. It is easy to verify that

c= (1−λ)/(4−3λ)∈ [0,1/4] due to λ∈ [0,1]. Consequently, starting from c= 1/2, reducing the cost

of concealing increases the width of the inefficiency interval leading to a higher chance of excessive

investment from the perspective of total welfare. The width of the inefficiency interval is maximized

at c= (1−λ)/(4−3λ), where any further reduction in the concealing cost from this point narrows

the interval. The interest in this discussion stems from the fact that c could be interpreted as

a policy tool: a stricter privacy law would make c lower, and vice-versa. From this perspective,

a policy maker that promotes total welfare (rather than consumer surplus alone) should make

data protection very costly, because this minimizes the probability that inefficiencies could arise.

However, this is true only if the policy maker can affect the entire range of values of c, which may

not be realistic. Often only piecemeal policy changes are implementable, and thus the policy maker

could only affect privacy costs incrementally. Corollary 4(ii) shows that the inefficiency interval is

non-monotonic in c: starting from a regime with very easy data protection (low c), making data

protection a bit more costly for consumers would actually worsen total welfare.

4.3. Impact of Data Requirements

The final step in this section concerns the properties of the sampling technology that is used to

profile consumers. We study how the firm’s investment decision and profit will be affected by the

different scenarios with respect to data requirement. The definition below sets the stage for our

discussion. In particular, we say one scenario τ1 indicates higher data requirement than another

scenario τ2, if τ1(γ)≥ τ2(γ), ∀γ ∈ [0,1]. That is, with the same investment from the firm, a larger

fraction of consumers is required in order to profile their valuations perfectly under τ1 than that

under τ2. An alternative way to interpret the definition is that, for the same fraction of consumers

who reveal their identities, a higher investment is needed from the firm under τ1 to profile consumers

perfectly.

18

Definition 1. (Higher Data Requirement) For two functions τ1 and τ2, τ1 represents a

scenario with higher data requirement than τ2 if τ1(γ) ≥ τ2(γ), ∀γ ∈ [0,1] and the inequality is

strict for some γ.

The implications of higher data requirement on the firm’s profit, consumer surplus and social

welfare are summarized in the proposition below. Because, given the same fraction of consumers

who choose to reveal their identities, the amount of investment required is lower with lower data

requirement, the firm is more likely to invest in consumer profiling, and the firm’s profit is always

higher under a scenario with lower data requirement. On the other hand, because the firm’s invest-

ment always leads to lower consumer surplus as shown in Corollary 3, consumer surplus is thus

lower when the data requirement is lower.

Proposition 2. (Impact of Data Requirement) Consider two scenarios τ1 and τ2, where

τ1 indicates higher data requirement. Then,

(i) the firm is more likely to invest in profiling under τ2, and the firm’s optimal profit is also

higher under τ2;

(ii) consumer surplus is (weakly) lower under τ2;

(iii) with small λ and c, the firm’s investment decision is socially optimal under both scenarios;

with large λ and c, the firm’s investment decision is more likely to be efficient under τ2; with

moderate λ and c, the firm’s investment decision is more likely to be efficient under τ1.

The impact of data requirement on social welfare is the most involved and deserves further

comment. Recall that the firm’s profit when investing in consumer profiling is always increasing

in λ and c. Consequently, with small λ and c, the firm does not invest under either a high-data-

requirement scenario or a low-data-requirement scenario, and the firm’s decision is efficient under

both scenarios. With large λ and c, the firm invests under both scenarios, and its investment deci-

sion is more likely to be socially optimal under a scenario with lower data requirement due to the

lower amount of investment required. With moderate λ and c, the firm will invest when the data

requirement is low, but does not invest otherwise. In this case, the decision of no investment under

high data requirement is guaranteed to be efficient, while the decision of investment may be exces-

sive if the condition shown in Proposition 1(ii) is satisfied. As a result, higher data requirements

may be beneficial to the entire society through dissuading the firm from unnecessary investment

in profiling.

Proposition 2 indicates that efficiency of the firm’s investment depends on not only the market

composition λ and consumers’ flexibility in concealing their identities c, but also data requirements.

19

Under the circumstances when the data requirement is low, policy makers can potentially increase

consumers’ cost of concealing their identities such that the firm’s interest is better aligned with

social welfare. On the other hand, under the scenario with big data, where a large fraction of

consumers are required for the firm to profile them to a good extent, increasing consumers’ flexibility

in concealing their identities increases the chance that the firm’s decision is also socially optimal.

5. α as a General (Logistic) Function

We managed to get several interesting insights analytically in the previous section, but arguably

in a rather special case, where the signal accuracy is modelled by a step function. We further

generalize our findings in this section by employing a flexible logistic specification for αK,γ , which

is given by

αK,γ =1

1 + exp{a(γ− b+Kd)}, (8)

where a≥ 0 and d≥ 0. It is easy to verify that the general αK,γ given by Equation (8) is decreasing

in γ and K, which is consistent with our assumption that more accurate signals are received with a

higher investment K or a greater proportion of consumers γ who choose to reveal their valuations.

When a→∞, αK,γ degenerates to a step function. Consequently, studying this general αK,γ allows

us to verify our findings from the special case, as well as explore the regimes that would be infeasible

under the special case. The logistic specification generalizes the relationship between the amount of

data and the quality of signals with an S-shaped curve, which is consistent with the idea pioneered

by Peter Norvig (Director of Research at Google) as described in Section 1. That is, there exist

“data thresholds” above which the quality of signals one can potentially extract from the data

improves dramatically.

We further illustrate the shape of αK,γ with different parameters in Figure 3. Figure 3(a) suggests

that the parameter a is associated with the curvature of αK,γ . αK,γ rotates clockwise around the

point γ = b−Kd with a greater a. Parameters of b and d have similar effects, as illustrated in Figure

3(b) and (c), where changing the parameters shifts the curve either leftwards or rightwards. Based

upon Definition 1, the curve corresponds to a scenario with higher data requirement if it shifts

rightwards. Consequently, for two b1 and b2, b1 indicates a scenario with higher data requirement

than that of b2 if b1 > b2 and all the other parameters are the same. The effect of d depends on

the value of K. When K < 1, a greater d corresponds to a scenario with higher data requirement,

whereas a smaller d indicates higher data requirement when K > 1.

20

Figure 3 Illustration of αK,γ with Various a, b and d.

0 0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

(a) Various a

0 0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

(b) Various b

0 0.2 0.4 0.6 0.8 1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

(c) Various d

Note. The parameters are specified as follows. K = 0.1, a = 15, b = 0.8 and d = 0.7 in the figures when the corre-

sponding parameter is fixed.

5.1. Characterization of Equilibrium

Under the general αK,γ , we can derive the firm’s belief of the valuation of a consumer upon receiving

a signal s from Equation (1), which is given by

h(v|s) =1

min{s+αK,γ , v}−max{s−αK,γ ,0},

for any v ∈ [max{s−αK,γ ,0},min{s+αK,γ , v}]. The firm’s posterior belief is uniformly distributed

and centred around the signal s. The range of posterior valuations is narrower for the extreme

signal values, i.e., when s is close to 0 or v. This is consistent with the intuition that the firm is

able to profile those consumers with extreme valuations more accurately.

Consequently, the firm’s expected revenue from charging price p to a consumer with signal s is

πs(p) =

p, if p <max{s−αK,γ ,0},

p(min{s+αK,γ ,v}−p)min{s+αK,γ ,v}−max{s−αK,γ ,0}

, if p∈ [max{s−αK,γ ,0},min{s+αK,γ , v}],0, if p >min{s+αK,γ , v}.

It is easy to verify that the optimal price p(s) that maximizes the firm’s expected revenue is given

by

p(s) = max

{max{s−αK,γ ,0},

min{s+αK,γ , v}2

}. (9)

The optimal expected revenue from a consumer with signal s is π∗s = πs(p(s)). Similarly, the

expected consumer surplus when the firm charges p(s) to a consumer with signal s is given by

CSs(p(s)) =

∫ min{s+αK,γ ,v}

p(s)

(v− p(s)) ·h(v|s)dv

21

=(min{s+αK,γ , v}− p(s))2

2(min{s+αK,γ , v}−max{s−αK,γ ,0}).

The price the firm charges those anonymous consumers remains the same as before, which is given

by Equation (2). Given p(s) and p∗(v), we next derive the fraction of consumers in the old market

who choose to reveal their identities at equilibrium. Consider the v-type consumer. As s+αK,γ ≥ v,

p(s) for a v-type consumer is simplified to p(s) = max{s−αK,γ , v2

}, ∀s ∈ [v − αK,γ , v + αK,γ ].

Consequently, the expected price for a v-type consumer if he chooses to reveal his identity is

E[p(s)|v] =

∫ v+αK,γ

v−αK,γ

1

2αK,γp(s)ds=

{v2

+ v2

16αK,γif v < 4αK,γ ,

v−αK,γ if v≥ 4αK,γ .

Recall that v is given by E[p(s)|v] = p∗(v) + c. That is, when v≤ 4αK,γ , v is given by

v

2+

v2

16αK,γ=

1

2+λ(1− v)

2(1−λ)+ c,

and when v≥ 4αK,γ , v is given by

v−αK,γ =1

2+λ(1− v)

2(1−λ)+ c.

Given v, the firm’s profit from investing K can thus be written as

πK(λ, c) = p∗(v) · [λ(1− v) + (1−λ)(1− p∗(v))] +λ

∫ v+αK,γ

−αK,γ

∫ v

0

g(s|v)

vdv ·πs(p(s))ds−K

= p∗(v) · [λ(1− v) + (1−λ)(1− p∗(v))]

+λ

∫ v+αK,γ

−αK,γ

(min{s+αK,γ , v}−max{s−αK,γ ,0})2αK,γ v

·πs(p(s))ds−K,

where the first term and the second term on the right hand side of the equation indicate revenue

from those anonymous consumers in both old and new markets, and revenue from those consumers

who reveal their identities, respectively. The optimal investment level K is the one that maximizes

πK(λ, c). Similarly, the expected consumer surplus from investing K is given by

CSK(λ, c) = λ

∫ 1

v

(v− p∗(v)− c)dv+ (1−λ)

∫ 1

p∗(v)(v− p∗(v))dv

+λ

∫ v+αK,γ

−αK,γ

∫ v

0

g(s|v)

vdv ·CSs(p(s))ds

= λ(1− v)

(1 + v

2− p∗(v)− c

)+ (1−λ)

(1− p∗(v))2

2

+λ

∫ v+αK,γ

−αK,γ

(min{s+αK,γ , v}− p(s))2

4αK,γ vds,

and the total welfare SWK(λ, c) = πK(λ, c) +CSK(λ, c).

The model becomes analytically intractable under general αK,γ . Thus, we study the impacts of

model parameters, as well as data requirements, on the equilibrium through extensive numerical

analysis below.

22

5.2. Numerical Analysis

Figure 4 shows how the optimal investment level K, the fraction of consumers choosing to reveal

their identities γ, and the price for anonymous consumers p∗(v) are affected by parameters c and

λ. Interestingly, Figure 4(a) indicates that the firm’s optimal investment level K is generally not

monotone in either λ or c. However, when λ is sufficiently large (i.e., λ = 0.7 or λ = 0.9 in the

figure), the optimal investment level K decreases in c. Similarly, when c is sufficiently large (c≥ 0.4

in the figure), the optimal investment level also decreases in λ. This is consistent with our findings

summarized in Corollary 1. Indeed, the monotonicity of K under the special case is conditional on

the decision that the firm makes an investment, and the firm is shown to be more likely to invest

with either a higher c or a higher λ by Corollary 2.

Figure 4 Impact of c and λ on Optimal Investment Level K, Fraction of Consumers choosing to Reveal Their

Identities γ, and Price for Anonymous Consumers p∗(v).

0 0.1 0.2 0.3 0.4 0.5

0.005

0.01

0.015

0.02

0.025

0.03

0.035

(a) K

0 0.1 0.2 0.3 0.4 0.5

0.82

0.84

0.86

0.88

0.9

0.92

0.94

0.96

0.98

1

(b) γ

0 0.1 0.2 0.3 0.4 0.50.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

(c) p∗(v)

Note. The parameters are specified as follows. a= 20, b= 0.8 and d= 0.7.

When either c or λ is small, the monotonicity of K no longer holds. Figure 4(a) suggests that

K first increases in c, and then decreases after a certain point for small λ. To understand this

result, it is crucial to connect it with the endogenous decisions of consumers who can conceal their

identities (see Figure 4(b)). When c is small, it is almost costless for consumers to conceal, and

most consumers would indeed choose to conceal should the firm decide to invest. Coupled with the

fact that the old market is not large, the optimal investment from the firm would be small. On

the other end of the spectrum, when c is large, consumers have less flexibility in concealing their

identities, and thus a small investment would be sufficient to extract all the benefits from profiling

consumers. Small investments thus arise when it is either very easy or very difficult for consumers

23

to protect their information. For intermediate values, the firm instead wants to invest more in the

profiling technology, and thus the firm’s investment is typically the highest for moderate c when λ

is small. The argument for the non-monotonicity of λ when c is small follows the same logic. When

the old market is relatively small, the benefit of consumer profiling tends to be small, leading to

a small investment level from the firm. However, when the old market is extremely large, a small

investment is sufficient to guarantee that a good fraction of consumers would reveal their identities,

allowing the firm to profile accurately. As such, the firm’s optimal investment is the highest with

a moderate-size old market (λ = 0.7). Figure 4(c) shows that the optimal price for anonymous

consumers decreases in both λ and c, which is a general property and a direct extension of the

properties of p∗(vo) under the special case as given by Equation (4).

Figure 5 Impact of c and λ on Firm’s Optimal Profit πK(λ, c), Consumer Surplus CSK(λ, c), and Social Welfare

SWK(λ, c).

0 0.1 0.2 0.3 0.4 0.50.2

0.25

0.3

0.35

0.4

0.45

(a) πK(λ, c)

0 0.1 0.2 0.3 0.4 0.5

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

(b) CSK(λ, c)

0 0.1 0.2 0.3 0.4 0.5

0.36

0.38

0.4

0.42

0.44

0.46

0.48

(c) SWK(λ, c)

Note. The parameters are specified as follows. a= 20, b= 0.8 and d= 0.7.

Figure 5 illustrates the impact of c and λ on the firm’s optimal profit πK(λ, c), consumer sur-

plus CSK(λ, c), and social welfare SWK(λ, c). Consistent with Lemma 4, the firm’s optimal profit

πK(λ, c) always increases in c and λ. Again, the monotonicity result in Lemma 4 is conditional on

the firm’s decision to invest which is more likely to happen with either a higher c or a higher λ.

Figure 5(a) suggests that the monotonicity result holds generally, even for small c and small λ.

Figure 5(b) indicates that consumer surplus decreases in λ, which is consistent with our result in

Lemma 4(ii). With a larger old market, consumer profiling always becomes more attractive to the

firm as it can price discriminate a larger fraction of the market, leading to a higher profit and lower

consumer surplus. On the other hand, consumer surplus generally decreases in c. However, this

24

result does not always hold. When λ= 0.9, consumer surplus first decreases, and then increases in

c (this may not be obvious in Figure 5(b), simply due to scale). Indeed this convexity of consumer

surplus in c is already present in Lemma 4(ii). Under the special case when αK,γ is a step function,

social welfare is convex in both λ and c as shown in Lemma 4(iii). Its non-monotonicity in c

still holds under the general scenario shown in Figure 5(c), however the total welfare seems to

increase in λ most of the time with the chosen parameter values. Total welfare typically reaches

its maximum when privacy costs are extremely large (i.e., c= 0.5). This is because, in such case,

consumers will avoid the expenditure of concealing their identity to protect their privacy, which is

a pure wasteful activity in our model. Without protection, everyone in the old market gets profiled

and output expands, which is positive for total efficiency. However, the downside is that all the

consumer surplus in the old market is then appropriated by the firm, and therefore the distribution

of total welfare among the parties is tilted heavily in favor of the firm.

Figure 6 Impact of b and d on Optimal Investment Level K, Firm’s Optimal Profit πK(λ, c), Consumer Surplus

CSK(λ, c), and Social Welfare SWK(λ, c).

0.2 0.4 0.6 0.80

0.005

0.01

0.015

0.02

0.025

0.03

(a) K

0.2 0.4 0.6 0.80.33

0.34

0.35

0.36

0.37

0.38

0.39

0.4

0.41

0.42

0.43

(b) πK(λ, c)

0.2 0.4 0.6 0.8

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

0.11

(c) CSK(λ, c)

0.2 0.4 0.6 0.8

0.425

0.43

0.435

0.44

0.445

0.45

0.455

(d) SWK(λ, c)

Note. The parameters are specified as follows. c= 0.25, λ= 0.7 and a= 20.

We now turn to a discussion of the implications of “big data” versus “small data”. Figure 6

illustrates the impact of b and d on the optimal investment level K, firm’s optimal profit πK(λ, c),

consumer surplus CSK(λ, c), and social welfare SWK(λ, c). Recall that a higher b (or a higher d

when K < 1) indicates a scenario with higher data requirement, other things being equal. Figure

6(a) suggests that the optimal investment K is not monotone in the level of data requirement.

When the data requirement is low, a small investment is sufficient for the firm to profile consumers

with good accuracy, and thus the optimal investment from the firm tends to be small. As data

requirement increases, the firm generally increases its investment level with the hope of more

accurate signals from consumers. At the same time, the fraction of consumers who choose to reveal

25

their identities decreases. Consequently, when the data requirement is sufficiently high, even a high

investment would not allow the firm to profile consumers accurately due to the lack of the scale of

data, leading the firm to scale back its optimal investment level.

Though K is not monotone in either b or d, Figure 6(b) suggests that the firm’s profit πK(λ, c) is

monotonically decreasing in both b and d. That is, a higher data requirement is always detrimental

to the firm’s profitability, which is consistent with our result under the special case shown in Propo-

sition 2(i). Similarly, consumer surplus CSK(λ, c) increases in both b and d as illustrated in Figure

6(c), which is a direct extension of Proposition 2(ii). Since profits decrease and consumer surplus

increases, it is not surprising that total welfare is not monotone in the level of data requirement,

as shown in Figure 6(d).

Figure 7 Socially Optimal Investment Ks under Various Model Parameters.

0 0.1 0.2 0.3 0.4 0.50

0.5

1

1.5

2

2.5

3×10-3

(a) a= 20, b= 0.8, d= 0.7

0.2 0.4 0.6 0.80

0.5

1

1.5

2

2.5

× 10-3

(b) a= 20, c= 0.2, λ= 0.5

0.2 0.4 0.6 0.80

0.2

0.4

0.6

0.8

1

1.2

1.4 × 10-3

(c) a= 20, c= 0.4, λ= 0.9

Note. Values of parameters which are fixed in the analysis are listed below the corresponding figures.

Lastly, we seek to find out under what conditions the firm’s investment will be socially optimal.

We expect the firm’s investment to be generally suboptimal from a total welfare point of view,

because the firm maximizes its profits without taking into account consumer surplus. The more

interesting question we address here is the following: are efficiency concerns more acute in a “small

data” or in a “big data” environment? In order to facilitate the comparison, we proceed as follows.

First, we construct a benchmark where a central planner determines the investment with the goal

of maximizing the total welfare. We denote the central planner’s investment level as Ks. Once Ks

is decided, the model unfolds as discussed in our base model where the firm determines the price

for anonymous consumers, as well as individual prices for each consumer who reveal his identity,

and then consumers make their purchase decisions accordingly. Finally, we denote the difference

26

in the investment levels from the firm and from the social planner as ∆K ≡K −Ks. This is a

measure of relative (in)efficiency. A small ∆K implies that the private investment choice of the

firm is aligned with social welfare. Conversely for large differences.

Figures 7 and 8 illustrate the optimal investment from the social planner Ks, and the difference

between the social planner’s investment and the firm’s investment, respectively. In particular,

Figure 8(a) illustrates the relative efficiency of the firm’s investment with varying c and λ. Recall

that we show in Corollary 4 that the inefficiency interval increases in λ, and is concave in c when

αK,γ is a step function. Figure 8(a) further corroborates this result under the general αK,γ . For

a given c, the difference in the investments is generally smaller with smaller λ. The interval of c

where ∆K is greater than zero grows wider as λ increases. On the other hand, for a given λ, the

difference is minimized at c= 1/2, and its maximum is realized with moderate c when λ= 0.3 or

0.5. That is, ∆K is more likely to be zero under a wider range of λ when c is large compared with

the situation when c is moderate, which is consistent with the convexity property of the inefficiency

interval as shown in Corollary 4(ii).

Figure 8 Comparisons of Socially Optimal Investment Ks and Optimal Monopoly Investment K under Various

Model Parameters.

0 0.1 0.2 0.3 0.4 0.50

0.005

0.01

0.015

0.02

0.025

0.03

0.035

(a) a= 20, b= 0.8, d= 0.7

0.2 0.4 0.6 0.80

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

0.02

(b) a= 20, c= 0.2, λ= 0.5

0.2 0.4 0.6 0.80

0.005

0.01

0.015

0.02

0.025

(c) a= 20, c= 0.4, λ= 0.9

Note. Values of parameters which are fixed in the analysis are listed below the corresponding figures.

The impact of data requirement on the efficiency of the firm’s investment is illustrated in Figure

8(b) and 8(c). In particular, we consider two scenarios, a scenario with moderate λ and c (λ= 0.5

and c= 0.2), and a scenario with high λ and c (λ= 0.9 and c= 0.4). Under moderate λ and c, the

difference in the investments corresponding with higher data requirement, i.e., d = 0.7 or 0.9, is

generally smaller than the difference under lower data requirement, i.e., d= 0.3 or 0.5. However,

27

this relationship is reversed in the scenario with high λ and c. The numerical analysis provides

evidence for Proposition 2(iii) under general αK,γ .

These welfare results can be read in two ways. When it is relatively easy (respectively, very costly)

for consumers to protect their privacy, then higher data requirements (respectively, lower data

requirements) cause less policy concerns. Alternatively, with lower data requirements (respectively,

higher data requirements), the firm’s investment decision is more likely aligned with the social

welfare by making data protection very costly (respectively, relatively easy).

6. Conclusion

In this paper, we study data profiling in the context of price discrimination. Our main contribution

to the literature is the novel focus on two endogenous and related decisions: the firm invests in

the precision of the information it gets from consumers, while consumers can take costly actions

to protect their privacy. We show that the firm’s investment in profiling closely relates to the

flexibility of consumers to conceal their identities as well as to data requirements.

We derive managerial implications on when, how and why a firm should invest in profiling tech-

nologies. A small investment is optimal when c is either very small or very large. The rationales

behind them differ substantially. When c is small, consumers in the old market have greater flex-

ibility in protecting themselves, rendering any profiling technology ineffective; whereas when c is

large, it is costly for consumers to conceal their identities, and thus a small investment is already

effective. The optimal investment level is typically the highest with intermediate values of c, when

λ is moderate. A similar argument applies to λ, i.e., the size of the old market, as well. It is not

beneficial for the firm to invest heavily either when the size of the old market is small, or when

the size of old market is large and consumers give up their information quite easily. The firm needs

to invest relatively more heavily to counteract consumers’ reluctance to reveal their information

when λ is moderate. On another note, the firm’s profit always increases in c as consumers have

less incentives to engage in costly privacy protection. The firm also makes a higher profit with a

larger old market.

The firm’s investment is not monotone in the level of data requirement either, where the invest-

ment is the greatest when data requirement is moderate. When the data requirement is low, a

small investment is sufficient for the firm to get a sufficiently informative dataset. As the data

requirement increases, the firm generally increases its investment level with the hope of getting

more accurate signals from consumers. However, this argument does not hold universally. When

the data threshold is extremely high, i.e., “big data”, the firm optimally scales back its investment

28

simply because investing in profiling becomes too onerous. Overall profits typically decrease with

data requirements.

We also discuss the welfare implications of privacy policies and regulations. Consumers benefit

from stricter data protections, because otherwise they are negatively affected from price discrim-

ination. However, the impact of data protection on social welfare is not always obvious, as one

needs to trade off several effects involving consumers’ costly privacy protection as well as quantity

allocations in both the old targeted market and the new anonymous market. Some of the numer-

ical examples (thus we cannot claim generality of our claims) show that both consumer surplus

and total welfare are maximized for minimal protection of consumers, i.e., the highest possible

concealing cost in our model. This is because, with a high concealing cost, no consumer has an

incentive to maintain anonymity (which is a pure cost in our model), and thus all of them will be

profiled by the firm, leading to an expansion in the output due to perfect personalized pricing in

the old market. However, arguably this policy is rather an extreme case with zero data protection.

We further analyze piecemeal interventions involving incremental changes to data protection, and

we find that the impact of data protection on social welfare is much more nuanced as it depends

crucially on the starting level of the intervention.

Social welfare is not monotonic in data requirements due to the tension between net profit that

is declining in data requirements and consumer surplus which is increasing. This per se is not

potentially relevant from a policy perspective because data requirements are related to progress

made in data analytics, typically not something that a policy maker could interfere with. Instead,

we tackle a more meaningful question by comparing whether private choices of the firm are more

aligned with total welfare in the context with a small data requirement or a large data requirement.

We show that the answer to this question is closely linked to privacy costs. When it is easy for

consumers to protect their data, private and social incentives are aligned when data analytics

involve a large data requirement. On the other hand, when it is very costly for consumers to conceal

their information, a small data requirement induces an investment on the firm’s side that is very

close to that would be chosen by a social planner. Of course, this conclusion impinges on the

model’s assumption of consumers having no preference for privacy per se. If they do, regimes with

privacy and data protection should naturally arise more often from a policy perspective; however,

our comparative statics are still expected to hold true.

References

Acquisti, A., C. Taylor, L. Wagman. 2015. The Economics of Privacy. Journal of Economic Literature

(forthcoming).

29

Acquisti, A., H. Varian. 2005. Conditioning Prices on Purchase History. Marketing Science 24(3) 367–381.

Belleflamme, P. 2015. Monopoly Price Discrimination and Privacy: The Hidden Cost of Hiding. mimeo.

Bergemann, D., A. Bonatti. 2015. Selling Cookies. American Economic Journal: Microeconomics 7(3)

259–294.

Casadesus-Masanell, R., A. Hervas-Drane. 2015. Competing with Privacy. Management Science 61(1)

229–246.

Chen, Y., G. Iyer. 2002. Consumer Addressability and Customized Pricing. Marketing Science 21(2) 197–

208.

Chen, Y., C. Narasimhan, Z. J. Zhang. 2001. Individual Marketing with Imperfect Targetability. Marketing

Science 20(1) 23–41.

Chen, Y., Z. J. Zhang. 2009. Dynamic Targeted Pricing with Strategic Consumers. International Journal of

Industrial Organization 27(1) 43–50.

Conitzer, V., C. Taylor, L. Wagman. 2012. Hide and Seek: Costly Consumer Privacy in a Market with

Repeat Purchases. Marketing Science 31(2) 277–292.

Esteves, R. 2009. A Survey on the Economics of Behaviour-Based Price Discrimination. NIPE Working

Paper.

Fudenberg, D., J. Tirole. 2000. Customer Poaching and Brand Switching. RAND Journal of Economics

31(4) 634–657.

Fudenberg, D., J. Villas-Boas. 2007. Behavior-Based Price Discrimination and Customer Recognition. in T.

Hendershott (Ed.), Handbook of Economics and Information Systems .

Johnson, J. P., D. P. Myatt. 2006. On the Simple Economics of Advertising, Marketing, and Product Design.

American Economic Review 96(3) 756–784.

Koh, B., S. Raghunathan, B. Nault. 2015. Is Voluntary Profiling Welfare Enhancing? MIS Quarterly

(forthcoming).

Montes, R., W. Sand-Zantman, T. Valletti. 2015. The Value of Personal Information in Markets with

Endogenous Privacy. mimeo.

Norvig, P. 2010. Internet-Scale Data Analaysis. mimeo.

30

Pazgal, A., D. Soberman. 2008. Behavior-Based Discrimination: Is It a Winning Play, and If So, When?

Marketing Science 27(6) 977–994.

Shin, J., K. Sudhir. 2010. A Customer Management Dilemma: When is It Profitable to Reward One’s Own

Customers? Marketing Science 29(4) 671–689.

Shy, O., R. Stenbacka. 2016. Customer Privacy and Competition. Journal of Economics & Management

Strategy (forthcoming).

Singer, N. 2015. Sharing Data, but Not Happily. The New York Times, June 4th.

Taylor, C., L. Wagman. 2014. Consumer Privacy in Oligopolistic Markets: Winners, Losers, and Welfare.

International Journal of Industrial Organization 34(1) 80–84.

Taylor, Curtis R. 2004. Consumer privacy and the market for customer information. RAND Journal of

Economics 35(4) 631–650.

Villas-Boas, J. 1999. Dynamic Competition with Customer Recognition. RAND Journal of Economics 30(4)

604–631.

Villas-Boas, J. 2004. Price Cycles in Markets with Customer Recognition. RAND Journal of Economics

35(3) 486–501.

Zhang, J. 2011. The Perils of Behavior-Based Personalization. Marketing Science 30(1) 170–186.

APPENDIX: Proofs

Proof of Lemma 2. We first show that there exists a cutoff v such that all consumers in the

old market with v > v choose to conceal their identities, and all consumers with v < v choose to

reveal. Consider the following two scenarios: (1) the fraction of consumers choosing to reveal their

identities, γ, is less than τ−1(K). In this case, the conditional distribution of signals is reduced

to a uniform distribution with support [v−αK,γ , v+αK,γ ]. It is easy to verify that Assumption 1

holds, and thus there exists a cutoff v that separates consumers choosing to conceal their identities

and those choosing to reveal their identities. (2) the fraction of consumers choosing to reveal their

identities, γ, is greater than of equal to τ−1(K). In this case, any consumer who chooses to reveal

his identity receives a zero utility. If a type-v consumer chooses to conceal his identity, in which

case he would expect a utility of v− p− c > 0, then any consumer with a higher valuation is also

better off concealing. On the other hand, if a type-v consumer chooses to reveal his identity, in

which case he would expect a non-positive utility from concealing his identity, i.e., v− p− c≤ 0,

31

then any consumer with a lower valuation would also choose to reveal. Thus, there exists a cutoff

v such that all consumers in the old market with v > v choose to conceal their identity, and all

consumers with v < v choose to reveal. The decision of the type-v consumer depends on whether a

positive utility can be derived from concealing his identity.

Given a cutoff v, the optimal price charged to those anonymous consumers in the old market is

given by Equation (2), i.e., p∗(v) = 12

+ λ(1−v)2(1−λ) . The expected utility of the type-v consumer from

concealing his identity must be non-negative. Otherwise, due to the continuity of the function v−

p∗(v)−c, the consumer just right to the type-v consumer receives a negative utility from concealing

his identity, and thus he is better-off revealing his identity. As a result, we have v− p∗(v)− c≥ 0,

which is equivalent to v≥ vo.

When the firm’s investment level K is greater than or equal to Ko, the minimal fraction of

consumers revealing their identities that allows the firm to perfectly profile them is less than or

equal to vo, i.e., vK ≤ vo. Consequently, any consumer with a valuation less than vo would choose

to reveal his identity and receive a zero utility, and any consumer with a valuation greater than

vo chooses to conceal, and receives a non-negative expected utility from concealing his identity.

The type-vo consumer is indifferent between the two options, and by our assumption, he chooses

to reveal his identity.

When the firm’s investment level K is less than Ko, we first show that the cutoff v cannot be

greater than vK . If v > vK , the type-v consumer would receive a positive utility from concealing his

identity, as v−p∗(v)− c > 0, and a zero utility from revealing his identity. Due to the continuity of

function v−p∗(v)−c, the consumer just left to the type-v consumer is also better-off concealing his

utility, and thus v≤ vK . Moreover, when v= vK , the type-vK consumer is also better-off concealing

his utility, as his utility from concealing is positive due to vK > vo, and his utility from revealing

his identity is zero. To this end, we only need to show that any consumer with a valuation lower

than vK would choose to reveal his identity. If v < vK , the firm’s belief of a consumer’s valuation

who chooses to reveal his identity is given by 1/v, ∀v ∈ [0, v]. The optimal price the firm charges to

those consumers is given by arg maxp p∫ vp

1vdv= v

2. Consequently, the utility of the type-v consumer

from revealing his identity is given by v/2, and the utility from concealing his identity is given by

v−p∗(v)− c. It is easy to verify that v−p∗(v)− c < v/2, ∀v ∈ [0, vK). Consequently, any consumer

with a valuation lower than vK is better-off revealing his identity, and we thus obtain the announced

result. �

Proof of Lemma 3. Consider the following two scenarios: (1) K ≥Ko. From Equation (5), we

know that the firm gains no more information advantage once its investment reaches the threshold

32

Ko. As a result, the firm’s optimal investment level is given by Ko, ∀K ≥Ko; (2) K <Ko. The

firm’s profit function is given by πK(λ, c) = λ(vK−1)24(1−λ) + 1

4− τ(vK), ∀vK ∈ (vo,1). By the assumption

that τ(·) is a concave function, πK(λ, c) is a convex function in vK , and thus the maximum is

realized at the either end of the support. When the firm’s investment is sufficiently close to Ko,

revenue from those anonymous consumers is the same as that by investing Ko. However, the firm

earns a revenue of v2o/2 from those consumers revealing their identities with an investment of Ko

due to perfect profiling, but only half of that revenue with an investment sufficiently close to Ko.

Thus, the firm is better-off investing Ko than any investment Ko− ε, for any small ε > 0. On the

other hand, when vK = 1, the firm’s optimal profit is given by π0(λ, c) = 1/4, which is greater than

the firm’s optimal profit with an investment sufficiently close to τ(1), which is 1/4− τ(1). Thus we

obtain the announced result. �

Proof of Lemma 4. (i) Taking the derivative of πKo(λ, c) with respect to c, we have

∂πKo(λ, c)

∂c=

2λc(1−λ)

2−λ− ∂Ko

∂c≥ 0,

where the inequality is due to γ ∈ [0,1] and result (i) in Corollary 1. Similarly, taking the derivative

of the profit function with respect to γ, we have

∂πKo(λ, c)

∂λ=

1 +λ2c2− 4λc2

2(2−λ)2+c2

2− ∂Ko

∂λ≥ 0,

where the inequality is due to the assumption c≤ 1/2, γ ∈ [0,1], and result (ii) in Corollary 1.

(ii) Taking derivative of CSKo(λ, c) with respect to λ, we have

∂CSKo(λ, c)

∂λ=c2(− 3

2λ3 + 9λ2− 12λ+ 4

)+ c(λ3− 6λ2 + 7λ− 2) + (λ− 1)

(2−λ)3

≤c2(− 3

2λ3 + 9λ2− 12λ+ 4

)+ c(λ3− 6λ2 + 7λ− 2) + 2c(λ− 1)

(2−λ)3

=c2(− 3

2λ3 + 9λ2− 12λ+ 4

)+ c(λ3− 6λ2 + 9λ− 4)

(2−λ)3

≤c2(− 3

2λ3 + 9λ2− 12λ+ 4

)+ 2c2(λ3− 6λ2 + 9λ− 4)

(2−λ)3

=c2(12λ3− 3λ2 + 6λ− 4

)(2−λ)3

< 0.

The first inequality is due to 2c∈ [0,1] and λ−1≤ 0. The second inequality is due to 2c∈ [0,1] and

g(λ)≡ λ3− 6λ2 + 9λ− 4≤ 0, ∀λ ∈ [0,1] (as dg(λ)/dλ= 3(λ− 2)2− 3≥ 0, ∀λ ∈ [0,1] and g(1) = 0).

Similarly, it is easy to verify that h(λ) ≡ λ3/2 − 3λ2 + 6λ − 4 < 0, ∀λ ∈ [0,1] since dh(λ)/dλ =

3(λ−2)2/2≥ 0, ∀λ∈ [0,1] and h(1) =−1/2< 0. Consequently, we conclude that ∂CSKo(λ, c)/∂λ<

0.

33

The first and second-order derivatives of CSKo(λ, c) with respect to c are given by

∂CSKo(λ, c)

∂c= λc−λ+

λ(2λ− 3)(λc− 1)

(λ− 2)2,

∂2CSKo(λ, c)

∂c2=λ(3λ− 4)(λ− 1)

(λ− 2)2≥ 0,

where the inequality is due to λ∈ [0,1]. Consequently, CSKo(λ, c) is convex in c.

(iii) The first and second-order derivatives of SWKo(λ, c) with respect to c are given by

∂SWKo(λ, c)

∂c=λ[(8− 5λ)(1−λ)c− (λ− 1)2

](λ− 2)2

− ∂Ko

∂vo· ∂vo∂c

,

∂2SWKo(λ, c)

∂c2=λ(1−λ)(8− 5λ)

(λ− 2)2− ∂

2Ko

∂v2o·(∂vo∂c

)2

− ∂Ko

∂vo· ∂

2vo∂c2

.

Given that λ ∈ [0,1], τ(·) is concave, and ∂2vo/∂c2 = 0, ∂2SWKo(λ, c)/∂c

2 is guaranteed to be

greater than or equal to 0. Consequently, SWKo(λ, c) is convex in c.

Similarly, the second-order derivative of SWKo(λ, c) with respect to λ is given by

∂2SWKo(λ, c)

∂λ2=

(16c2− 10c+ 1)λ+ (−20c2 + 8c+ 1)

(λ− 2)4− ∂

2Ko

∂v2o·(∂vo∂λ

)2

− ∂Ko

∂vo· 2(1− 2c)

(2−λ)3

≥ (16c2− 10c+ 1)λ+ (−20c2 + 8c+ 1)λ

(λ− 2)4− ∂

2Ko

∂v2o·(∂vo∂λ

)2

− ∂Ko

∂vo· 2(1− 2c)

(2−λ)3

=2λ(1− 2c)(1 + c)

(λ− 2)4− ∂

2Ko

∂v2o·(∂vo∂λ

)2

− ∂Ko

∂vo· 2(1− 2c)

(2−λ)3≥ 0.

The first inequality is due to λ ∈ [0,1], and −20c2 + 8c + 1 = (1 − 2c)(1 + 10c) ≥ 0 because of

c ∈ [0,1/2]. The second inequality is due to c ∈ [0,1/2], λ ∈ [0,1], and τ(·) is non-increasing and

concave. We thus obtain the announced result. �

Proof of Corollary 3. (i) From Corollary 2, we know that ∂CSKo(λ, c)/∂λ< 0, and CSKo(λ, c)

is maximized at λ= 0, where it attains a value of 1/8. Recall that CS0(λ, c) = 1/8. We thus have

CSKo(λ, c)≤CS0(λ, c) for any λ and c.

(ii) We seek to establish the inequality by showing that SWKo(λ, c)≥ SW0(λ, c) for any λ≥ 1/4

when Ko = 0, i.e., the function τ(·) is extremely small irrespective of λ and c. Recall that we show in

Lemma 4(iii) that SWKo(λ, c) is a convex function in c. This result remains valid when Ko = 0, and

its minimum is realized at ∂SWKo(λ, c)/∂c= 0, i.e., c∗(λ) = 1−λ8−5λ . Plugging c∗(λ) into SWKo(λ, c),

we have SWKo(λ, c∗(λ)) = λ2−4λ+6

2(8−5λ) . The second-order derivative of SWKo(λ, c∗(λ)) is given by

d2SWKo(λ, c∗(λ))

dλ2=

54

(8− 5λ)3≥ 0,

where the inequality is due to λ ∈ [0,1]. Consequently, SWKo(λ, c∗(λ)) is convex in λ. The two

roots such that SWKo(λ, c∗(λ)) = SW0(λ, c) = 3/8 are given by λ1 = 0 and λ2 = 1/4. Consequently,

SWKo(λ, c)≥ 3/8 for any λ≥ 1/4 when Ko = 0, and we obtain the announced result. �

34

Proof of Proposition 1. From Corollary 3(i), we know that CSKo(λ, c) ≤ CS0(λ, c) for any λ

and c, which is equivalent to πKo(λ, c)−π0(λ, c)≥ SWKo(λ, c)−SW0(λ, c). It is easy to verify that

πKo(λ, c)− π0(λ, c) = 0 if Ko =Kλ,c, and SWKo(λ, c)− SW0(λ, c) = 0 if Ko =Kλ,c. Kλ,c ≥Kλ,c is

guaranteed by the facts that πKλ,c(λ, c)− π0(λ, c) ≥ SWKλ,c(λ, c)− SW0(λ, c) = 0 = πKλ,c(λ, c)−

π0(λ, c), and πKo(λ, c) is decreasing in Ko.

Consider the following three scenarios: (i) Ko > Kλ,c, and thus 0 > πKo(λ, c) − π0(λ, c) ≥

SWKo(λ, c)−SW0(λ, c). In this case, the firm gains a higher profit from not investing in consumer

profiling, and social welfare is also higher when the firm invests zero. (ii) Kλ,c <Ko ≤Kλ,c, and

thus πKo(λ, c)− π0(λ, c)≥ 0> SWKo(λ, c)− SW0(λ, c). The firm is better-off investing Ko in con-

sumer profiling, however social welfare is maximized when the firm invests zero. (iii) Ko ≤Kλ,c,

and thus πKo(λ, c)−π0(λ, c)≥ SWKo(λ, c)−SW0(λ, c)≥ 0. Both the firm’s profit and social welfare

are maximized when the firm invests Ko in consumer profiling. We thus obtain the announced

results. �

Proof of Corollary 4. (i) The first-order derivative of Kλ,c−Kλ,c with respect to λ is given by

∂(Kλ,c−Kλ,c)

∂λ=

1

2(λ− 2)2

[(−5λ2 + 10λ− 4)c2 + 2(λ− 1)2c

]+

(λc− 1)2(λ− 1)

(λ− 2)3

≥ 1

2(λ− 2)2

[(−5λ2 + 10λ− 4)c2 + 2(λ− 1)2 · 2c2

]+

(λc− 1)2(λ− 1)

(λ− 2)3

=λc2

2(2−λ)+

(λc− 1)2(λ− 1)

(λ− 2)3≥ 0.

The first inequality is due to 2(λ−1)2 ≥ 0, and c∈ [0,1/2]. The second inequality is due to λ∈ [0,1].

Consequently, we conclude that (Kλ,c−Kλ,c) is increasing in λ.

(ii) The first and second-order derivatives of Kλ,c−Kλ,c with respect to c are given by

∂(Kλ,c−Kλ,c)

∂c=λ(λ− 1)[(4− 3λ)c+ (λ− 1)]

(λ− 2)2,

∂2(Kλ,c−Kλ,c)

∂c2=λ(λ− 1)(4− 3λ)

(λ− 2)2.

Because λ ∈ [0,1], ∂2(Kλ,c − Kλ,c)/∂c2 ≤ 0. Its maximum is realized at c∗, which is given by

∂(Kλ,c−Kλ,c)/∂c= 0, i.e., c∗ = (1−λ)/(4−3λ). It is easy to verify that c∗ in decreasing in λ, and

c∗ ∈ [0,1/4] as λ∈ [0,1]. The minimum of Kλ,c−Kλ,c is realized at either c= 0 or 1/2. When c= 0,

Kλ,0−Kλ,0 = 3−2λ2(2−λ)2 −

38, and when c= 1/2, Kλ,1/2−Kλ,1/2 = λ

8. Taking difference of the two, we

have

(Kλ,0−Kλ,0)− (Kλ,1/2−Kλ,1/2) =λ2(1−λ)

8(2−λ)2≥ 0,

where the inequality is due to λ∈ [0,1]. We thus obtain the announced results. �

35

Proof of Proposition 2. We use superscript i= 1,2 to denote the two scenarios. Because τ1(γ)≥

τ2(γ), ∀γ ∈ [0,1], π1Ko

(λ, c)≤ π2Ko

(λ, c) for any λ and c. Recall that π10(λ, c) = π2

0(λ, c) = 1/4. Given

different combinations of λ and c, we have the following three cases.

(1) π1Ko

(λ, c)≤ π2Ko

(λ, c)≤ 1/4. Investing in consumer profiling is unprofitable for the firm under

both τ1 and τ2. Consequently, the optimal decision for the firm is not to invest, which leads to

π1(λ, c) = π2(λ, c) = 1/4, CS1(λ, c) = CS2(λ, c) = 1/8, and SW 1(λ, c) = SW 2(λ, c) = 3/8. Recall

that the firm’s decision not to invest is always socially optimal.

(2) π1Ko

(λ, c) ≤ 1/4 < π2Ko

(λ, c). In this case, the firm invests in consumer profiling under τ2,

but does not invest under τ1. As a result, π2(λ, c) = π2Ko

(λ, c) > 1/4 = π10(λ, c) = π1(λ, c), and

CS2(λ, c) =CS2Ko

(λ, c)≤CS20(λ, c) =CS1

0(λ, c) =CS1(λ, c), where the inequality is given by Corol-

lary 3. As shown in Proposition 1, the firm’s decision not to invest under τ1 is always socially

optimal, however, the investment decision under τ2 may not be efficient if τ2(λ, c)>Kλ,c.

(3) 1/4 < π1Ko

(λ, c) ≤ π2Ko

(λ, c). The firm invests in consumer profiling under both τ1 and τ2.

Because τ1(γ) ≥ τ2(γ), ∀γ ∈ [0,1], we have π1(λ, c) = π1Ko

(λ, c) ≤ π2Ko

(λ, c) = π2(λ, c). Consumer

surplus under the two scenarios are identical because it is independent of the scenarios should the

firm decide to invest. Lastly, let us consider the efficiency of the firm’s investment. If the investment

is socially optimal under τ1, which indicates that τ1(λ, c)≤Kλ,c, then the investment decision is

also socially optimal under τ2 because τ2(λ, c)≤ τ1(λ, c)≤Kλ,c. However the reverse is not true.

Summarizing the results from the three cases, we thus obtain the announced results. �