essays in strategic experimentation and bargaining
TRANSCRIPT
The Pennsylvania State University
The Graduate School
College of the Liberal Arts
ESSAYS IN STRATEGIC EXPERIMENTATION AND BARGAINING
A Dissertation inEconomics
byKaustav Das
c© 2013 Kaustav Das
Submitted in Partial Fulfillmentof the Requirements
for the Degree of
Doctor of Philosophy
August 2013
The dissertation of Kaustav Das was reviewed and approved* by the following:
Kalyan ChatterjeeDistinguished Professor of Economics and Management ScienceDissertation Adviser, Chair of Committee
Edward GreenProfessor of Economics
Vijay KrishnaDistinguished Professor of Economicsand Director of Graduate Studies
Susan H. XuProfessor of Management Science and Supply Chain Management
*Signatures are on file in the Graduate School.
Abstract
This dissertation consists of Four Chapters:
Chapter 1 analyses a situation where competing agents involved in making the same
discovery have alternative research avenues to pursue. Agents are uncertain about the
quality of the available research methods. They learn about a particular method in light
of their search experiences. One can relate this to R&D activities in the pharmaceutical
industry, electronics industry etc. This scenario is modeled as a Two-armed Bandit prob-
lem. We consider two alternative settings. One has two risky arms which are perfectly
negatively correlated and the other one has one safe arm and one risky arm. I show that
with a winner-takes all structure and heterogeneity among agents with respect to their in-
nate abilities, there is always an excessive amount of experimentation along one of the lines
of research. This phenomenon is called Duplication which implies that there is too much
specialisation along a line of research when efficiency would require more diversification.
Chapter 2 explores the scenario where competing agents trying to make the same
discovery have alternate methods of research to choose from and agents may be privately
informed about the quality of a method. The model is an extension of the second setting
with symmetric firms, where each firm may experience private arrival of information along
the good risky avenue. I show that there is a symmetric non-cooperative equilibrium in
which there is excessive amount of experimentation along the risky avenue if the prior is
iii
high enough and too little otherwise.
Chapter 31 analyses a model of price formation in a market with a finite number of
non-identical agents engaging in decentralised bilateral interactions. We focus mainly on
equal numbers of buyers and sellers, though we discuss other cases. All characteristics
of agents are assumed to be common knowledge. Buyers simultaneously make targeted
offers, which sellers can accept or reject. Acceptance leads to a pair exiting and rejection
leads to the next period. Offers can be public, private or “ex ante public” (as in directed
search models, which are, however, mostly one-period in the preceding literature). As the
discount factor goes to 1, the price in all transactions converges to the same value.
Chapter 42 studies study a model of decentralised bilateral interactions in a small
market where one of the sellers has private information about her value. There are two
identical buyers and another seller, whose valuation is commonly known to be in between
the two possible valuations of the informed seller. We consider two infinite horizon games,
with public and private simultaneous one-sided offers respectively and simultaneous re-
ponses. We show that there is a stationary perfect Bayes’ equilibrium for both models
such that prices in all transactions converge to the same value as the discount factor goes
to 1.
Keywords: R&D competition, Two-armed Bandit, Duplication, Bilateral Bargaining,
Outside options, Incomplete information, Coase Conjecture, Uniform Price
1Co-authored with Kalyan Chatterjee2Co-authored with Kalyan Chatterjee
iv
Contents
Dedication ix
Acknowledgments x
1 Competition, Duplication and Learning in R&D 1
1.1 Environment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.1 Beliefs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.2 Social Planner’s problem: The Efficiency Benchmark . . . . . . . . . 10
1.1.3 The non-cooperative game . . . . . . . . . . . . . . . . . . . . . . . 16
1.2 Environment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.2.1 Symmetric firms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.2.2 Asymmetric firms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2 Competition and Learning in R&D : The Role of Private Information 43
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.2.1 The planner’s problem: The full information optimal . . . . . . . . . 48
2.3 The non-cooperative game . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.3.1 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
v
2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3 Decentralised Bilateral Trading, Competition for Bargaining Partners
and the law of one price 60
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.1.1 Motivation for the problem studied . . . . . . . . . . . . . . . . . . . 61
3.1.2 Main features of our model. . . . . . . . . . . . . . . . . . . . . . . . 62
3.1.3 Related literature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.2 The basic framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.2.1 The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.2.2 Equilibrium in the basic model . . . . . . . . . . . . . . . . . . . . . 67
3.2.3 Adding a seller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.2.4 Heterogeneous buyers . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.2.5 Adding a buyer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.2.6 Generalisation 1: n buyers and n sellers . . . . . . . . . . . . . . . . 84
3.2.7 Generalisation 2: n buyers and n-1 sellers . . . . . . . . . . . . . . . 91
3.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4 Decentralised Bilateral Trading in a Market with Incomplete Informa-
tion 96
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.2.1 Players and payoffs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.2.2 The extensive form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.3 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.3.1 The Benchmark Case: Complete information . . . . . . . . . . . . . 103
4.3.2 Equilibrium of the one-sided incomplete information game with two
players . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
vi
4.3.3 Equilibrium of the four-player game with incomplete information. . 106
4.4 Asymptotic characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.5 A non-stationary equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Bibliography 118
Appendix 123
A.1 Solution for planner’s v(p) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
A.2 Switching-derivative lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
A.3 Auxillary results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
A.3.1 For the proof of proposition (3 . . . . . . . . . . . . . . . . . . . . . 124
A.3.2 For the proof proposition (8) . . . . . . . . . . . . . . . . . . . . . . 126
A.3.3 For the proof of lemma (8) . . . . . . . . . . . . . . . . . . . . . . . 126
A.4 Strategy depending on both belief and the location of the opponent . . . . . 127
A.5 Proof of Lemma 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
A.6 Proof of Lemma 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
A.7 Proof of lemma 23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
A.8 Proof of Proposition 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
A.9 Proof of Proposition 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
A.10 Proof of Proposition 17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
A.11 Details of the equilibria defined in proposition (18) . . . . . . . . . . . . . . 140
A.11.1 Ph < 1 and 1− Ph > qH . . . . . . . . . . . . . . . . . . . . . . . . 140
A.11.2 Ph < 1 and 1− Ph < qH . . . . . . . . . . . . . . . . . . . . . . . . . 141
A.11.3 Ph ≥ 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
A.12 Off-path behavior of the 2 player game with incomplete information . . . . 142
A.13 Off-path behavior of the 4 player game with incomplete information(public
offers) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
vii
A.14 Off-path behavior with private offers . . . . . . . . . . . . . . . . . . . . . . 147
viii
Dedication
To
My parents
Gopa Das and Pabitra Kumar Das,
for the way they have brought me up;
and
Gurudev Rabindranath Tagore,
The great poet and musician,
whose songs are constant source of inspiration to me.
ix
Acknowledgments
This dissertation would not have been possibile without the wisdom, kindness and friend-
ship of many people.
First and foremost, I am indebted to my adviser Prof. Kalyan Chatterjee. In the past
five years that I have been with him, I have learnt immensely from him as an instructor in
the classroom, as a mentor in the PhD program, and as an erudite scholar in economics.
But for his continuous guidance and encouragement, I would have never ended up doing
research. Chapters 3 and 4 of this dissertation are joint works with him, and in the course
of working on the project, I learnt the basic approach to think about a research question
and thereafter coming up with a solution. I will never forget how, inspite of being in an
extremely constrained situation, he kept on providing me with support during the past one
year when I was searching for a job. Prof. Chatterjee has set an example before me how
a supervisor should be. I am extremely fortunate to have had him as my adviser.
I am extremely grateful to Prof. Edward Green and Prof. Vijay Krishna for serving as
field members in my dissertation committee. Their thoughtful and detailed comments have
helped this dissertation look much better. Prof. Susan Xu was generous enough to serve
as an outside committee member and her comments were also helpful. I must thank Prof.
James Jordan for his constant help over these years. Help from Venky Venkateswaran and
Alex Monge is greatly acknowledged. A special note of thanks to Prof. Bhaskar Dutta for
his comments on chapters 3 and 4, and to Prof. Sven Rady for his comments on chapters
1 and 2.
x
I am grateful to my friends at the Department of Economics at Penn State with whom I
have had numerous insightful discussions, both academic and non-academic. A special note
of thanks goes to Ethem Akyol and Pathikrit Basu. Ethem and I have studied together on
many ocassions and association with him has enhanced my technical rigor to a large extent.
Pathikrit was generous enough to go through drafts of my papers and give constructive
feedbacks. Thanks to Bruno and Nail for being wonderful office mates. I would also like
to thank all my friends in State College who made my stay an enjoyable and enriching
experience.
I have been fortunate enough to be taught by many dedicated teachers in Presidency
College and the Indian Statistical Institute. I would specially like to mention Prof. Amitava
Bose and Prof. Ambar Ghosh in this context. Prof. Ghosh motivated me in my study of
Economics in my early budding days in Presidency College. It is because of him that I
decided to pursue higher studies in the subject. Prof. Amitava Bose is one of the finest
people I have ever come across. He has provided me with constant encouragement and
has taught me how to live life in an enjoyable manner, amidst all challenges during the
graduate program.
This acknowledgement will remain incomplete without mentioning my parents and
my elder brother, Gaurav. It is only because of my father that I pursued Economics
at the undergraduate level. My family has always been with me over these challenging
years and provided me with constant motivation. Finally, many thanks to Atisha for her
encouragement through the highs and lows during the past one year.
xi
Chapter 1
Competition, Duplication and
Learning in R&D
Innovation constitutes an important part in the progress of a society. Starting from the
growth rate of an economy to the various aspects which affects the day to day life of
individuals, R&D activities play an important role. Innovation is a costly and uncertain
process. The uncertainty arises from the fact that the exact path along which the R&D
activities will bear success is unknown. Therefore potential innovators go through trial-
and-error experimentation along the available research avenues. Since experimentation
along an avenue involves a cost (explicit or implicit), it is always desirable that at any
point of time, resources are optimally(from the society’s point of view) spread among the
available methods. Experimentation along a wrong avenue will delay the invention. If the
society discounts the future, then this delay imposes a cost.
This problem is prevalent in those industries where success in R&D activities comes
through a series of trial and error across different methods of experimentation. Hence
apart from the choice of scale of the R&D activities, choosing among alternative research
projects is also an important issue. Most of the existing literature addressing the issue
of patent race, has mainly been concerned with the overall level of firms’ investment in
1
2
R&D activities.(for example Reinganum (1982), Loury(1979), Lee and Wilde (1980), and
Dasgupta and Stiglitz (1980).) However, there have been very few attempts to analyse the
issue of efficient allocation of R&D activities between competing lines of enquiry. In the
present chapter, to isolate this aspect, we fix the total amount of R&D resources and solely
focus on the issue of allocating resources between competing avenues of research.
It is commonly observed that similar innovations are simultaneously tried out by com-
peting firms, who might differ with respect to their abilities. This is of importance in many
real life instances. Consider the research activities to invent a drug for Alzheimer’s disease.
This disease is estimated to cost America alone some 170 billion dollars a year. At the turn
of the century, research to invent a drug for this disease seemed promising. The physical
manifestations of the disease are plaques of a type of protein known as the β-amyloid, and
nerve-cell-engulfing tangles of a protein known as the tau. However the exact causation is
not known. Hence given a level of resource to invent a drug for the disease, it is absolutely
necessary to choose a proper allocation of resources across the methods of experimentation.
When several firms are independently engaged in R&D activities, the way the firms are
compensated in the market is approximately of the form- the winner takes all. This form
of compensation is mimicked by the institution of patents. The winner-takes-all hypothesis
can also be perceived as an idealisation of the fact that even in the absence of patents, there
are many real world observations where the rent accrued to the first inventor is dispropor-
tionately higher than the ones accruing to the later inventors. This is because of the fact
that the first inventor in many situations makes great inroads into the market, and they
earn a huge share of rent from the invention. For example, at present three big companies
are engaged in research to invent a drug for the Alzheimer’s disease. They are Pfizer, Eli
Lilli and Baxter. Given the high perceived valuation of a possible drug, it is evident that
whoever invents the drug first will make a disproportionately higher amount of money than
the later inventors. To cite another example, Xerox -corporation was the first firm to invent
a photocopier using the Xerography technology. Although there were other competing com-
3
panies to come up with a photocopier, Xerox reaped a rent which was disproportionately
higher than that earned by the successive companies inventing photocopier.
The issue of making a choice among competing research avenues was also observed in the
development of an asthma medicine. This medicine tried to block the action of leukotrienes.
Research was conducted along two broad approaches, namely as inhibition strategy and an
antagonist strategy. After experimenting with both, the inhibition strategy was abandoned
and ultimately success came along the antagonist strategy. Apart from these, one can also
look for instances in the electronics industry. For example, in the race in the 1970′s for
inventing marketable video players, RCA and Sony adopted different approaches. Sony’s
approach bore success (and consequently earned a huge profit) and RCA lost a huge amount
of money.( [31], [58] )
In situations similar to above, when the patent mechanism is such that the first one to
invent appropriates all the rent, a particular firm’s decision about which research avenue
to pursue is not only affected by the belief itself, but also by the choices of other firms. It is
worth exploring the efficient allocation of firms across research avenues and the distortions
which can take place in a non-cooperative interaction. A possible distortion would be all
firms engaged in R&D experimenting on the same approach, whilst the socially optimal
allocation would involve diversifying effort on different approaches. This phenomenon is
called duplication. It involves a firm imitating its competitor in a situation when social
optimum would require the firm to adopt a different strategy. There are many real life
stylized facts which might be a manifestation of this phenomenon. For example, consider
the Alzheimer’s drug research case. It was widely believed that the level of β-amyloid pro-
tein is the main culprit. Consequently for the past two decades almost exclusive attention
was given to develop drugs to remove amyloid plaques. However not much success has
been attained in this direction. The drugs which are presently in the market, only delay
the onset of this disease.([20]) As a consequence of this, the theory that β-amyloid protein
is the culprit is waning and the conjecture that tau-proteins are to be blamed is gaining
4
ground. However major R&D activities still involve removal of amyloid plaques.
This chapter analyses highly stylized models to address this issue. The analysis is done
using a strategic Bandit setting. Two environments are considered. The first environment
has two firms (1 and 2) trying to make the same invention, for example to find the correct
explanation for why Alzheimer’s disease occurs. There are two available research avenues,
of which one and only one can lead to success.( like in the context of our Alzheimer’s
disease example, either tau or the amyloid is the correct explanation) The setup is similar
to that of a buried treasure problem with two sites S1 and S2. One and only one of the sites
contains the treasure. Firms know that with probability p the treasure is at S1. Hence we
are analysing a two armed bandit model where both the arms are risky and are perfectly
negatively correlated. Also here agents are operating on the same bandit. Conditional on
the treasure being present at a particular site, (the arm being good) the success of a firm
who is searching there is defined by a Poisson process. The intensity of this Poisson process
is common knowledge. For a particular site (arm), this intensity differs across firms. It is
assumed that while firm 1 is better than firm 2 at S1, firm 2 has an edge over firm 1 at
S2. The firm who discovers the treasure first, appropriates all the rent (and is normalised
to 1). This is equal to the social value of the invention. We consider a continuous time
framework where a firm based on the prior chooses an initial site to carry out research
and a time point (or equivalently a posterior) at which it decides to switch, conditional
on no discovery until that time. The choice of sites by the firms is publicly observable.
Conditional on there being no discovery till a time point, firms update p using Bayes’ rule
on the basis of their search experiences till then.
We first obtain the efficiency benchmark by solving the planner’s problem. The planner
at any instant can choose a site for each of the firms, to carry out research. The objective of
the planner is to maximise the expected discounted social surplus with respect to the firms’
abilities and the likelihood of a site being the correct one. Efficiency involves allocating
both the firms to the same site (specialization) for extreme range of beliefs and allocating
5
them to different sites (diversification) for an interim range of beliefs. In absence of any
heterogeneity between the firms, the range of beliefs over which the planner allocates firms
to different sites shrinks to the point 12 . Next, we fully characterise the non-cooperative
equilibria. Attention is restricted to markovian strategies only1. We show that when
firms’ abilities differ, the efficient allocation can never be achieved. The non-cooperative
interaction always involves duplication. If the extent of the heterogeneity between the firms
(with respect to other parameters) is large enough, then we have a unique non-cooperative
equilibrium in threshold type markovian strategies. This equilibrium outcome involves
diversification over a range of beliefs. If the extent is not high enough then we have a
multiplicity of equilibria involving duplication, with diversification at one point only. It
has been shown that the only situation when the efficient allocation can be achieved in a
non-cooperative interaction is when the firms are homogeneous.
In the second environment there are two sites. One of them is referred to as the
safe site. Safe site has the treasure for sure. Any firm searching there obtains success
according to a Poisson process with intensity π0 > 0. The risky site can either be good
or bad. A bad risky site has no treasure. A good risky site has the treasure and if firm
i searches there, it obtains success according to a Poisson process with intensity πi. We
have π1 ≥ π2 > π0 > 0. First, we solve the planner’s problem to obtain the efficiency
benchmark. When π1 = π2, then efficiency requires allocating both the firms to the risky
site if belief exceeds a threshold and allocate both to the safe site otherwise. This can also
be obtained as a non cooperative outcome in threshold type markovian strategies. With
heterogeneous firms, efficiency requires diversification. This means there is a range of belief
over which the superior firm is allocated to the risky site and the other one to the safe site.
We show that there is a unique equilibrium in threshold type markovian strategies which
echoes the phenomenon of duplication.1In the present model, the state of Markovian strategies should include both belief and location of the
opponent firm. However in the body of the chapter we concentrate only on the effect of beliefs. In theappendix we show that this does not really matter .
6
By establishing the phenomenon of duplication in two different kinds of environments,
we can see that duplication is not an artefact of a particular model. This is basically a man-
ifestation of the heterogeneity between the firms, and the competition between the agents.
Alternatively, the above analysis characterises the non-cooperative equilibria in threshold
type markovian strategies, for two different two armed bandit models, with players differing
with respect to their innate abilities and with payoff externalities.
Related Literature: This chapter contributes to the relatively less explored area of
the broad literature on R&D races. It shows that in presence of heterogeneity and com-
petition among agents, there is always a distortion in the choice of research avenue in a
non-cooperative interaction. Bhattacharya and Mookerjee([7]), Dasgupta and Maskin([18])
are two of the early papers which explore this issue in a static framework. Chatterjee and
Evans ([12]) analyses similar issues in a dynamic setting. The model of this chapter is
closely related to [12]except for the fact that we consider site-specific knowledge and a
continuous time framework. However here we can show that we always have duplication
in the non-cooperative interaction. Some other papers to look into similar issues are Fer-
shtman and Rubinstein ([22]) and Akcigit and Liu([1]). ([22]) studies a two-stage model in
which agents simultaneously rank a finite set of boxes. Exactly one of the boxes contains
the prize. Players commit to open the boxes according to their ranked order. Inefficiency
arises due to the fact that the box which is most likely to have the prize is not opened first.
Their model is basically static in nature.
This chapter also contributes to the strategic bandit literature. Some of the seminal
papers which have studied the bandit problem in the context of economics, are Bolton
and Harris ([9]) Keller,Rady and Cripps([37]), Keller and Rady([38]), Klein and Rady (
[40]), Klein([39]), Thomas([61]). In all of these papers except ([61]) and ([40]), players have
replicas of bandits. Free-riding is a common feature in all of these above models. This
leads to inefficient level (too little) of experimentation. The present work differs from ([37])
7
and ([38]) in two ways. First, we have payoff externalities. Due to this, the phenomenon
of free riding does not arise .(in the first two environments) Secondly, agents differ with
respect to their innate abilities. This gives us inefficiency in equilibrium, the nature of
which is different from the ones in ([37]) and ([38]).
([61]) analyses a set-up where each player has access to an exclusive risky arm, and
both of them have access to a common safe arm. At a time the safe arm can be accessed
by one player only. Hence there is congestion along an arm. The present chapter differs
form this in the way that here each of the arms can be accessed by all the players. Further
we do not have congestion along any of the arms.
The model analysed in ( [40]) has each player having a bandit with a safe arm and a
risky arm. The risky arm of one player is perfectly negatively correlated to the risky arm of
the other player. The environment 1 in the present chapter differs from this in the following
way. We have two arms, both of which are risky and perfectly negatively correlated. (there
is no safe arm)Each arm can be accessed by all the players. ([39]) addresses a model where
players have replicas of bandits with three arms. One of the arms is safe and the other
two are risky. The risky arms are perfectly negatively correlated. Thus there is no payoff
externality between the players as in the present work.
Players in the present chapter differ with respect to their innate abilities, which is
absent in ([61]), ( [40]) and ([39]). Evidently, this seems to be the first successful attempt
in the bandit literature, which explicitly works out models that incorporate difference in
learning abilities of the players along an arm2. Of course we only analyse settings where
players operate on the same bandit.
The rest of the chapter is organised as follows. Section 2 and 3 analyses the models
in environment 1 and 2 respectively. Section 4 describes the analysis of a situation when
there is private arrival of information. Finally, section 5 concludes the chapter.2Klein and Rady([40]) discuss this issue in their work. Also, Akcigit and Liu[[1]] have this feature in
their model. However their work is solely concerned about dealing with private arrival of information.
8
1.1 Environment 1
Two firms (1 and 2 ) are simultaneously searching for a prize which is worth 1 unit. The
first inventor appropriates all the rent from it. There are two potential avenues along which
the research can be conducted. We refer to these avenues as sites. Hence there are two
sites (S1 and S2) and the treasure is located at one and only one of them. However the
correct site is unknown to both the firms. It is only publicly known that with probability
p (p ∈ (0, 1)), S1 is the site which contains the treasure.
Firms’ capability of conducting research at the onset is site specific. While firm 1 is
better in searching at S1, firm 2 does relatively better at S2(conditional on the treasure
being present at the respective sites). Time is continuous and firms discount the future by
a continuous time discount rate r, such that r > 0.
Conditional on the treasure being present at a particular site, the success of a firm who
is searching there is governed by a Poisson process. The intensity of this Poisson process
directly reveals the level of basic research knowledge a firm possess, in conducting research
at that particular site. Given the treasure is located at S1, the Poisson intensity of the
success of firm 1 is π′
and that of firm 2 is π. Similarly, given the treasure is located at
S2, the Poisson intensity of the success of firm 2 is π′
and that of firm 1 is π where,
π′> π > 0
The abilities of the firms across sites are common knowledge.
1.1.1 Beliefs
Each firm can observe the site where his opponent is going. If there is an invention then it
is immediately revealed. In the present model, this implies that the outcomes of research
by firms are publicly observable. Thus, given the players’ common prior p0, at each time
point t ≥ 0, players share a common posterior pt, which is derived using Bayes’ rule on
9
the basis of the observed outcomes till then. Over the time interval [t, t + ∆] (∆ > 0), if
both firms 1 and 2 carry out research at S1 without having any success, then the common
posterior at t+ ∆ is given by
pt+∆ =pt exp−(π+π
′)∆
pt exp−(π+π′ )∆ +1− pt
The posterior is decreasing in ∆. The longer the firms conduct research at S1 without
finding the treasure, the less optimistic they become about the treasure being present at
S1( Simultaneously they become more optimistic about S2 having the treasure). If the
firms conduct research at S1 for the time interval dt → 0 (such that the terms of order
o( dt) can be ignored), then the law of motion followed by the belief is given by
dpt = −(π + π′)pt(1− pt) dt
Similarly if the firms carry out research at S2, the law of motion of the belief is given
by dpt = (π + π′)pt(1− pt) dt. Given the parametric assumptions of the model, it is easy
to see that there is no change in beliefs when each site is exploited by one firm only and
there is no arrival. This can be explained as follows. Suppose firm 1 explores S1 and firm
2 explores S2 over the time interval [t, t+ ∆] and there is no arrival. In that case, because
of firm 1’s exploration, p gets updated downwards and because of firm 2’s exploration, p
gets updated upwards. Thus as the duration of the interval dt → 0, from the above we
can infer that the total change in p is given by:
dpt = −(π′)pt(1− pt) dt+ (π
′)pt(1− pt) dt = 0
This explains why the beliefs are frozen if each firm exploits one site.
10
1.1.2 Social Planner’s problem: The Efficiency Benchmark
We solve for the utilitarian social planner’s optimal behavior in our present set-up. The
planner allocates each firm to a site based on the firm’s ability of conducting research along
that site, and the likelihood of that site containing the treasure. Let pt be the common
subjective probability at time t which the firms assign to S1 being the correct site. The
planner’s payoff from the invention is 1 unit.
The planner wants to maximise the expected discounted social value by choosing an
appropriate action profile at each instant. kt = (k1t, k2t) denotes the action profile chosen
by the planner at the instant t. kit (i = 1, 2) can take values in {0, 1} only. kit = 1 implies
that the planner allocates both the firms to Si. If k1t = k2t = 0 then it implies that the
planner allocates 1 to S1 and 2 to S23. Hence we must have,
k1t + k2t ≤ 1
kt(t ≥ 0) is such that it is measurable with respect to the information available at the time
point t.
Assumption 1 If the planner is indifferent between allocating firm 1 (2) to S1 and S2, then
it allocates 1 (2) to S1 (S2). Since in the current set-up beliefs can move in both directions,
this ensures a well-defined solution to the corresponding law of motion for posterior beliefs.
This is closely related to the admissibility assumption in ([40]) and ([39]).
The expected discounted payoff to the planner can then be expressed as:
E[∫ ∞
0e−rt[(1− k1t − k2t)π
′+ k1tpt(π + π
′) + k2t(1− pt)(π + π
′)]eX(t) dt],
3It is easy to observe that the social planner will never allocate 1 to S2 and 2 to S1
11
where
X(t) = −[∫ t
0{(1− k1τ − k2τ )π
′+ k1τpτ (π + π
′) + k2τ (1− pτ )(π + π
′)} dτ ]
and the expectation is over the stochastic processes kt and pt. This shows that we can take
the belief to be our state variable. Thus we have a dynamic programming problem with
the current belief p (from now on we will do away with the time subscript) as the state
variable. Since the evolution of beliefs depends on k only, the planner’s problem reduces
to choosing the action profile k = (k1, k2), given the current belief p.
Let v(p) be the value function of the planner. By the principle of optimality it should
satisfy
v(p) = maxk1,k2∈{0,1};k1+k2≤1
{(1− k1 − k2)π′dt+ (π + π
′)(k1p+ k2(1− p)) dt
+(1− r dt)[1− k1p(π + π′) dt− k2(1− p)(π + π
′) dt− (1− k1 − k2)π
′dt][v(p+ dp)]}
where (1−r dt) is an approximation of the discount factor exp−r dt. Substituting v(p+ dp) =
v(p) + v′(p) dp and dp = (k1 − k2)p(1− p)(π + π
′) dt, we get
v(p) = maxk1,k2∈{0,1};k1+k2≤1
{(1− k1 − k2)π′dt+ (π + π
′)(k1p+ k2(1− p)) dt
+(1−r dt)[1−k1p(π+π′) dt−k2(1−p)(π+π
′) dt−(1−k1−k2)π
′dt][v(p)+v
′(p)(k1−k2)p(1−p)(π+π
′) dt]}
After simplifying and rearranging the above, we obtain the following Bellman equation
rv = maxk1,k2∈{0,1};k1+k2≤1
{(1− k1 − k2)[π′(1− v)] + k1[(π + π
′)p(1− v − (1− p)v′)]
+ k2[(π + π′)(1− p)(1− v + pv
′)]} (1.1)
Proposition 1 There exists a solution to the planner’s problem in which both the firms
12
are allocated to S1 (S2) if the belief p is strictly greater(lower) than a threshold p∗1 (p∗2). If
the belief is in the range [p∗2, p∗1], then firm 1 is allocated to S1 and firm 2 is alocated to S2.
p∗1 and p∗2 satisfy,
0 < p∗2(=π
π + π′) <
12< p∗1(=
π′
π + π′) < 1
Note that p∗1 = 1− p∗2.
Proof.
We prove this through following two lemmas:
Lemma 1 If the planner’s solution is assumed to be of the threshold type, i.e if there exist
threshold probabilities p∗2 and p∗1, such that 0 < p∗2 < p∗1 < 1 and both firms are allocated
to site S1 (S2) for p ∈ (p∗1, 1] ([0, p∗2)) and firm 1 (2) to S1 (S2) for p ∈ [p∗2, p∗1], then
p∗1 = π′
π+π′= 1− p∗2.
Proof of Lemma. Suppose the planner’s solution is of the threshold type as described
by the above lemma. If p ∈ (p∗1, 1], from (1.1) we can infer that v(p) satisfies:
v′+
[r + (π + π′)p]
p(1− p)(π + π′)v =
11− p
This is a first order linear O.D.E. Solving for it( see appendix (A.1) for a detailed
analysis) we obtain:
v =π + π
′
r + π + π′p+ C1(1− p)[Λ(p)]
r
π+π′ (1.2)
where C1 is an integration constant and Λ(p) = 1−pp .
Similarly if p ∈ [0, p∗2), v(p) satisfies the following O.D.E:
v′ − v [r + (1− p)(π + π
′)]
p(1− p)(π + π′)= −1
p
13
Solving the above first order O.D.E as before, we have
v =π + π
′
r + π + π′(1− p) + C2(p)[Γ(p)]
r
π+π′ (1.3)
where C2 is an integration constant and Γ(p) = p1−p . Finally if p ∈ [p∗2, p
∗1], then v satisfies,
rv = (1− v)π′ ⇒ v =
π′
r + π′
Hence the value function is given by:
v(p) =
π+π′
r+π+π′p+ C1(1− p)[Λ(p)]
r
π+π′ : If p ∈ (p∗1, 1],
:π+π
′
r+π+π′(1− p) + C2(p)[Γ(p)]
δ
π+π′ : if p ∈ [0, p∗2),
:π′
δ+π′: if p ∈ [p∗2, p
∗1].
(1.4)
If p∗1 and p∗2 are optimally chosen, then the smooth pasting and value matching condi-
tions should be satisfied at p∗1 and p∗2. Invoking them we derive C1,C2, p∗2 and p∗1 . At p∗1,
the value matching condition implies:
π + π′
r + π + π′p∗1 +C1(1−p∗1)[Λ(p∗1)]
r
π+π′ = v(p∗1) =
π′
r + π′⇒ C1 =
π′
r+π′− π+π
′
δ+π+π′p∗1
(1− p∗1)[Λ(p∗1)]r
π+π′
(1.5)
and the smooth pasting condition implies,
v′(p∗+1 ) = 0⇒ π + π
′
r + π + π′= C1[Λ(p∗1)
r
π+π′ +
r
π + π′Λ(p∗1)
r
π+π′ 1p
]
Substituting the value of C1 from (1.5) we obtain p∗1 = π′
π+π′
14
Since π′> π, p∗1 >
12 . Similarly, we obtain C2 and p∗2 as
C2 =π′
r+π′− π+π
′
r+π+π′(1− p∗2)
γ2[Γ(p∗2)]r
π+π′
; p∗2 =π
π + π′(1.6)
It is easy to see that p∗2 <12 as π
′> π.
Lemma 2 The v obtained in (1.4) with p∗1 = π′
π+π′= 1− p∗2, and the corresponding policy
k satisfy (1.1).
Proof of Lemma. We need to show that k1 = 1 for p ∈ (p∗1, 1], k2 = 1 for p ∈ [0, p∗2) and
k1 = k2 = 0 for p ∈ [p∗2, p∗1] are optimal choices for the planner. Let,
B(p, v) = π′(1−v) ; B1(p, v) = (π+π
′)p(1−v−(1−p)v′) and B2(p, v) = (π+π
′)(1−p)(1−v+pv
′)
Then (1.1) is equivalent to
rv = maxk1,k2∈{0,1};k1+k2≤1
{(1− k1 − k2)B(p, v) + k1B1(p, v) + k2B2(p, v)}
To show that v (and the corresponding k) satisfies the Bellman equation, we need to verify
that the following hold ,
B(p, v) = max{B(p, v), B1(p, v), B2(p, v)} for p ∈ [p∗2, p∗1]
B1(p, v) = max{B(p, v), B1(p, v), B2(p, v)} for p ∈ (p∗1, 1]
B2(p, v) = max{B(p, v), B1(p, v), B2(p, v)} for p ∈ [0, p∗2)
First, consider the interval [p∗2, p∗1]. According to (1.4), v = π
′
r+π′in this region. Thus
v′
= 0. This implies that
B(p, v) = π′(1− v) ; B1(p, v) = (π + π
′)p(1− v) and B2(p, v) = (π + π
′)(1− p)(1− v).
15
The conditions B(p, v) ≥ B1(p, v) and B(p, v) ≥ B2(p, v) hold simultaneously when p ≤π′
π+π′and p ≥ π
π+π′. Hence for p ∈ [p∗2, p
∗1],
B(p, v) = max{B(p, v), B1(p, v), B2(p, v)}
Next, consider the region (p∗1, 1]. v′
is given by π+π′
r+π+π′− C1(Λ(p))
r
π+π′ [1 + r
π+π′.1p ]
⇒ (1−p)v′ =π + π
′
r + π + π′−[p
π + π′
r + π + π′+C1(1−p)[Λ(p)]
r
π+π′ ]−C1
(1− p)p
[Λ(p)]r
π+π′ r
π + π′
⇒ (1− v − (1− p)v′) =r
π + π′ + r+
(1− p)p
C1[Λ(p)]r
π+π′ r
π + π′
Substituting the above in the expression of B1(p, v), we get B1(p, v) = rv.
Further, from the expression of v′(p) we obtain
pv′
=π + π
′
r + π + π′p− pC1[Λ(p)]
r
π+π′ − C1
r
π + π′[Λ(p)]
r
π+π′ = v − C1[Λ(p)]
r
π+π′ .r + π + π
′
π + π′
⇒ 1− v + pv′
= 1− C1[Λ(p)]r
π+π′ .r + π + π
′
π + π′
Substituting this in the expression of B2(p, v), we get B2(p, v) = (π+π′)−(r+π+π
′)v. Thus
to have B1(p, v) ≥ B(p, v) and B1(p, v) ≥ B2(p, v), we require v ≥ π′
r+π′and v ≥ π+π
′
2r+π+π′
respectively. Since π′
r+π′− π+π
′
2r+π+π′= δ(π
′−π)
(r+π′ )(2r+π+π′ )> 0 and for p ∈ (p∗1, 1] v > π
′
r+π′, we
have
B1(p, v) = max{B(p, v), B1(p, v), B2(p, v)}
Similarly we can show that for the region p ∈ [0, p∗2), B2(p, v) = max{B(p, v), B1(p, v), B2(p, v)}.
This shows that the value function and the corresponding policy k satisfies (1.1).
The proof of proposition (1) now follows directly from lemma (1) and (2).
We conclude this subsection by making an observation. It follows that the length
of the interval [p∗2, p∗1] is given by π
′−ππ+π′
. Hence the range of beliefs over which there is
16
diversification of research, is increasing in the difference of abilities of firms at a particular
site. If π = π′, then the range shrinks to the point 1
2 . This implies that with homogeneous
firms, diversification in research takes place at the belief 12 only.
1.1.3 The non-cooperative game
The extensive form
Player i = 1, 2 chooses actions {ki,t ∈ {(1, 0), (0, 1)} such that ki,t is measurable with
respect to the information available at time t. ki,t = (1, 0)((0, 1)
)indicates that firm i is
going to S1 (S2). It is evident that as soon as there is a discovery by a particular firm the
game ends. We consider a winner takes all structure, so that the entire rent from discovery
accrues to the firm who discovers it first.
Throughout our analysis of the non-cooperative game, we will restrict our attention
to Markovian strategies with common belief p as the state variable. We define a (marko-
vian)strategy of player i (i = A,B) to be the mapping ki : [0, 1] → {(1, 0), (0, 1)} (i.e
from states pt to kit). We allow only those ki(.) functions which satisfy the property
that k−1i [(1, 0)] and k−1
i [(0, 1)] are disjoint unions of a finite number of non-degenerate
sub-intervals in [0, 1]. Also ki(0) = (0, 1) and ki(1) = (1, 0). This ensures that player i
chooses the dominant action under subjective certainty. Given this we can also visualise
the strategies of the players as follows. A firm, given the current belief choose a site and
a posterior at which it is going to switch to the other site. Player 1’s markov strategy is
called a threshold type strategy if k−11 (1, 0) = [p1, 1] or (p1, 1]. Similarly player 2’s markov
strategy is a threshold type strategy if k−12 (0, 1) = [0, p2] or [0, p2).
It should be noted that strictly speaking, the domain of a Markovian strategy of a
particular firm should not only depend on the belief p, but also on the location of its com-
petitor. Appendix (A.4) illustrates that the results obtained by restricting the strategies
of players as function of beliefs remain valid when strategy depends on both the belief and
the location of the competitor.
17
Assumption 2 We assume that k1 is right continuous and k2 is left continuous. This
guarantees the existence of a well-defined solution to the law of motion for posterior beliefs.
Equilibrium
We aim to characterise the markov-perfect equilibria of the non-cooperative game. A
markov-perfect equilibrium is a pair of strategies (k1, k2), such that the strategy of player
i maximises his expected discounted payoff, conditional on the strategy of player j and
vice-versa.
First we focus on equilibria in which diversification in research takes place (i.e there
exists a range of beliefs over which firms choose different sites.)and the strategies of players
are of the threshold type. In the present set-up threshold type markov strategies are said
to be symmetric if
k1 =
(1, 0) : if p ∈ [p, 1],
(0, 1) : if p ∈ [0, p),(1.7)
and
k2 =
(0, 1) : if p ∈ [0, p],
(1, 0) : if p ∈ (p, 1],(1.8)
such that p < p and p = 1− p.
Let v1(p) and v2(p) be the value functions(equilibrium payoffs) of firm 1 and 2 re-
spectively, from an equilibrium strategy profile (k1, k2). Then given k2, v1 and k1 should
satisfy,
v1(p) = maxk1∈{(1,0),(0,1)}
{(k11k
22pπ
′dt+ k1
1k12pπ
′dt+ k2
1k22(1− p)π dt+ k2
1k12(1− p)π dt)
+(1− r dt)(1− k11k
22π′dt− k1
1k12p(π + π
′) dt− k2
1k22(1− p)(π + π
′) dt− k2
1k12π dt)(v1(p, k2)
18
− (k11k
12 − k2
1k22)p(1− p)(π + π
′)v′1(.) dt)} (1.9)
Similarly, given k1, v2 and k2 should satisfy
v2(p) = maxk2∈{(1,0),(0,1)}
{(k22k
11(1− p)π′ dt+ k1
1k12pπ dt+ k2
1k22(1− p)π′ dt+ k2
1k12pπ dt)
+(1− r dt)(1− k11k
22π′dt− k1
1k12p(π + π
′) dt− k2
1k22(1− p)(π + π
′) dt− k2
1k12π dt)(v2(p)
− (k11k
12 − k2
1k22)p(1− p)(π + π
′)v′2(.) dt)} (1.10)
Expanding and rearranging (1.9) and (1.10)(after ignoring the terms of the order o( dt) )
we get the following Bellman equations, which the equilibrium payoffs should satisfy
rv1 = maxk1∈{(1,0),(0,1)}
{k11k
22[π
′(p− v1)] + k1
1k12[(π + π
′)p(
π′
π + π′− v1 − (1− p)v′1)]
+ k21k
22[(π + π
′)(1− p)( π
π + π′− v1 + pv
′1)] + k2
1k12[π((1− p)− v1)]} (1.11)
rv2 = maxk2∈{(1,0),(0,1)}
{k22k
11[π
′((1− p)− v2)] + k2
2k21[(π + π
′)(1− p)( π
′
π + π′− v2 + pv
′2)]
+ k12k
11[(π + π
′)p(
π
π + π′− v2 − (1− p)v′2)] + k1
2k21[π(p− v2)]} (1.12)
If (k1, k2) is a threshold type markovian strategy profile and is symmetric in the way
described above, then players’ payoffs induced by this strategy profile are given by
v1(p) =
π′
r+π+π′p+ C1
11(1− p)[Λ(p)]r
π+π′ : if p ∈ (p, 1],
π′
r+π′p : if p ∈ [p, p],
πr+π+π′
(1− p) + C122[Γ(p)]
r
π+π′ : if p ∈ [0, p)
:
(1.13)
19
and
v2(p) =
πr+π+π′
p+ C211(1− p)[Λ(p)]
r
π+π′ : if p ∈ (p, 1],
π′
r+π′(1− p) : if p ∈ [p, p],
π′
r+π+π′(1− p) + C2
22p[Γ(p)]r
π+π′ : if p ∈ [0, p)
:
(1.14)
where Ci11 , Ci22 (i = 1, 2) are integration constants.
If the strategy profile (k1, k2) constitutes an equilibrium(markovian), then given k2,
(1.13) along with k1 should satisfy (1.11) and given k1, (1.14) along with k2 should should
satisfy (1.12).
The following proposition states that we cannot obtain the efficient outcome as an
equilibrium outcome of the non-cooperative game.
Proposition 2 There does not exist an efficient equilibrium.
Proof. The strategy profile (k∗1, k∗2), which implements the efficient outcome is the one
which satisfies (1.7) and (1.8) with p = p∗2 and p = p∗1 . If (k∗1, k∗2) constitutes an equilibrium
then k∗1 should constitute a best response to k∗2 for all p ∈ [0, 1]. The payoffs induced by
this strategy profile will be given by (1.13) and (1.14) with p = p∗2 and p = p∗1.
If k∗1 is a best response to k∗2, then given k∗2, v1 along with k∗1 should satisfy (1.11), .
Consider the region [p∗2, p∗1] first. Payoffs induced by the strategy profile (k∗1, k
∗2) implies
that for p ∈ [p∗2, p∗1], v1 = π
′
r+π′p. From (1.11), we know that to have k∗1 to be a best
response to k∗2, we require
π′(p− v1) ≥ (π + π
′)(1− p)( π
π + π′− v1 + pv
′1)
⇒ p ≥ π(r + π′)
rπ′ + π(r + π′)
However,π(r + π
′)
rπ′ + π(r + π′)− p∗2 =
π(r + π′)
rπ′ + π(r + π′)− π
π + π′
20
=π(π
′2)[rπ′ + π(r + π′)][(π + π′)]
> 0
Hence for p ∈ [p∗2,π(r+π
′)
rπ′+π(r+π′ )) , π
′(p− v1) < (π + π
′)(1− p)( π
π+π′− v1 + v
′1p). Thus k∗1
does not constitute a best response to k∗2 . This shows that there does not exist an efficient
equilibrium.
Inefficient equilibrium with diversification (symmetric):
The efficient strategy profile involves diversification and is symmetric( in the manner
described above). Since we have shown that there does not exist an efficient equilibrium,
it is of natural interest to look for outcomes(with diversification in research) that can be
obtained in a symmetric Markovian equilibrium of the non-cooperative game. It turns out
that for certain parametric conditions we can obtain a unique equilibrium outcome (in
threshold type markovian strategies). The following proposition describes this.
Proposition 3 If r(π′−π)−ππ′ > 0, then the unique Markovian equilibrium in threshold
type strategies is symmetric and is constituted by the strategy profile (kN1 , kN2 ) such that it
satisfies (1.7) and (1.8) with,
p = p∗N2 =π(r + π
′)
rπ′ + π(r + π′)and p = p∗N1 =
rπ′
rπ′ + π(r + π′)= 1− p∗N2
Proof. We prove this proposition with the help of following lemmas:
Lemma 3 If firm 2 goes to S2 for p ∈ [0, p] and to S1 for p ∈ (p, 1] with 12 ≤ p < p∗1,
then there exists a p∗N2 satisfying 0 < p∗N2 < 12 ≤ p, such that for firm 1, going to S2 for
p ∈ (0, p∗N2 ) and to S1 for p ∈ [p∗N2 , 1] constitutes a best response to firm 2’s strategy. .
Proof of Lemma. By hypothesis, we have k2 = (0, 1) for p ∈ [0, p] and k2 = (1, 0) for
p ∈ (p, 1], such that p ≥ 12 . We know that given this, for p = 0 it is optimal for 1 to choose
k1 = (0, 1)( that is to conduct research at S2). We now need to find the point where 1 will
21
find it optimal to switch to S1, given k2. Hence we will be solving for the optimal stopping
problem of player 1 in the region [0, p].
Let p∗N2 be the switching point for 1. First, we assume that p∗N2 < 12 . Then this will
induce a payoff function for 1 which satisfies (1.13) with p = p∗N2 . Since v1 thus obtained
is a continuous function, at the switching point switching point we shall have,
π′{p∗N2 − v1} = (π + π
′)(1− p∗N2 ){ π
π + π′− v1 + p∗N2 v
′1}
Given k2, p can change in one direction only, v′1 = π
′
r+π′. This implies
π′ r
r + π′p∗N2 = (1− p∗N2 )π
⇒ p∗N2 =π(r + π
′)
rπ′ + π(r + π′)
Since r(π′ − π)− ππ′ > 0, p∗N2 < 1
2 . This is consistent with the assumption that p∗N2 < 12 .
This shows that k1 = (1, 0) is an optimal response to k2 for p ∈ [p∗N2 , p].
Next, Consider the region (p, 1]. For k1 to constitute a best response to k2 we must
have,
(π + π′)p(
π′
π + π′− v1 − v
′1(1− p)) ≥ π((1− p)− v1)
In this region, v′1 is given by
v′1 =
π′
r + π + π′− C1
11(Λ(p))r
π+π′ [1 +
r
π + π′1p
] (1.15)
This implies,
(π + π′)p(
π′
π + π′− v1 − v
′1(1− p)) = rv1
Thus we require
rv1 ≥ π((1− p)− v1)⇒ v1 ≥π
r + π(1− p)
22
From the value matching condition we know that v1(p) = π′
r+π′ p. Since 1
2 ≤ p < p∗1, from the
switching derivative lemma (refer to appendix A.2) we know that v′1 > 0 for all p ∈ [p, 1].
Hence we must have v1 >πr+π (1 − p) for all p ∈ (p, 1]. This implies that k1 = (1, 0) is an
optimal response to k2 for p ∈ (p, 1].
Thus we have shown that k1 constitutes a best response to k2 for all p ∈ [0, 1]. This
concludes the proof of the lemma.
Lemma 4 If firm 1 goes to S1 for p ∈ [p, 1] and to S2 for p ∈ [0, p) with p∗2 < p ≤ 12 , then
there exists a p∗N1 satisfying 12 < p∗N1 < 1, such that for firm 2, going to S2 for p ∈ [0, p∗N1 ]
and to S1 for p ∈ (p∗N1 , 1] constitutes a best response to firm 1’s strategy.
Proof of Lemma. We have k1 = (1, 0) for p ∈ [p, 1] and k1 = (0, 1) for p ∈ [0,p) such
that p ≤ 12 . Given this, at p = 1, 2 finds it optimal to choose k2 = (1, 0) (that is to conduct
research at S1). As before, we intend to find the point where 2 will switch to S2. (the
optimal stopping problem of player 2 in the region [p, 1]).
Let p∗N1 be the switching point of 2. Assuming p∗N1 > 12 , this will induce a payoff
function for 2 which satisfies (1.14) with p = p∗N1 . As in lemma (3), at p = p∗N1 we shall
have
π′((1− p∗N1 )− v2) = (π + π
′)p∗N1 (
π
π + π′− v2 − v
′2(1− p))
Since given k1, p can change in one direction only, v′2 = − π
′
r+π′. This implies
π′(1− p∗N1 )
r
r + π′= pπ
⇒ p∗N1 =rπ′
rπ′ + π(r + π′)
Since r(π′ − π)− ππ′ > 0, p∗N1 > 1
2 . This is consistent with our assumption that p∗N1 > 12 .
This shows that k2 = (0, 1) is an optimal response to k1 for p ∈ [p, p∗N2 ]. Similar to the proof
of lemma (3) we can also show that k2 = (0, 1) is a best response to k1 for p ∈ [0, p). Hence
23
we have demonstrated that k2 is a best response to k1 for all p ∈ [0, 1]. This concludes the
proof of the lemma.
Let(kN1 , kN2 ) be the strategy profile such that it satisfies (1.7) and (1.8) with, p =
p∗N2 and p = p∗N1 . The payoff functions induced by this strategy profile satisfies (1.13) and
(1.14) with p = p∗N2 and p = p∗N1 .
Appendix (A.3.1) describes the value of the integration constants obtained by imposing
the value matching condition at p∗N1 and p∗N2 . Also it shows that v1 and v2 satisfy the
smooth pasting condition at p∗N2 and p∗N1 respectively, conditional on the other player’s
strategy.
The proof of the proposition now follows directly from lemma (3) and (4) and the fact
that p∗N2 and p∗N1 constitute the unique switching points for 1 and 2 respectively.
Proposition (3) characterises the non-cooperative equilibrium when r(π′−π)−π′π > 0.
It is to be observed that p∗N2 > p∗2 and p∗N1 < p∗1. The inefficiency of the non-cooperative
equilibrium entails from the fact that in the intervals [p∗2, p∗N2 ) and (p∗N1 , p∗1], firms conduct
research at the same site when efficiency requires them to conduct research at different
sites. Hence there exist ranges of beliefs, such that if the state lies in one such range, there
is too much specialistation along a line of research, when efficiency requires diversification.
Given π and r and r > π, the condition r(π′ − π)− π′π > 0 puts a lower bound on the
value of π′. This condition can be intuitively explained as follows. Suppose there exists a
range of beliefs such that firm 1 conducts research at S1 and firm 2 at S2. Over the range
of diversification, the payoffs to firm 1 and 2 are π′
r+π′p and π
′
r+π′(1 − p) respectively. If
firm 1 unilaterally deviates and goes to S2, then the belief will be updated upwards and
if firm 2 unilaterally deviates and goes to S1, the belief will be updated downward. This
implies that if firm 1 unilaterally deviates and goes to S2 over the time interval dt, then
24
conditional on no arrival, the expected discounted future payoff will be higher for firm 1.
Similarly, if firm 2, deviates and goes to S1 over the dt time interval then conditional on
no discovery its expected discounted future payoff increases.
Thus staying at different sites (as described above) is incentive compatible from firms’
point if at each p, given the other firm’s action, the instantaneous payoff to a firm from
diversification is no less than that from specialisation. This is a necessity. Consider a p in
such a range. Given that firm 2 is at S2, firm 1 knows that the expected discounted payoff
from diversification is π′
r+π′p. The instantaneous payoff is π
′
r+π′pr dt. The instantaneous
payoff from specialisation is π(1−p) dt. Thus it is optimal for firm 1 to go for diversification
if,π′
r + π′pr dt ≥ π(1− p) dt (1.16)
Similarly, given firm 1 is at S1,firm 2 finds it optimal to go for diversification if
π′
r + π′(1− p)r dt ≥ πp dt (1.17)
Thus to have a range of beliefs over which diversification will take place in a non-
cooperative equilibrium, (1.16) and (1.17) should hold together, with strict inequality for
at least one p if the range does not consist of one point only . This implies
π′
r + π′r > π
⇒ r(π′ − π)− π′π > 0
This explains the condition required for the existence of a symmetric equilibrium. Thus
to have an equilibrium with diversification in research, it is necessary that the extent of
site specific superiority is high enough. In this chapter this is reflected by the magnitude
of the term (π′ − π). The condition is more likely to be true, when the value of (π
′ − π)
is higher. (for a given value of r) However one can see that for low values of r (i.e when
25
agents become more patient) the condition is less likely to hold.
Inefficient equilibria with no diversification:
It is clear from the previous proposition that we cannot have symmetric equilibrium
with diversification in research when the condition r(π′ − π) − π′π > 0 fails to hold. In
these situations we can expect to obtain equilibrium where the equilibrium strategy profile
(kN′
1 , kN′
2 ) satisfies (1.7) and (1.8) with p = p = p∗. In other words we look for equilibrium
where switching points for the firms are the same. This implies that in these equilibria,
diversification of research takes place only at the point p∗.
To begin with, we focus on the case when firms are equally capable along sites, i.e
π11 = π1
2 = π21 = π2
2 = π
Clearly, the condition r(π′−π)−π′π > 0 fails to hold. The following proposition describes
the equilibrium.
Proposition 4 If π′
= π, the unique equilibrium in threshold type strategies is constituted
by the strategy profile (kNe1 , kNe2 ) such that it satisfies (1.7) and (1.8) with p = p = 12 .
Proof. Suppose the strategy profile (k′1, k
′2) constitutes an equilibrium such that (k
′1, k
′2)
satisfies (1.7) and (1.8) with p = p = p∗. This will induce the payoff functions v1(.) and
v2(.) which satisfy (1.13) and (1.14) respectively with p = p = p∗.
Firm 1 finds it optimal to switch from S2 to S1 at p∗. Then, from (1.11) we should
have
π{p− v1} ≥ 2πp{12− v1 + pv
′1}
If 1 choose S2 at p∗ then conditional on no discovery, p will be updated upwards. Hence
v′1 will be given by
v′1 =
π
r + 2π− C1
11[Λ(p)]r2π [1 +
r
2π1p
]
26
Choosing the integration constant by imposing the value matching condition to v1 at p∗,
we have
2πp{12− v1 + pv
′1} = π − (r + 2π)v1
Thus we require
π{p− v1} ≥ π − (r + 2π)v1 ⇒ π(r
r + π)p∗ ≥ π − (r + 2π)
π
r + πp∗
using v1(p∗) = πr+πp
∗. This implies that p∗ ≥ 12 .
Next, Firm 2 finds it optimal to switch from S1 to S2 at p∗. From (1.12) we shall then
have
π{(1− p)− v2} ≥ 2π{12− v2 − (1− p)v′2}
If firm 2 goes to S2 at p∗, then conditional on no discovery, p will be updated downwards.
Hence v′2 will be given by
v′2 = − π
r + 2π+ C2
22[1 +r
2π1
1− p]
After substituting the value of the integration constant, we can posit that for optimality
we require
π{(1− p∗)− v2} ≥ π − (r + 2π)v2
Putting v2(p∗) = πr+π (1− p∗), we then have p∗ ≤ 1
2 .
This implies that if there is an equilibrium with the same switching point for both the
firms, then the switching point should be p∗ = 12 .
Let (kNe1 , kNe2 ) be the strategy profile which satisfies (1.7) and (1.8) with p = p = 12 .
The payoffs induced by this profile will be given by (1.13) and (1.14) with p = p = 12 .
The integration constants are chosen by imposing value matching condition to v1 and v2
at p = 12 .
27
We now need to establish that (kNe1 , kNe2 ) constitutes an equilibrium. All we need to
show is that kNe1 (kNe2 ) constitutes a best response to kNe2 (kNe1 ) for p ∈ (12 , 1](p ∈ [0, 1
2))
Consider the region (12 , 1]. If it is optimal for firm 1 to choose S1, then it must be true
that,
2πp(π
2π− v1 − (1− p)v′1) ≥ π(1− p− v1)
⇒ v1 ≥π
r + π(1− p)
It can be shown that v1 is strictly increasing for p ∈ (12 , 1]. Since v1(1
2) = πr+π
12 , v1 >
πr+π (1−p) for all p ∈ (1
2 , 1]. Hence kNe1 is a best response to kNe2 for p ∈ (12 , 1]. In a similar
manner it can be shown that kNe2 constitutes a best response to kNe1 for p ∈ [0, 12).
This concludes the proof.
The above analysis shows that in the absence of any difference in abilities of the firms,
we have a unique equilibrium in threshold type markovian strategies with diversification
at one point only. By recalling the analysis of the social planner’s problem we can posit
that when firms are equally capable at each of the sites, the outcome of this equilibrium
coincides with the efficient outcome.
We now turn our focus to the situation when firms’ abilities do differ and we cannot
have equilibrium that involves diversification in research activities over a range of beliefs.
The following proposition describes this.
Proposition 5 If π′> π > 0 and the condition r(π
′ − π)− π′π > 0 fails to hold, then we
have a multiplicity of equilibria in threshold type strategies as described below:
ks1 = (0, 1) for p ∈ [0, p∗) and ks1 = (1, 0) for p ∈ [p∗, 1]
ks2 = (0, 1) for p ∈ [0, p∗] and ks2 = (1, 0) for p ∈ (p∗, 1]
where,
p∗ ∈ [max{ps, p∗N1 }, 1−max{ps, p∗N1 }]
28
and
ps =π(r + π
′)
π(r + π′) + π′(r + π)
Proof. Since the condition r(π′ − π)− π′π > 0 fails to hold, we cannot have equilibrium
with diversification in research (i.e a range of beliefs over which firms choose different sites
to conduct their research.). Thus we seek to find equilibria where the switching points for
the firms are the same. Let p∗ be the common switching point. The payoffs induced will
be given by (1.13) and (1.14) with p = p = p∗.
At p∗, firm 1 finds it optimal to switch to site S1 from S2. This implies that we must
have
π′(p∗ − v1) ≥ (π + π
′)(1− p∗){ π
π + π′− v1 + p∗v
′1}
At p = p∗, given that 2 is at S2, if 1 goes to S2 then conditional on there being no discovery,
p will increase. Hence v′1 will be given as
v′1 =
π′
π + π′ + r− C1
11[Λ(p)]r
π+π′ [1 +
r
π + π′1p
]
Since integration constants are chosen by imposing the value matching condition to v1 at
p∗,
(π + π′)(1− p∗){ π
π + π′− v1 + p∗v
′1} = π(1− p∗) + π
′ − (r + π + π′)v1
Hence we require
π′(p∗ − v1) ≥ π(1− p∗) + π
′ − (r + π + π′)v1
Substituting v1(p∗) = π′
r+π′p∗ , we obtain
p∗ ≥ π(r + π′)
π(r + π′) + π′(r + π)= ps
29
Similarly we can show that if B finds it optimal to switch at p∗, then we must have
p∗ ≤ π′(r + π)
π(r + π′) + π′(r + π)= ps
Thus to have an equilibrium with same switching points, it is necessary that the switch-
ing point p∗ lies in the interval [ps, ps]. Since r(π′−π)−π′π ≤ 0, p∗N1 < 1
2 . In an equilibrium
where the switching points are the same it is a necessity that the switching point p∗ ≥ p∗N1 .
Otherwise from our previous analysis we know that if p∗ < p∗N1 and 1 switches at p∗ then 2
finds it optimal to switch at p∗N1 and not p∗. Hence p∗ ∈ [max{ps, p∗N1 }, 1−max{ps, p∗N1 }].
Finally, we need to establish that ks1 (ks2) constitutes a best response to ks2 (ks1) for
p ∈ (p∗, 1] ([0, p∗)). Consider the region p ∈ (p∗, 1]. From the above analysis we know that
given the conjectured behavior, we have
v1(p∗) ≥ π
r + π(1− p∗)
From the switching derivative lemma we can infer that v1 will be strictly increasing for
p ∈ [p∗, 1]. Hence v1 ≥ πr+π (1 − p) for p ∈ [p∗, 1]. Along the line of our previous analysis
we can show that this is what is required for ks1 to be a best response to ks2 for p ∈ (p∗, 1].
Similarly we can show that ks2 constitutes a best response to ks1 for p ∈ [0, p∗).
It is to be noted that in the present case, smooth pasting condition will not necessarily
be satisfied by vi at p∗. This is because here we are in some sense getting a corner solution
for the optimal stopping problems of firm 1 and firm 2. That is, given that ks2 = (0, 1) for
p ∈ [0, p∗], firm 1 would have ideally liked to switch to S1 from S2 at p = p∗N2 . However he
will not be able to do this since p∗ ≤ p∗N2 . Similar thing will be true for firm 2 as well.
This concludes the proof.
The previous proposition states that when r(π′ − π) − π′π < 0 and firms are hetero-
geneous, we have a multiplicity of equilibria where firms have a common switching point.
30
Since p∗, the common switching point always lies in the interval (p∗2, p∗1), each of this equi-
libria involves duplication. The analysis of the above model shows that whenever the firms
differ in their innate abilities,(i.e their Poisson intensities of learning along an arm dif-
fer)the non-cooperative equilibrium is inefficient, such that for a certain range of beliefs,
there is too much specialisation along a method of research when efficiency would require
more diversification. Hence the phenomenon of duplication in the present set-up can be
perceived as a manifestation of competition and heterogeneity among the firms.
Next, we analyse a different model in Environment 2 and show that duplication is only
possible when firms differ in their abilities. The setting is similar to the ones analysed
in ([37]) and ([38]). However here players operate on the same bandit. Thus apart from
showing that the phenomenon of duplication generalizes in to other models as well, we also
show the nature of inefficiency in a bandit model with one safe arm and one risky arm in
presence of payoff externalities and difference in innate abilities across the players.
1.2 Environment 2
Two firms (1 and 2) are trying to find a treasure. The first one to find it appropriates all
the rent from it which we normalize to 1. There are two sites to look for the treasure. Up
to now, it is similar to the setting in environment 1. However, the characteristics of the
sites differ as follows.
It is known with certainty that one of the sites has the treasure. This site is referred
to as the safe site(S). As before, we consider a continuous time framework. The success of
any firm who is searching at S, follows a Poisson process with intensity π0 > 0. The other
site (R) is a risky one and can either be good or bad. A good risky site has the treasure
and the success of firm i (i = 1, 2) who is searching at R, follows a Poisson process with
intensity πi, such that πi > π0 for all i = 1, 2. A bad risky arm has no treasure. At the
onset, the firms know that the risky site is good with probability p. Each firm can observe
31
the location where its opponent is carrying out research.
1.2.1 Symmetric firms
First we consider the situation when the firms are symmetric in their abilities. That is for
both the firms(1 and 2), success at the risky site follows a Poisson process with intensity
π1, such that π1 > π0 > 0.
Planner’s problem: The Efficiency Benchmark
Consider the problem of a benevolent social planner who wants to maximise the expected
discounted social value from the invention. The payoff to the planner from invention is
1. Hence at each instant, based on p, he allocates each of the firms to a particular site to
carry out research. kt denotes the action profile chosen by the planner at the instant t.
kt ∈ {0, 1, 2}. kt denotes the number of firms allocated to the risky site at the instant t.
kt(t ≥ 0) is such that it is measurable with respect to the information available at time t.
It is assumed that if the planner is indifferent between allocating a firm to the risky
and the safe site, then he allocates it to the safe site. Thus the action profile of the planner
is left continuous.
From now on we will do away with the time subscript. Let v(p) be the value function
of the planner. Since actions are left continuous and beliefs can move only in the left
direction, left continuity of v(p) can always be assumed.
Then v(p) should satisfy,
v(p) = maxk∈{0,1,2}
{(2− k)π0 dt+ kpπ1 dt+
(1− r dt)(1− (2− k)π0 dt− kpπ1 dt)(v(p)− v′(.)kp(1− p) dt)},
32
since (v(p+ dp) = v(p) + v′(p) dp) and dp = kp(1− p) dt.
After simplifying, we have
rv = maxk∈{0,1,2}
{(2− k)π0[1− v] + k(π1p[1− v − v′p(1− p)])} (1.18)
Proposition 6 The planner allocates both firms to the risky site as long as p > p∗, where
p∗ = π0π1
. For p ≤ p∗, both firms are allocated to the safe site.
Proof. Since (1.18) is linear in k, we know that at the optimum, k will either be 2 or 0.
When both firms are optimally allocated to the risky site, the value function satisfies:
v =2π1
r + 2π1+ C(1− p)[Λ(p)]
r2π1 ,
where Λ(p) = 1−pp and C is the integration constant. This is derived by solving the O.D.E
obtained by putting k = 2 in (1.18).
When both firms are optimally allocated to S, then v = 2π0r+2π0
. Since v(p) satisfies the
value matching and smooth pasting conditions at p = p∗, we get
C =2π0r+2π0
− 2π1r+2π1
(1− p∗)[Λ(p)]r
2π1
and p∗ =π0
π1
This concludes the proof.
The non-cooperative game
Player i chooses actions {kit ∈ {0, 1}}, such that kit is measurable with respect to the
information available at time t. We restrict our attention to Markovian strategies, such
that strategy of player i is defined by the mapping ki : [0, 1] → {0, 1}. We allow only
those ki functions which satisfy the property that k−1i (1) and k−1
i (0) are disjoint unions of
a finite number of non-degenerate sub-intervals in [0, 1], such that ki(0) = 0 and ki(1) = 1.
This ensures that the game is well-defined in the continuous time framework.
33
Firms simultaneously update their belief about the risky site to be good until there is
at least one firm carrying out research at the risky site and there is no discovery(at any
of the sites). Both k1 and k2 are left continuous, which guarantee the existence of a well
defined law of motion of the posterior.
Let vi be the value function (equilibrium payoff) of firm i (i = 1, 2)in the non-cooperative
game. If (k1, k2) is an equilibrium strategy profile then given kj (j = 1, 2), ki (i = 1, 2; i 6= j)
and vi should satisfy
vi = maxki∈{0,1}
{(1− ki)π0 dt+ kiπ1p dt+
(1− r dt)(1− π0 dt(2− ki + kj)− pπi(ki + kj) dt)(vi − v′ip(1− p)(ki + kj) dt)}
Simplifying above, we obtain
rvi = maxki∈{0,1}
{(1− ki)π0(1− vi) + ki(π1p[1− vi − v′ip(1− p)])
− (1− kj)π0vi − kjπjp(vi + (1− p)v′i)} (1.19)
Proposition 7 There exists an efficient equilibrium.
Proof. Consider the following strategy profile: Each firm uses R for p > p∗ and S for
p ≤ p∗ (Hence p∗ is the switching point). This is a symmetric strategy profile and the
outcome implied by this profile is the efficient outcome. We need to show that this profile
constitutes an equilibrium.
Suppose firm 2 follows the above strategy. We will determine the best response of firm
1. It is clear that for p = 1, firm 1 will choose R. Thus the optimal switching point of firm
1 is to be determined.
If firm 1 shifts to S from R at any p > p∗, then his payoff in the range (p∗, p] will satisfy,
v1 =π0
π0 + r(1− π1
π1 + π0 + rp) + C(1− p)[Λ(p)]
π0+rπ1
34
This is derived from solving the O.D.E. obtained by putting k1 = 0 and k2 = 1 in (1.19).
Since firm 2 switches to S from R at p∗, value matching condition at p∗ implies
C =π0
r+2π0− π0
π0+r (1− π1π1+π0+rp
∗)
(1− p∗)[Λ(p∗)]r+π0π1
We can check that C < 0. Hence v′1 is concave for p ∈ (p∗, p]. Further, v
′1(p∗) is zero.
Thus if 1 switches to S from R at p, v′1(p) < 0. This implies that we have π0(1− v1(p)) <
π1p[1 − vi(p) − v′i(p)p(1 − p)], which contradicts optimality. This is true for any p > p∗.
This implies that firm 1 should shift to S from R at any p ≤ p∗.
Suppose firm 1 shifts at a point p′< p∗. Then v1 for the range [p
′, p∗] will satisfy,
v1 =π1
r + π0 + π1p+ C(1− p)[Λ(p)]
r+π0π1
Then v′1(p′) < 0 for any p
′< p∗. Since v1(.) will satisfy the value matching condition
at p′, we know that v1(p
′) = π0
r+2π0. Thus for p = p
′+ ε, ε > 0 and ε → 0, we must have
v1(p) < π0r+2π0
. However by switching to S at p∗ he can guarantee himself a payoff of π0r+2π0
at all p < p∗. This contradicts optimality. Hence the unique optimal switching point for
firm 1 is p∗. Similarly we can show this for firm 2.
This concludes the proof.
The setting of this model with symmetric firms is similar to that in ([37]), except for
the difference that here we have payoff-externality among players. (i.e they operate on the
same bandit). Hence we see that competition brings in efficiency.
1.2.2 Asymmetric firms
Suppose the firms differ in their abilities in conducting research at the good risky site.
That is we have π1 > π2 > π0 > 0.
35
Planner’s problem
Let (k1, k2) be the planner’s action profile. ki ∈ {0, 1}, for i = 1, 2. ki = 1(0) implies that
the planner has allocated the ith firm to the risky(safe) site. Let v(p) be the value function
of the planner. Then it should satisfy
v(p) = maxki∈{0,1}
{(2− k1 − k2)πo dt+ k1pπ1 dt+ k2pπ2 dt+
(1− r dt)(1− (2−k1−k2)π0 dt−k1pπ1 dt−k2pπ2 dt)(v(p)− v′(p)p(1− p)(k1π1 +k2π2) dt)}
⇒ rv = maxki∈{0,1}
{(2−k1−k2)π0[1−v]+k1(pπ1[1−v−v′(1−p)])+k2(pπ2[1−v−v′(1−p)])}
(1.20)
This is because v(p+ dp) = v(p) + v′(p) dp and dp = −(k1π1 + k2π2) dt.
Lemma 5 If there exists an interior solution (i.e there exists p∗i ∈ (0, 1) such that for
higher p firm i is allocated to R and for p less than or equal to p∗i , firm i is allocated to S)
then optimality requires diversification over a range of beliefs. That is, there exists a range
of beliefs over which the planner will allocate one firm to the risky site and the other to the
safe site.
Proof of Lemma. Suppose not. This implies that the planner’s optimality requires him
to switch both the firms from the risky to the safe site at the same p, say p′. At the
optimum the smooth pasting condition must hold which implies that v′(p′) = 0. From
(1.20), we know that optimality requires,
p′π2[1− v] = p
′π1[1− v(p
′)] = π0[1− v(p
′)]
However since π1 > π2, p′π2[1− v(p)] < p
′π1[1− v(p
′)]. This is a contradiction.
This proves the lemma.
Lemma 6 Firm 2 is to be shifted at a higher p than firm 1.
36
Proof of Lemma. Suppose not. From lemma (5) we know that this implies firm 1 is
shifted to the safe site at a higher p than firm 2. Let this switching point be p∗1. From (1.20),
we know that at p∗1 we must have, π0[1−v(p∗1)] = p∗1π1[1−v(p∗1)−v′(p∗1)(1−p∗1)]. Since π2 <
π1, we have π0[1−v(p∗1)] = p∗1π1[1−v(p∗1)−v′(p∗1)(1−p∗1)] > p∗1π2[1−v(p∗1)−v′(p∗1)(1−p∗1)].
This is a contradiction to the claim that it is optimal to keep firm 2 at the risky site at
p = p∗1. This proves the lemma.
Proposition 8 There exists a solution to the planner’s problem, where both the firms are
allocated to the risky site for p > p∗2, firm 2 is allocated to the safe site and 1 to the risky
site for p ∈ (p∗1, p∗2], and both firms are allocated to the safe site for p ≤ p∗1 where p∗1 = π0
π1.
Proof. First, assume that there exists some π0π1< p∗2 < 1, such that it is optimal to shift
firm 2 to the safe site at p∗2. The range of beliefs over which 2 is allocated to the safe site
and 1 is allocated to the risky site, v(p) should satisfy,
v =π0
r + π0+
rπ1p
(r + π0)(r + π0 + π1)+ C2(1− p)[Λ(p)]
r+π0π1 ≡ vSR
This is derived through solving the O.D.E obtained by putting k2 = 0 and k1 = 1 in
(1.20). Suppose p∗1 is the belief where 1 is shifted to the safe site. Since at p∗1, both the
firms are at S, optimality would require to have v′(p∗1) = 0(smooth pasting condition).
According to lemma (6), firm 2 is shifted from R to S at a higher p. Then from the
value matching condition, we know that we should have vSR(p∗1) = v(p∗1). This gives us
C2 =rπ0
(r+π0)(r+2π0)− rπ1p
∗1
(r+π0)(r+π0+π1)
(1−p∗1)[Λ(p∗1)]r+π0π1
. Observe that C2 > 0. Also, the smooth pasting condition
at p∗1 implies v′SR(p∗1) = 0. This gives us
rπ1
(r + π0)(r + π0 + π1)− C2[Λ(p∗1)]
r+π0π1 [1 +
(r + π0)π1p∗1
] = 0⇒ p∗1 =π0
π1
37
We now need to prove the existence of a p∗2 ∈ (p∗1, 1), such that at p∗2, the planner finds
it optimal to shift firm 2 from R to S. When both firms are allocated to R, v(p) satisfies
v =π1 + π2
r + π1 + π2p+ C1(1− p)[Λ(p)]
rπ1+π2 ≡ vR
Hence we need to prove the existence of a p∗2 ∈ (p∗1, 1), such that vR(p∗2) = vSR(p∗2) and
v′R(p∗2) = v
′SR(p∗2); the manifestations of the value matching and smooth pasting conditions
respectively.
Consider any p ∈ [p∗1, 1]. By v′sR , we denote the slope of vR if p is the point where firm
2 is shifted from R to S. Note v′SR(.) is evaluated on the basis of the fact that firm 1 is
shifted from R to S at p = p∗1.
v′sR =
π1 + π2
r + π1 + π2− Cp1 [Λ(p)]
rπ1+π2 [1 +
r
(π1 + π2)p]
where Cp1 =vSR−
π1+π2r+π1+π2
p
(1−p)[Λ(p)]r
π1+π2. This is obtained from the value matching condition at p.
Consider the expression Cp1 [Λ(p)]r
π1+π2 . The derivative of this expression with respect
to p is strictly negative.(refer to Appendix (A.3.2))
Further, the term [1 + r(π1+π2)p ] is also decreasing in p. Hence v
′sR is strictly increasing
in p.
At p = 1,
v′sR =
π1 + π2
r + π1 + π2>
rπ1
(r + π0)(r + π0 + π1)= v
′SR
At p = p∗1 = π0π1
,
v′sR(p∗1) =
π1 + π2
r + π1 + π2− [
vSR(p∗1)− π1+π2r+π1+π2
p
(1− p∗1)][1 +
r
(π1 + π2)p∗1]
=π1 + π2
r + π1 + π2− {[
( 2π0r+2π0
)− ( π1+π2r+π1+π2
p)(1− p)
][1 +r
(π1 + π2)p]}
38
since vSR(p∗1) = 2π0r+2π0
.
It can be shown that π1+π2r+π1+π2
− {[(
2π0r+2π0
)−(π1+π2r+π1+π2
p)
(1−p) ][1 + r(π1+π2)p ]} = 0 for p = 2π0
π1+π2.
Since v′sR(p) is strictly increasing in p and p∗1 = π0
π1< 2π0
π1+π2, v′sR(p∗1) < 0. Earlier, we
have established that the smooth pasting condition at p∗1 implies v′SR(p∗1) = 0. Hence
v′sR(p∗1) < v
′SR(p∗1), v
′sR(1) > v
′SR(1). Since both v
′SR(.) and v
′sR(.) are strictly increasing and
concave in p, there exists a unique p∗2 ∈ (p∗1, 1), such that v′sR(p∗2) = v
′SR(p∗2).
This concludes the proof of the existence of p∗2. Also it is established that v(p) is strictly
convex for p > p∗1.
Corollary 1 p∗2 >π0π2
, the threshold p where the planner would have shifted firm 2 from R
to S had he been dealing with this firm only.
Proof. Suppose not. Then p∗2 ≤ π0π2
. At p∗2, v′(p∗2) = v
′SR(p∗2) > 0. Since v is strictly
convex for p > π0π1
, v′(π0π2
) > 0. Therefore at p = π0π2
, π0[1−v] > π2p[1−v−v′(1−p)]. From
(1.20), we can see that this contradicts the claim that p∗2 ≤ π0π2
. This proves the corollary.
The non-cooperative game
This is similar to the non-cooperative game with symmetric firms. Thus k1(.) and k2(.)
are the Markovian strategies of the players.
Let v1(p) and v2(p) be the payoff functions of firm 1 and 2 respectively in a Markovian
equilibrium. vi should then satisfy,
rvi = maxki∈{0,1}
{(1−ki)[π0(1−vi)]+ki[πip(1−vi−v′i(1−p))]−[(1−kj)π0vi+kjp(vi+v
′(1−p))]}
(1.21)
This implies that given kj , at any p optimality on firm i’s part requires choosing ki(p) =
0(1) if [π0(1− vi)] ≥ (<)[πip(1− vi − v′i(1− p))] .
We determine the non-cooperative equilibrium in following steps.
39
Lemma 7 Suppose 2 follows the strategy of going to R for p > p∗N2 and to S for p ≤ p∗N2such that π0
π1< p∗N2 < 1. Then firm 1’s best response is to go to R for p > p∗1 and to S for
p ≤ p∗N1 where p∗1 = π0π1
.
Proof of Lemma. First, consider the range p ≤ p∗N2 . If k1 = 1 (k2 = 0 by hypothesis),
then by putting i = 1 in (1.21) we know that v1 should solve
v′1 +
[r + π0 + π1]p(1− p)π1
v1 =1
(1− p)
This is a first order O.D.E. Solving this we have,
v1 =π1
r + π0 + π1p+ C(1− p)[Λ(p)]
r+π0π1 ≡ vRS1 (p) (1.22)
where C is an integration constant. If he choose k1 = 0 then v1(p) should satisfy,
v1 =π0
r + 2π0(1.23)
Initially, we assume that firm 1 indeed behaves in the way as claimed, for p ≤ p∗N2 .
Later, we will show that the value function thus obtained for the specified range will satisfy
the bellman equation for this range.
p∗N1 is the threshold, above which firm 1 goes to R. Then, value matching and smooth
pasting at p∗1 would imply vRS1 = π0r+2π0
and vRS′
1 (p∗N1 ) = 0. From these, we obtain
p∗N1 = π0π1
. Now we check, whether v1 thus obtained, satisfies the bellman equation or not.
At p = p∗N1 v′1(p∗N1 ) = 0. Hence [π1p(1−v1−v
′1(1−p))] = π0(1−v1). For p ∈ [p∗N1 , p∗N2 ],
v′1 satisfies,
v′1 ≡ vRS
′1 =
π1
r + π0 + π1− C[Λ(p)]
r+π0π1 [1 +
r + π0
π1p]
⇒ (1− p)v′1 =π1
r + π0 + π1− C(1− p)[Λ(p)]
r+π0π1
r + π0
π1p− v1
⇒ [π1p(1− v1 − v′1(1− p))] = (r + π0)v1
40
v1 is strictly convex and increasing in the range [p∗N1 , p∗N2 ]. At p = p∗N1 , [π1p(1−v1−v′1(1−
p))] = (r + π0)v1 = π0(1 − v1). Therefore for p ∈ (p∗N1 , p∗N2 ], [π1p(1 − v1 − v′1(1 − p))] =
(r+π0)v1 > π0(1−v1) . From (1.21) we can conclude that it is optimal for firm 1 to choose
k1 = 1.
Next, consider the range p > p∗N2 . As before we conjecture that it is optimal for 1
to choose k1 = 1 and derive the value function. Then we show that the obtained value
function indeed satisfy the bellman equation. If 1 choose k = 1 then v1 should satisfy
v1 =π1
r + π1 + π2p+ C(1− p)[Λ(p)]
rπ1+π2
Value matching at p∗N2 implies C =[v1(p∗N2 )− π1
r+π1+π2p∗N2 ]
(1−p∗N2 )[Λ(p)]r
π1+π2. Clearly C is positive as v1(p∗N2 ) >
π1r+π1+π0
p∗N2 > π1r+π1+π2
p∗N2 . Thus v1 is strictly increasing and convex in (p∗N2 , 1].
We will show that it satisfies the bellman equation. For p > p∗N2 ,
v′1 =
π1
r + π1 + π2− C[Λ(p)]
rπ1+π2 (1 +
r
(π1 + π2)p
⇒ [π1p(1− v1 − v′1(1− p))] =
π
π1 + π2[π2p+ rv1]
At p∗N2 , [π1p(1 − v1 − v′1(1 − p))] > π0(1 − v1). Since v1 is strictly increasing in p, for
p ∈ (p∗N2 , 1], [π1p(1− v1− v′1(1− p))] > π0(1− v1) for p > p∗N2 . Hence it is optimal for firm
1 to choose k1 = 1 for p ∈ (p∗N2 , 1].
This concludes the proof.
Lemma 8 Suppose firm 1 plays the following strategy: Go to R for p > p∗N1 = π0π1
and Go
to S for p ≤ p∗N1 . Then there exists a p∗N2 ∈ (p∗N1 , π0π2
), such that firm 2’s best response is
to Go to R for p > p∗N2 and Go to S for p ≤ p∗N2 .
Proof of Lemma. Consider p ≤ p∗N1 . First, as before we conjecture that it is optimal
for firm 2 to be at S. Then v2 = π0r+2π0
for p ≤ p∗N1 . From (1.21) once can conclude that
π0(1− v2) > π2p[1− v2 − v′2(1− p)] for p ≤ p∗N1 .
41
Now consider the optimal stopping problem of firm 2 in the range [p∗N1 , 1], given firm
1’s strategy.
First we show that firm 2 will switch from R to S at a p > p∗N1 .
Suppose firm 2 switches from R to S at p∗N1 . Then v2 for p ≥ p∗N1 satisfies:
v2 =π2
r + π1 + π2p+ C(1− p)[Λ(p)]
rπ1+π2 ≡ vR2
This is derived through solving the O.D.E obtained by substituting k1 = k2 = 1 and i = 2
in (1.21).
We can show that vR′
2 (p∗N1 ) < 0 (See Appendix (A.3.3))
If firm 2 switches at some p2, such that p2 > p∗N1 , then v2 in the range [p∗N1 , p2] will
satisfy,
v′2 +
[r + π0 + π1]π1p(1− p)
v2 =π0
π1p(1− p)
Solving this O.D.E we have
v2 =π0
r + π0[1− π1
r + π0 + π1p] + C(1− p)[Λ(p)]
r+π0π1 ≡ vSR2
Value matching at p∗N1 implies C < 0. Hence v2 is concave in the range [p∗N1 , p2] and
it can be shown that vSR′
2 (p∗N1 ) = 0. Hence this proves that it is optimal for firm 2 to
switch at a point p > p∗N1 . Also at the optimal switching point p∗N2 ,(if exists) smooth
pasting condition will be satisfied and we shall have vR′
2 (p∗2) = vSR′
2 (p∗N2 ) < 0. Suppose
by vRs′
2 we denote the derivative of vR2 (p) if the switching takes place at the belief p. It
has been established that vSR′
2 (p∗N1 ) = 0 > vRs′
2 (p∗N1 ). Further, it is easy to see that
vSR′
2 (1) = − π0π1(r+π0)(r+π0+π1) <
π2r+π1+π2
= vRs′
2 (1). It can be shown that vRs′
2 (p) is strictly
increasing. Since vSR′
2 (.) is strictly decreasing, there exists a unique p∗N2 ∈ (p∗N1 , 1), such
that vSR′
2 (p∗N2 ) = vRs′
2 (p∗N2 ).
From (1.21), we know that at the optimal we shall have [π2p∗N2 (1−v2(p∗N2 )−v′2(p∗N2 )(1−
42
p∗N2 ))] = π0(1− v2(p∗N2 )). Since [1− v2(p∗N2 )] < [1− v2(p∗N2 )− v′2(p∗N2 )(1− p∗N2 )], we have
p∗N2 < π0π2
.
Proposition 9 Firm 1 going to R (S) for p > (≤)p∗N1 and firm 2 going to R (S)for
p > (≤)p∗N2 constitutes a Markovian equilibrium .
Proof. This follows directly from lemma (7) and (8).
The above proposition describes the unique equilibrium in threshold type Markovian
strategies. Since p∗N2 < π0π2
< p∗2, there exists a range of beliefs (p∗N2 , p∗2) when efficiency
requires firm 2 to shift to the safe site, but it does not. This shows, that the non-cooperative
equilibrium outcome involves the phenomenon of duplication.
Proposition (9) strengthens the notion, that in a R&D race model, where firms have
to choose between competing research projects and the first inventor appropriates all the
rent, we have distortion in the form of duplication.
1.3 Conclusion
We have demonstrated that when the firms’ abilities differ across research methods, then
efficiency requires diversification of research efforts over a range of beliefs. This has been
established in two different environments. In presence of heterogeneity among firms, we
do not achieve efficiency in a non-cooperative interaction. Only when the firms are equally
capable across research methods, is the non-cooperative outcome efficient. When the firms
differ in abilities, we always have duplication in a non-cooperative interaction. Depending
on the parameter values we can either have a unique equilibrium with diversification over
a range of beliefs or a multiplicity of equilibria with diversification at a point only.
Chapter 2
Competition and Learning in R&D
: The Role of Private Information
2.1 Introduction
Innovation is an important aspect in the technological progress of an economy. The previous
chapter explored the scenario where competing agents trying to make the same discovery
had alternate methods of research to choose from. We saw that in presence of heterogeneity
between agents, non-cooperative interactions always lead to distortion. This distortion is
in the form of too-much duplication, i.e both firms adopt same kind of research method,
when efficiency dictates one of them to adopt a different method.
Throughout our analysis in the previous chapter, we found that inefficiency was follow-
ing from competition and heterogeneity among agents. The analysis was done by restricting
ourselves to settings where every outcome is perfectly observable by all agents and there
can only be one kind of news arriving(i.e the final discovery). However, in reality, we do
observe that the process of innovation might involve several interim arrivals of news before
the final discovery. While for all practical purposes final discovery can be supposed to be
publicly observable due to patenting, there is no reason why intermediate arrivals of news
43
44
should be supposed to be observed by everyone. In fact, it is commonly observed that
firms conducting research to compete for a discovery, often maintain secrecy about their
interim outcomes, even though the path of research adopted by other firms is commonly
known. Each firm may obtain some interim success which they may not choose to reveal.
Revealing interim success gives an instantaneous payoff, but it makes the firm vulnerable in
the sense that the competitor may take advantage of this interim result and make the final
discovery sooner and thereby get disproportionately higher rent due to the winner-takes-all
structure.
As a motivating example, consider the world of academic research. Often two re-
searchers try to solve the same problem independently. Whoever solves the problem first,
gets a disproportionately higher payoff (a very good publication) than the subsequent re-
searcher solving the problem. In this situation it is very likely that one of them may get an
interim result earlier. This individual now has two options: Either to reveal this interim
discovery or to conceal it. Revealing might give an instantaneous payoff(say a publication
in a relatively low ranked journal). However, this also increases the probability of the
competing researcher solving the final problem earlier. This shows that a researcher will
not always have incentive to reveal his interim success. In particular, in the absence of any
interim payoff a researcher will never reveal any interim result. This chapter analyses this
issue of private arrival of information in a setting where there is no payoff to reveal interim
result(s).
The setting is a modified version of the model in the second environment of the previous
chapter. We have two firms who are trying to find the same treasure. There are two
alternate sites to search for the treasure. One of them will almost surely lead to success
in finite time. We refer to this as the safe site. We consider a continuous time framework.
Success to each firm who searches there follows a Poisson process with intensity π0 > 0.
The other site can either be good or bad. A bad risky site has no treasure and a good
risky site has the treasure for sure. A firm who searches there can experience two kinds
45
of arrivals. There can be an arrival of information according to a Poisson process with
intensity π1 > 0. This just informs the firm that the site is good. This information is
only revealed to the firm who experiences this arrival. There can also be arrival of final
discovery according to a Poisson process with intensity π2 > π0. A priori, players start
with the same prior p, which is the probability with which the risky site is good.
We first obtain the efficiency benchmark or the full information optimal of this model,i.e
when both the firms are controlled by a social planner, who can observe all arrivals experi-
enced by a firm. Hence, both firms and the planner share a common belief about the state
of the risky site. The planner at each instant allocates a firm to a site. As soon as there is
a final discovery, the search ends. If any firm experiences an informational arrival, then all
uncertainties are resolved and both the firms thereon are allocated to the risky site( which,
in fact is now found to be good). The solution is threshold type. There exists a threshold
belief p∗ such that conditional on no observation, both firms are allocated to the risky site
if p > p∗ and to the safe site otherwise.
Next, we turn to the non-cooperative game. We restrict ourselves to symmetric equi-
libria. This implies that on the equilibrium path, given same information, actions will
be identical across firms. Hence, if the players start with a common prior, then on the
equilibrium path they will have a common posterior, even though the beliefs are private.
We derive an equilibrium as follows. There exists a common threshold p∗N , such that if
the private belief is greater than p∗N , then the firm searches at the risky site. Else they go
to the safe site. If a firm experiences an informational arrival then it keeps on searching
at the risky site as long the game continues. If initially a firm goes to the risky site and
gets no arrival till the belief hits p∗N , then it shifts to the safe site. However, if it observes
that its competitor has not shifted, then it reverts back to the safe site. This is because
the action of the competitor gives the firm a signal that an informational arrival has been
experienced at the risky site and thus it is good.
Having described the full information optimal and a non-cooperative equilibrium, we
46
try to analyse the nature of inefficiency. We observe that p∗N > p∗. However, this will not
help us to determine the nature of inefficiency. This is because in the full information case,
the beliefs are public and in the non-cooperative case the beliefs are private. Moreover, the
belief updating processes are different. In the non-cooperative game, movement of beliefs
are sluggish. Hence, to determine the nature of inefficiency, we take the following route.
For each initial prior, at which the planner would have allocated both firms to the
risky site, we try to calculate the duration for which the firms are kept at the risky site,
conditional on no observation. Then we compare this with the duration for which the firms
would be in the risky site in the non-cooperative game, given the same prior.
First of all, it is trivially true that if the prior is in the range (p∗, p∗N ), in the non
cooperative game, the duration for which the firms go to the risky site (which is actually
0) is less than that a planner would have wanted. Then we determine a threshold belief
p0∗ ∈ (p∗N , 1) such that if the initial prior is higher (lower) than this threshold, then the
duration for which the firms are in the risky site in the non-cooperative game is higher
(lower) than that a planner would have wanted. Hence, too much optimism results in
excessive experimentation along the risky line.
Related Literature: This chapter contributes to two broad areas: IO literature and
the Strategic Bandit literature. To avoid redundancy with the previous chapter, I only
discuss the papers which have dealt with the issue of private arrival of information in the
context of strategic experimentation.
The paper which is closest to this work is the one by Akcigit and Liu[1]. They analyse
a two-armed bandit model with one risky and one safe arm. The risky arm could poten-
tially lead to a dead end. Inefficiency arises from the fact that there is wasteful dead-end
replication and an early abandonment of the risky project. The present work incorporates
the issue of private arrival of information in a different manner. The private information is
in the form of good news about the risky site, unlike their work where private information
47
is in the form of bad news. However, the present work shows that there can still be early
abandonment of the risky project, if to start with players are not too much optimistic
about the quality of the risky line. Further, in the present work we have learning even
when there is no information asymmetry.
Heidhues, Rady and Strack[34] analyse a strategic experimentation model where we
have private payoffs. They take a two armed bandit model with a risky arm and a safe
arm. Players observe each other’s behavior but not the realised payoffs. They communicate
with each other via-cheap talk. The present chapter differs from their work in the following
ways. Firstly, we have private arrivals of information only. Secondly, players are rivals
against each other.
The rest of the chapter is organised as follows. Section 2.2 discusses the Environment
formally and the full information optimal solution. Section 2.3 discuss the non-cooperative
game and the nature of inefficiency. Section 2.4 concludes the chapter.
2.2 Environment
Two firms are trying to find the same treasure. The first one to find it, appropriates all the
rent from it( which is the social value from the invention and is normalised to 1). There
are two sites. One of the sites referred to as the safe one, has the treasure for sure. A firm
who is searching there discovers it according to a Poisson process with intensity π0 > 0.
The other site is risky and can either be bad or good. A bad risky site has no treasure in it
and a firm who is searching there does not experience any arrival. A good risky site has the
treasure for sure. A firm who searches there can experience two kinds of arrivals. There
can be an arrival of information according to a Poisson process with intensity π1 > 0. This
just informs the firm that the site is good. This information is only revealed to the firm
who experiences this arrival. There can also be arrival of final discovery according to a
Poisson process with intensity π2 > π0.
A priori, players start with the same prior p, which is the probability with which the
48
risky site is good.
2.2.1 The planner’s problem: The full information optimal
Consider the optimal allocation in the case of full information, i.e when all kinds of arrivals
at the risky site are publicly observable. Suppose both the firms are controlled by a
benevolent social planner, who can observe all the arrivals experienced by each of the
firms. The planner wants to maximise the expected discounted social value.
Let k be the number of firms allocated to the risky site at an instant t. Since every
arrival is observable to the planner, if there is no arrival during the interval dt, then
dpt = −pt(1− pt)k(π1 + π2) dt
As soon as the planner experiences any arrival at the risky site, the uncertainty is resolved.
If it is a final discovery then the search ends, else the planner knows for sure that it is a
good risky site and allocates both the firms to that site then on. As before, we assume
that if the planner is indifferent between allocating a firm to the risky and and the safe
site, then he allocates it to the safe site.
Hence if v(p) is the value function of the planner, then it should satisfy the following
Bellman equation:
v(p) = maxk∈{0,1,2}
{(2− k
)π0 dt+ kp
[π2 dt+ π1
2π2
r + 2π2dt]
+(1− r dt
)(1− (2− k)π0 dt− kp(π1 + π2) dt
)(v(p)− v′(p)kp(1− p)(π1 + π2) dt
)}
⇒ rv = maxk∈{0,1,2}
{(2−k)[π0(1−v)] +kp
[π2 +π1
2π2
r + 2π2− (π1 +π2)v− (π1 +π2)(1−p)v′
]}(2.1)
Since the Bellman equation is linear in k, we can posit that at the optimal either k = 0 or
49
k = 2. If k = 0, then v = 2π0r+2π0
. If k = 2 then v satisfies the following first order O.D.E:
v′+
[r + 2(π1 + π2)p]p(1− p)2(π1 + π2)
v =2π2{r + 2(π1 + π2)}(r + 2π2)2(π1 + π2)
1(1− p)
This is derived from (2.1) by putting k = 2. The solution to this O.D.E is
v =2π2
(r + 2π2)p+ C(1− p)[Λ(p)]
r2(π1+π2) (2.2)
where C is the integration constant and Λ(p) = (1−p)p .
Let p∗ be the belief at which the planner shifts both the firms to the safe site from the
risky site. Since for p = 1 (p = 0), the planner allocates both firms to the risky (safe) site,
p∗ ∈ (0, 1). Hence v(p) should satisfy the value matching and smooth pasting condition.
From the value matching condition at p∗, we have
C =2π0r+2π0
− 2π2r+2π2
p∗
(1− p∗)[Λ(p)]r
2(π1+π2)
Smooth pasting condition at p∗ implies v′(p∗+) = 0. From (2.2) we have
v′
=2π2
r + 2π2− C[Λ(p)]
r2(π1+π2) [1 +
r
2(π1 + π2)1p
]
Substituting the value of C and imposing the smooth pasting condition at p∗, we obtain
2π2
r + 2π2=
2π0r+2π0
− 2π2r+2π2
p∗
(1− p∗)[1 +
r
2(π1 + π2)1p∗
]
⇒ p∗ =π0
π2 + 2π1{(π2−π0)}(r+2π2)
(2.3)
By comparing this p∗ to the one obtained in the model without informational arrival
(which is π0π2
from the previous chapter), we can infer that experimentation along the risky
line is carried out for a larger range of beliefs in presence of informational arrival.
50
2.3 The non-cooperative game
Let us now consider the non-cooperative game. We assume that a firm can observe the
action of its opponent. Only the final discovery by any firm is publicly observable. Hence
if a firm is searching at the risky site and experiences an informational arrival, it is private
to him. Also any informational arrival to a firm resolves the uncertainty to it.
Suppose firms start out with the same prior pt at the instant t, and both conduct
research at the risky site over a time interval ∆ > 0. Conditional on not observing anything
until the instant t and during the interval [t, t+ ∆], the common posterior of the firms at
(t+ ∆) is given as:
pt+∆ =pte−(π1+2π2)∆
pte−(π1+2π2)∆ + (1− pt)
This is because during the time interval [t, t + ∆], conditional on the risky site being
good, probability that a firm does not experience any informational arrival or have a final
discovery is e−(π1+π2)∆ and the probability that the opponent does not have any final
discovery is e−(π2)∆. Hence probability that the site is good and a firm does not observe
anything is pe−(π1+2π2)∆.
If ∆ is small enough then the firms’ common posterior when both conduct research at
the risky site (starting with a common prior) satisfies the following law of motion:
dpt = −(π1 + 2π2)pt(1− pt) dt
2.3.1 Equilibrium
We look for a symmetric equilibrium in the following kind of strategies:
A firm, given the current belief (which is private) chooses a site to carry out research.
If it chooses the risky site at the onset, then it also chooses a posterior, at which it is going
to switch to the safe site.
51
In a symmetric equilibrium, there exists a threshold p∗N ∈ (0, 1) such that if the prior
(which is common to both the firms) p0 > p∗N , then both the firms choose the risky site
to carry out research and choose p∗N as the posterior to switch to the safe site.
Since firms start out from the same prior, as long as there is no arrival, firms will have
the same posterior.
If at p = p∗N , the firm observes the other firm to be still conducting research at the
risky site then it reverts back to the risky site and follows the other firm then on. Also if
the other firm reverts back to the risky site at any p < p∗N then it will follow suit. Shifting
between sites is costless and takes dt amount of time where dt > 0 and dt → 0. dt is
small enough such that the terms of order o( dt2) can be ignored.
In the following proposition we show that for sufficiently patient firms, such a symmetric
equilibrium exists and is unique.
Proposition 10 There exists a unique symmetric equilibrium as described above for suf-
ficiently patient firms(i,e r is low enough) with
p∗N =π0
π2 + π1r+2π2
(π2 − π0) rr+π0
Proof. We prove this proposition in following steps:
Lemma 9 If there exists a symmetric equilibrium as conjectured above then we must have
p∗N ≤ π0
π2 + π1r+2π2
(π2 − π0) rr+π0
≡ ¯p∗N
Proof of Lemma. Suppose there exists a symmetric equilibrium as conjectured above
and p∗N is the common belief where firms switch to the safe site from the risky site, if the
prior p0 > p∗N . Let the action of firm i be denoted by kit. kit ∈ {0, 1}. kit = 0(1) implies
that the firm is choosing the safe(risky) site. The strategy of each player in a symmetric
equilibrium is as follows:
52
1. If p0 > p∗N , then ki0 = 1 and kit = 1 as long as pit > p∗N . If pit ≤ p∗N , then kit = 0.
2. If pit ≤ p∗N (t > 0) and kjt = 1 (j 6= i) then kit = kjt from then on.
Let v1(p1) and v2(p2) be the equilibrium payoffs to firm 1 and 2 respectively, in a
symmetric equilibrium. Since firms are identical in all respects and they start with a
common prior, in a symmetric equilibrium firms will have a common posterior, conditional
on not observing anything. Thus on the equilibrium path, we shall have pi = pj .
Then given the strategy of firm 2, firm 1’s payoff v1(.) should satisfy the following
bellman equation:
v1 = maxk1∈{0,1}
{(1− k1)π0 dt+ k1p
[π2 dt+ π1 dt
π2
r + 2π2
]+(1−r dt
)(1−(2−k1−k2)π0 dt−k1(π1+π2)p dt−k2π2p dt
)(v1−v
′1p(1−p)[k1(π1+π2)+k2π2] dt
)}
⇒ rv1 = maxk1∈{0,1}
{(1−k1)
[π0(1−v1)
]+k1p
[(π2(r + π1 + 2π2)
r + 2π2)−(π1+π2)v1−v
′1(1−p)(π1+π2)
]− (1− k2)π0v1 − k2
[π2v1 + π2p(1− p)v
′1
]}(2.4)
We define Bs(p) and Br(p) as follows:
Bs(p) =[π0(1− v1)
](2.5)
Br(p) = p[(π2(r + π1 + 2π2)
r + 2π2)− (π1 + π2)v1 − v
′1(1− p)(π1 + π2)
](2.6)
From (2.4) it is clear that if at a particular p it is optimal for firm 1 to go to the risky
(safe) site then we shall have Br(p) ≥ (≤)Bs(p).
According to the conjectured equilibrium given firm 2’s strategy, firm 1 finds it optimal
53
to shift to the safe site at p = p∗N . Hence at p = p∗N we must have
Bs(p) ≥ Br(p)⇒ π0
(1− v1
)≥ p[(π2(r + π1 + 2π2)
r + 2π2)− (π1 + π2)v1 − v
′1(1− p)(π1 + π2)
]In equilibrium, both firms shift to S at p = p∗N . This implies that the left derivative
of v1 at p∗N is zero. Given firm 2’s strategy, if firm 1 goes to the risky site at p = p∗N ,
then conditional on there being no arrival, belief can change only in the leftward direction.
This implies
π0
(1− v1(p∗N )
)≥ p∗N
[(π2(r + π1 + 2π2)
r + 2π2)− (π1 + π2)v1(p∗N )
]Value matching condition at p∗N implies v1(p∗N ) = π0
r+2π0. Hence we shall have
π0(r + π0)(r + 2π0)
≥ p∗N π2(r + 2π2)(r + π0) + rπ1(π2 − π0)(r + 2π2)(r + 2π0)
⇒ p∗N ≤ π0
π2 + π1r+2π2
(π2 − π0) rr+π0
This concludes the proof.
As per the conjectured equilibrium, both firms go to the risky site for p > p∗N . Starting
from a common prior if both firms go to the risky site for p > p∗N , then v1 in this region
satisfies the following O.D.E:
v′1 +
r + (π1 + 2π2)p(1− p)(π1 + 2π2)
v1 =π2
r + 2π2(r + π1 + 2π2)
1(1− p)(π1 + 2π2)
This O.D.E is obtained from the bellman equation in (2.4) by putting k1 = k2 = 1.
Solving this we have
v1 =π2
r + 2π2p+ C(1− p)[Λ(p)]
rπ1+2π2 (2.7)
54
where C and Λ(.) are as defined before. Then v′1 is given by
v′1 =
π2
r + 2π2− C[Λ(p)]
rπ1+2π2 [1 +
r
π1 + 2π2
1p
] (2.8)
Lemma 10 In a symmetric equilibrium it is necessary to have p∗N = ¯p∗N .
Proof of Lemma. Suppose p∗N < ¯p∗N . In equilibrium we should have Br(p) ≥ Bs(p) for
all p > p∗N .
First of all we will show that it is never possible to have v′1(p∗N+) < 0.
From (2.6) we have
Br(p) = p[(π2(r + π1 + 2π2)
r + 2π2)− (π1 + π2)v1 − v
′1(1− p)(π1 + π2)
]⇒ Br(p) = p[
π2(r + π1 + 2π2)r + 2π2
−(π1+2π2)v1(p)−(π1+2π2)v′1(p)(1−p)]+π2pv1+π2p(1−p)v
′1
Since for p > p∗N , v1 is given by (2.7), we have
Br(p) = rv1 + π2pv1 + π2p(1− p)v′1 (2.9)
Consider p = p∗N + ε, such that ε > 0 and ε→ 0. Then v1 ≈ π0r+2π0
.(since v1 is continuous)
This implies
Bs(p) ≈ rπ0
r + 2π0+
π20
r + 2π0
If v′1(p∗N+) < 0, then Br(p) < rv1 + π2pv1 ≈ r π0
r+2π0+ π2p
π0r+2π0
. Since ¯p∗N < π0π2
,
r π0r+2π0
+ π2pπ0
r+2π0< Bs(p). This implies that Br(p) < Bs(p) and contradicts optimality.
Hence we cannot have v′1(p∗N+) < 0 in equilibrium. This implies that v
′1(p∗N+) ≥ 0.
Since both firms shift to the safe site from the risky site at p = p∗N , using the value
matching condition at p = p∗N we have
C =π0
r+2π0− π2
r+2π2p∗N
(1− p∗N )[Λ(p′)]r
π1+2π2
55
From (2.8), we have v′1(p∗N+) ≥ (≤)0 according as p∗N ≥ (≤) π0
π2+π1
r+2π2(π2−π0)
.
Since v′1(p∗N+) ≥ 0, we must have p∗N ∈ [ π0
π2+π1
r+2π2(π2−π0)
, ¯p∗N ).
As p∗N < ¯p∗N , we shall have Bs(p∗N ) > Br(p∗N ).
This implies
π0
(1− v1(p∗N )
)> p∗N
[(π2(r + π1 + 2π2)
r + 2π2)− (π1 + π2)v1(p∗N )
]as v
′1(p∗N ) = 0. As v1(.) is continuous, this strict inequality will still be satisfied for
p = p∗N + ε (ε > 0) . For p = p+ ε, v′1 ≥ 0. Then from (2.6), we can infer that
Bs(p) = π0
(1− v1(p∗N )
)> p∗N
[(π2(r + π1 + 2π2)
r + 2π2)− (π1 + π2)v1(p∗N )
]
> p∗N[(π2(r + π1 + 2π2)
r + 2π2)− (π1 + π2)v1(p∗N )− v′1(1− p)(π1 + π2)
]= Br(p)
This is not possible in equilibrium. Hence it is necessary to have Bs(p∗N ) = Br(p∗N ). This
implies p∗N = ¯p∗N .
This concludes the proof.
From the above two lemmas we know that a necessary condition to have a symmetric
equilibrium is to have the common switching probability p∗N to be equal to ¯p∗N . Hence
conditional on the existence, the symmetric equilibrium is unique.
Now we need to prove the existence. We need to find the conditions which will guarantee
that for all p > p∗N , Br(p) ≥ Bs(p).
According to the proposed profile of strategies, both firms go to the risky site for
p > p∗N . Then Br(p) is given by (2.9). Using (2.5) we know that Br(p) ≥ Bs(p) requires
v1 ≥π0
r + π2p+ π0− π2
p(1− p)v′1(p)r + π2p+ π0
(2.10)
From our above analysis we know that v1 is always strictly increasing and convex in p for
p ∈ (p∗N , 1). This implies v1 ≥ π0r+2π0
for p ∈ (p∗N , 1). Hence from (2.10) we can posit that
56
a sufficient condition to ensure Br(p) ≥ Bs(p) is to have
π0
r + π2p+ π0− π2
p(1− p)v′1(p)r + π2p+ π0
≤ π0
r + 2π0
Since v′1 > 0 for p > p∗N , the above inequality is satisfied (strictly) for p ≥ π0
π2.
At p = π0π2
,π0
r + π2p+ π0− π2
p(1− p)v′1r + π2p+ π0
<π0
r + 2π0
Hence there exists a p′< π0
π2, such that for p ∈ (p
′, π0π2
],
π0
r + π2p+ π0− π2
p(1− p)v′1r + π2p+ π0
≤ π0
r + 2π0
From the expression of p∗N we know that p∗N → π0π2
as r → 0. Hence we can find a
r∗ > 0, such that for all r ∈ (o, r∗), p∗N > p′. This implies implies for r ∈ (0, r∗), we have
Br(p) ≥ Bs(p) for all p ∈ (p∗N , 1).
This concludes the proof of the proposition
Since rr+π0
< 1 we can infer that p∗N > p∗. Hence the non-cooperative equilibrium
may involve distortion. However, a priori it cannot be determined whether there will be
too much or too little experimentation along the risky line in the non-cooperative equi-
librium described above. This is because in non-cooperative interaction, private arrival of
information is not publicly observable. Thus if the common prior is greater than p∗N , then
conditional on no arrival, the private belief of the players diverges from the public belief.
(which is the one discussed in the full information optimal problem) Thus to determine
the nature of inefficiency, we need to know the duration of experimentation along the risky
line, conditional on no arrival. Observe that in case of no private information, there is a
one to one correspondence between the duration of experimentation and the posterior.
The following proposition establishes the nature of inefficiency.
Proposition 11 The non-cooperative equilibrium involves inefficiency. There exists a
57
p0∗ ∈ (p∗N , 1) such that if the prior p0 > p0∗, then conditional on no arrival we have
excessive experimentation and for p0 < p∗0 we have too little experimentation. By ex-
cessive experimentation we mean that starting from a prior the duration for which firms
conduct research in the risky site is more than that a planner would have liked to.
Proof. Let tnp0 be the duration of experimentation along the risky line by the firms in the
non-cooperative equilibrium described above when they start from the prior p0. From the
non-cooperative equilibrium described above we know that of the firms start out from the
prior p0 then they would carry on experimentation along the risky line until the posterior
reaches p∗N . From the dynamics of the posterior we know that
dpt = −(π1 + 2π2)pt(1− pt) dt⇒ dt = − 1(π1 + 2π2)
1pt(1− pt)
dpt
tnp0 = − 1(π1 + 2π2)
∫ p∗N
p0
[1pt
+1
(1− pt)] dpt
⇒ tnp0 =1
(π1 + 2π2)[log[Λ(p∗N )]− log[Λ(p0)]]
Let tpp0 be the duration of experimentation along the risky line a planner would have wanted
if the firms start out from the prior p0. Then from the equation of motion of pt in the
planner’s problem we have
dpt = −2(π1 + π2)pt(1− pt) dt⇒ dt = − 12(π1 + π2)
1pt(1− pt)
dt
⇒ tpp0 =1
(2π1 + 2π2)[log[Λ(p∗)]− log[Λ(p0)]]
We have excessive experimentation when tnp0 > tpp0 . This is the case when
1(π1 + 2π2)
[log[Λ(p∗N )]− log[Λ(p0)]] >1
(2π1 + 2π2)[log[Λ(p∗)]− log[Λ(p0)]]
⇒ π1 log[Λ(p0)] < 2(π1 + π2) log[Λ(p∗N )]− (π1 + 2π2) log[Λ(p∗)]
58
Since Λ(p) is decreasing in p the above inequality states that there exists a p∗0 ∈ (0, 1) such
that if p0 > p∗0 then the above inequality is satisfied. Also since p∗ < p∗N we have
π1 log[Λ(p∗0)] = π1 log[Λ(p∗N )]− (π1 + 2π2)[log[Λ(p∗)]− log[Λ(p∗N )]] < π1 log[Λ(p∗N )]
⇒ p∗0 > p∗N
This concludes the proof.
In the non-cooperative equilibrium, distortion arises from two sources. One, is what
we call the implicit free-riding effect. This comes from the fact that if a firm experiences
a private arrival of information, then the benefit from that is also reaped by the other
competing firm. This is possible here because of instantaneous costless switching. In fact,
if information arrival to firms would have been public, then the non-cooperative equilibrium
would always involve free-riding. This follows directly from ([37]). Thus this implicit free
riding effect tends to reduce the duration of experimentation along the risky line.
The other kind of distortion arises from the fact that information arrival is private and
the probability that the opponent firm has experienced an arrival of information is directly
proportional to the belief that the risky site is good. Conditional on no observation, this
makes the movement of the belief sluggish. This results in an increase in the duration of
experimentation along the risky line. The effect of distortion from the second (first) source
dominates, if the prior to start with is higher.(lower)
This intuitively explains the result obtained in the above proposition.
2.4 Conclusion
This chapter has analysed a tractable model to explore the situation when there can be
private arrival of information. We show that there can be a non-cooperative equilibrium
where depending on the prior we can have both too much and too little experimentation
along the risky line. This result has been obtained under the assumption that firms can
59
switch between sites without incurring any cost (revocable switching). It will be interesting
to see how the results change if a firm after switching to the safe site is unable to revert
back to the risky site. This idea of irrevocable switching and payoff from interim results
will addressed in my near future research.
Chapter 3
Decentralised Bilateral Trading,
Competition for Bargaining
Partners and the law of one price
3.1 Introduction
In this chapter, we study price formation in a market with small numbers of buyers and
sellers, where transactions are bilateral, between a single buyer and a single seller. For a
broad range of variants of a dynamic bargaining game with many sellers and buyers, in
which only one side of the market makes offers, we find that, as the discount factor goes to
1, the stationary equilibrium prices in different transactions converge to a single value. A
dynamic version of “directed search” is one of the extensive forms discussed here, though
more attention is given to offers targeted to specific individuals on the other side.
60
61
3.1.1 Motivation for the problem studied
Most modern markets consist of a small number of participants on each side. These par-
ticipants buy from and sell to each other, write contracts with each other and sometimes
merge with each other. The transactions in these markets are often bilateral in nature,
consisting of an agreement between a buyer and a seller or a firm and a worker. These
bilateral trades occur without any centralised pricing mechanism, in a series of bargains in
which the “outside options” for a current bargaining pair are, in fact, endogenously given
for each by the presence of alternative partners on the other side of the market. However,
these potential alternative partners, by their presence, implicitly compete with each other
and one question that arises naturally is whether the “competitive” pressure of the outside
options leads to an approximately uniform price for non-differentiated goods. It is this
basic question, about endogenous outside options and a uniform price, that this chapter
seeks to study, in the context of a particular set of extensive forms. We focus on complete
information. 1
Examples
Whilst the models we study are going to be highly stylised representations of these exam-
ples, they at least have some features in common with them. A standard example used
in these settings is the housing market, for a given location and a given type of home (to
reduce the extent of differentiation). Sellers list their houses, buyers visit, inspect and then
convey their offers to the sellers-one offer from each buyer. Sellers can accept or reject the
offers they have; possibly they then make counter-offers or often wait for the buyer to come
back again with new higher offers. Whether counter-offers are made or not distinguishes
different extensive forms or bargaining protocols. The offers are privately made to sellers,
who typically do not know what other sellers receive.
Another example is of a firm being acquired. Here the potential acquirer makes a public1An incomplete information analysis has been done in the next chapter.
62
targeted offer for a particular firm, which the shareholders of the potential acquisition have
to accept or reject (based on a recommendation by the management). A rejection could
lead to the acquirer raising its offer. There could be competition on both sides, perhaps
another potential buyer called in by management of the target as a “white knight” and
other possible targets with the same attractive characteristics as the one in play. In this
particular context, it makes sense to think of offers as being one-sided, from the potential
buyers, and publicly announced.
Private targeted offers occur in negotiations for joint ventures. For example, the book [2]
describes the joint venture talks between industrial gas companies and chemical companies
in the 1980s, in which the players were Air Products, Air Liquide and British Oxygen on
one side and DuPont, Dow Chemical and Monsanto on the other. After some bargaining,
two joint ventures and an acquisition resulted.
A fourth context, this time from the economics literature, occurs in the “directed
search” models common in labour economics([29], [47]). The game consists here of firms
announcing wage offers simultaneously and workers deciding which offer to accept. If firms
are constrained in the number of slots, sometimes not all workers who seek the job can be
hired. The game is often modelled as one-stage; there is no dynamics of competing offers
over time for the same potential workers. Our model, however, has this additional feature
(of competition over time).
3.1.2 Main features of our model.
Our model begins from a setting of two buyers with common valuation v, two sellers
with valuations M,H and complete information about these values. We assume that
v > H > M > 0. We then extend the model by adding buyers, sellers and both to the
basic model. There is a one-time entry of players, at the beginning of the game, and a
buyer-seller pair who trade leave the market.
Players discount with a common discount factor δ ∈ (0, 1). We consider equilibria for
63
high values of δ and consider the limit of equilibria as δ → 1. We also consider extensive
forms with public and private targeted offers and “ex ante” public offers (as in directed
search), using the terminology of Gale. All extensive forms we consider have two main
features; offers are one-sided and offers are simultaneous. Simultaneous offers seems to
us to be the right way to capture the essence of competition. Targeting an offer to one
individual on the other side of the market enables us to endogenise matching between
buyers and sellers as a strategic decision. Once the offers have been made, one per proposer,
recipients simultaneously accept or reject. A rejection ensures that the game continues to
the following period, where payoffs are discounted by δ.
Our main results, starting with the basic model, can be simply described. There
is a unique stationary equilibrium outcome under complete information, involving non-
degenerate mixed strategies for all players. As δ → 1, the mixed strategies collapse to a
single price and the price in all matches goes to H. In equilibrium, there could be one-period
delay with positive probability, but the cost of delay, of course, goes to 0 as δ → 1. The
price H might be thought of as a competitive equilibrium price in the complete information
setting.
The complete information asymptotic results extend to the general case for n buyers
and n sellers (where n <∞ ).2
In the next section, we discuss the relevant literature and compare our results to some
of the existing work.
3.1.3 Related literature.
We now qualitatively describe the existing literature and compare our model with it. The
first attempts to obtain microfoundations for markets using bilateral bargaining were the
papers by Rubinstein and Wolinsky [53], [8] Gale [25] and [26], . These papers were all2We have not checked for uniqueness of the stationary equilibrium outcome. Though the equilibrium in
the general case has to consider cases not present in the basic model, the uniqueness result should extend,though proving it formally would involve details of a large number of special cases.
64
concerned with large anonymous markets, in which players who did not agree in a given
period are randomly and exogenously rematched in succeeding periods with someone they
had never met before. Rubinstein and Wolinsky [53] and Gale [26] consider bargaining
frictions given by discounting and characterise the limiting price as the discount factor
goes to 1. The limiting price depends on exogenously given probabilities of being matched
in the following period.
Rubinstein and Wolinsky [54] (see also [46], Chapter 9.2, 9.3 for an exposition of their
models) consider B buyers and S sellers, with B > S and both finite. They have models
in which a proposal is made and, if it is rejected, participants are rematched using an
exogenous matching technology. They take a frictionless trading environment. Rubinstein
and Wolinsky, in this paper, show that there could be multiple equilibria in prices even
though all buyers and all sellers are homogeneous. In their model, because there are more
buyers than sellers, the competitive price is the buyers’ valuation. However, non-stationary
equilibria with prices different from the competitive equilibrium also exist. Some additional
assumptions ensure the competitive solution to be unique.
Gale and Sabourian [28] and Sabourian [55] use notions of strategic complexity to select
the competitive equilibrium in games of the kind studied by Rubinstein and Wolinsky, by
refining away non-stationary equilibria using the complexity concept.
Hendon and Tranaes [35], also following [54] study a market with two heterogeneous
buyers and one seller, and random matching after termination, and show there is no sta-
tionary subgame perfect equilibrium.
Chatterjee and Dutta [10] attempt a project similar to this one, also with public and
private targeted offers and ex ante offers, but both sides of the market are allowed to make
offers. It turns out that this difference with the current chapter is crucial. The paper
[10] does not, in general, obtain an asymptotically single price as δ → 1; under public
targeted offers, there is a pure strategy equilibrium and all pure strategy equilibria involve
two different prices. In general, the mixed strategy equilibria in the other models remain
65
non-degenerate even as δ → 1, unlike this chapter, even though the expected player payoffs
converge (except for public targeted offers).
To summarise, this current chapter differs from the existing literature by considering
one or more of the following: (i) Small numbers and strategic matching. (ii) Extensive
forms with different assumptions about whether offers are public or targeted and private.
(iii) Simultaneous offers. Despite this variety and the number of differences with the papers
mentioned above, the results we get are surprisingly consistent with an asymptotic single
price. It is clear that the fact that we consider one-sided rather than alternating offers has
much to do with this, and this might be considered one of the takeaways from this chapter,
namely that the intuition for the single price result holds broadly provided alternating
offers don’t push prices apart when buyer-seller valuations are heterogeneous.
In the next section, we discuss the basic model with two buyers and two sellers under
complete information. In Section 3, we consider extensions of the basic model, analyzing
the effects of adding a buyer or a seller. We also show, in this section, how to extend
the description of the equilibrium constructed in Section 2 to a setting where there are n
buyers and n sellers, for general finite n.
3.2 The basic framework
3.2.1 The model
Players and payoffs
In the basic model we address, there are two buyers and two sellers. As mentioned in
Section 3.1.2, there are two buyers B1 and B2 with a common valuation of v for the good
(the maximum this buyer is willing to pay for a unit of the indivisible good). There are
two sellers. Each of the sellers owns one unit of the indivisible good. Sellers differ in their
valuations (we can also interpret these as their costs of producing to order). One of the
sellers, (SM ) has a value of M for one or more units of the good. The other seller, (SH)
66
similarly has a value of H where
v > H > M > 0
This inequality implies that either buyer has a positive benefit from trade with either
seller. Alternative assumptions can be easily accommodated but are not discussed in the
chapter. In the basic complete information framework all these valuations are commonly
known. Finally, all players are risk neutral. Players (buyers or sellers) have a common
discount factor δ where δ ∈ (0, 1). Suppose a buyer agrees on a price pj with seller Sj in
period t. Then the buyer has an expected discounted payoff of δt−1(v − p) and Sj has the
payoff of δt−1(p− j), where j = M,H.
We shall discuss the informational assumptions along with the extensive forms in the
next subsection.
The extensive form
We consider an infinite horizon multi-player bargaining game with one-sided offers. The
extensive form of the game is described as follows.
At each time point t = 1, 2, ... offers are made simultaneously by the buyers. The offers
are targeted. This means an offer by a buyer consists of a seller’s name (that is SH or
SM ) and a price at which the buyer is willing to buy the object from the seller he has
chosen. Each buyer can make only one offer per period. Two settings could be considered;
one in which each seller observes all offers made (public targeted offers) and one (private
offers) in which each seller observes only the offers she gets. (Similarly for buyers after
the offers have been made.) We shall focus on the first and argue that here it makes no
difference in the analysis of stationary equilibrium. A seller can accept at most one offer
she receives. Acceptances or rejections are simultaneous. Once an offer is accepted, the
trade is concluded and the trading pair leave the game. Leaving the game is publicly
observable. The remaining players proceed to the next period in which buyers again make
67
price offers to the sellers. As is standard in these games, time elapses between rejections
and new offers.
The analogue of directed search, public offers that are not targeted to specific individ-
uals, is discussed in the extensions section.
We will not formally write out strategies, since this is a standard multi-stage game with
observable actions [24] . The main difference between the two extensive forms discussed in
the previous subsection is that, in public targeted offers, a seller’s response (and subsequent
actions by all players) can condition on the history of offers made to the other seller, in
addition to those she receives herself. In private offers, the only public history in each
period is the set of players remaining in the game. Each player has private histories as
well. Our equilibrium notions here will be standard, subgame perfect equilibrium for the
public targeted offers case and public perfect equilibrium for the second ( to avoid having
to consider and specify a player’s beliefs about past and present offers that are not in his
or her private history).3
3.2.2 Equilibrium in the basic model
Stationary equilibria
We consider stationary equilibria, that is, equilibria in which buyers when making offers
condition only on the set of players remaining in the game and the sellers, when responding,
condition on the set of players remaining and the offers made by the buyers. Clearly in the
private targeted offers model, the response of a seller can condition only on her own offer.
(We emphasise that this is not a restriction on strategies, only on the equilibria considered.)
These are therefore public perfect equilibria in the private offers game and particular sub-
game perfect equilibria in the targeted offers extensive form. We shall demonstrate that
the equilibrium outcome we find in this way is the unique stationary equilibrium outcome.
We shall proceed in this subsection by showing that a candidate strategy profile, in fact,3See [6] for an example of the effect of such beliefs in a multilateral bargaining context.
68
does constitute an equilibrium. In the next subsection, we shall show that the stationary
equilibrium payoff vector is unique upto choice of the buyer who makes an offer to both
sellers.
The conjectured equilibrium is as follows:
1. Consider a game in which only two players, buyer Bi and seller Sj remain in the
market and wj denotes the valuation/cost of Sj . Then it is clear that (i) Bi offers wj and
(ii) that Sj accepts any offer at least as high as wj and rejects otherwise.
2. Now consider the four-player game4. We consider the following strategies:
(a) One of the buyers, B1 say, makes offers to each seller with positive probability and
the other buyer B2 makes an offer only to SM . Let q be the probability with which B1
offers to SH . B1 offers H to SH . B1 randomises an offer to SM , using a distribution F1 (·)
with support [pl, H], where pl is to be defined later. The distribution F1(·) consists of
an absolutely continuous part from pl to H and a mass point at pl. B2 randomises by
offering M to SM (with probability q′) and randomising his offers in the range [pl, H] using
an absolutely continuous distribution function F2. The distributions Fi(·) are explicitly
calculated later.
(b) The sellers’ strategies in the four-player game are as follows. SH accepts the highest
offer greater than or equal to H and rejects if all offers are less than H. SM accepts the
highest offer with a payoff from accepting at least as large as the expected continuation
payoff from rejecting it (to be calculated later).
3. The expected payoff of a buyer Bi in equilibrium is v −H. The expected payoff of
SH is 0 and that of SM is positive and is considered below.
Lemma 11 Suppose there exists a pl such that
pl −M = δ(E(y)−M)4Note that, since we start with the same number of players on both sides of the market and since players
can leave only in pairs, any possible subgames will also have the same number of buyers and sellers.
69
,where y (a random variable) represents the maximum price offer to SM under the proposed
strategies. Then the strategies in 1,2 above constitute an equilibrium with
(i)
F1(s) =(v −H)(1− δ(1− q))− q(v − s)
(1− q)[(v − s)− δ(v −H)](3.1)
(ii)
F2(s) =(v −H)(1− δ(1− q′))− q′(v − s)
(1− q′)[(v − s)− δ(v −H)](3.2)
(iii)
q =[v −H](1− δ)
(v −M)− δ(v −H)(3.3)
(iv)
q′
=[v −H](1− δ)
(v − pl)− δ(v −H)(3.4)
Proof.
Since the proof is long, we relegate it to appendix (A.5).
Lemma 12 There exists a unique pl ∈ (M,H), such that,
pl −M = δ(E(y)−M)
where E(y) is same as defined before.
Proof. For any x ∈ (M,H) let F x1 (.), F x2 (.), qx , q′x, and Ex(y) be the expressions obtained
from F1(.), F2(.), q, q′
and E(y) respectively by replacing pl by x. Thus all we need to
show is that there exists a unique x∗ ∈ (M,H) such that,
x∗ −M = δ(Ex∗(y)−M)
70
We have,
Ex(y) = qx[q′xM + (1− q′x)Ex2 (p)] + (1− qx)[q
′xEx1 (p)
+(1− q′x)E(highest offer)]
where, Exi (p) is derived from F xi (.), (i = 1, 2) and is the expected price offer by the
buyer Bi,when his offers are in the range [x,H].
The following lemma shows that as x increases by 1 unit, increase in Ex(y) is by less
than 1 unit.
Lemma 13∂Ex(y)∂x
< 1
Proof. See appendix A.6 for the proof of the lemma.
Now we define the function G(.) as,
G(x) = x− [δEx(y) + (1− δ)M ]
Differentiating G(.) w.r.t x we get,
G′(x) = 1− (δ)
∂Ex(y)∂x
From Lemma 13 we have,
G′(x) > 0
From the equilibrium strategies we know that M < Ex(y) < H for any x ∈ (M,H). Since
δ ∈ (0, 1) we have,
limx→M
G(x) < 0 and limx→H
G(x) > 0
71
Since G(.) is a continuous and monotonically increasing function, using the Intermediate
Value Theorem we can say that there exists a unique x∗ ∈ (M,H) such that,
G(x∗) = 0
⇒ x∗ = δEx∗(y) + (1− δ)M
This x∗ is our required pl.
Thus we have,
G(pl) = 0
⇒ pl = (1− δ)M + δE(y)
Proposition 12 There exists a unique pl ∈ (M,H) such that strategies described above
constitute a subgame perfect equilibrium and,
pl = (1− δ)M + δE(y)
Proof. The proof directly follows from lemma 11 and lemma 12.
Uniqueness of the stationary equilibrium outcome
In this section we will show that the outcome derived above is the unique stationary
equilibrium outcome in this game, so that the expected payoff to each of the buyers is
v−H5. By outcome we mean the vector of payoffs obtained by the buyers and sellers. We
will adopt the methodology of Shaked and Sutton [57].5In fact there is another stationary equilibrium where B2 offers to both the sellers with positive proba-
bility and B1 to SM only. The qualitative nature will be the same and the buyer with valuation vi obtainsa payoff of vi −H.
This does not necessarily mean that the price is H. However, we shall show this is true asymptotically,as δ → 1.
72
Let M∗ and m∗ be the maximum and the minimum payoffs6 obtained by a buyer in
any stationary equilibrium of the complete information game. Also let ΛH and ΛM be the
maximal stationary equilibrium payoffs for sellers SH and SM respectively.
Lemma 14 In any stationary equilibrium, when all four players are present, both buyers
cannot make offers to both sellers with positive probability.
Proof. In a stationary equilibrium when both the buyers are offering to both the sellers,
each buyer should randomise its offer while offering to any of the sellers. Given the buyers’
behavior, each seller accepts an offer(or the maximum of the received offers) if and only
if the payoff from acceptance is at least as large as the discounted continuation payoff
from rejection. This implies that in a stationary equilibrium we need not worry about the
deviations by the sellers.
Let sMi be the upper bound of the support of offers to SM from the buyer Bi, i = 1, 2.
Let sHi be the upper bound of the support of offers to SH from the buyer Bi, i = 1, 2.
If sH1 6= sH2 then the buyer having a higher upper bound (say B1) can profitably deviate
by offering (sH1 − ε) to SH , where ε > 0 and sH1 − ε > sH2 .
Thus ,
sH1 = sH2 = sH
By similar reasoning we can say that,
sM1 = sM2 = sM
Next we would argue that we must have sH = sM . Suppose not . W.L.O.G let
sH > sM . In this case one of the buyers can profitably deviate by offering p to SM such
that sH > p > sM . Thus we have,
sH = sM = s6We assume (without needing to) that the supremal and infimal payoffs are actually achieved.
73
Let q2 be the probability with which B2 offers to SH . Let FM2 (.) and FH2 (.) be the con-
ditional distributions of offers by B2 given that he makes offers to SM and SH respectively.
Take s ∈ [sM1 , s] ∩ [sH1 , s]. B1’s indifference relation tells us that:
(v − s)[q2 + (1− q2)FM2 (s)] + (1− q2)(1− FM2 (s))δ(v −H)
= (v − s)[(1− q2) + q2FH2 (s)] + q2(1− FH2 (s))δ(v −M)
Since δ(v −M) 6= δ(v −H), (1− q2)(1− FM2 (s)) 6= q2(1− FH2 (s)). W.L.O.G we take,
(1− q2)(1− FM2 (s)) > q2(1− FH2 (s))
⇒ (1− q2)(1− FM2 (s)) > q2(1− FH2 (s))
The above inequality suggests that B2 puts a mass point at the upper bound of one of the
supports. If not then both (1 − q2)(1 − FM2 (s)) and q2(1 − FH2 (s)) are 0 and the above
inequality is not satisfied. This implies that B1 can profitably deviate.
Lemma 15 In any stationary equilibrium, when all four players are present , both buyers
cannot offer to SH with positive probability.
Proof. Clearly both offering to SH only is not possible in equilibrium. Similarly one of the
buyers offering to SH only and the other one making offers to both the sellers with positive
probability is not possible. In that case the buyer who is offering to both can profitably
deviate by offering M to SM . Thus if both are offering to SH it must be the case that
both are making offers to both the sellers with positive probability. From lemma 14 we
know that this is not possible in a stationary equilibrium. This concludes the proof.
Lemma 16 ΛH = 0
Proof. Suppose not. That is let it be the case that in a particular stationary equilibrium
SH obtains a strictly positive payoff (ΛH > 0). From Lemma 14 and Lemma 15 we know
74
that a single buyer is making this offer to SH . Since ΛH > 0, this buyer is offering xH
(where xH ≥ H + ΛH) with positive probability and his payoff is less than or equal to
v − xH .
Suppose this buyer deviates and makes an offer of x′H such that,
x′H = H + εΛH
where 0 < δ < ε < 1.
This offer will always be accepted by SH , irrespective of what the other seller’s strategy
is. This is because if she rejects this offer then next period she can at most obtain a payoff
of ΛH which is worth δΛH now. However by accepting this offer she gets εΛH > δΛH .
Since,
xH − x′H ≥ H + ΛH −H − εΛH
= ΛH(1− ε) > 0,
this deviation is profitable for the buyer. Thus we must have ΛH = 0 . This also tells
us that in a stationary equilibrium SH never gets an offer greater than H with positive
probability.
Lemma 17 In a stationary equilibrium, SM cannot get an offer greater than H with pos-
itive probability.
Proof. Suppose SM gets an offer H +4,4 > 0 with positive probability. From lemma
2.4 we know that H never gets an offer greater than H in equilibrium. Thus the buyer
making the above offer to M can profitably deviate by offering H + λ4, (0 < λ < 1) to
SH . Thus in equilibrium SM cannot get an offer greater than H with positive probability.
Lemma 18
m∗ ≥ v −H for i = 1, 2
75
Proof. From Lemma 16 and Lemma 17 we can posit that none of the sellers gets any offer
greater than H with positive probability. Thus in a stationary equilibrium buyers’ offers
are always in the interval [M,H]. Hence m∗ is bounded below by v −H. Thus,
m∗ ≥ v −H
Lemma 19
M∗ ≤ v −H for i = 1, 2
Proof. Suppose there exists a stationary equilibrium such that Bi obtains a payoff of M∗
such that M∗ > v −H.
(i) Consider the situation when the buyers play pure strategies. It must be true that
the offer made by Bi is accepted. Let p∗ be the equilibrium price offer by Bi. Since,
M∗ = v − p∗ > v −H
we have,
p∗ < H
This implies that this offer is accepted by seller SM .
Thus either Bj (j 6= i) is offering to SH or it is offering a price lower than p∗ to SM .
In both cases Bj can profitably deviate by offering a price p to SM such that p∗ < p < H .
Hence it is not possible for Bi to obtain a payoff of M∗ > v − H in a stationary
equilibrium when both the buyers play pure strategies.
(ii) Suppose at least one of the buyers plays a non-degenerate mixed strategy. It is
easy to note that Bi cannot obtain a payoff of M∗ > v−H if he offers to SH with positive
probability. Thus we only need to consider the situations when Bi is offering to SM only.
Suppose both B1 and B2 are offering to SM only. There does not exist a stationary
76
equilibrium where one of the buyers plays a pure strategy. Thus both B1 and B2 play
mixed strategies. It is trivial to check that in equilibrium the supports of their offers have
to be the same. Let [s, s] be the common support of their offers, where s ≥ M . Since Bi
obtains a payoff higher than v−H we must have s < H. Let Fj(.) be the distribution 7 of
offers by Bj where j = 1, 2 and j 6= i. Thus for any s ∈ [s, s] we have ,
(v − s)Fj(s) + (1− Fj(s))δ(v −H) = M∗
⇒ Fj(s) =M∗ − δ(v −H)
(v − s)− δ(v −H)
Since Fj(s) is always positive, Bj puts a mass point at s. From lemma 18 we know that
m∗ ≥ v − H. Thus by applying similar reasoning we can show that Bi also puts a mass
point at s.
We will show that Bi can profitably deviate. Suppose Bi shifts the mass from s to s+ ε
where ε > 0 and ε is small enough. The change in payoff of Bi is given by,
4ε = Fj(s+ ε)(v − (s+ ε))− Fj(s)2
(v − s) (3.5)
We will show that for small values of ε the above change in payoff is positive. For ε > 0,
from ( 3.5) we have,
4ε = [Fj(s) + εF′j (x)](v − (s+ ε))− Fj(s)
2(v − s)
where x ∈ (s, ε).
This implies
4ε = Fj(s)(v − s) + εF′j(x)(v − s)− εFj(s)− ε2F
′j (x)− Fj(s)
2(v − s)
7We assume that Fj(.) is differentiable
77
= Fj(s)(v − s
2− ε) + εF
′j (x)(v − s)− ε2F ′j (x)
For ε small enough we have, ε2F′j (x) ≈ 0.
Thus 4ε = Fj(s)(v − s
2− ε) + εF
′j (x)(v − s) > 0
This shows that Bi has a profitable deviation.
Next, consider the case when Bi offers to SM and Bj offers to SH . If Bi is playing a
pure strategy then his offer must be less than H. If Bi is playing a mixed strategy then the
upper bound of the support must be less than H. In both cases Bj can profitably deviate.
Lastly, consider the case when Bi is offering to SM and Bj is offfering to both the
sellers. If Bi obtains a payoff of M∗ > v −H then the upper bound of the support of his
offers must be less than H. Since the other buyer is offering to SH , his payoff is bounded
above by v −H. This implies that Bj can profitably deviate.
Hence from the above arguments we can infer that,
M∗ ≤ v −H (3.6)
Proposition 13 The outcome implied by the asymmetric equilibrium of Proposition 12 is
the unique stationary equilibrium outcome of the basic game.
Proof. From Lemma 18 and Lemma 19 we have,
M∗ ≤ v −H ≤ m∗ (3.7)
By construction we have,
m∗ ≤M∗
78
This implies that,
M∗ = v −H = m∗
This concludes the proof.
We will conclude the discussion on uniqueness by stating that in proving the station-
ary equilibrium outcome to be unique we have never used the fact that each seller while
responding observes the other seller’s offer. Thus the same analysis will hold good in the
private offers model. Hence the outcome implied by the stationary equilibrium8 of the tar-
geted offers model is the unique public perfect equilibrium outcome of the basic complete
information game with private targeted offers.
Asymptotic characterisation
We now determine the limiting equilibrium outcome when the discount factor δ → 1.
From (3.3) we know that the probability with which the buyer B1 offers to SH is given
by,
q =(v −H)(1− δ)
(v −M)− δ(v −H)(3.8)
From ( 3.8) it is clear that as δ → 1, q → 0.
From section 3.2.2 recall the equation,
G(x) = x− [δEx(y) + (1− δ)M ]
Since the fixed point x∗ is a function of δ, we denote it by x∗(δ).
Lemma 20 There exists a δ∗ ∈ (0, 1) such that for any δ ∈ (δ∗, 1), the fixed point x∗(δ)
is bounded above by δH.
Proof. We know that for any δ ∈ (0, 1), limx→H G(x) > 0.8This is the same as the one described for the public targeted offers model.
79
Since the function G(x) is continuous and monotonically increasing in x, there exists
a δ∗ ∈ (0, 1) such that, G(δH) > 0 for all δ ∈ (δ∗, 1), . Thus for any δ ∈ (δ∗, 1), the fixed
point x∗(δ) is bounded above by δH.
Lemma 21 As δ → 1, q′ → 0.
Proof. We have,
q′
=(v −H)(1− δ)
(v − pl)− δ(v −H)
=1
vv−H + δH−pl
(1−δ)(v−H)
where pl = x∗(δ).
From Lemma 20 we have δH − pl > 0. Thus we have
q′ → 0 as δ → 1
Proposition 14 As δ → 1, pl → H.
Proof. The offers from B2 to SM in the range [pl, H], follows the distribution function,
F2(s) =(v −H)[1− δ(1− q′)]− q′(v − s)
(1− q′)[v − s− δ(v −H)]
⇒ 1− F2(s) =H − s
(1− q′)[v − s− δ(v −H)]
.
Note that,
1− F2(H) = 0
From Lemma 21 we know that as δ → 1, q′ → 0. Thus as δ → 1, for s arbitrarily close
80
to H we have,
1− F2(s) ≈ H − sH − s
= 1
Thus the support of the distribution F2 collapses. This implies that as δ → 1 , pl → H.
This shows that as agents become patient enough, the unique stationary equilibrium
outcome of the basic complete information game implies that in presence of all players
both the buyers almost surely offer H to seller SM . Hence although trading takes place
through decentralised bilateral interactions, asymptotically we get a uniform price for a
non-differentiated good.
Stationary equilibrium for ex ante public offers/directed search
We intend to find a stationary equilibrium of this (modified) extensive form. The qualitative
nature of the equilibrium, analogous to the one we have studied before, is as follows. One
of the buyers B1 randomises between posting a price of H and posting something less than
H. He randomises his prices if offering less than H. The other buyer B2’s posted price is
randomised along a support whose upper bound is H.
In order to describe the candidate equilibrium, we note that the two player game (one
buyer-one seller) is identical to that in the targeted offers model. We consider only the
four-player game. Consider the following strategies:
(a) One of the buyers, B1 say, puts a mass of q at H and a continuous distribution of
offers, (1−q)F1(.) from pl to H, where pl will be defined later. The conditional distribution
F1(.) consists of an absolutely continuous part from pl to H and a mass point at pl. B2,
on the other hand randomises his posts by putting a mass point at p′l and an absolutely
continuous part F2(.) from pl to H, with p′l < pl. The price p
′l is defined as,
p′l =
M +H
2(3.9)
The distributions Fi(.) will be explicitly calculated.
81
(b) The sellers’ strategies in the four-player game are as follows:
Suppose p1 and p2 are the posted prices such that M ≤ p1 ≤ p2 . If p2 ≥ H then
SM accepts p1 (p2) if p1 ≥ M+p22 (p1 <
M+p22 ). If p2 < H then SM accepts p2 only if the
payoff from accepting it is at least as large as the continuation payoff from rejecting it. SH
accepts p2 provided p2 ≥ H.
2. The expected payoff of a buyer i in equilibrium is v−H. The expected payoff of SH
is zero and that of SM is positive.
Lemma 22 Suppose there exists pl ∈ (p′l, H) such that,
pl −M = δ(E(y)−M)
,where p (a random variable) represents the highest price offer ≤ H under the proposed
strategies. Then the proposed strategies constitute an equilibrium with,
(i)
F1(s) =(v −H)(1− δ(1− q))− q(v − s)
(1− q)[(v − s)− δ(v −H)]
(ii)
F2(s) =(v −H)(1− δ(1− q′))− q′(v − s)
(1− q′)[(v − s)− δ(v −H)]
(iii)
q =[v −H](1− δ)
(v − p′l)− δ(v −H)
(iv)
q′
=[v −H](1− δ)
(v − pl)− δ(v −H)
Proof. The proof is identical to the proof of lemma 11, if we replace M by p′l.
In the next lemma we will show that for sufficiently high values of δ there exists a
unique pl in the open interval (p′l, H)
82
Lemma 23 There exists a δ∗ ∈ (0, 1) such that for all δ > δ∗, there exists a unique
pl ∈ (p′l, H) that satisfies,
pl = δE(y) + (1− δ)M
Proof. Refer to appendix A.7
Asymptotic characterisation for ex ante public offers/directed search
In the public offers model, as δ → 1, pl → H. Thus as agents become patient enough
we get a uniform price for the non-differentiated goods. Since the proof of this is almost
identical to the proof of Proposition 14 we omit it.
Note that the different versions of the extensive form give similar equilibria and the
same asymptotic result, provided offers are one-sided
3.2.3 Adding a seller
We now consider the effect of adding a seller to the basic complete information model.
(i) Suppose the three sellers have different valuations, i.e H, M and L with,
v > H > M > L
In this case the seller with valuation H will be irrelevant. This is because we have
already described an equilibrium with 2 buyers and 2 sellers (sellers having different valu-
ations) in which each buyer is guaranteed a payoff of v−M . Since SH will not accept any
price lower than H, buyers will simply ignore SH . Hence in this case the unique stationary
equilibrium outcome will be the same as in the 2 buyers, 2 sellers case.
(ii) If two of the sellers have valuations M and one of them has valuation H, where
M < H, then it is easy to see that each buyer offering M to each of the sellers SM
constitutes an equilibrium. In this case each of the buyers gets a payoff of v−M . Intuitively,
it seems that this gives the unique equilibrium payoff.
83
(iii) Lastly consider the case when two of the sellers have valuation H and one has
valuation M . In this situation the stationary equilibrium of the 2 buyer, 2 sellers case
will be applicable. We can assume that one of the H sellers is randomly chosen at the
beginning of the game.
3.2.4 Heterogeneous buyers
Suppose, in the basic model, buyers too are heterogeneous. That is, buyer Bi has a
valuation of vi where,
v1 > v2 > H > M
Analysis of the basic model holds good. 9
We conclude this subsection by providing an example to show that even if there is
potential of trade for both the sellers, such trades need not take place in the equilibrium
of our model. Suppose there are two buyers with valuation v1 and v2 and two sellers with
valuations H and M such that
M < v2 < H < v1
In equilibrium, both the buyers offer v2 to the seller with valuation M and the trade
takes place between the M -seller and the v1-buyer. (If, in equilibrium, the v2 buyer were
concluding the trade with positive probability, the v1 buyer would offer ε > 0 more and
have a profitable deviation.) Note that, in this case, any price between v2 and H would be
a competitive equilibrium in which the demand and supply would equate.
3.2.5 Adding a buyer
This analysis has been done in the generalised section for homogeneous buyers.9The generalisations in ensuing sections are in terms of homogeneous buyers. Heterogeneity in buyer
valuations can be accommodated in the section on n buyers and n sellers. We have not been able toincorporate heterogeneity in the case of more buyers than sellers.
84
3.2.6 Generalisation 1: n buyers and n sellers
Players and payoffs
There are n buyers (n > 2 and n finite) and n sellers. Each buyer’s maximum willingness
to pay for a unit of an indivisible good is v. Each of the sellers owns one unit. Sellers differ
in their valuations. We denote seller Sj ’s valuation (j = 1, ..., n) by uj where,
v > un > un−1 > ... > u2 > u1
The above inequality implies that any buyer has a positive benefit from trade with any
of the sellers. All players are risk neutral. Hence the expected payoffs obtained by the
players in any outcome of the game are identical to that in the basic model.
The extensive form
This is identical to the one in the basic complete information game. We first consider the
infinite horizon, public and targeted offers game where the buyers simultaneously make
offers and each seller either accepts or rejects an offer directed towards her. Matched pairs
leave the game and the remaining players continue the bargaining game with the same
protocol.
Equilibrium
We seek, as usual, to find a stationary equilibrium. Thus buyers’ offers at a particular
time point depend only on the set of players remaining and the sellers’ responses depend
on the set of players remaining and the offers made by the buyers. Since we start out with
equal numbers of buyers and sellers, any possible subgame will have that. Depending on
the parametric values we can have three types of equilibria. However, as δ becomes greater
than a threshold value, there is only one type of characterisation.
85
First, for our notational convenience, we re-label u1 = L and un = H. From the basic
complete information game, for each i = 1, ..., n− 1, we calculate pi such that,
pi = (1− δ)ui + δE(yi) (3.10)
where E(yi) is defined as the equilibrium expected maximum price offer which Si gets in
the four-player game with Si and Sn as the sellers and two buyers with valuation v.10
For each i = 1, ..., n− 1 we define qi as,
qi =H − pi
(v − pi)− δ(v −H)(3.11)
and qH as ,
qH =(v −H)(1− δ)
(v − L)− δ(v −H)(3.12)
Let P =∑
i=1,..,n−1 qi. The following three propositions fully characterise the equilibrium
behavior in the present game11. In all of them, sellers’ strategies are as follows: (i) Sn
accepts any offer greater than or equal to H. (ii) Seller Si (i = 1, .., n − 1) accepts the
highest offer with a payoff from accepting at least as large as the expected continuation
payoff from rejecting it.
Proposition 15 If for δ ∈ (0, 1), P < 1 and 1 − P > qH , then the equilibrium is as
follows:
(i) Buyer B1 makes offers to S1 only. B1 puts a mass of q′1 at L and has a continuous
distribution of offers F1(.) with [p1, H] as the support. Bn makes offers to S1 with probability
q1. He randomises his offers to S1 with a probability distribution F 1n(.) with [p1, H] as the
support. F 1n(.) puts a mass point at p1 and has an absolutely continuous part from p1 to
10Note that pi is given by the equilibrium of the appropriate four-player game, which has already beendescribed earlier. It can essentially be treated as an exogenously given function of the parameters of theproblem for the purposes of the n− player analysis.
11Note that all quantities used in these propositions are defined with respect to the exogenously givenparameters of the game.
86
H. The distributions F1(.), F 1n(.), q1 and q
′1 are given by:
F1(s) =(v −H)[1− δ(1− q′1)]− q′1(v − s)
(1− q′1)[(v − s)− δ(v −H)](3.13)
F 1n =
(v −H)[1− δq1]− (1− q1)(v − s)q1[(v − s)− δ(v −H)]
(3.14)
q′1 =
(v −H)(1− δ)(v − p1)− δ(v −H)
(3.15)
q1 = q1 + (1− P − qH) (3.16)
(ii) For i = 2, ..., n − 1, Bi makes offers to Si only. Bi’s offers to Si are randomised
with a distribution Fi(s). Fi(.) puts a mass point at pi and has an absolutely continuous
part from pi to H. Bn makes offers to Si (i = 2, .., n − 1) with probability qi = qi. Bn’s
offers to Si are randomised by an absolutely continuous probability distribution F in with
[pi, H] as the support. For i = 2, .., n− 1, Fi(.) and F in(.) are given by,
Fi =(v −H)(1− δ)
(v − s)− δ(v −H)(3.17)
F in =(v −H)[1− δqi]− (1− qi)(v − s)
qi[(v − s)− δ(v −H)](3.18)
(iii) Bn offers to Sn with probability qH . He offers H to Sn.
(iv) In equilibrium, all buyers obtain an expected payoff of v −H.
Proof. Refer to appendix (A.8).
Proposition 16 If for a δ ∈ (0, 1) P < 1 and 1 − P < qH , then the equilibrium is as
follows:
(i) For i = 1, 2, ..., n−1, buyer Bi makes offers to Si only. Bi’s offers to Si are random
with a distribution Fi(s). Fi(.) puts a mass point at pi and has an absolutely continuous
part from pi to H. Bn makes offers to Si (i = 1, .., n − 1) with probability qi = qi. Bn’s
87
offers to Si are random with an absolutely continuous probability distribution F in with [pi, H]
as the support. For i = 1.., n− 1, Fi(.) and F in(.) are given by,
Fi =(v −H)(1− δ)
(v − s)− δ(v −H)(3.19)
F in =(v −H)[1− δqi]− (1− qi)(v − s)
qi[(v − s)− δ(v −H)](3.20)
(ii) Bn offers to Sn with probability qn = 1− P. He offers H to Sn.
(iii) In equilibrium, all buyers obtain an expected payoff of v −H.
Proof. Refer to appendix (A.9)
Proposition 17 If P ≥ 1, then the equilibrium is as follows:
For i = 1, .., n − 1, buyer Bi makes offers to seller Si only. Bi’s offers to Si are
randomised using a distribution function Fi(.), with [pi, p] as the support. The distribution
Fi(.) puts a mass point at pi and has an absolutely continuous part from pi to p. Buyer
Bn offers to all sellers except Sn. Bn’s offers to Si (i = 1, ..n− 1) are randomised with a
continuous probability distribution F in. The support of offers is [pi, p]. The probability with
which Bn offers to Si (i = 1, .., n − 1) is qi. If P = 1 then p = H. If P > 1 then p < H
and as δ → 1, p → H. In equilibrium, all buyers obtain an expected payoff of v − p. The
following relations formally define the equilibrium:
Fi(s) =(v − p)− δ(v −H)(v − s)− δ(v −H)
(3.21)
F in =(v −H)[1− δqi]− (1− qi)(v − s)
qi[(v − s)− δ(v −H)](3.22)
qi =p− pi
(v − pi)− δ(v −H)(3.23)
Further if for δ = δ∗, P > 1 then for all δ > δ∗, P > 1 and p→ H as δ → 1.
Proof. Refer to appendix (A.10)
88
Proposition (17) tells us that as agents become patient enough, prices in all transactions
tend towards H.12The following observation can be made about the asymptotic result. For
δ high enough, the prices tend towards the valuation of the highest seller, independently
of the distributions of the valuations of the other sellers. Hence even if the distribution of
the valuations of the sellers Si (i = 1, ..n − 1) is heavily skewed towards L, the uniform
asymptotic price will still be H.
Whilst the formal proofs of the above propositions are relegated to the appendix, we
provide a verbal description of the nature of the stationary equilibrium as follows.
It can be observed that in all of the above stationary equilibria, each buyer, other than
Bn,is assigned to a seller to make offers to-buyer Bi to seller Si. The remaining buyer (Bn)
offers to all (or all but one) the sellers. This creates some competition among the buyers,
since each seller(except Sn) gets two offers with positive probability. The probability qH is
the probability with which Bn should offer to Sn in equilibrium if B1 puts a mass point at
u1(= L). The quantity qi is the probability with which Bn should offer to Si in equilibrium,
if Bi puts a mass point at pi and Bn offers to all the sellers. Further, in any stationary
equilibrium, a buyer who is assigned to a seller Sj has to put a mass point either at uj or
at pj . Hence, for a given δ, if Bn has to make offers to all the sellers then it is necessary
to have P < 1. Further if 1 − P > qH , then it is possible to have the buyer B1 put a
mass point at L; the equilibrium is then described by proposition (15). Otherwise the
equilibrium is described by proposition (16). On the other hand if P ≥ 1 it is not possible
to have Bn offering to all the sellers in equilibrium. In that case he offers to all but the
highest valued seller. The equilibrium is then described by proposition (17). In the 2× 2
case, the conditions P < 1 and 1−P > qH are satisfied for all values of δ ∈ (0, 1). This is
because in the 2× 2 case P = H−pl(v−pl)−δ(v−H) , which is less than 1 for all values of δ ∈ (0, 1).
Further 1 − P = (v−H)(1−δ)(v−pl)−δ(v−H) > qH = (v−H)(1−δ)
(v−M)−δ(v−H) as pl > M . Hence the qualitative
nature of the equilibrium described in proposition (15) is identical to the one described in12We have seen earlier (in the 2 × 2 game analysis) that pi goes to H as δ → 1. In this propsition, we
show that p→ H as δ → 1. Thus the supports of the randomised strategies also collapse as δ → 1
89
the basic model. However for n > 2, the conditions satisfied by the 2 × 2 configuration
need not hold for all values of δ.
In proposition (17), the highest valued seller does not get any offer when all the players
are present. Hence the continuation game faced by a seller from rejection is always the same
irrespective of whether she gets one offer or two offers. A seller knows that by rejecting
all the offer(s) she will face a four-player game with Sn as the other seller and two buyers
with valuation v. Thus the seller Si,(i = 1, .., n− 1) knows the continuation game for sure
and this does not require her to observe the offers received by other sellers or the seller to
whom buyer Bn is making his offer. Since for high values of δ, P ≥ 1, we have the following
corollary:
Corollary 2 With private offers, Proposition (17) describes the equilibrium of the game
for high values of δ.
Heterogeneous buyers: Suppose the buyers are heterogeneous such that,
vN > vN−1 > ... > v2 > v > H > uN−1 > ... > L
For each i = 1, ..n− 1, define
phi = (1− δ)ui + δE(yhi )and
qhi =H − phi
(vi − phi )− δ(vhi −H)
, where E(yhi ) is defined as the equilibrium expected maximum price offer that Si gets in
the four-player game with Si and Sn as the sellers and two buyers with valuation vi and
vn. As before, let Ph =∑
i=1,..,n−1 qhi . Define qHh = (v1−H)(1−δ)
(v1−L)−δ(v1−H) ≡ qH as v1 = v.
Proposition 18 With heterogeneous buyer valuations, analogues to propositions 15, 16
and 17 hold good for Ph < 1 and 1 − Ph > qH , Ph < 1 and 1 − Ph < qH and Ph ≥ 1
90
respectively. For Ph < 1 and 1 − Ph > qH the lowest-valued buyer with valuation v offers
to S1. The specifics, however, are slightly different(see appendix (A.11)). Also with private
offers, proposition (17) describes the equilibrium for high values of δ.
Remark 1 We omit the formal proof of the results for heterogeneous buyers since this
is very similar to those of the previous propositions. Here, we explain why in the case of
Ph < 1 and 1−Ph > qH the lowest-valued buyer with valuation v offers to S1, rather than
one of the others.13 In equilibrium, the buyer who is making offers to S1 puts a mass point
at the reservation value of that seller (i.e. at L). Since the buyer is indifferent between
offering L to S1 and making randomised offers in the range [p1, H], the probability (qH)
with which the buyer Bn makes offers to Sn must just make B1 indifferent among the offers
in the support of his randomised strategy.14 This gives qH as below.
(v − L)qH + (1− qH)δ(v −H) = v −H
⇒ qH =(v −H)(1− δ)
(v − L)− δ(v −H)
Buyer Bj (j 6= 1; j 6= n) makes randomised offers to the seller Sj with [pj , H] as the
support. First, it is easy to see that Bj cannot profitably deviate by making offers to Sk
(j 6= k 6= n) in the range [pk, H]. To ensure that the proposed strategies constitute an
equilibrium we need to show that this buyer with valuation vj(6= v), has no incentive to
offer ui(or in the range (ui,pi) ) to Si, i = 1, .., n− 1;. First consider i = 2, ..n− 1. Since
offers are public15, a seller with valuation ui will only accept an offer of ui (or something
in the range (ui,pi)) if the buyer Bn makes an offer to Sn. Hence, the payoff to the buyer
with valuation vj of making an offer of ui to Si is,
(vj − ui)qH + (1− qH)δ(vj −H)
13This is a sufficient condition for the strategies described to be an equilibrium.14W.L.O.G we assume that v1 = v15Note that the equilibrium for private offers is described by a different proposition.
91
Define qHj such that, (vj − ui)qH + (1 − qH)δ(vj − H) = vj − H. This implies qHj =(vj−H)(1−δ)
(vj−ui)−δ(vj−H) . Since vj > v for all j 6= 1 and ui > L, for all i 6= 1 we have
qHj =(vj −H)(1− δ)
(vj − ui)− δ(vj −H)>
(v −H)(1− δ)(v − L)− δ(v −H)
= qH
Since (vj − ui) > δ(vj −H), (vj − ui)qH + (1− qH)δ(vj −H) < (vj −H). The equilibrium
payoff to the buyer with valuation vj is (vj − H). This implies that the buyer has no
incentive to offer ui to seller Si. This also proves that for i = 1, the buyer Bj has no
incentive to offer L to S1. To see this note that (vj−H)(1−δ)(vj−L)−δ(vj−H) >
(v−H)(1−δ)(v−L)−δ(v−H) . Since B1
is also offering L to S1 with some positive probability the payoff to Bj by offering L to S1
is strictly less than (vj − L)qH + (1− qH)δ(vj −H) < vj −H. Hence Bj has no incentive
to offer anything in the range [ui, pi) to Si (i = 1, ..n− 1).
3.2.7 Generalisation 2: n buyers and n-1 sellers
Players and payoffs
We have n buyers and n − 1 sellers. The rest of the environment is as before with all
buyers having a common, known valuation v. Each of the sellers owns one unit of that
good. Sellers differ in their valuations. Seller Sj ( j = 1, .., n − 1) has a valuation of uj
such that,
v > un−1 > ... > u1
Thus any buyer has a positive benefit from trade with any seller. However, the number of
buyers is more than the number of sellers. Hence only n − 1 buyers can be served. The
payoffs of players are identical to that in the basic model.
The extensive form
The extensive form is same as in the basic complete information game. Thus at each time
point we have simultaneous public targeted offers from the buyers only. Sellers respond by
92
either accepting or rejecting the offers(s). Matched pairs leave the game and the players
remaining move on to the next period. They continue the bargaining game according to
the same protocol.
Equilibrium
We will derive a stationary equilibrium of this extensive form. Thus buyers’ offers at any
time point depend only on the set of players remaining and the sellers’ responses depend
only on the set of players remaining, and the offers. Before we describe the equilibrium
of this game formally we will verbally discuss its nature. In equilibrium, if all the players
are present, buyer Bi (i = 1, ..., n − 1) makes offers to Si only. His offers are randomised
using a distribution function function Fi(.), with [pi, p] (pi and p will be defined later )
as the support. Fi(.) puts a mass point at pi and has an absolutely continuous part from
pi to p. Buyer Bn makes offers to all the sellers with positive probability. Bn’s offers to
Sj (j = 1, .., n− 1) are randomised using a probability distribution F in(.). The support of
offers is [pi, p].
For each i = 1, .., n− 1 we define pi as ,
pi = (1− δ)ui + δv (3.24)
Let qi be the probability with which Bn offers to seller Si. The following proposition
now formally defines the equilibrium of the game.
Proposition 19 (i) The above conjectured strategies constitute a stationary equilibrium
of the present game with,
Fi(s) =v − pv − s
(3.25)
F in(s) =(v − p)− (1− qi)(v − s)
qi(v − s)(3.26)
qi =p− piv − pi
(3.27)
93
p = v − (n− 2)
∏i=1,..,n−1(v − pi)∑
j=1,..,n−1[∏k=1,..,n−1;k 6=j(v − pk)]
(3.28)
(ii) In equilibrium, each buyer obtains an expected payoff of (v − p).
Proof. First consider the buyer Bi, (i = 1, .., n−1). For s ∈ [pi, p] his indifference relation
is,
(v − s)[(1− qi) + qiFin(s)] = v − p
Solving the above relation for F in(.) we get (3.26). Putting s = pi in Bi’s indifference
relation we obtain (3.27). It is easy to note that F in(pi) = 0 and F in(p) = 1.
Next, consider the buyer Bn. The support of his offers to Si (i = 1, .., n − 1) is [pi, p].
For s ∈ [pi, p], Bn’s indifference relation is given by
(v − s)[Fi(s)] = v − p
which gives us (3.25). Note that Fi(pi) > 0 and Fi(p) = 1. This confirms our conjecture
that Bi puts a mass point at pi.
To have consistency in the expressions obtained we must have,
∑i=1,..,n−1
qi = 1⇒∑
i=1,..,n−1
(p− pi)(v − pi)
= 1
⇒∑
i=1,..,n−1
(v − p)(v − pi
) = n− 2
Rearranging the terms in the above relation we get (3.28).
Now we should check that the strategies constitute an equilibrium. First, observe that
on the equilibrium path if a seller Si rejects her offer(s) then next period she will face a
game with two buyers and one seller. This will give her a discounted payoff of δ(v − ui).
Hence her minimum acceptable price should be pi. From the analysis of the basic model one
can infer that on the equilibrium path, there is no profitable deviation for the players. The
94
way we have specified sellers’ strategies these always constitute best responses in any off-
path contingency. It is easy to check that buyers’ strategies also constitute best responses
in any off-path contingency. This concludes the proof.
Remark 2 Note that irrespective of whether a seller gets one offer or two offers, the con-
tinuation game faced by her from rejection is the same. Hence the result of the proposition
(19) holds good for the case of private offers as well.
The Asymptotic Characterisation
We would like to analyze the equilibrium outcome discussed above as agents become patient
enough, i.e as δ → 1.
From (3.24)It is easy to observe that pi → v as δ → 1. Thus as δ → 1, (v− pi)→ 0 for
i = 1, .., n− 1. This implies that the second term in (3.28) goes to zero as δ tends to one.
Hence ,
p→ v as δ → 1
This implies that the distributions of the price offers by each buyer collapse to a single
value in the limit.
Thus as δ → 1, we tend to get an uniform price of v for the non-differentiated good.
This is equivalent to the Walrasian outcome of the present setup.
3.3 Conclusion
This chapter has considered several different variants of a dynamic strategic matching and
bargaining game, with the common feature that only one side of the market makes offers.
Unlike other papers in the field, the offers are made simultaneously to capture competition.
We find that stationary equilibria give a single price asymptotically in all the transactions.
Previous work has shown that this conclusion is not true when buyers and sellers take
it in turns to make offers (a game of which the Rubinstein bargaining game is a special
95
case). Alternating offers with heterogeneity in valuations tends to drive valuations apart.
Other authors [17] have mentioned the difficulty of solving dynamic bargaining and
matching games with many players if there is heterogeneity of valuations on both sides,
though she was specifically concerned with alternating offers. This turns out mostly not to
be an issue for us, except in the one general case where sellers are on the short side, where
we have not been able to extend the basic analysis.
One interesting heterogeneity would be to consider settings in which the value of buyer
i for seller j′s good is vij , as in the housing market. In this setting it seems appropriate
to assume that sellers’ valuations do not depend upon the identity of the potential buyers.
This is kept for future research, though it seems feasible that techniques similar to the ones
used in this chapter would enable us to characterise equilibrium prices in such markets as
well.
Chapter 4
Decentralised Bilateral Trading in
a Market with Incomplete
Information
4.1 Introduction
This chapter attempts to study a small market in which one of the players has private
information about her valuation. As such, it is a first step in combining the literature
on incomplete information with that on market outcomes obtained through decentralised
bilateral bargaining.
We shall discuss the relevant literature in detail later on in the introduction. Here we
summarise the motivation for studying this problem.
One of the most important features in the study of bargaining is the role of outside
options in determining the bargaining solution. There have been several different ap-
proaches to this issue, starting with treating alternatives to the current bargaining game
as exogenously given and always available. Accounts of negotiation directed towards prac-
titioners and policy-oriented academics, like Raiffa’s masterly “The Art and Science of
96
97
Negotiation”,([50]) have emphasised the key role of the “Best Alternative to the Negoti-
ated Agreement” and mentioned the role of searching for such alternatives in preparing
for negotiations. Search for outside options has also been considered, as well as search for
bargaining partners, in a general coalition formation context.
Real world examples of such search for outside options are abound. For example, firms
that receive (public) takeover bids seek to generate other (also public offers) in order to
improve their bargaining position. Takeovers are an instance also of public one-sided offers.
The housing market is another example; there is a given (at any time) supply of sellers and
buyers who are interested in a particular kind of house make (private) offers to the sellers
of the houses they are interested in, one at a time. (This is, for instance, the example used
in [36].)
Private targeted offers are prevalent in industry as well, for joint ventures and mergers.
For example, the book [2] is concerned with the joint venture negotiations in the 1980s,
in which Air Products, Air Liquide and British Oxygen were buyers and DuPont, Dow
Chemical and Monsanto were sellers (of a particular kind of membrane technology). The
final outcome of these negotiations were two joint ventures and one acquisition.
Proceeding more or less in parallel, there has been considerable work on bargaining with
incomplete information. The major success of this work has been the complete analysis of
the bargaining game in which the seller has private information about the minimum offer
she is willing to accept and the buyer, with only the common knowledge of the probability
distribution from which the seller’s reservation price is drawn, makes repeated offers which
the seller can accept or reject; each rejection takes the game to another period and time
is discounted at a common rate by both parties. With the roles of the seller and buyer re-
versed, this has also been part of the development of the foundations of dynamic monopoly
and the Coase conjecture. Other, more complicated models of bargaining have also been
formulated (including by one of us), with two-sided offers and two-sided incomplete infor-
mation, but these have not yielded the clean results of the game with one-sided offers and
98
one-sided incomplete information.
Whilst this need not necessarily be a reason for studying this particular game, it does
suggest that if we desire to embed bargaining in a more complex market setting with
private information, it is rational for us, the modellers, to minimise the extent of complexity
associated with the bargaining to focus on the changes introduced by adding endogenous
outside options, as we intend to do here.
Our model therefore takes the basic problem of a seller with private information and
an uninformed buyer and adds another buyer-seller pair; here the new seller’s valuation is
different from the informed seller’s and commonly known and the buyers’ valuations are
identical. Each seller has one good and each buyer wants at most one good. This is the
simplest extension of the basic model that gives rise to outside options for each player,
though unlike the literature on exogenous outside options, only one buyer can deviate from
the incomplete information bargaining to take his outside option with the other seller (if
this other seller accepts the offer).
In our model, buyers make offers simultaneously, each buyer choosing only one seller.1
Sellers also respond simultaneously, accepting at most one offer. A buyer whose offer is
accepted by a seller leaves the market with the seller and the remaining players play the
one-sided offers game with or without asymmetric information. We consider both the cases
where buyers’ offers are public, so the continuation strategies can condition on both offers
in a given period, and private, when only the proposer and the recipient of an offer know
what it is and the only public information is the set of players remaining in the game.
Our analysis explores whether a Perfect Bayes Equilibrium similar to that found in the
two-player asymmetric information game continues to hold with alternative partners on
both sides of the market and with different conditions on observability of offers.
The equilibrium we describe is a randomized behavioral strategy one (as in the two-
player game). As agents become patient enough, in equilibrium competition always takes1Simultaneous offers extensive forms probably capture the essence of competition best.
99
place for the seller whose valuation is commonly known. The equilibrium behavior of beliefs
is similar to the two-player asymmetric information game and the same across public and
private offers. However, the off-path behaviour sustaining this equilibrium is different and
has to take into account many more possible deviations. The path of beliefs also differs
once an out-of-equilibrium choice occurs. The case of private offers is quite interesting.
For example a buyer who offers to the informed seller might see his offer rejected but his
expectation that the other offer has been accepted is belied when he observes all players
remain in the market. He is then unsure of whether the other buyer has deviated and made
an offer to the informed seller, which the informed seller has rejected, or an offer to the
seller with commonly known valuation. The beliefs have to be constructed with some care
to make sure the play gets back to the equilibrium path. However, the beliefs used here
are not inherently implausible.
The interesting asymptotic characterisation obtained by taking the limit of the equilib-
rium prices, as the discount factor goes to 1, is that, despite the asymmetric information
and two heterogeneous sellers, the different distributions of prices collapse to a single price
that is consistent with an extended Coase conjecture.2
The intuition and the economics behind these results can be explained in the following
way. In the benchmark case when one of the sellers’ valuation is known to be H and
the other M , then in the Walrasian setting, there will be excess demand at any prices
p ∈ (M,H). This is in essence what drives the prices to H. We model an explicit trading
protocol with simultaneous offers made by both buyers. As δ → 1,the offers converge to
H and the trade takes place immediately. For lower values of δ, the buyer can exploit the
fact that the seller will need to wait until she gets a new offer and hence buyers would be2The “Coase conjecture” relevant here is the bargaining version of the dynamic monopoly problem,
namely that if a uninformed seller (who is the only player making offers) has a valuation strictly belowthe informed buyer’s lowest possible valuation, the unique sequential equilibrium as the seller is allowed tomake offers frequently has a price that converges as the frequency of offers becomes infinite to the lowestbuyer valuation. Here we show that even if one adds endogenous outside options for both players, a similarconclusion holds for an equilibrium that is common to both public and private offers-hence an extendedCoase conjecture holds.
100
able to capture some rents.
Next, let us move on to the private information case. From existing results we know the
solution for the sub-game where only the privately informed seller is left. This states that
as δ → 1, the price would converge to H and trade will take place immediately. Thus in the
limit the reservation price of the informed seller is H, regardless of her type. This explains
why we have equilibrium in which there is immediate trade at a price of H. There coulkd
be other equilibria where essentially the buyers collude on (in a tacit manner though).
In the two-player game, the Perfect Bayesian Equilibrium is unique in the “gap” case.
In our competitive setting, this is not true, at least for public offers. We include an example.
Related literature: The modern interest in this approach dates back to the sem-
inal work of Rubinstein and Wolinsky ( [53], [54]), Binmore and Herrero ([8])and Gale
([25]),[26]). These papers, under complete information, mostly deal with random match-
ing in large anonymous markets, though Rubinstein and Wolinsky (1990) is an exception.
Chatterjee and Dutta ([10]) consider strategic matching in an infinite horizon model with
two buyers and two sellers and Rubinstein bargaining, with complete information. The
previous chapter analysed markets under complete information where the bargaining is
with one-sided offers.
There are several papers on searching for outside options, for example, Chikte and
Deshmukh ([16]), Muthoo ([44]), Lee ([41]), Chatterjee and Lee ([15]). Chatterjee and
Dutta ([11]) study a similar setting but with sequential offers by buyers. In the present work
we consider simultaneous offers, which is closer to the usual model of Bertrand competition.
We should emphasise that we consider an infinite horizon model, unlike one-stage Bertrand
competition.
A rare paper analysing outside options in asymmetric information bargaining is that
by Gantner([30]), who considers such outside options in the Chatterjee-Samuelson ([14])
model. Our model differs from hers in the choice of the basic bargaining model and in the
explicit analysis of a small market with both public and private targeted offers.
101
Some of the main papers in one-sided asymmetric information bargaining are the well-
known ones of Sobel and Takahashi([56]), Fudenberg, Levine and Tirole ([23]), Ausubel
and Deneckere ([4]). The dynamic monopoly papers mentioned before are the ones by
Gul and Sonnenschein ([32]) and Gul, Sonnenschein and Wilson([33]). See also the review
paper of Ausubel, Cramton and Deneckere ([5]).
There are papers in very different contexts that have some of the features of this model.
For example, Swinkels [60] considers a discriminatory auction with multiple goods, private
values (and one seller) and shows convergence to a competitive equilibrium price for fixed
supply as the number of bidders and objects becomes large. We keep the numbers small, at
two on each side of the market. Horner and Vieille [36] consider a model with one informed
seller, two buyers with correlated values who are the only proposers and both public and
private offers. They show that, in their model unlike ours, public and private offers give
very different equilibria; in fact, public offers could lead to no trade.
Outline of rest of the chapter. The rest of the chapter is organised as follows.
Section 2 discusses the model in detail. The qualitative nature of the equilibrium and its
detailed derivation is given in section 3. The asymptotic characteristics of the equilibrium
are obtained in Section 4. Section 5 discusses the possibility of other equilibria. Finally,
section 6 concludes the chapter.
4.2 The Model
4.2.1 Players and payoffs
The setup we consider has two uninformed homogeneous buyers and two heterogeneous
sellers. Buyers (B1 and B2 ) have a common valuation of v for the good (the maximum
willingness to pay for a unit of the indivisible good). There are two sellers. Each of the
sellers owns one unit of the indivisible good. Sellers differ in their valuations. The first
seller (SM ) has a reservation value of M which is commonly known. The other seller (SI)
102
has a reservation value that is private information to her. SI ’s valuation is either L or H,
where,
v > H > M > L
We assume that L = 0, for purposes of reducing notation. It is commonly known by
all players that the probability that SI has a reservation value of L is π ∈ (0, 1). It is
worthwhile to mention that M ∈ [L,H] constitutes the only interesting case. If M < L (or
M > H) then one has no ambuguity about which seller has the lowest reservation value.
Although our model analyses the case of M ∈ (L,H), the same asymptotic result will be
true for M ∈ [L,H] ( even though the analytical characteristics of the equilibrium for δ < 1
are different).
Players have a common discount factor δ ∈ (0, 1). If a buyer agrees on a price pj with
seller Sj at a time point t, then the buyer has an expected discounted payoff of δt−1(v−pj).
The seller’s discounted payoff is δt−1(pj − uj), where uj is the valuation of seller Sj .
4.2.2 The extensive form
This is an infinite horizon, multi-player bargaining game with one sided offers and dis-
counting. The extensive form is as follows:
At each time point t = 1, 2, .., offers are made simultaneously by the buyers. The offers
are targeted. This means an offer by a buyer consists of a seller’s name (that is SI or
SM ) and a price at which the buyer is willing to buy the object from the seller he has
chosen. Each buyer can make only one offer per period. Two informational structures will
be considered; one in which each seller observes all offers made ( public targeted offers)
and one ( private targeted offers) in which each seller observes only the offers she gets.
(Similarly for the buyers after the offers have been made-in the private offers case each
buyer knows his own offer and can observe who leaves the market.) A seller can accept
at most one of the offers she receives. Acceptances or rejections are simultaneous. Once
an offer is accepted, the trade is concluded and the trading pair leave the game. Leaving
103
the game is publicly observable (irrespective of public or private offers). The remaining
players proceed to the next period in which buyers again make price offers to the sellers.
As is standard in these games, time elapses between rejections and new offers.
4.3 Equilibrium
We will look for Perfect Bayes Equilibrium[24] of the above described extensive form. This
requires sequential rationality at every stage of the game given beliefs and the beliefs being
compatible with Bayes’ rule whenever possible, on and off the equilibrium path. The
PBE obtained is stationary in the sense that the strategies depend on the history only
to the extent to which it is reflected in the updated value of π (the probability that SI ’s
valuation is L). Thus at each time point buyers’ offers depend only on the number of
players remaining and the value of π. The sellers’ responses depend on the number of
players remaining, the value of π and the offers made by the buyers.
4.3.1 The Benchmark Case: Complete information
Before we proceed to the analysis of the incomplete information framework we state the
results of the above extensive form with complete information, the formal analysis of which
has been done in the previous chapter.
Suppose the valuation of SI is commonly known to be H. In that case there exists a
stationary equilibrium (an equilibrium in which buyers’ offers depend only on the set of
players present and the sellers’ responses depend on the set of players present and the offers
made by the buyers) in which one of the buyers (say B1) makes offers to both the sellers
with positive probability and the other buyer (B2) makes offers to SM only. Suppose E(p)
represents the expected maximum price offer to SM in equilibrium. Assuming that there
104
exists a unique pl ∈ (M,H) such that,
pl −M = δ(E(p)−M)3
, the equilibrium is as follows:
1. B1 offers H to SI with probability q. With the complementary probability he makes
offers to SM . While offering to SM , B1 randomises his offers using an absolutely continuous
distribution function F1(.) with [pl, H] as the support. F1 is such that F1(H) = 1 and
F1(pl) > 0. This implies that B1 puts a mass point at pl.
2. B2 offers M to SM with probability q′. With the complementary probability his
offers to SM are randomised using an absolutely continuous distribution function F2(.)
with [pl, H] as the support. F2(.) is such that F2(pl) = 0 and F2(H) = 1.
Let us recollect few things from the previous chapter. There exists a unique pl and the
outcome implied by the above equilibrium play constitutes the unique stationary equilib-
rium outcome.
Also as δ → 1,
q → 0 , q′ → 0 and pl → H
This means that as market frictions go away, we tend to get a uniform price in differ-
ent buyer-seller matches. In this chapter, we show a similar asymptotic result even with
incomplete information, with somewhat different analysis.
4.3.2 Equilibrium of the one-sided incomplete information game with
two players
The equilibrium of the whole game contains the analyses of the different two-player games
as essential ingredients. If a buyer-seller pair leaves the market after an agreement and
the other pair remains, we have a continuation game that is of this kind. We therefore3Given the nature of the equilibrium it is evident that M(pl) is the minimum acceptable price for SM
when she gets one(two) offer(s).
105
first review the features of the two-player game with one-sided private information and
one-sided offers.
The setting is as follows: There is a buyer with valuation v, which is common knowledge.
The seller’s valuation can either be H or L where v > H > L = 0. At each period, the
remaining buyer makes the offer and the remaining (informed) seller responds to it by
accepting or rejecting. If the offer is rejected then the value of π is updated using Bayes’
rule and the game moves on to the next period when the buyer again makes an offer. This
process continues until an agreement is reached. The equilibrium of this game(as described
in, for example, [21]) is as follows.
For a given δ we can construct an increasing sequence of probabilities, d(δ) = {0, d1, ....., dt, ....}
so that for any π ∈ (0, 1) there exists a t ≥ 0 such that π ∈ [dt, dt+1). Suppose at a partic-
ular time point the play of the game so far and Bayes’ Rule implies that the updated belief
is π. Thus there exists a t ≥ 0 such that π ∈ [dt, dt+1). The buyer then offers pt = δtH.
The H type seller rejects this offer with probability 1. The L type seller rejects this offer
with a probability that implies, through Bayes’ Rule, that the updated value of the belief
πu = dt−1. The cutoff points dt’s are such that the buyer is indifferent between offering
δtH and continuing the game for a maximum of t periods from now or offering δt−1H and
continuing the game for a maximum of t−1 periods from now. Thus here t means that the
game will last for at most t periods from now. The maximum number of periods for which
the game can last is given by N(δ). It is already shown in [21] that this N(δ) is uniformly
bounded by a finite number N∗ as δ → 1.
Since we are describing a PBE for the game it is important that we specify the off-path
behavior of the players. First, the off-path behavior should be such that it sustains the
equilibrium play in the sense of making deviations by the other player unprofitable and
second, if the other player has deviated, the behavior should be equilibrium play in the
continuation game, given beliefs. We relegate the discussion of these beliefs to appendix
(A.12).
106
Given a π, the expected payoff to the buyer vB(π) is calculated as follows:
For π ∈ [0, d1), the two-player game with one-sided asymmetric information involves
the same offer and response as the complete information game between a buyer of valuation
v and a seller of valuation H. Thus we have
vB(π) = v −H for π ∈ [0, d1)
For π ∈ [dt, dt+1), (t ≥ 1 ), we have,
vB(π) = (v − δtH)a(π) + (1− a(π))δ(vB(dt−1)) (4.1)
where a(π) is the equilibrium acceptance probability of the offer δtH.
These values will be crucial for the analysis of the four-player game.
4.3.3 Equilibrium of the four-player game with incomplete information.
We now consider the four-player game. The complete-information benchmark case suggests
that there will be competition among the buyers for the more attractive seller, in the sense
that that seller will receive two offers with positive probability in equilibrium, whilst the
other seller will obtain at most one. However, the difference arises here because of the
private information of one of the sellers. Even if one pair of players has left the market,
a seller with private information has some power arising from the private information. In
fact, for δ high enough, this residual power of the informed seller leads, in equilibrium, to
competition taking place for the other seller (whose value is common knowledge), even if π
is relatively high. The main result of this chapter is described in the following proposition.
Proposition 20 There exists a δ∗ ∈ (0, 1) such that if δ > δ∗ then for all π ∈ [0, 1) there
exists a stationary equilibrium as follows (both public and private offers:):
107
(i) One of the buyers (say B1) will make offers to both SI and SM with positive proba-
bility. The other buyer B2 will make offers to SM only.
(ii) B2 while making offers to SM will put a mass point at p′l(π) and will have an
absolutely continuous distribution of offers from pl(π) to p(π) where p′l(π) (pl(π)) is the
minimum acceptable price to SM when she gets one(two) offer(s). For a given π, p(π) is
the upper bound of the price offer SM can get in the described equilibrium (p′l(π) < pl(π) <
p(π)). B1 while making offers to SM will have an absolutely continuous (conditional)
distribution of offers from pl(π) to p(π), putting a mass point at pl(π).
(iii) B1 while making offers to SI on the equilibrium path behaves exactly in the same
manner as in the two player game with one-sided asymmetric information.
(iv) SI ’s behavior is identical to that in the two-player game. SM accepts the largest
offer with a payoff at least as large as the expected continuation payoff from rejecting all
offers.
(v) Each buyer in equilibrium obtains a payoff of vB(π).
Remark 3 The mass points and the distribution of buyers’ offers will depend upon π
though we show that these distributions will collapse in the limit. Off the path, the analysis
is different from the two-player game because the buyers have more options to consider
when choosing actions. For the description of off-path behavior refer to Appendix(A.13)
and Appendix(A.14) for public and private offers respectively.
Remark 4 A “road map” of the proof: We construct the equilibrium by starting from the
benchmark complete information case and showing that the complete information strategies
essentially carry over to the game where π is in a range near 0. This includes, through
108
the competition lemma, showing the nature of the competition among the sellers. Once
π is outside this range, the mass points and support of the randomised strategies in the
candidate equilibrium will depend upon π and these are characterised for all values of π.
The equilibrium is then extended beyond the initial range (apart from the initial range, these
are functions of δ) for sufficiently high values of δ by recursion. Finally, checking that
the candidate equilibrium is immune to unilateral deviation at any stage involves specifying
out-of-equilibrium beliefs. This is done in the two appendices.
Proof. We prove this proposition in steps. (Not all of these steps are given here in order
to reduce unwieldy notation-see also the appendices.) First we derive the equilibrium for
a given value of π by assuming that there exists a threshold δ∗, such that if δ exceeds
this threshold then for each value of π, a stationary equilibrium as described above exists.
Later on we will prove this existence result.
To formally construct the equilibrium for different values of π, we need the following
lemma which we label as the competition lemma, following the terminology of [11], though
they proved it for a different model.
Consider the following sequences for t ≥ 1:
pt = v − [(v − δtH)α+ (1− α)δ(v − pt−1)] (4.2)
p′t = M + δ(1− α)(pt−1 −M) (4.3)
where α ∈ (0, 1) and p0 = H.
Lemma 24 There exists a δ′ ∈ (0, 1), such that for δ > δ
′and for all t ∈ {1, ....N(δ)}, we
have,
pt > p′t
109
Proof.
pt − p′t = v − [(v − δtH)α+ (1− α)δ(v − pt−1))]−M
−δ(1− α)(pt−1 −M)
= (v −M)(1− δ + δα)− α(v − δtH)
= (1− δ)(v −M) + α(δv − δM − v + δtH)
= (1− δ)(v −M) + α(δtH − δM − (1− δ)v)
If we show that the second term is always positive then we are done. Note that the
coefficient of α is increasing in delta and is positive at δ = 1. Take t = N∗, where N∗ is
the upper bound on the number of periods up to which the two player game with one sided
asymmetric information (as described earlier) can continue. For t = N∗, ∃ δ′ < 1 such
that the term is positive whenever δ > δ′. Since this is true for t = N∗, it will be true for
all lower values of t.
As N(δ) ≤ N∗ for any δ < 1, for all t ∈ {1, ....N(δ)},
pt > p′t
whenever δ > δ′.
For both public and private targeted offers, the equilibrium path is the same. However
the off-path behavior differs (to be specified later).
Fix a δ > δ∗. Suppose we are given a π ∈ (0, 1)4. There exists a t ≥ 0 (it is easy to
see that this t ≤ N∗ ) such that π ∈ [dt, dt+1). The sequence dτ (δ) = {0, d1, d2, ...dt..} is
derived from and is identical with the same sequence in the two-player game. Next, we4π = 0 is the complete information case with a H seller.
110
evaluate vB(π) (from the two player game). Define p(π) as,
p(π) = v − vB(π)
Define p′l(π) as,
p′l(π) = M + δ(1− a(π))[Edt−1(p)−M ] (4.4)
where Edt−1(p) represents the expected price offer to SM in equilibrium when the proba-
bility that SI is of the low type is dt−1. From (4.4) we can posit that, in equilibrium, p′l(π)
is the minimum acceptable price for SM if she gets only one offer.
Lemma 25 For a given π > d1, the acceptance probability a(π) of an equilibrium offer is
increasing in δ and has a limit a(π) which is less than 1.
Proof. The acceptance probability a(π) of an equilibrium offer is equal to πβ(π), where
β(π) is the probability with which the L-type SI accepts an equilibrium offer. From the
updating rule we know that β(π) is such that the following relation is satisfied:
π(1− β(π))π(1− β(π)) + (1− π)
= dt−1
From the above expression, we get
β(π) =π − dt−1
π(1− dt−1)
Therefore, β(π) is increasing in π and decreasing in dt−1. From [21] the dt are decreasing
in δ and have a limit. Hence β(π) (and also a(π) ) is increasing in δ. Since the dt have a
limit as δ goes to 1, so does β(π). Therefore, a(π) also has a limit a(π) which is less than
1 for π ∈ (0, 1).
111
For π = dt−1, the maximum price offer to SM (according to the conjectured equilibrium)
is p(dt−1). This implies that Edt−1(p) < p(dt−1) (this will be clear from the description
below). Since a(π) ∈ (0, 1), from lemma (24) we can infer that p(π) > p′l(π). Suppose
there exists a pl(π) ∈ (p′l(π), p(π)) such that,
pl(π) = (1− δ)M + δEπ(p)
We can see that pl represents the minimum acceptable price offer for SM in the event that
he gets two offers. (Note that if SM rejects both offers, the game goes to the next period
with π remaining the same.)
¿From the conjectured equilibrium behavior, we derive the following5 :
1. B1 makes offers to SI with probability q(π), where
q(π) =vB(π)(1− δ)
(v − p′l(π))− δvB(π)(4.5)
B1 offers δtH to SI . With probability (1 − q(π)) he makes offers to SM . The conditional
distribution of offers to SM , given B1 makes an offer to this seller when the relevant
probability is π, is
F π1 (s) =vB(π)[1− δ(1− q(π))]− q(π)(v − s)
(1− q(π))[v − s− δvB(π)](4.6)
We can check that F π1 (pl(π)) > 0 and F π1 (p(π)) = 1. This confirms that B1 puts a mass
point at pl(π).
2. B2 offers p′l(π) to SM with probability q
′(π), where
q′(π) =
vB(π)(1− δ)(v − pl(π)))− δvB(π)
(4.7)
5We obtain these by using the indifference relations of the players when they are using randomizedbehavioral strategies.
112
With probability (1−q′(π)) he makes offers to SM by randomizing his offers in the support
[pl(π), p(π)]. The conditional distribution of offers is given by
F π2 (s) =vB(π)[1− δ(1− q′(π))]− q′(π)(v − s)
(1− q′(π))[v − s− δvB(π)](4.8)
This completes the derivation. Appendix(A.13) and Appendix(A.14)(for public and
private offers respectively) describes the off-path play and show that it sustains the equi-
librium play in each of the cases.
Next, we show that there exists a δ∗ such that δ′< δ∗ < 1 such that for δ > δ∗ an
equilibrium as described above exists for all values of π ∈ [0, 1). To do these we need the
following lemmas:
Lemma 26 If π ∈ [0, d1), then the equilibrium of the game is identical to that of the
benchmark case.
Proof. From the equilibrium of the two player game with one sided asymmetric infor-
mation, we know that for π ∈ [0, d1), buyer always offers H to the seller and the seller
accepts this with probability one. Hence this game is identical to the game between a
buyer of valuation v and a seller of valuation H, with the buyer making the offers. Thus,
in the four-player game, we will have an equilibrium identical to the one described in the
benchmark case. We conclude the proof by assigning the following values:
p′l(π) = M and p(π) = H for π ∈ [0, d1)
Lemma 27 6If there exists a δ ∈ (δ′, 1) such that for δ ≥ δ and for all t ∈ {1, ..., N∗} an
equilibrium exists for π ∈ [0, dt(δ)), then there exists a δ∗t ≥ δ such that, for all δ ∈ (δ∗t , 1)6We use the following notation, from the appendix. For any x ∈ (M,H) Ex(p) be the expressions
obtained from F1(.), F2(.), q, q′
and E(p) respectively by replacing pl by x.
113
an equilibrium also exists for π ∈ [dt(δ), dt+1(δ)).
Proof. We only need to show that there exists a δ∗t ≥ δ such that for all δ > δ∗t and for
all π ∈ [dt(δ), dt+1(δ)),there exists a pl(π) ∈ (p′l(π), p(π)) with
pl(π) = (1− δ)M + δEπ(p)
¿From now on we will write dt instead of dt(δ). For each δ ∈ (δ′, 1) we can construct
d(δ) and the equilibrium strategies as above (assuming existence). Construct the function
G(x) as
G(x) = x− [δExπ(p) + (1− δ)M ]
We can infer from Appendix (??) that the function G(.) is monotonically increasing in x.
Since Exπ(p) < p(π),
limx→p(π)
G(x) > 0
Next, we have
G(p′l(π)) = p
′l(π)− [δEp
′l(π)π (p) + (1− δ)M ]
By definition Ep′l(π)π (p) > p
′l(π). So for δ = 1, G(p
′l(π))) < 0. Since G(.) is a continuous
function, there exists a δ∗t ≥ δ such that for all δ > δ∗t , G(p′l(π))) < 0. By invoking the
Intermediate Value Theorem we can say that there is a unique x∗ ∈ (p′l(π), p(π)) such that
G(x∗) = 0. This x∗ is our required pl(π).
This concludes the proof.
¿From lemma (26) we know that for any δ ∈ (0, 1) an equilibrium exists for π ∈ [0, d1).7
Using lemma (27) we can obtain δ∗t for all t ∈ {1, 2, ..., N∗}. Define δ∗ as:
δ∗ = max1,..,N∗
δ∗t
7Note that d1 is independent of δ
114
We can do this because N∗ is finite. Lemma (26) and (27) now guarantee that whenever
δ > δ∗ an equilibrium as described above exists for all π ∈ [0, 1) .
This concludes the proof of the proposition.
4.4 Asymptotic characterization
It has been argued earlier that as δ → 1, p′l(π) reaches a limit which is less than p(π).
From (4.5) we then have,
q(π)→ 0 as δ → 1
Then from (4.6) we have,
1− F π1 (s) =p(π)− s
(1− q(π))[v − s− δvB(π)]
We have shown that q(π) → 0 as δ → 1. Hence as δ → 1, for s arbitrarily close to p(π),
we have
1− F π1 (s) ≈ p(π)− sp(π)− s
= 1
Hence the distribution collapses and pl(π) → p(π). From the expression of pl(π) we
know that pl(π) → Eπ(p) as δ goes to 1. Thus we can conclude that Eπ(p) approaches
p(π). From the two-player game with one-sided asymmetric information we know that as
δ goes to 1, p(π) → H, (since vB(π) goes to v − H) for any value of π. This leads us to
conclude that as δ goes to 1, Eπ(p) → H for all values of π. This in turn provides the
justification of having Edt−1(p) ≈ Exπ(p) for high values of δ(used in the proof of lemma
(27)).
¿From the proof of lemma (27) we know that G(p(π)) > 0. Hence there will be a
threshold of δ such that for all δ higher than that threshold we have G(δp(π)) > 0. Thus
115
pl(π) is bounded above by δp(π). (4.7) implies that
q′(π) =
1v
vB(π) + δp(π)−pl(π)(1−δ)vB(π)
Since pl(π) is bounded above by δp(π), q′(π)→ 0 as δ goes to 1.
Thus we conclude that as δ goes to 1, prices in all transactions go to H. We state this
(informally) as a result.
Main result: With either public or private offers there exists a stationary Perfect-
Bayes equilibrium, such that, as δ → 1, the prices in both transactions go to H. The
bargaining ends “almost” immediately and both sellers, the one with private information
and L type and the one whose valuation is common knowledge, obtain strictly positive
expected profits.
Comment : It should be mentioned that we would expect the same result to be true, if,
instead of a two-point distribution, the informed type’s reservation value s is continuously
distributed in [L,H] according to some cdf G(s). Then with probability G(M) the reserva-
tion value of SI is less than M and with probability (1−G(M)) it is higher than M . Since
this will still be the “gap” case of the two-player bargaining case,the result of [33] will hold
so that the two-player offer goes to H as δ → 1. This should make an analogue to the com-
petition lemma true. The belief updating for private and public offers off the equilibrium
path will now not be in terms of the probability of a soft type but in the (truncated) updated
distribution of the informed seller’s reservation value.
4.5 A non-stationary equilibrium
We show that with public offers we can have a non-stationary equilibrium, so that the
equilibrium constructed in the previous sections is not unique. This is based on using the
stationary equilibrium as a punishment (the essence is similar to the pooling equilibrium
with positive profits in [45]). The strategies sustaining this are described below. The
116
strategies will constitute an equilibrium for sufficiently high δ, as is also the case for the
stationary equilibrium.
Suppose for a given π, both the buyers offer M to SM . SM accepts this offer by
selecting each seller with probability 12 . If any buyer deviates, for example by offering to
SI or making a higher offer to M, then all players revert to the stationary equilibrium
strategies described above. If SM gets the equilibrium offer of M from the buyers and
rejects both of them then the buyers make the same offers in the next period and the seller
SM makes the same responses as in the current period.
Given the buyers adhere to their equilibrium strategies, the continuation payoff to SM
from rejecting all offers she gets is zero. So she has no incentive to deviate. Next, if one
of the buyers offers slightly higher than M to SM then it is optimal for her to reject both
the offers. This is because on rejection next period players will revert to the stationary
equilibrium play described above. Hence her continuation payoff is δ(Eπ(p) −M), which
is higher than the payoff from accepting.
Finally each buyer obtains an equilibrium payoff of 12(v −M) + 1
2δvB(π). If a buyer
deviates then, according to the strategies specified, SM should reject the higher offer if the
payoff from accepting it is strictly less than the continuation payoff from rejecting(which
is the one period discounted value of the payoff from stationary equilibrium). Hence if a
buyer wants SM to accept an offer higher than M then his offer p′
should satisfy,
p′
= δEπ(p) + (1− δ)M
The payoff of the deviating buyer will then be δ(v − Eπ(p)) + (1 − δ)(v −M). As δ → 1,
δ(v − Eπ(p)) + (1− δ)(v −M) ≈ δ(v − p(π) + (1− δ)(v −M)
= δvB(π) + (1− δ)(v −M).
For δ = 1 this expression is strictly less than 12(v−M)+ 1
2δvB(π), as (v−M) > δvB(π).
Hence for sufficiently high values of δ this will also be true. Also if a buyer deviates and
117
makes an offer in the range (M,p′) then it will be rejected by SM . The continuation payoff
of the buyer will then be δvB(π) < 12(v−M)+ 1
2δvB(π). Hence we show that neither buyer
has any incentive to deviate.
We conclude this section by noting that this is not an equilibrium for private offers.
This is because we have different continuation play for buyer’s and seller’s deviations. For
public offers these deviations are part of the public history. However for private offers they
are not.
4.6 Conclusion
In the model we described above we have shown that with either public or private offers
there exists a stationary PBE, such that, as δ → 1, the prices in both transactions go to
H. The bargaining ends within the first two periods and both sellers, the one with private
information and L type and the one whose valuation is common knowledge, obtain strictly
positive expected profits. This equilibrium is reminiscent of the “Coase Conjecture” on the
rents from private information dominating the rents from having the sole right to make
offers, as the offers can be made more quickly. However, the setting is different, in that
there is an endogenous outside option for which buyers compete, and the model contains
a potential interaction between this competition and the private information bargaining.
This interaction comes through, at least in the equilibrium we study, mainly in the analysis
of out-of-equilibrium behavior. It is interesting that the equilibrium path behavior is almost,
though not quite, separable along these two dimensions.
It is also interesting that the equilibrium path in our model is essentially the same
with the two different observability structures of public offers and private offers. We were
somewhat hesitant to use the name PBE for the private offers case, since this is not a
multistage game with observable actions and private information, in the sense of Fudenberg
and Tirole, but the spirit of the analysis is very similar to theirs, so we have retained their
name.
118
One question that might arise is how robust is our conclusion to different bargaining
extensive forms. Clearly, simultaneous offers is best to represent competition and one-sided
offers to represent the power to make offers. If we go to alternating offers, previous results
in the complete information setting indicate that we cannot expect the same results. This
is also true in the two-player setting, so the market element in the current model is not
the driver for this difference.
We have shown that there could be non-stationary equilibria in this model. However,
we have not been able to demonstrate an analogue to the uniqueness result for two-person
bargaining with one-sided offers and one-sided private information, even for stationary
equilibria.
In our future research we intend to address the issue of having two privately informed
sellers and to extend this model to more agents on both sides of the market.
Bibliography
[1] Akcigit, U., Liu, Q., 2013: “The Role of Information in Competitive Experimentation.”, mimeo, Columbia University and University of Pennsylvania.
[2] Almqvist, Ebbe (2002), History of Industrial Gases, Springer, Berlin
[3] d’Aspremeont, C., Bhattacharya, S., Gerard-Varet, L., 2000 “Bargaining and SharingInnovative knowledge. ”, The Review of Economic Studies 67, 255− 271.
[4] Ausubel, L.M., Deneckere, R.J 1989 “A Direct Mechanism Characterzation of Sequen-tial Bargaining with One-Sided Incomplete Information ” Journal of Economic Theory48, 18− 46.
[5] Ausubel, L.M., Cramton, P., and Deneckere, R.J 2002 “Bargaining with IncompleteInformation ”Ch. 50 of R. Aumann and S.Hart (ed.) Handbook of Game Theory, Vol3, Elsevier.
[6] Baliga, S., Serrano,R. (1995), “Multilateral Bargaining with Imperfect Information”,Journal of Economic Theory, 67, 578-589
[7] Bhattacharya, S., Mookerjee D., 1986 “Portfolio choice in research and development.”, Rand Journal of Economics 17, 594− 605.
[8] Binmore, K.G. , Herrero, M.J. 1988. “Matching and Bargaining in Dynamic Markets,’Review of Economic Studies 55, 17− 31.
[9] Bolton, P., Harris, C., 1999 “Strategic Experimentation. ”, Econometrica 67, 349−374.
[10] Chatterjee, K. , Dutta, B. 1998. “Rubinstein Auctions: On Competition for BargainingPartners,’ Games and Economic Behavior 23, 119− 145.
[11] Chatterjee, K. , Dutta, B. 2006. “Markets with Bilateral Bargaining and IncompleteInformation” mimeo Penn State and University of Warwick
[12] Chatterjee, K., Evans, R., 2004: “Rivals’ Search for Buried Treasure: Competitionand Duplication in R&D. ”, Rand Journal of Economics 35, 160− 183.
119
120
[13] Chatterjee, K. , Samuelson, L. 1988. “Bargaining Under Two-Sided Incomplete Infor-mation: The unrestricted Offers Case,’ Operations Research 36, 605− 638.
[14] Chatterjee, K. , Samuelson, L. 1987. “Infinite Horizon Bargaining Models with Alter-nating Offers and Two-Sided Incomplete Information ”Review of Economic Studies,54 , 175− 192.
[15] Chatterjee, K. , Lee, C.C. 1998. “Bargaining and Search with Incomplete Informationabout Outside Options ” Games and Economic Behavior 22, 203− 237.
[16] Chikte S.D., Deshmukh S.D. 1987. “The Role of External Search in Bilateral Bargain-ing ”Operations Research 35, 198− 205.
[17] Corominas-Bosch, Margarida, 2004. “Bargaining in a network of buyers and sellers ”,Journal of Economic Theory 115 , 35− 77
[18] Dasgupta, P., Maskin, E., 1987: “The Simple Economics of Research Portfolios ”, TheEconomic Journal 581− 595
[19] Dasgupta, P., Stiglitz, J., 1980 “Uncertainty, Industrial Structure and the Speed ofR&D ”, Bell JOurnal of Economics 111− 28
[20] “No end to Dementia ”, The Economist, June 2010
[21] Deneckere,R. , Liang, M.Y., 2006. “Bargaining with Interdependent Values,’ Econo-metrica, 74, 1309− 1364.
[22] Fershtman, C., Rubinstein, A., 1997 “A Simple Model of Equilibrium in Search Pro-cedures. ”,Journal of Economic Theory 72, 432− 441.
[23] Fudenberg, D., Levine, D., and Tirole, J. , 1985. “Infinite-Horizon Models of Bargain-ing with One-Sided Incomplete Information. ”A. Roth (ed.), Game-Theoretic Modelsof Bargaining , Cambridge University Press .
[24] Fudenberg, D. , Tirole, J. 1990 Game Theory
[25] Gale, D., (1986). “Bargaining and Competition Part I: Characterization,’ Economet-rica 54, 785− 806.
[26] Gale, D. 1987. “Limit theorems for Markets with Sequential Bargaining,’ Journal ofEconomic Theory 43, 20− 54.
[27] Gale, D. 2000. “Strategic Foundations of General Equilibrium: Dynamic Matchingand Bargaining Games,’ Cambridge University Press
[28] Gale, D., Sabourian, H. (2005).“Complexity and Competition,’ Econometrica 73, 739−769.
121
[29] Galenianos, M. , Kircher, P. 2009. “Directed Search with Multiple Job Applications,’Journal of Economic Theory 144, 445− 471.
[30] Gantner, A. 2008. “Bargaining, Search and Outside Options. ”Games and EconomicBehavior 62, 417− 435.
[31] Graham, M.B.W., 1986 “The Business of research ”, New York:Cambridge UniversityPress.
[32] Gul, F., Sonnenschein, H., “On Delay in Bargaining with One-Sided Uncertainty.”Econometrica 56, 601− 611.
[33] Gul, F., Sonnenschein, H., and Wilson, R. , 1986. “Foundations of Dynamic Monopolyand the Coase Conjecture. ” Journal of Economic Theory 39, 155− 190.
[34] Heidhues, P., Rady, S., Strack, P.,2012 “Strategic Experimentation with Private Pay-offs ”, mimeo. University of Bonn.
[35] Hendon, E., and Tranaes, T. (1991). “Sequential Bargaining in a Market with OneSeller and Two Different Buyers,’ Games and Economic Behavior 4,453− 466.
[36] Horner, J., Vieille, N., (2009). “Public vs. Private Offers in the Market for Lemons ”,Econometrica 77, 29− 69.
[37] Keller, G., Rady, S., Cripps, M., 2005: “Strategic Experimentation with ExponentialBandits ”, Econometrica 73, 39− 68.
[38] Keller, G., Rady, S., 2010:“Strategic Experimentation with Poisson Bandits ”, Theo-retical Economics 5, 275− 311.
[39] Klein, N., 2011: “Strategic Learning in Teams ”, mimeo University of Bonn
[40] Klein, N., Rady, S., 2011: “Negatively Correlated Bandits ”, The Review of EconomicStudies 78 693− 792.
[41] Lee, C.C. 1995. “Bargaining and Search with Recall: A Two-Period Model with Com-plete Information ” Operations Research 42, 1100− 1109
[42] Lee, T., Wilde, L., 1980: “Market Structure and Innovation: A Reformulation”, Quar-terly Journal of Economics 94429− 436
[43] Loury,G.C., 1979 “Market Structure and Innovation ”, Quarterly Journal of Eco-nomics 93395− 410.
[44] Muthoo, A. 1995. “On the strategic Role of Outside Options in Bilateral Bargaining”, Operations Research, 43 292− 297.
122
[45] Noldeke, G. , Van Damme, E. , 1990 “Signalling in a Dynamic Labor Market ”, TheReview of Economic Studies, 57 1− 23.
[46] Osborne, M., and Rubinstein, A. (1990). Bargaining and Markets. San Diego: Aca-demic Press
[47] Peters, M. 2010. “Noncontractible Heterogeneity in Directed Search,’ Econometrica78, 1173− 1200.
[48] Peters, M. , Severinov, S. , 2006. “Internet auctions with many traders,’ Journal ofEconomic Theory 130, 220− 245.
[49] Presman, E.L., 1990: “Poisson Version of the Two-Armed Bandit Problem with Dis-counting, Theory of Probability and its Applications
[50] Raiffa, H., 1985, The Art and Science of Negotiation, Harvard University Press.
[51] Reinganum, J. 1982 “A dynamic Game of R&D Patent Protection and ComeptitiveBehavior ”, Econometrica 50671− 688.
[52] Rubinstein, A. 1982. “Perfect equilibrium in a bargaining model ”, Econometrica 50,97− 109.
[53] Rubinstein, A., and Wolinsky, A. (1985). “Equilibrium in a Market with SequentialBargaining,’ Econometrica 53, 1133−−1150.
[54] Rubinstein, A. and Wolinsky, A. (1990). “Decentralised Trading, Strategic Behaviorand the Walrasian Outcome,’ Review of Economic Studies 57.
[55] Sabourian,H. 2004. “Bargaining and markets: complexity and the competitive out-come”, Journal of Economic Theory 116,, 189− 228.
[56] Sobel, J., and Takahashi, I. 1983. “A Multi-Stage Model of Bargaining ”Review ofEconomic Studies 50 411− 426.
[57] Shaked, A. and Sutton, J. (1984), “Involuntary Unemployment as a Perfect Equilib-rium in a Bargaining Model”, Econometrica, 52, 1351-1364
[58] Scherer, F.M., “International High-Technology Competition ”, Cambridge, Mass:,Harvard University Press
[59] Stokey,N.L., 2009: “The Economics of Inaction ”, Princeton University Press.
[60] Swinkels, J.M. (1999), “Asymptotic Efficiency for Discriminatory Private Value Auc-tions ”, The Review of Economic Studies, 66, 509− 528.
[61] Thomas, C., 2011: “Experimentation with Congestion ”, mimeo, University Collegeof London and University of Texas Austin
Appendix
A.1 Solution for planner’s v(p)
The O.D.E is given by:
v′(p) +
[r + (π + π′)p]
(π + π′)p(1− p)v(p) =
11− p
The integrating factor of the above differential equation µ(p) is given by
µ(p) = e
∫ r+(π+π′)p
(π+π′)p(1−p)
dp=
pr
π+π′
(1− p)r
π+π′ +1
Multiplying both sides of the O.D.E with µ(p) and integrating both sides we get
∫dv(p) =
∫ pr
π+π′
(1−p)r
π+π′ +2 + C1
pr
π+π′
(1−p)r
π+π′ +1
which gives (1.2)
123
124
A.2 Switching-derivative lemma
Lemma 28 When both firms are conducting their research at S1 then v′1(.) is given by
v′1(.) =
π′
r + π + π′− C1
11[Λ(p)]r
π+π′ (1 +
r
π + π′1p
)
Similarly when both firms are conducting their research at S2 then v′2(.) is given by
v′2(.) = − π
′
r + π + π′+ C2
22[Γ(p)]r
π+π′ (1 +
r
π + π′1
1− p)
Let p2s (p1
s) be the switching point of B (A). Then if p2s < p∗1 (p1
s > p∗2), v′1(p2
s) > 0
(v′2(p1
s) < 0).
Proof. Since C111 is chosen by imposing value matching at p2
s, we have
v′1(p2
s) =π′
r + π + π′− {[ π
′
r + π′− π
′
r + π + π′](
p2s
1− p2s
+r
π + π′1
1− p2s
)}
It is easy to see that v′1(p2
s) is decreasing in p2s. Also we can show that v
′1(p∗1) = 0 where
p∗1 = π′
π+π′. Hence if p2
s < p∗1 then v′1(p2
s) > 0. Similarly we can argue for v′2(p1
s).
A.3 Auxillary results
A.3.1 For the proof of proposition (3
Imposing the value matching condition to v1(.) at p∗N2 and p∗N1 we obtain:
π′
r + π + π′p∗N1 + C1
11(1− p∗N1 )[Λ(p∗N1 )]r
π+π′ =
π′
r + π′p∗N1
⇒ C111 =
p∗N1 [ π′
r+π′− π
′
r+π+π′]
(1− p∗N1 )[Λ(p∗N1 )]r
π+π′
(A.1)
125
andπ
r + π + π′(1− p∗N2 ) + C1
22p∗N2 [Γ(p∗N2 )]
r
π+π′ =
π′
r + π′p∗N2
⇒ C122 =
π′
r+π′p∗N2 − π
r+π+π′(1− p∗N2 )
p∗N2 [Γ(p∗N2 )]r
π+π′
(A.2)
Similarly by imposing value matching to v2(p) at p∗N1 and p∗N2 we obtain
C211 =
π′
r+π′(1− p∗N1 )− π
r+π+π′p∗N1
(1− p∗N1 )[Λ(p∗N1 )]r
π+π′
(A.3)
C222 =
(1− p∗N2 )[ π′
r+π′− π
′
r+π+π′]
p∗N2 [Γ(p∗N2 )]r
π+π′
(A.4)
For both 1 and 2, the switching point is in the interior of the range of beliefs over which
the other player’s action is constant. This implies that v1(p) and v2(p) should satisfy
certain smooth pasting conditions.
Lemma 29 v1(.) and v2(.) are smooth at p∗N2 and p∗N1 respectively.
Proof of Lemma. v1(.) is smooth at p∗N2 if
−πr + π + π′
+ C122[Γ(p∗N2 )]
r
π+π′ [1 +
r
π + π′1
1− p∗N2] =
π′
r + π′
Substituting the value of C122 from (A.2) it can be shown that the above equality holds.
Similarly v2(.) is smooth at p∗N1 if
π
r + π + π′− C2
11[Λ(p∗N1 )]r
π+π′ [1 +
r
π + π′1p∗N1
] = − π′
r + π′
Substituting the value of C211 from (A.3) it can be shown that the above equality holds.
This concludes the proof.
126
A.3.2 For the proof proposition (8)
The derivative of Cp1 [Λ(p)]r
π1+π2 with respect to p is given by(1−p)v′SR(p)+vSR(p)− π1+π2
r+π1+π2(1−p)2 .
Since
v′SR(p) =
rπ1
(r + π0)(r + π0 + π1)− C2[Λ(p)]
r+π0π1 [1 +
r + π0
π1p]
⇒ (1− p)v′SR(p) =π0
r + π0+
rπ1
(r + π0 + π1)(r + π0)− vSR(p)− C2[Λ(p)]
r+π0π1
(r + π0)π1
1p
Hence
(1−p)v′SR(p)+vSR(p)− π1 + π2
r + π1 + π2=
π1 + π0
r + π0 + π1− π1 + π2
r + π2 + π1−C2[Λ(p)]
r+π0π1
(r + π0)π1
1p< 0
This follows from the fact that [ π1+π0r+π0+π1
− π1+π2r+π2+π1
] < 0 and C2 > 0.
A.3.3 For the proof of lemma (8)
Since firm 2 switches at p∗N1 ,
vR′
2 (p∗N1 ) =π2
r+π1+π2(1− p∗N1 )− π0
r+2π0+ π2
r+π1+π2p∗N1 − π0r
(r+2π0)(π1+π2)p∗N1+ π2r
(π1+π2)(r+π1+π2)
(1− p∗N1 )
This is obtained by imposing the value matching condition at p∗N1 . Substituting p∗N1 = π0π1
,
the value of the numerator is:
=π2
π1 + π2− π0
r + 2π0− rπ1
(r + 2π0)(π1 + π2)
<π2
π1 + π2− π0
r + 2π0− rπ2
(r + 2π0)(π1 + π2)
=2π0π2
(π1 + π2)(r + 2π0)− π0
r + 2π0=
π0(π2 − π1)(π1 + π2)(r + 2π0)
< 0
as π1 > π2.
127
A.4 Strategy depending on both belief and the location of
the opponent
Case 1 :
r(π′ − π)− ππ′ > 0
Using lemma (3) one can show that given firm 2 is at S2, firm 1 will go to S1 for
p ∈ [p∗N2 , 1] and to S2 for p ∈ [0, p∗N2 ].
Further, we can show that given firm 2 is at S1, firm 1 will go to S1 for p ∈ (p′2, 1] and
to S2 for p ∈ [0, p′2], where p
′2 = rπ
rπ′+π(r+π′ ). p′2 < p∗N2 .
Similarly, we can show the following:
Given that firm 1 is at S1, firm 2 will go to S1 for p ∈ (p∗N1 , 1] and to S2 for p ∈ [0, p∗N1 ].
Given that firm 1 is at S2, firm 2 will go to S1 for p ∈ (p′1, 1] and to S2 for p ∈ [0, p
′1) ,
where p′1 = π
′(r+π)
rπ′+π(r+π′ ). p′1 > p∗N1 .
Define the following strategy for firm 1(s1) :
Choose S2 for p ∈ [0, p′2).
For p ∈ [p′2, p∗N2 ), choose S2 if 2 is at S2. If 2 is at S1, then choose S1.
Choose S1 for p ∈ [p∗N2 , 1].
Define the following strategy for firm 2(s2):
Choose S1 for p ∈ (p′1, 1].
For p ∈ (p∗N1 , p′1], choose S1 if 1 is at S1. If 1 is at S2, then choose S2.
Choose S2 for p ∈ [0, p∗N1 ].
Then (s1, s2) constitutes an equilibrium and the outcome is same as the one obtained
with stationary markovian strategies.
128
Case 2:
r(π′ − π)− ππ′ < 0
First, observe that p′1 > p∗1 and p
′2 < p∗2. Thus for any p∗ ∈ [max{ps, p∗N1 }, 1 −
max{ps, p∗N1 }], p∗ ∈ (p′2, p
′1).
Define the following strategy for firm 1(s1) :
Choose S1 for p ∈ [p∗, 1].
For p ∈ [p′2, p∗), choose S2 if 2 is at S2. If 2 is at S1 then choose S1.
Choose S2 for p ∈ [0, p′2).
Define the following strategy for firm 2(s2) :
Choose S2 for p ∈ [0, p∗].
For p ∈ (p∗, p′1], choose S1 if 1 is at S1. If 1 is at S2, then choose S2.
Choose 1 for p ∈ (p′1, 1].
(s1, s2) constitutes an equilibrium and the outcome is same as the one obtained with
stationary Markovian strategies.
A.5 Proof of Lemma 11
The proof proceeds as follows: We first show that F1(·), F2(.) as given are probability dis-
tributions and have the desired properties. Next we show that q, q′ are in (0,1). Assuming
pl is between M and H, we then show that the strategies are an equilibrium. In the lemma
(12), we show that there is a unique pl implied by all these conditions and it is between M
and H.
Since both buyers offer to SM , it is clear that in equilibrium the offers to SM from both
129
the buyers have to be randomised.
To begin with, we figure out the continuation payoff for SM from rejecting her offer(s).
Consider the case when rejecting an offer leads her to face a 2-player game next period.
This gives her a continuation payoff of zero.
When rejection leads SM to face a 4-player game next period, the continuation payoff
needs to be determined from the equilibrium strategies of the buyers. (Recall y is the
maximum price SM gets in equilibrium in the next period (a random variable this period)).
Thus if pl is the minimum acceptable price for SM in this situation, we must have,
pl −M = δ(E(y)−M)
⇒ pl = (1− δ)M + δ(E(y))
Given the buyers’ strategies, E(y) is given by,
E(y) = q[q′M + (1− q′)E2(p)] + (1− q)[q′E1(p) + (1− q′)E(highest offer)]
where E1(p) is the conditional expectation of B1’s offers given that he is offering to SM
and E2(p) is the conditional expectation of B2’s offers given that he is not offering M to
SM .
Since, as per our proposed strategies, competition takes place for SM only, it is easy to
note that E(y) > M . The fact δ ∈ (0, 1) implies that we must have pl > M .
Consider the region [pl, H] first, where both B2 and B1( if he does make one to SM )
make an offer. In equilibrium both buyers must be indifferent for all price offers in this
region.
According to the proposed strategies the support of B1’s offer to SM is [pl, H]. Also
we know that B1 in equilibrium can obtain a payoff of v −H by offering H to SH . Hence
130
for any s ∈ (pl, H] we should have the following indifference relation:
(v − s)[q′ + (1− q′)F2(s)] + (1− q′)(1− F2(s))[δ(v −H)] = v −H
, which gives us,
F2(s) =(v −H)(1− δ(1− q′))− q′(v − s)
(1− q′)[(v − s)− δ(v −H)](A.5)
As stated earlier, pl is the minimum acceptable price for SM , when on rejection he faces
a 4-player game next period. This implies that on the equilibrium path pl is the minimum
acceptable price for SM when he gets two offers. Thus B1’s offer of pl to SM is accepted
only when B2 offers M to SM . Hence for s = pl, B1’s indifference relation is,
(v − pl)[q′] + (1− q′)[δ(v −H)] = v −H
which implies,
q′
=[v −H](1− δ)
(v − pl)− δ(v −H)
as per (3.4).
Since H > pl, from ( 3.4) we have,
q′
=[v −H](1− δ)
(v − pl)− δ(v −H)<
[v −H](1− δ)[v −H](1− δ)
= 1
This implies that q′ ∈ (0, 1).
For F2(.) to be a distribution function as conjectured we must have F2(H) = 1 and
F2(Pl) = 0. From 3.2 we have,
1− F2(s) =H − s
(1− q′)[(v − s)− δ(v −H)]
131
From (3.4) we can infer that,
1− F2(pl) =H − pl
(1− q′)[(v − pl)− δ(v −H)]=
(1− q′)(1− q′)
= 1
and 1−F2(H) = 0. Thus we have F2(H) = 1 and F2(pl) = 0. Hence F2 has the conjectured
properties .
Now consider the behavior of B2 in the selected region. Since B2 can obtain a payoff
of v −H by offering H to SM , for any s ∈ [pl, H] we should have,
(v − s)[q + (1− q)F1(s)] + (1− q)(1− F1(s))[δ(v −H)] = v −H
which gives us (3.1).
Next, consider other regions. According to the conjectured equilibrium strategies, B2
offers M to SM with probability q′(i.e, he puts a mass point at M). Also B1 offers H to
Sh with probability q. At equilibrium B2 should be indifferent for all price offers he makes.
Therefore we should have,
(v −M)q + (1− q)δ(v −H) = v −H
which gives us (3.3). Since H > M , from ( 3.3) we have,
q =[v −H](1− δ)
(v −M)− δ(v −H)<
[v −H](1− δ)[v −H](1− δ)
= 1
This implies that q ∈ (0, 1).
For F1 to satisfy the conjectured properties we should have F1(pl) > 0(since B1 puts a
mass point at pl while offering to SM ) and F1(H) = 1. From ( 3.1) and ( 3.3) we have,
1− F1(pl) =H − pl
(1− q)[(v − pl)− δ(v −H)]
132
=(1− q′)(1− q)
Since pl > M , q > q′. Thus
(1− q′)(1− q)
< 1 (A.6)
From (A.6) we can infer that,
1− F1(pl) < 1⇒ F1(pl) > 0
Also it is easy to note that 1 − F1(H) = 0. Hence F1(H) = 1.Thus F1(.) satisfies the
conjectured properties.
Lastly, to conclude the proof it needs to be verified that above specified strategies
constitute a subgame perfect equilibrium. We use the one deviation property to do this.
Consider the sellers first. Since we are considering public offers, a seller’s history con-
stitutes the set of players, the offer she receives and the other seller’s received offer. On
the equilibrium path there are only two possible histories. One has all the players present
with both sellers getting equilibrium offers. The other one is when only two players are
present and an equilibrium offer is made. It is easy to observe that in the two-player game
no seller has a profitable one-shot deviation. This is because offers are one sided. Thus we
need to verify equilibrium for the 4-player game only. In the 4-player game irrespective of
SM ’s offer, it is always optimal for SH to accept any offer greater than or equal to H. If
she rejects then next period period either she will face a 4-player game or a 2-player game.
In either case, given that other players adhere to their equilibrium strategies the maximum
payoff which SH can obtain is 0. Also SH has no incentive to accept any offer less than H,
(which gives her a negative payoff), as she can always guarantee a zero payoff by rejecting
the offer.
Next let us look at the possible one-shot deviations for SM on the equilibrium path.
Suppose in the event when she gets two offers, she rejects an offer greater than or equal
133
to pl. Her continuation payoff would be then be pl −M . This is less than or equal to the
payoff obtained by accepting the offer. Thus on the candidate equilibrium path there is
no profitable one-shot deviation by SM . Finally, the way we have specified SM ’s strategy,
there exists no profitable one shot deviations for SM for any off-path history.
Now consider the buyers. After any history there can be only two possible situations.
Either all the players are present or only one pair remains. Given other players’ strategies
and the one-deviation property it is easy to note that buyers cannot profitably deviate.
This concludes the proof of the lemma.
A.6 Proof of Lemma 13
We prove this in the following steps:
(i) From the expression obtained for q′
we can say that q′x is increasing in x.
(ii) Next we show that as we raise x by 1 unit, there is an increase in Ex2 (p) by less
than 1 unit.
Increasing x by 1 unit means raising the lower bound of support of F x2 (.) by 1 unit.
Thus we need to show that
Ex+12 (p) < Ex2 (p) + 1
Consider the distribution F x2 (.) with [x+ 1, H + 1] as the support such that,
F x2 (s) = F x2 (s− 1)
Let Ex2 (p) be the expectation obtained under F x2 (s) . Thus,
Ex2 (p) =∫ H+1
x+1s dF x2 (s)
134
⇒ Ex2 (p) = [∫ H+1
x+1(s− 1) dF x2 (s)] + 1
= [∫ H+1
x+1(s− 1) dF x2 (s− 1)] + 1
= [∫ H
x(s) dF x2 (s)] + 1
= Ex2 (p) + 1
F x+12 (p) is obtained from F x2 (s) by transferring the mass from the interval (H,H + 1]
to [x+ 1, H], i.e transferring mass from higher values to lower values. Thus it is clear that,
Ex+12 (p) < Ex2 (p) = Ex2 (p) + 1
By similar reasoning we can say that ,
Ex+11 (p) < Ex1 (p) + 1
These imply that the increase in E(highest offer) following a unit increase in x is less
than 1.
Hence from the above arguments it follows that,
∂Ex(y)∂x
< 1
A.7 Proof of lemma 23
As before, define the function G(.) as,
G(x) = x− [δEx(y) + (1− δ)M ]
135
where Ex(y) is obtained from E(y) as before(i.e by replacing pl by x). Using lemma 13
we can argue that G′(x) is monotonically increasing in x for x ∈ (p
′l, H). Next, from the
above prescribed strategies it is easy to see that for any x ∈ (p′l, H),we have Ex(y) > p
′l.
Thus we can infer that there exists a δ∗ ∈ (0, 1) such that,
limx→p′l
G(x) = x− [δ∗Ex(y) + (1− δ∗)M ] = 0
Thus for any δ > δ∗, we have limx→p′l
G(x) < 0. Also since for all x ∈ (p′l, H), Ex(y) < H,
we have limx→H G(x) > 0. Hence by applying the Intermediate Value Theorem we can
infer that there exists a unique x∗ ∈ (p′l, H) such that G(x∗) = 0. This x∗ is our required
pl. Thus there is a unique pl ∈ (p′l, H) such that for all δ > δ∗,
G(pl) = 0⇒ pl = δE(y) + (1− δ)M
A.8 Proof of Proposition 15
Consider Buyer B1 first. He puts a mass point at L and his equilibrium payoff is v −H.
Since we are considering public offers, S1 will accept an offer of L only when Bn is offering
to Sn. This is because only in that contingency would the continuation payoff to S1 from
rejection be zero. Thus we must have,
(v − L)qH + (1− qH)δ(v −H) = v −H (A.7)
Solving for qH we get (3.12). Consider the region [p1, H], where both B1 and Bn make
offers. In equilibrium each buyer should be indifferent among all the points in the support.
Thus for s ∈ [p1, H], B1’s indifference relation is given by:
(v − s)[(1− q1) + q1F1n(s)] + q1(1− F 1
n(s))δ(v −H) = v −H
136
Solving for F 1n(.) from the above relation we get (3.14). Similarly for s ∈ [p1, H], Bn’s
indifference relation from offering to S1 is,
(v − s)[q′1 + (1− q′1)F1(s)] + (1− q′1)(1− F1(s))δ(v −H) = v −H
Solving for F1(.) we get (3.13). Putting s = p1 in Bn’s indifference relation we get,
(v − s)q′1 + (1− q′1)δ(v −H) = v −H
which gives us (3.15). Note that from (3.14) and (3.11) we have,
1− F 1n(p1) =
H − p1
q1[(v − p1)− δ(v −H)]=q1
q1
From (3.16) we know that q1 > q1. Hence we have 1 − F 1n(p1) < 1 which implies that
F 1n(p1) > 0. This confirms our conjecture that Bn, while offering to S1 puts a mass point
at p1. It is easy to check that F 1n(H) = 1. Similarly from (3.13) and (3.15) we have,
1− F1(p1) =H − p1
(1− q′1)[(v − p1)− δ(v −H)]=
(1− q′1)(1− q′1)
= 1
which implies F1(p1) = 0. Again it is easy to observe that F1(H) = 1.
Next, consider buyer Bi, i = 2, ..., n − 1. Consider the region [pi, H], where both Bi
and Bn make offers. In equilibrium both buyers should be indifferent between any offers
in the region. For s ∈ [pi, H], Bn’s indifference relation is given by,
(v − s)[Fi(s)] + [1− Fi(s)]δ(v −H) = v −H
Solving for Fi(.) from above, we get (3.17). We can easily infer that Fi(pi) > 0 and
Fi(H) = 1. This confirms the conjecture that Bi puts a mass point at pi. Similarly, Bi’s
137
indifference relation is given by:
(v − s)[(1− qi) + qiFin(s)] + qi(1− F in(s))δ(v −H) = v −H
which gives us (3.18). Putting s = pi inBi’s indifference relation we get qi = v−pi(v−pi)−δ(v−H) =
qi. Hence we have,
1− F in(pi) =H − pi
qi[(v − pi)− δ(v −H)]=qiqi
= 1
Thus F 1n(pi) = 0 and F in(H) = 1. Also note that,
∑i=1,,,n−1
qi + qH = q1 + (1− P − qH) +∑
j=2,..,,n−1
qj + qH = 1
Since uj > L for j > 1, from (A.7) we know that,
(v − uj)qH + (1− qH)δ(v −H) < v −H for j = 2, ...n− 1
Hence Bi (i = 2, ..., N) does not have any incentive to offer uj to seller Sj . Further, Bi
cannot obtain a payoff higher than v − H by deviating unilaterally and making offers to
any other sellers. Lastly, the way we have specified sellers’ strategies it is easy to check
that none of the sellers has a unilateral profitable deviation on the equilibrium path. This
concludes the proof.
A.9 Proof of Proposition 16
This proof is identical in many respects to the proof of proposition (15). Consider the region
[pi, H],(i = 1, .., n−1). In this region both Bi and Bn make offers with positive probability.
By considering the indifference relations of Bi and Bn in this region, we can get (3.19) and
(3.20) in the same manner as we obtained (3.17) and (3.18) in the proof of the previous
138
proposition. Similarly, we can infer that Fi(pi) > 0;Fi(H) = 1 and F in(pi) = 0;F in(H) = 1.
Since qn = 1− P < qH , from (A.7) we know that,
(v − L)qH + (1− qH)δ(v −H) = v −H and
(v − uj)qH + (1− qH)δ(v −H) < v −H
for all j = 2, .., n− 1. Since qn < qH , for all j = 1, ..n− 1 we have,
(v − uj)qn + (1− qn)δ(v −H) < v −H
Hence Bi (i = 1, ..., n− 1) has no incentive to offer ui to seller Si. Finally note that,
∑i=1,..,n
qi =∑
i=1,..,n−1
qi + (1− P) = 1
This concludes the proof.
A.10 Proof of Proposition 17
Consider the region [pi, p] (i = 1, ..., n− 1), where both the buyers Bi and Bn make offers.
Hence the indifference relation of Bn is given by,
(v − s)Fi(s) + (1− Fi(s))δ(v −H) = v − p
This gives us (3.21). One can easily figure out from (3.21) that Fi(pi) > 0 and Fi(p) = 1.
This confirms our conjecture that Bi(i = 1, .., n − 1) puts a mass point at pi. Buyer Bi’s
indifference relation is given by,
(v − s)[(1− qi) + qi(F in(s))] + qi(1− F in(s))δ(v −H) = v − p
139
Solving for F in(.) we get (3.22). By substituting s = pi in Bi’s indifference relation we
get (3.23). From (3.22)and (3.23) it is easy to see that F in(pi) = 0 and F in(H) = 1. For
consistency in the expressions obtained we must have,
∑i=1,..,n−1
qi = 1⇒∑
i=1,..,n−1
p− pi(v − pi)− δ(v −H)
= 1 (A.8)
From the hypothesis of the proposition we know that P ≥ 1. If P = 1, from (A.8) we have
p = H. If P > 1, from (A.8) we can infer that p < H.
From the analysis of the basic complete information game we know that for each i =
1, ..., n − 1, H−pi(v−pi)−δ(v−H) → 1 and pi → H as δ → 1. Thus if P > 1 for a particular
δ∗ ∈ (0, 1),1 it will be so for all δ > δ∗. Thus, the equilibrium behavior will remain the
same for all higher values of δ. Hence we can characterise the equilibrium for values of δ
close to one. Using (A.8) we have,
∑i=1,..,n−1
(1− p− pi(v − pi)− δ(v −H)
) = n− 2⇒∑
1,..,n−1
(v − p)− δ(v −H)(v − pi)− δ(v −H)
= n− 2
⇒ p = v − (n− 2)[
∏i=1,..,n−1[(v − pi)− δ(v −H)]∑
j=1,..n−1[∏k=1,..,n−1;k 6=j{(v − pk)− δ(v −H)}]
]− δ(v −H) (A.9)
From the basic model we know that for each i = 1, .., n − 1, pi → H as δ → 1. Hence
[(v − pi)− δ(v −H)]→ 0 as δ → 1. From (A.9) we have,
p = v − [n− 2∑
j=1,..n−1[∏k=1,..,n−1;k 6=j{(v−pk)−δ(v−H)}∏
i=1,..,n−1[(v−pi)−δ(v−H)] ]]− δ(v −H)
As δ → 1, [ n−2∑j=1,..n−1[
∏k=1,..,n−1;k 6=j{(v−pk)−δ(v−H)}∏
i=1,..,n−1[(v−pi)−δ(v−H)]]] → 0. Hence as δ → 1, p → H. This
concludes the proof.1In fact, as δ increases we will eventually have P > 1
140
A.11 Details of the equilibria defined in proposition (18)
We give here a more detailed description of the equilibrium for heterogeneous buyers for
the n× n model.
A.11.1 Ph < 1 and 1− Ph > qH
Buyer Bi (i = 1, .., n− 1) offers to seller Si only. B1 while making offers to S1 puts a mass
of q′h1 at L. With probability (1 − q′h1 ) he randomises his offers to S1 using a continuous
probability(conditional) distribution function F h1 with [ph1 , H] as the support. Bn offers to
S1 with probability qh1 . His offers are randomised using a probability distribution function
F 1nh with [ph1 , H] as the support. F 1
nh puts a mass point at phi . The distributions F h1 , F 1nh
and the probabilities qh1 and q′h1 are given by:
F h1 =(vn −H)[1− δ(1− q′h1 )]− q′h1 (vn − s)
(1− q′h1 )[(vn − s)− δ(vn −H)]
F 1nh =
(v −H)[1− δqh1 ]− (1− qh1 )(v − s)qh1 [(v − s)− δ(v −H)]
q′h1 =
(vn −H)(1− δ)(vn − ph1)− δ(vn −H)
qh1 = qh1 + (1− Ph − qH)
For i = 2, .., n − 1, Bi’s offers to Si are randomised with a distribution F hi (s). F hi (.) puts
a mass point at phi and has an absolutely continuous part from phi to H. Bn makes offers
to Si(i = 2, .., n − 1) with probability qhi = qhi . Bn’s offers to Si are randomised using
an absolutely continuous probability distribution F inh with [phi , H] as the support. For
i = 2, ..., n− 1, F hi (.), F inh(.) are given by,
F hi =(vn −H)(1− δ)
(vn − s)− δ(vn −H)
141
F inh =(vi −H)[1− δqhi ]− (1− qhi )(vi − s)
qhi [(vi − s)− δ(vi −H)]
Bn offers to Sn with probability qH . He offers H to Sn.
A.11.2 Ph < 1 and 1− Ph < qH
Buyer Bi (i = 1, .., n − 1) offers to seller Si only. Bi’s offers to Si are random with a
distribution F hi (s). F hi (.) puts a mass point at phi and has an absolutely continuous part
from phi to H. Bn makes offers to Si (i = 1, .., n− 1) with probability qhi = qhi . Bn’s offers
to Si are random with an absolutely continuous probability distribution F inh with [phi , H]
as the support. For i = 1, .., n− 1, F hi (.) and F inh(.) are given by
F hi =(vn −H)(1− δ)
(vn − s)− δ(vn −H)
F inh =(vi −H)[1− δqhi ]− (1− qhi )(vi − s)
qhi [(vi − s)− δ(vi −H)]
Bn offers to Sn with probability qhn = 1− Ph. He offers H to Sn.
A.11.3 Ph ≥ 1
Buyer Bi makes offers to seller Si only. Bi’s offers to Si are randomised using a distribution
function F hi (.) with [phi , ph] as the support. The distribution F hi (.) puts a mass point at phi
and has an absolutely continuous part from phi to ph. Buyer Bn makes offers to all sellers
except Sn. Bn’s offers to Si (i = 1, .., n− 1) are randomised with a continuous probability
distribution F inh. The support of offers is [phi , ph]. The probability with which Bn makes
offers to Si is qhi . If Ph = 1 then ph = H. If Ph > 1 then ph < H and as δ → 1, ph → H.
F hi (.), F inh and qhi are given by the following expressions:
F hi (s) =(vn − ph)− δ(vn −H)(vn − s)− δ(vn −H)
142
F inh =(vi −H)[1− δqhi ]− (1− qhi )(vi −H)
qi[(vi − s)− δ(vi −H)]
qhi =ph − phi
(vi − phi )− δ(vi −H)
A.12 Off-path behavior of the 2 player game with incom-
plete information
We recapitulate here the off-path beliefs that sustain the equilibrium we have discussed
for the two-player game. Suppose, for a given δ and π, the equilibrium offer is δtH(i.e
π ∈ [dt, dt+1) ) .We need to consider the following off-path contingencies.
(a) The buyer offers po to the seller such that po < δtH: If p0 < δt+1H then both the
L-type and H-type seller reject this offer with probability 1. If po ∈ [δt+1H, δtH) then the
L-type seller rejects this with a probability, which, through Bayes’ rule, implies that the
updated belief is dt. Let this probability be β′′(p). Hence the acceptance probability of this
offer is a′′(p) = πβ
′′(p). The H-type seller always rejects this offer. Since po ∈ [δt+1H, δtH),
there exists a k ∈ (0, 1] such that po = kδt+1H+(1−k)δtH. Next period (if the seller rejects
now) the buyer offers δtH with probability k and δt−1H with probability (1 − k). This
is optimal from the point of view of the buyer because at π = dt, the buyer is indifferent
between offering δtH and δt−1H. Also the expected continuation payoff to the L-type seller
from rejection is equal to δ(kδtH+(1−k)δt−1H) = po. Thus the L-type seller is indifferent
between accepting and rejecting the offer of po.
The way the cutoffs dt’s are derived ensures that the buyer has no incentive to deviate
and offer something less than δtH.
(b) Next, consider the case when the buyer offers po to the seller such that po > δtH.
If po ∈ (δtH, δt−1H], the L-type seller rejects this offer with a probability that takes
the updated belief to dt−1. Since po ∈ (δtH, δt−1H], there exists a k ∈ (0, 1], such that
143
po = kδt−1H+(1−k)δtH. If the seller rejects then next period the buyer offers δt−2H with
probability k and δt−1H with probability 1− k. This is optimal from the buyer’s point of
view since at π = dt−1, the buyer is indifferent between offering δt−1H and δt−2H. Since
the expected payoff to the L-type seller from rejection is δ(kδt−2H + (1− k)δt−1H) = po,
he is indifferent between accepting and rejecting an offer of po. As po is strictly greater
than δtH and the acceptance probability is the same as that of the equilibrium offer, the
buyer has no incentive to deviate and offer po to the seller where po ∈ (δtH, δt−1H].
If po ∈ (δτ , δτ−1] (for τ ≤ t − 1 ) then the L-type seller rejects this with a probability
which through Bayes’ rule implies that the updated belief is dτ−1. If the seller rejects
then next period the buyer randomises between offering δτ−1H and δτ−2H such that the
expected continuation payoff to the L-type seller from rejection is po. It can be checked
that the buyer has no incentive to deviate and offer po where po ∈ (δτ , δτ−1] (τ ≤ t− 1 ).
A.13 Off-path behavior of the 4 player game with incom-
plete information(public offers)
Suppose B2 adheres to his equilibrium strategy. Then the off path behavior of B1 that of
L-type SI , while B1 makes an offer greater δtH SI , are the same as in the 2-player game
with incomplete information. If B1’s offer to SI is less than δH then the off path behavior
of the L-type SI is in the following manner. If B2’s offer to SM is in the range [pl(π), p(π)],
then the L-type SI behaves in the same way as in the 2-player game. If B2 offers p′l(π) to
SM then the L-type SI accepts the offer with the equilibrium probability so that rejection
takes the posterior to dt−1. Next period, B1 randomises bwteen dt−1 and dt−2 so that the
L-type SI is indifferent betwween accepting or rejecting the offer now. High values of δ
ensures that B1 has no incentive to deviate.
Next, suppose B2 makes an unacceptable offer to SM , (which is observable to SI) and
B1 makes an equilibrium offer to SI . The L-type SI rejects this offer with a probability
144
that takes the updated belief to dt−1. If SI rejects this equilibrium offer and next period
both the buyers make offers to SM , then two periods from now, the remaining buyer
offers δt−2H (the buyer is indifferent between offering δt−1H and δt−2H at π = dt−1) to
SI . Thus the expected continuation payoff to SI from rejection is δ(q(dt−1)δt−1H + δ(1−
q(dt−1))δt−2H) = δtH. This implies that the L-type SI is indifferent between accepting
and rejecting an offer of δtH if he observes SM to get an unacceptable offer.
Now consider the case when B2 deviates and makes an offer to SI . It is assumed that
if SI gets two offers then she disregards the lower offer.
Suppose B1 makes an equilibrium offer to SI and B2 deviates and offers something less
than δtH to SI . SI ’s probability of accepting the equilibrium offer (which is the higher
offer in this case) remains the same. If SI rejects the higher offer (which in this case is
the offer of δtH from B1 ) and next period both the buyers make offers to SM , then two
periods from now, the remaining buyer offers δt−2H to SI .
If B2 deviates and offers po ∈ (δtH, δt−1H] to SI , then SI rejects this with a probability
that takes the updated belief to dt−2. If SI rejects this offer then next period if B1 offers
to SI , he offers δt−2H. If both B1 and B2 make offers to SM then two periods from now
the remaining buyer randomises between offering δt−2H and δt−3H to SI (conditional on
SI being present). Randomisations are done in a manner to ensure that the expected
continuation payoff to SI from rejection is po. It is easy to check that this can always
be done. Lastly, if B2 deviates and offers to SI and B1 offers to SM (according to his
equilibrium strategy), then the off-path specifications are the same as in the 2-player game
with incomplete information.
We will now show that B2 has no incentive to deviate. Suppose he makes an unaccept-
able offer to SM . His expected discounted payoff from deviation is given by,
D = q(π)[δ{a(π)(v −M) + (1− a(π))vB(dt−1)}] + (1− q(π))δvB(π) (A.10)
145
From (4.4) we know that,
p′l(π) < M + δ(1− a(π))[p(dt−1)−M ]
as Edt−1 < p(dt−1). Hence we have,
p′l(π) < M + δ(1− a(π))[(v −M)− (v − p(dt−1))]
Rearranging the terms above we get,
(v − p′l(π)) > δ{a(π)(v −M) + (1− a(π))vB(dt−1)}+ (1− δ)(v −M) (A.11)
By comparing (A.10) and (A.11) we have,
q(π)(v − p′l(π)) + (1− q(π))δvB(π) > D
The L.H.S of the above relation is B2’s equilibrium payoff, as he puts a mass point at p′l(π).
Hence he has no incentive to make an unacceptable offer to SM .
Next, suppose B2 deviates and makes an offer of po to SI such that po ∈ (δtH, δt−1H].
B2’s payoff from deviation is:
ΓH = q(π)[(v−po)a′(π)+(1−a′(π))δvB(dt−2)]+(1−q(π))[(v−po)a(π)+(1−a(π))δvB(dt−1)]
where a′(π) is the probability with which B2’s offer is accepted by SI in the event when
both B1 and B2 make offers to SI and B2’s offer is in the range (δtH, δt−1H]. From our
above specification it is clear that a′(π) > a(π), where a(π) is the acceptance probability
of an equilibrium offer to SI . This is also very intuitive. In the contingency when B1
makes an equilibrium offer to SM and B2’s out of the equilibrium offer to SI is in the
range (δtH, δt−1H], the acceptance probability is equal to a(π), the equilibrium acceptance
146
probability. In this case if the L-type SI rejects an offer then next period he will get an
offer with probability 1. However if both B1 and B2 make offers to SI and B2’s offer is
in the range (δtH, δt−1H] then the L-type SI accepts this offer with a higher probability.
This is because, on rejection, there is a positive probability that SI might not get an offer
in the next period. This explains why a′(π) > a(π).
Since po > p′l(π)2 and p(dt−2) > p
′l(π)3, we have
v − p′l(π) > (v − po)a′(π) + (1− a′(π))δvB(dt−2) (A.12)
Also, since po > δtH, we have
(v − po)a(π) + (1− a(π))δvB(dt−1) < vB(π)
The expression [(v− po)a(π) + (1− a(π))δvB(dt−1)− δvB(π)] is strictly negative for δ = 1.
From continuity, we can say that for sufficiently high values of δ, (v − po)a(π) + (1 −
a(π))δvB(dt−1) < δvB(π). This implies that,
(v − p′l(π))q(π) + (1− q(π))δvB(π) > ΓH
The L.H.S of the above inequality is the equilibrium payoff of B2. Similarly if B2 deviates
and make an offer to SI such that his offer p0 is in the range [δt+1H, δtH), the payoff from
deviation is
ΓL = q(π)[δ{a(π)(v −M) + (1− a(π))vB(dt−1)}]
+(1− q(π))[(v − p0)a′′(π) + (1− a′′(π))δvB(dt)]
¿From the 2-player game we know that [(v − p0)a′′(π) + (1 − a′′(π))δvB(dt)] < vB(π).
2For sufficiently high values of δ this will always be the case.3Since p(dt−2) > p(π) > p
′l(π).
147
Also from the previous analysis we can posit that (v − p′l(π)) > δ{a(π)(v −M) + (1 −
a(π))vB(dt−1)}. Thus for sufficiently high values of δ, (v−p′l(π))q(π) + (1− q(π))δvB(π) >
ΓL.
Hence B2 has no incentive to deviate and make an offer to SI .
A.14 Off-path behavior with private offers
The off-path behavior described in the preceding appendix is not applicable to the case of
private offers. This is because it requires the offers made by both the buyers to be publicly
observable. The off-path behavior of the players in the case of private offers is described
as follows.
Specifically we need to describe the behavior of the players in the following three con-
tingencies.
(i) B2 makes an unacceptable offer to SM .
(ii) B2 makes an offer of po to SI such that po < δtH.
(iii) B2 makes an offer of po to SI such that po > δtH.
We denote the above three contingencies by E1, E2 and E3 respectively. We now
construct a particular belief system that sustains the equilibrium described in the text.
Suppose B1 attaches probabilities λ,λ2 and λ3 (0 < λ < 1 ) to E1, E2 and E3 respec-
tively. Thus he thinks that B2 is going to stick to his equilibrium behavior with probability
[1− (λ+ λ2 + λ3)].
If E1 or E2 occurs and B1 makes an equilibrium offer to SI , then SI ’s probability of
accepting the equilibrium offer remains the same and two periods from now (conditional on
the fact that the game continues until then), if B2 is the remaining buyer he offers δt−2H
to SI . If E3 occurs and all players are observed to be present, then next period B2 offers
p(dt−1) to SM . In any off-path contingency, if B1 is the last buyer remaining (two periods
148
from now) then he offers δt−2H to SI .
The L-type SI accepts an offer higher than δtH with probability 1 if she gets two offers.
If she gets only one offer then the probability of her acceptance of out-of-equilibrium offers
is the same as in the two-player game with incomplete information.
We will now argue that the off-path behavior constitutes a sequentially optimal response
by the players to the limiting beliefs as λ→ 0.
Suppose B1 makes an equilibrium offer to SI and it gets rejected. Although offers are
private, each player can observe the number of players remaining. Thus, next period, if
B1 finds that all four players are present he infers that this is due to an out-of-equilibrium
play by B2. Using Bayes’ rule he attaches the following probabilities to E1, E2 and E3
respectively.
11 + λ+ λ2
to E1
λ
1 + λ+ λ2to E2
λ2
1 + λ+ λ2to E3
As λ→ 0, the probability attached to E1 goes to 1. Thus B1 believes that his equilibrium
offer of δtH to SI was rejected and the updated belief is dt−1. In the case of E1 or E2 the
beliefs of B1 and B2 coincide. However, in the case of E3 they differ. Suppose E3 occurs
and B1’s equilibrium offer to SI gets rejected. Then next period all four players will be
present and given L-type SI ’s behavior, the belief of B2 will be π = 0 and that of B1 will
be π = dt−1. In that contingency it is an optimal response of B2 to offer p(dt−1) to SM
since he knows that B1 is playing his equilibrium strategy with the belief dt−1.
Next we will argue that the L-type SI finds it optimal to accept an offer higher than
δtH with probability 1, if she gets two offers. This is because in the event when she gets
two offers she knows that rejection will lead the buyer B1 to play according to the belief
149
dt−1 and, two periods from now, the remaining buyer will offer δt−2H to SI . Thus her
continuation payoff from rejection is
δ{δt−1Hq(dt−1) + δ(1− q(dt−1))δt−2H} = δ{δt−1H} = δtH
Hence she finds it optimal to accept an offer higher than δtH with probability 1.
We need to check that B2 has no incentive to deviate and make an offer of po to SI
such that po > δtH.
Suppose B2 deviates and makes an offer of po to SI such that po > δtH. With prob-
ability q(π), SI will get two offers and B′2s will be accepted with probability π. With
probability (1− q(π)), SI will get only one offer. B2 then gets a payoff of
(v − po)q(π)π + (1− q(π))[(v − po)a(π) + (1− a(π))δvB(dt−1)]
As shown in the previous appendix, for high values of δ we have (v − po)a(π) + (1 −
a(π))δvB(dt−1) < δvB(π). Also for high values of δ, po > p′l(π). Thus4,
vB(π) = (v − p′l(π))q(π) + (1− q(π))δvB(π)
> (v − po)q(π)π + (1− q(π))[(v − po)a(π) + (1− a(π))δvB(dt−1)]
Hence B2 has no incentive to deviate and make an offer of po to SI .
Lastly, to show that B2 has no incentive to deviate and make an unacceptable offer to
SM or offer p0 to SI such that p0 < δtH we refer to the analysis in the previous appendix.
4This is because B2 puts a mass point at p′l(π)
Vita
Kaustav Das
Education
• Ph.D. in Economics The Pennsylvania State University, 2013
• MS Quantitative Economics Indian Statistical Institute, Kolkata, 2008
• B.Sc. in Economics Presidency College, University of Calcutta, 2006
Research Interests
Strategic experimentation,Inefficiency in R&D, Bargaining theory, Game theory.
Completed Manuscripts
• Competition, Duplication and Learning in R&D.
• Competition and Learning in R&D: The Role of Private Information.
• Decentralised Bilateral Trading, Competition for Bargaining Partners and the law of
one price.
• Decentralised Bilateral Trading in a Market with Incomplete Information (Revise and
Resubmit at the American Economic Journal: Microeconomics).
Awards
• Rosenberg Centennial Scholarship, Department of Economics, Penn State University,
Spring 2012
• Mrs. M.R Iyer Gold Medal, awarded by the Indian Statistical Institute for outstand-
ing performance in the M.S. in Quantitative Economics program, 2008.