product recalls, market size and innovation in the

43
Product recalls, market size and innovation in the pharmaceutical industry Andrea Morescalchi *1 , Federico Nutarelli †2 , and Massimo Riccaboni ‡3 1 AXES Group IMT School for Advanced Studies Abstract The idea that research investments respond to market rewards is well established in the literature on markets for innovation (Schmookler, 1966; Acemoglu & Linn, 2004; Bryan & Williams, 2021). Empirical evidence tells us that a change in market size, such as the one measured by demographical shifts, is associated with an increase in the number of new drugs available (Acemoglu & Linn, 2004; Dubois et al., 2015). However, the debate about potential reverse causality is still open (Cerda et al., 2007). In this paper we analyze market size’s effect on innovation as measured by active clinical trials. The idea is to exploit product recalls an innovative instrument tested to be sharp, strong, and unexpected. The work analyses the relationship between US market size and innovation at ATC-3 level through an original dataset and the two-step IV methodology proposed by Wooldridge et al. (2019). The results reveal a robust and significantly positive response of number of active trials to market size. 1 Introduction Exploring the actual relationship between market rewards and innovation has been widely explored in innovation economics for a long time ( [53], [54], [34]). This opened to the possibility of public demand in stimulating innovation, such as in the case of orphan drugs. Schmookler’s "demand-pull" hypothesis, implying that innovation is a function of market demand, has been challenged over the years. Already in the ’90s [33] noticed that the direction of causality between market size and innovation appears to be far from obvious. In particular, the authors suggested the presence of a simultaneous relationship between demand and innovation but did not manage to control for it. More recently, [57] and [9] developed more rigorous ways to detect such type of endogeneity. * [email protected] [email protected] [email protected] 1 arXiv:2111.15389v1 [econ.GN] 30 Nov 2021

Upload: others

Post on 12-May-2022

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Product recalls, market size and innovation in the

Product recalls, market size and innovation in thepharmaceutical industry

Andrea Morescalchi∗1, Federico Nutarelli†2, and Massimo Riccaboni‡3

1AXES Group

IMT School for Advanced Studies

Abstract

The idea that research investments respond to market rewards is well established in theliterature on markets for innovation (Schmookler, 1966; Acemoglu & Linn, 2004; Bryan& Williams, 2021). Empirical evidence tells us that a change in market size, such as theone measured by demographical shifts, is associated with an increase in the number ofnew drugs available (Acemoglu & Linn, 2004; Dubois et al., 2015). However, the debateabout potential reverse causality is still open (Cerda et al., 2007). In this paper weanalyze market size’s effect on innovation as measured by active clinical trials. The ideais to exploit product recalls an innovative instrument tested to be sharp, strong, andunexpected. The work analyses the relationship between US market size and innovationat ATC-3 level through an original dataset and the two-step IV methodology proposedby Wooldridge et al. (2019). The results reveal a robust and significantly positiveresponse of number of active trials to market size.

1 IntroductionExploring the actual relationship between market rewards and innovation has been widelyexplored in innovation economics for a long time ( [53], [54], [34]). This opened to thepossibility of public demand in stimulating innovation, such as in the case of orphan drugs.Schmookler’s "demand-pull" hypothesis, implying that innovation is a function of marketdemand, has been challenged over the years. Already in the ’90s [33] noticed that thedirection of causality between market size and innovation appears to be far from obvious. Inparticular, the authors suggested the presence of a simultaneous relationship between demandand innovation but did not manage to control for it. More recently, [57] and [9] developedmore rigorous ways to detect such type of endogeneity.

[email protected][email protected][email protected]

1

arX

iv:2

111.

1538

9v1

[ec

on.G

N]

30

Nov

202

1

Page 2: Product recalls, market size and innovation in the

Acemoglu and Linn (2004) developed a strategy to overcome the endogeneity bias at themarket level. Specifically, they exploited changes in the market size for different drugcategories driven by U.S. demographic trends ( [4]). After the contribution of [4], the focusmoved from ascertaining the presence of the reverse causality of market size and innovationto detecting the best instrument for market size. Indeed, the instrument adopted in [4] waslater criticized by [15] as being itself endogenous. As detailed in [15], while pharmaceuticalinnovation increases the age of patients, the fact that the average age increases imply thatmore patients would need innovative products. To the best of our knowledge, such a gap inthe literature is still unfulfilled.Besides, authors, pushed by the studies of [4], mainly concentrate their efforts on thePharmaceutical industry. The Pharmaceutical, indeed, constitutes a definitive case study:in such a sector, consumers’ needs are diverse and almost constant over time, which allowsto separate it into independent sub-markets based on such needs ( [10]). Furthermore,investments in innovation are vital for the industry’s existence. Innovation is also easilymeasurable. In the Pharmaceutical industry, market size is defined based on the AnatomicalTherapeutic Chemical (ATC) Classification System, i.e., a drug classification system thatclassifies the active ingredients of drugs according to the organ or system on which they actand their therapeutic, pharmacological, and chemical properties.The present paper aims at capturing the relationships among market size and innovation atthe ATC-3 level by instrumenting market size with recalls (see below) of drugs operated bythe Food and Drug Administration (FDA). Our work tries to contribute to the literature inthree ways.First, we adopt an innovative measure of innovation, i.e., the overall number of trials atthe ATC-3 level instead of the cumulative R&D expenditures or New Molecular Entities(NME). This necessary modification overcomes the limitations of the other two most adoptedmeasures. R&D expenditures are, indeed, linked both to firms’ long-term profit decisions(Cohen, 2010) and, more critically, to their size. [57], among others, suggested that smallerentrants might be more inclined to invest in R&D expenditures than their bigger veterancompetitors. Hence, innovation as cumulative R&D expenditures of the firms composing themarket might be related to market size as measured by the firms’ cumulative sales composingthe market. More delicate is the topic concerning NME, also adopted in [4]. NME areinnovative products containing active moieties that have not been approved by the FDApreviously, either as a single ingredient drug or as part of a combination product. They canbe either innovative new products that never have been used in clinical practice or the sameas, or related to, previously approved products. Though a complete definition, the one ofNME does not fully capture, in our opinion, the will of innovation by firms inside the market.The reason hinges around the stage of drug approvals at which NME, with respect to themeasure of innovation employed in the present work, are approved by FDA.Pharmaceutical drug approval is a long process. Firms should first pass a pre-clinical phase,a stage of research that starts before that clinical trials (testing in humans) can begin, andduring which important feasibility, iterative testing, and drug safety data are collected. Theclinical drug development stage, then, consists of three phases. In Phase 1, clinical trials areconducted using healthy individuals to determine the drug’s basic properties and safety profilein humans. Typically, the drug remains in this stage for one to two years ( [18]). In Phase 2,efficacy trials begin as the drug is administered to volunteers of the target population. Finally,

2

Page 3: Product recalls, market size and innovation in the

Phase 3 compares a new drug to a standard-of-care drug. NME are FDA-approved entitieshaving overcome pre-clinical trials, while the number of trials also considers the pre-clinicalphase. In other words, the latter measure also considers potentially unsuccessful trials, i.e.,trials not passed to the clinical phase, which equally characterize an innovative drive of thefirm. Furthermore, the number of trials, differently from NME, includes the definition ofinnovation and the drugs that the FDA did not approve for clinical trials.A second contribution is given by the adoption of more refined ATC classes. The availabledata on the ATC-3 level of classification well captures the structure of sub-markets, usuallyconstructed artificially or disregarded by literature. Moreover, the ATC-3 level is the levelemployed by antitrust agencies. Refer to Section 3 for details.A further improvement is methodological. The paper adopts an IV approach to deal with theendogeneity problem of market size. The enhancement compared to past research consists ofthe instrumentation of market size with recalls to overcome the endogeneity issue alreadydetailed. The idea is to exploit sharp and unexpected recalls. The task of characterizingrecalls as being sharp and unexpected requires a general definition of recalls. We believe thatrecalls are exogenous (as detailed in Section 3).FDA refers to a recall as the most effective way to protect the public from a defective orpotentially harmful product. A recall is a voluntary action taken by a company to remove adefective drug product from the market. Drug recalls are conducted either on a company’sinitiative or by FDA request. In a recall, the FDA’s role is to oversee a company’s strategy,assess the recall’s adequacy, and classify the recall. According to their severity, the FDAclassifies the recalls in Class I (more severe), Class II, and Class III (least severe). Medicinesmay be recalled for several reasons ranging from health hazards to potential contamination,adverse reaction, mislabeling, and poor manufacturing. Recalls should not be confusedwith withdrawals. Unlike the FDA definition, literature often refers to withdrawals as post-marketing recalls imposed by the FDA on firms due to their high severity and risk to humanhealth. Therefore, recalls can be either expected and voluntarily made by firms if minor orsharp and unexpected if most severe and forced by the FDA.By the very definition of drug recalls, we expect a drop in sales consequent to a drug recallin a market. To clarify the latter mechanism, one can refer to Merck’s popular recall ofVIOXX in 2004. VIOXX was withdrawn from the market due to an increased risk for seriouscardiovascular events. The recall caught unprepared both the market and the firm. Afterthe announcement of the recall of VIOXX in September 2004, shares of Merck and its salesdropped. This drop has been publicized by mass media ( [59], [12] among others) and wellrecognized by academics (see, e.g. [61] among others).The present work tries to assess such sharp and unexpected recalls (" major recalls" from nowon). The definition of major recalls we adopted throughout the paper has been recoveredby filtering the causes of Class I recalls. We filtered recalls according to the relevanceof the cause, its severity in terms of potential danger against human life, and the FDA’sactions. Specifically, we comprised in the definition of major recalls, withdraws, Class Irecalls containing critical keywords among their causes such that:" contamination," "death/s,""overdose," "symptoms," "particulate matters," and "adverse reaction."We did not employ Class II recalls since we believe that they constitute a weaker instrumentthan major recalls. Nonetheless, we included in the Appendix the analysis using all types ofrecalls as a robustness check. The main results are confirmed.

3

Page 4: Product recalls, market size and innovation in the

To the best of our knowledge, it is the first time that recalls are employed as an instrumentfor market size.

2 Literature ReviewThe literature acknowledges the importance of market size in explaining the rate of innovationfor many years. Back in 1942, Schumpeter indicated that larger firms are more innovativethan smaller ones. In the early ’60s, the focus shifted more broadly on the possible effects ofdemand on market size (see, e.g., [53]). It was not yet clear whether the reverse causality ofdemand and innovation played a relevant role. [53], for instance, argued that causality ranprimarily from sales to innovation. His study, however, has been criticized in several aspects.The definition of demand was indeed still too broad and was not conclusive about the uniquesign on the relationship between demand and innovation (see, e.g., [45]). At the time, theresearch did not focus specifically on the pharmaceutical sector nor looked at the aggregatemarket level (see, e.g., [47]).Most recently, [33] denounced a clear reverse causality of demand and innovation, thusinvalidating the prior studies. [24] empirically verified such conclusions soon after, findingout how innovations increase demand by creating their demand.Besides, it was clear that heterogeneous shifts of demand played a prominent role in deter-mining technological development (see, e.g., [41]). Between 1980 and 1990 and most recentlyin 2002, several studies showed, for instance, how innovation reacted elastically to energyprices.Nowadays, a huge part of the research on the relationship between market size and innovationregards the pharmaceutical industry, where innovation represents a pushing power. Literaturemainly takes into account two levels of aggregation: firm-level and market-level. Past researchefforts have been devoted to identifying the impact of firm size on R&D investments andoutput. Nevertheless, this question is still an open debate ( see [44], [35] among others).Specifically, controversial results emerge due to the difficulty in fully excluding unobservableendogeneity sources varying with time. Such unobservables might derive from strategicdecisions taken within the firms, which, in turn, might be related to their size. For example,small pharmaceutical firms are likely to take more risky decisions than big established ones( [26]). Moving to market aggregation easily avoids the mentioned concerns. Unobservablesrelated to market size can principally be considered as intrinsic characteristics of marketsand, consequently, fixed in time. Thus, fixed effect techniques allow researchers to control forunobservable heterogeneity, purging the idiosyncratic endogeneity of market size. Therefore,the market seemed a more suitable level, and most authors shifted to the latter level ofaggregation.The literature of the pharmaceutical sector is varied. Part of its variability is due to themeasures of innovation adopted. Some authors adopted accounting data focusing on R&D.The latter, though robust under perfect capital markets becomes inconclusive with imperfectmarkets. Because current revenues (market size) are a reasonable proxy for future market size,and since present R&D may be responding both to present and to future sales opportunities,results incorporate two effects that are difficult to separate. Aware of such an issue, theauthors included lagged proxies of the market size (see, e.g., [25] who estimated that a 1%

4

Page 5: Product recalls, market size and innovation in the

increase in price leads to a 0.58% increase in R&D spending).Other measures of innovation include clinical trials (see [36] among others) and changesin Medicare part D affecting both present and future market size ( [11]). Scholars founda positive response of innovation to shocks in market size. Again, the problem remainedthe possible co-occurrence in innovation’s response to both current and expected cash flowsgenerated by market size shocks.Besides, innovation has been quantified by the number of relevant journal articles abouta condition ( [38]). Further measurements comprise the number of new drugs launched,including generic drugs ( [4], [19]) in the form of new molecular entities (NME), new chemicalentities (NCE), or approvals of new medicines by the FDA.Similarly, many measures of market size have been embraced.[4] gave a first significant contribution on the relation between market size and innovation inthe pharmaceutical industry. Their idea relies on adopting demographic shifts to instrumentmarket size and control for the endogeneity arising from reverse causality. In particular, [4]exploit variations in the expenditure share of different U.S. age cohorts for different therapeu-tic classes from 1970-2000. They find that a 1% increase in expenditure shares leads to a4% increase in the number of new drugs, a far higher elasticity than the average elasticityfound in the remaining literature ( [19]). [15] provided further insights on the results foundin [4]. Employing U.S. demographic data, [15] showed that there are essential feedback effectsnot considered in [4]. New drugs might affect the market size through their impact on themortality rate. Indeed, innovative medicines are likely to cure more diseases, raising thepopulation’s average age and, hence, the number of older people needing such cures. Demandshifts accordingly, bringing out again the issue of reverse causality.Recent literature on the topic improves above all on the methodological part (see e.g. [38],[17], [19], [51] and others). Authors found, on average, that a 1% increase in the market sizemeasure increases innovation of 0.4% to 0.7%.Past papers acted mainly at disease level or, at most, at ATC-1 or ATC-2 levels (see e.g. [19]).To the best of our knowledge, no works are focusing on the more interesting ATC-3 levelat which antitrust authorities work. According to us, there are several advantages of usingdrug classes rather than disease classes. Firstly, since firms request NCTs, NMEs, and NDAsdirectly, devoting too much attention to the demand-side might neglect the supply-sidedynamics, which induce firms to undergo an NDA. In particular, aggregate sales of drugsalign to the supply-side, while sales based on disease classes (i.e., aggregated sales of productspurchased by patients) are more on the demand side.In other words, while firms might follow demand-side stimuli to undergo an NDA (or anNCT), they might also look up at the competitors, i.e., products of other companies in thesame ATC class. The latter applies to commercial trials when the sponsor is a pharmaceuticalindustry and not academy/research related.Secondly, by taking disease classes, one includes in the definition of innovation differentchemical and therapeutic typologies of drugs ranging from topical to systemic drugs, fromvaccines to ointment. This lack of distinction might lead to endogeneity through severalchannels, such as people’s expectations. Patients might beware of some drugs, affecting theprobability of having a larger market size for the product’s typology under question. Otherendogeneity sources regard the possibility of a correlation between regressors and the errorterm (which includes "drug-type"). For instance, regulations may be product type-specific

5

Page 6: Product recalls, market size and innovation in the

(e.g., the regulation of the WHO vaccines do not apply to other drug types). Other possiblyproblematic controls are knowledge stocks, which could again depend on the product type.Moreover, knowledge stocks might increase by developing innovative medicines in classeswhere only a particular type of medicine has been developed until that moment. An exampleis provided in dermatology, where academics produce papers for adopting topical medicinesfor systemic usage due to some systemic medicines’ undesired side effects.Finally, the length of a clinical trial varies depending on the type of medicine under study,which may cause lagged effects of market size if disease class is employed.To the best of our knowledge, among the several innovation measures, no work exploitedINDs and early stages clinical trials (i.e., pre-clinical and Phase I) together with Phase IIand Phase III trials.The two more recent estimates of the relationship between market size and innovation havebeen provided in [51] and [19]. The latter used NCE to measure innovation and definedmarket size as a measure of expected revenue. The dataset comprised information aboutsales for 14 different countries. Specifically, [19] measured market size as the total revenueover the entire life cycle of a branded drug. [19] performed a control function approach andrecovered an estimate of the relation between market size and innovation for each therapeuticclass at level 1. The average elasticity of innovation to the market size in [19] was about23%, which is relatively low than the average estimates. A possible explanation can be foundin [11], which states that several of the countries chosen for the analysis regulate prescriptiondrug prices, and regulations may change rapidly over time. Thus, given the lower expectedprofit per consumer and more significant uncertainty about future profits and prices, firms’R&D decisions are likely to be less responsive to a unit change in expected revenues for allthese countries combined versus the exact unit change in the U.S. market.Finally, [51] adopted several measures of innovation from NCE to clinical trials in PhaseII and Phase III. [51] found no evidence of reverse causality when adopting NCE. One ofhis efforts was to account for the fact that changes in the industry’s R&D process, from"random screening" to "guided drug development," pointing out the importance of advances inmolecular biology and related fields ( [51]). The author modeled technological opportunitiesand inserted them as a regressor in the analysis, finding a positive relationship with PhasesII and III trials. His results are in line with [15] and [4].

Tab. 9 in Appendix provides a schematic literature review on previous estimates of therelation between innovation and market size.

3 DataThe sales data employed come from Evaluate dataset. The controls have been extrapolatedfrom Evaluate, from the PHarmaceutical Industry Database (PHID) and FDA. Specifically,some of the regressors derive from an elaboration of the variables present in the PHIDdatabase.Sales data for the US pharmaceutical market ranges from 2004 to 2015. Sales data wereinitially available at the product and molecule level and have successively been aggregated atthe ATC-3 level. In the ATC classification system, drugs are classified at five levels (ATC-1,

6

Page 7: Product recalls, market size and innovation in the

ATC-2, ATC-3, ATC-4, ATC-5): the higher the level, the more detailed the classification.Acemoglu, Linn (2004) employed ATC-1 and ATC-2 categorizations to define market size. Inparticular, Acemoglu, Linn (2004) constructed market size as the sum of the average expen-diture share of drugs in an ATC-1 (ATC-2) category across all ATC-1 (ATC-2) categories.Data at our disposal allow us to catch the diverse strata of products inside broader classes(ATC-1 and ATC-2) in terms of both demand and supply dynamics. Medicines classifiedinside an ATC-1 or an ATC-2 level can satisfy patients with completely diverse needs sincethey are designed to cure various diseases. At the same time, a firm investing in the sameATC-2 sector might invest in more ATC-3 sectors. In the case of ATC-1 or ATC-2 adoption,the latter missing information may lead to the construction of uninformative innovationand market size variables. Such controls might not consider the firms’ specialization in asub-sector rather than in another one belonging to the same ATC-2 or ATC-1 class.In previous work, we have also evaluated other levels of analyses (firm, product, and ATC-firmaggregations) ( [22]) but opted for ATC-3 level because of the importance of ATC-3 levelbeing employed by antitrust agencies. To provide some examples, we mention Provost etal.(2019), Markham, A. (2020), Vaishnav, A. (2011), Hawk et al.(2000), Cheng J. (2008),and other cases mostly pertaining M&A (e.g., Case M.8889 - TEVA / PGT OTC ASSETSof 2018).We avoided adopting the ATC-4 level since, at such granularity, products belonging to aspecific ATC-4 class might not differ substantially from others belonging to another ATC-4class. This might lead to between-group dependencies (e.g., innovations in an ATC-4 at level4 may also affect a close ATC-4 class) which could cause inference to be invalid. Further, atthe ATC-4 level, compensations may also intervene between groups, thus invalidating thestrength of the instrumental variable recalls.The available data also contain the launch date and ATC code of products. We focused onworldwide sales of US companies.Data on New Clinical Trials (NCT) for 2004-2015 at product level come from the ClinicalTri-als.gov website, while data on commercial Investigational New Drugs (IND) at product levelderive from a Pharmaceutical Industry Database maintained at IMT Lucca.Clinical trials are research studies performed on people who aim to evaluate a medical, surgical,or behavioral intervention. An IND in clinical trials is the mean by which a pharmaceuticalcompany obtains permission to start human clinical trials and to ship an experimental drugacross state lines before a marketing application for the drug has been approved.Clinical trials comprise trials from Phase I to Phase IV. Fig.1 displays the yearly number oftrials and commercial IND as obtained by the mentioned sources.It also shows the expected positive trend of sales of the Pharmaceutical industry in time.

7

Page 8: Product recalls, market size and innovation in the

2000

3000

4000

5000

6000

Tria

ls a

nd IN

D

2005 2010 2015Year

(a) Number of trials and INDs per year

26.5

26.6

26.7

26.8

26.9

27Lo

g sa

les

2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015Year

(b) Trend of sales by year

Figure 1: Overview of sales’ and trials’ trends

A considerable drop in Trials and IND occurred after 2013, as it is evident from Fig.1 (a).The reason for such lack is that, in general, clinical trials innovate drugs, approaches, andinterventions. However, approaches and interventions are excluded from the count of trials tofocus strictly on innovation coming from industrial sources.

Recalls data have been manually collected from different sources, among which FDAwebsite, openFDA, various articles, and web sources (e.g., [46]; WHOCC website, PubMed, [55]and others).The 7.19% of the firms’ sample (i.e., 697 firms in total) have issued Class II recall. Amongthe firms that issued a recall, 51 firms underwent a recall of Class I, 27 of which issued asingle recall of Class I, and just three firms issued more than 9 Class I recalls.The provided estimates must be read in light of the database’s possible limitations regardingthe presence/absence of firms and products inside it.Besides, the recalls of pure compounders were only partially included 1 and, when included,were attributed to the unique manufacturer/distributor in the database. Finally, recallscoming from repackaging firms were not attributed uniquely to the repackager (e.g., Aidapak)but the labeler specified in the NDC.Due to these case-specific engines, it is not easy to establish a unique and unambiguouspattern of recalls over the years. The situation is further complicated wherever differentresources employ different methodologies to count the recalls. An example of the citeduncertainty in sources is found when comparing [1] and [2]. Specifically, [1] asserts that thenumber of recalled products had remained reasonably constant except for 2010 and 2013when the number went down by approximately 35%. The statement contrasts with whatwas reported in [2]. According to [2] "a spike in the number of drugs recalled occurred in2013. There were nearly 60 recalls in that year alone. However, 2017, with 71 recalls, sawnearly the highest number of recalls since 2009. Only 2011 and 2009 surpassed it at 74, and75 recalls, respectively".Tab. 1, provides a list of the primary sources and the average number of recalls across them.

1Due to the unavailability of data. We verified that the representativeness of recalls is preserved [22]

8

Page 9: Product recalls, market size and innovation in the

The following is an attempt to overcome the mentioned issues involving the dissimilarities ofdata origins.

Year CNN Regulatory Focus [27] FDA Enforcement Reports [2] AVERAGE2004 68 682005 140 1402006 384 109 2432007 391 56 1892008 426 128 176 2442009 1742 85 1660 8902010 135 389 2622011 236 1279 75 5302012 381 499 1518 7992013 1031 1283 848 60 8052014 640 1344 893 9592015 1584 1584

Table 1: Sources with reported number of recalls

To overcome the dissimilarities of data origins, we chose the average as the benchmark tocompare with the collected recalls. Fig.2 illustrates in more detail the comparison betweenthe benchmark recalls represented by the average recalls among all sources and the collectedrecalls. Aidapak’s recalls of 2011 are considered as "outliers" and, for this reason, are notincluded among the collected recalls at this stage:

9

Page 10: Product recalls, market size and innovation in the

68140

109

378

56

321

128

426

85

1742

135

389

75

1279

381

1518

60

1283

640

1344

1584

080

160240320400480560640720800880960

10401120120012801360144015201600

Num

ber

of r

ecal

ls

2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015Year

Benchmark recalls Our recalls

Figure 2: The number of our recalls against the number of recalls used as benchmark (average of sources).We include the minimum and the maximum number of recalls retrieved by the different sources. Mint coloredpoints represent the minimum amount of recalls retrieved among all the sources at our disposal. Red pointsrepresent the maximum number of recalls among all the sources. A single mint point has been put whenevera single source was present for a year (2004, 2005, 2015).

Fig. 2 underlines a disproportion in terms of the number of recalls starting from 2009, withrespect to the benchmark. Such deficiency pertains to the counting methodology togetherwith the structure of the database (see above).Though the global trend is approximately reproduced, 2011, 2013, and 2015 representproblematic years. The dissimilarity of 2011 concerning the benchmark can be easily explained.Indeed, with the exclusion of Aidapak’s recalls from the count of the collected recalls, thelatter dropped. Furthermore, 2013 and 2015 have far fewer recalls than expected becausemore than 60% of the recalls in 2013 and nearly 75% of the recalls in 2015 were representedby compounding firms.The recalls trend of the benchmark seems to be well reproduced. However, when recalls ofpure compounders are excluded from the benchmark number and the sample of collectedrecalls. Fig. 3 shows, indeed, an accordance in trends. Aidapak’s recalls are here included.Indeed, Aidapak is a repackager and not a compounder. Besides, we want to show that 2011does not constitute a problematic year once Aidapak’s recalls are considered.

10

Page 11: Product recalls, market size and innovation in the

623

203

402

190

653

403

428

135

0

80

160

240

320

400

480

560

640

720

800

880

960

2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015Year

Bench. recalls no comp. Our recalls no comp.

Nr. bench. recalls no compound Nr. our recalls no comp.

Figure 3: The number of our recalls against the number of recalls used as benchmaark without compoundingrecalls. We included the minimum and the maximum number of recalls retrieved by the different sources.Mint colored points represent the minimum amount of recalls retrieved among all the sources at our disposal.Red points represent the maximum number of recalls among all the sources.The situation i almost unchangedwith respect to Fig. 2 until 2011. From 2011 on, the recalls collected in our dataset follow the benchmark ifcompounders’ recalls are excluded more precisely.

To conclude, as a further check of the exogeneity of recalls, we constructed a box plotdisplaying the average number of trials (and their dispersion) in both ATC markets havingundergone a recall and not having undergone a recall by year. The latter exercise helps inunderstanding that major recalls do not necessarily intervene in more innovative markets.Indeed, Fig. 4 displays that the yearly number of trials of ATC markets undergoing majorrecalls almost coincides with the average number of trials in all other markets.

11

Page 12: Product recalls, market size and innovation in the

NR R NR R NR R NR R NR R NR R NR R NR R NR R050

100

150

200

250

Tria

ls

20042005

20062007

20082009

20102011

20122013

Year

NR = Not Recalled; R = Recalled

Trials of recalled and not recalled ATC-3 by year

Trials of not recalled ATC-3 Trials of recalled ATC-3

RNR

Figure 4: The box plots show how, on average, recalls do not necessarily happen in moreinnovative markets. The average number of trials amounts to 123 for ATC-3 markets havingundergone a recall and to 118 for ATC-3 markets not having undergone a recall.The graph reports a yearly analysis of the average number of trials in recalled and not recalledATC-3 groups. The average number of trials is similar in both the ATC-3 markets, havingundergone at least a recall (R) and ATC-3 markets not having undergone any recall (NR).

12

Page 13: Product recalls, market size and innovation in the

4 MethodologyThe main theoretical framework is the same adopted in [4]. In particular, [4] model innovation,the dependent variable of the present model, as being proportional to market size. Referto [4] for further details.The measure of innovation is the number of clinical trials in all Phases for the ATC-3 categoryi. The measure of market size is the sum of products’ sales for the ith market. When otherpotential determinants, time effects, and category effects are added to the analysis, thewell-known estimation Poisson model is returned

E[Nit|µi, ζt,Xit,Mit] = exp(β1 · logMit +β2 ·Xit +µi + ζt) ∀i= 1, . . .N,t= . . .T (1)

where E is the expectations operator, Mit represents endogenous market size, Xit capturesage (e.g., the average age of products in category i weighted for products’ size) , diversificationand innovation patterns (e.g., the scientific production), µi are ATC fixed effects and ζt timefixed effects. The estimation of (1) would lead to biased estimates for two reasons: first of all,the non-linearity in (1) makes it impossible to estimate the fixed effects consistently; secondly,market size is endogenous.In order to deal with both problems, a novel control function (CF) IV approach, describedin [39] has been adopted. With respect to past literature, the present method allows to deal(i.e., testing and estimating) simultaneously with two potential sources of endogeneity: thatdue to correlation of covariates with time-constant, unobserved heterogeneity and that dueto correlation of covariates with time-varying idiosyncratic errors. Furthermore, it can beeasily extended to non-linear scenarios with fixed effects.Specifically, denoting as κit the idiosyncratic shock and ci the individual heterogeneity,the unobserved effects non-linear model allowing for both idiosyncratic endogeneity andheterogeneity endogeneity might look as follows:

E[Nit|Mit, zit, ci,κit] = ciexp(xitβ1 +κit) (2)

where xit = (Mit, zit). zit would typically include a full set of time effects, and Mit is theendogenous variable. All exogenous variables, which include the vector zit can be correlatedwith the heterogeneity (i.e., no random effects). There is also a set of excluded exogenous Rit2serving as an instrument for the potentially endogenous variable. In the present work, Rit2 isrepresented by recalls. [39] noticed that, without the idiosyncratic endogeneity, an appealingestimator would be a fixed-effects Poisson estimator, which, viewed as a QMLE, would onlyrequire a strict exogeneity assumption with respect to the idiosyncratic shocks to ensureconsistency. Such an assumption is exploited as a null hypothesis for testing idiosyncraticendogeneity against the alternative of full dependence of the error term of the specificationof Mit and κit. The alternative is composed by exploiting the reduced form equation for theendogenous variable

Mit = zitΠ + ci2 +uit2 ∀t= 1, . . .T (3)where because the zit is strictly exogenous, it is tested the correlation between κit and functionsof uit2. [39] developed a simple procedure allowing to test for idiosyncratic endogeneity andproduce consistent estimates also in co-presence of non-linearity, fixed effects and both typesof endogeneity. The algorithm follows the steps below:

13

Page 14: Product recalls, market size and innovation in the

1. Estimate the reduce form for the endogenous through fixed effects and obtain the fixedeffects residuals uit2 = Mit− zitΠ

2. Use fixed effects Poisson on the mean function

E[Nit|Mit, zit, ci, uit2] = ciexp(xitβ1 + uit2ρ)

use robust Wald test of H0 : ρ= 0

Step 2 allows estimating the fixed effects in the presence of non-linearity consistently. Yet,fixed effects Poisson enables eliminating ATC-level fixed effects performing a conditional MLconsistent estimation. Refer to [14] for further details.A crucial characteristic of Poisson-FE models is that they require the dependent variable tobe nonzero for at least one time period. The lower the proportion of zeroes in the dependent,the better the model works. The last condition has been fulfilled by dropping those ATCcategories not meeting it, constituting approximately 10% of the total ATC-3 in the sample.We estimated several instances common to literature to check for either delayed effects of tri-als or the presence of a bias if market size were considered exogenous (or fixed effects omitted).

Throughout, the problem of endogeneity in market size has been exposed as being in-trinsic to market size. Hence instrumentation of the endogenous Mit is needed. Market sizeis instrumented through normalized recalls. The normalization is on the number of productspresent in the market i at time t. Calling m the major recalls, normalized recalls are denotedas follows:

m= m

#prod. ·100.

As aforementioned, normalization is necessary in order to avoid another source of endogeneity.Indeed, ATC markets having more products are more likely to undergo a recall by definition.Omitting such control would partly invalidate the estimates. The belief is that marketsundergoing major recalls experiment with a sudden negative shock in sales. The relevance ofthe instrument is tested in Section 5.The instrument is not directly related to the dependent variable. The central argumentationthat might directly connect normalized recalls to trials is that the lack left by recalls isfilled with innovations. Hence sectors more prone to undergo a recall should also be themost innovative ones. In literature, there seems to be contrasting evidence about the topic.Though the argumentation would imply a positive impact of recalls on innovation, the recentevents seem to contradict such findings. Indeed, albeit an increasing number of recalls from2004 to 2015 (see, e.g., Fig.2), the innovation crisis of the pharmaceutical industry is a widelyknown and recognized phenomenon in literature (see e.g. [48], [50] among others). It mightbe argued that the contrasting effects leading to the drop of innovation have overtaken thepositive effect of recalls, thus favoring the decreasing pharmaceutical innovation trend. Thepositive effect of recalls on innovation could be, therefore, still present but hidden. Empiricalresearch has conducted few analyses to explore the relationship between innovation and recallsor withdrawal in general. Fortunately enough, most severe recalls have considerable mediacoverage, which allowed researchers to collect data on market reaction to such bad events(see, e.g., [49]). Authors working on such a stream of literature conclude that the impact

14

Page 15: Product recalls, market size and innovation in the

of recalls and withdrawals on market innovation has a high variability: some recalls haveconsiderable effects while others have none at all. There seems not to be a systematic way toidentify the recalls whose announcement impacted innovation among major recalls. Marketreactions depend on not controllable criteria, such as the period during which the recall tookplace and eventual delays in the FDA’s communication of the recall. Generally, however,the market does not systematically overreact to such shocks, invalidating any dependencebetween recalls and innovation.To summarize, direct connection sources between innovation and recalls are mainly due tofixed and time effects. FDA delays cannot be easily controlled. The FDA developed preciseguidance and protocols for recall communication and announcement for the period consideredin the present work. Hence, delays constitute a minor issue because FDA regulates them.For the sake of completeness, thanks to the FOIA agreement signed, openFDA, and FDAEnforcement report, it has been possible to verify the happening of delays. The mentionedsources allowed us to access the time gaps between recall initiation, recall classification, andrecall termination. The communication of the recall is part of the initiation process. Aboveall, in case of severe recalls, it must be prompt. The average time between the initiationand the termination for Class I and Class II recalls has been around 23 months. A delayin communication might happen in the first initiation phase. The average time that theinitiation phase took for any Class I and Class II recall was four months approximately. Forour sample of major recalls, the initiation phase’s average time has been approximately 2 to3 months, in line with prompt communication criteria. This evidence enforces the limitedimpact of delays on the analysis.Dropping out unobserved heterogeneity and including time dummies in the primary spec-ification control for possible direct connections between recalls and innovation. Thus, thementioned operations ensure only an indirect effect of recalls through sales.Further arguments in favor of the indirect effect of recalls on innovation follow.In particular, the recalls taken into account are severe recalls of marketed products. Thetime gap between trial phases and the marketing of a drug usually takes between 8 to 14years. Such a significant time gap is relevant to guess and understand competitors’ possiblereactions to a drug recall in the same sector where a firm is operating. We believe that acompetitor that underwent a recall in the sector in which both firms operate does not increaseor decrease the risk of innovation in the short run. Indeed, marketed products undergo majorrecalls long after that they are commercialized.Besides, the lack of sales left on the market by recalling the drug requires an extended periodto recover fully. Hence, there is no need to invest in clinical trials to take advantage ofsuch a shortage in the short run. As a further check f the latter conjecture, we build up atime-to-event analysis in Fig.6 of Appendix. Fig.6 takes into account all types of recall andclearly shows how a recalled product has a truncated life compared to drugs having a normallife cycle.In particular, Fig.6 displays how the survival rate of drugs that did not undergo a recallis persistently higher than the survival rate of drugs having undergone a recall. Hence,having undergone a recall decreases the "probability of surviving of a drug." Under normalconditions, drugs have a probability greater than 0 to survive more than ten years. However,if a drug underwent a recall, this probability drastically reduces to almost 0. Notice that theprobability that a recalled drug survives two years is still consistent. The median survival

15

Page 16: Product recalls, market size and innovation in the

time is five years.Thus, the drop in sales after a recall is likely to remain unfilled for years. Indeed, hadfirms found innovative replacements for recalled drug d, which allowed them to recover theshortages left by the recall of d, there would not be any reason to keep selling drug d foryears. Recalled products, therefore, leave a long-term lack in terms of sales within the ATC-3market to which they belong.A further argument against the coverage of lacks left by recalls through innovative productsis that such shortages might be filled by drugs already present in the market, whose trialsstarted before, soon after, or at the same time as the trials leading to the recalled drug.This eventuality is reasonable since, as mentioned, suspended or terminated studies areexcluded from the sample, meaning that remaining clinical trials sponsored by concurrentfirms are likely to arrive on the market with products belonging to the same therapeuticclass. Competition of wholesalers within an ATC might reveal in early stages once it isevident that a firm will develop an innovative cure. The development of alternative drugsis encouraged from the early trial phases when there are still chances to arrive first on themarket. Medicines substituting recalled drugs in the same ATC might be developed soonafter the recalled medicine in a "first to arrive" competition rather than a "fill the gaps ofrecalls" logic. The latter may also be because the demand for patented medicines of the typeof the recalled drug was likely more consistent when the trial for the recalled drug started. Inthe eventuality that demand propagates at the recalls’ time, either already existing genericsor new ones (trials of generics is indeed less time consuming since they only need to ensurebio-comparability) might intervene and fill the gap.It is worth noticing, in any case, that the potential positive relationship of recalls andinnovation exploiting the market lacks passes indirectly through market size. Indeed, theemergence of new trials within a market after a recall depends on the demand that theproduct in question generated in the market. If a recalled product had no underlying demand,it is reasonable to expect no company to begin a costly trial only to fill the lack left by therecalled product. Therefore, the response of innovation seems to depend not directly on therecall but the underlying magnitude of the recalled product’s demand, i.e., on market size.Finally, another possible critique undermining the instrument’s validity is that recall ofproduct i might have provoked the recall of trials concerning similar products. This dominoeffect hangs on the causes of the recall. Indeed, if the recall concerns only the specificproduct being withdrawn from the market, implications on other companies’ products areunlikely. For instance, it is possible that after the recall of the COX-2 inhibitor, Vioxx, dueto cardiovascular side effects, all firms having ongoing trials on the same target did suspendor withdraw the trials relating to COX-2 inhibitors. To the best of our knowledge, no efforthas been made to explore this possibility in the drugs market. The only work approachingthe critique is [8]. The author, however, focuses on the medical devices industry, which hasdifferent legislation for recalls than the drugs’ market. Indeed, a device’s recall is a commonpractice made ordinarily by firms to repair or update a device, which is usually promptlyplaced back to the market. The way we managed the circumstance is threefold. First, weconsidered only active trials, thus excluding suspended and withdrawn trials, including thosesuspended due to other drugs’ recall. In a second instance, we removed trials of companiesundergoing a recall as well. Ultimately, as far as it has been possible to link the reason of

16

Page 17: Product recalls, market size and innovation in the

the severe recalls 2 we dropped trials adopting a similar active principle. The latter instancehappened in a few cases since eliminating suspended and withdrawn trials constitutes alreadya robust control.

2above all, in case of adverse events caused by an active principle adopted in the drug to the scope of atrial

17

Page 18: Product recalls, market size and innovation in the

5 ResultsThe results section is divided into two main subsections. Namely, the impact of recalls onthe endogenous market size is first analyzed as measured by total sales of ATC i. The aim isto provide convincing arguments in favor of the relevance of the adopted instrument.Successively, are presented the results of the impact of the instrumented market size oninnovation.

5.1 The impact of recalls on sales5.1.1 Summary statistics

This Section reports summary statistics for the sample. Tab.2 contains average values andstandard deviations (below) of relevant variables for the full sample and two separate sub-samples for observations associated or not to recalls. The Table includes such information atthe ATC-3 level and refers to major recalls.Tab.2 embraces all the relevant controls employedfor constructing Tab.7.

Table 2: Summary statistics at ATC-3 level for the full sample, the subset of ATC-3 having undergone arecall in the period considered, and the subset not having undergone a recall. Database at ATC-3 level isbalanced.

ATC-3Variable Full Sample Subs. recalls Subs. no recalls Description

Sales (log)

Overall mean 19.405 20.794 19.054

Log of sales at ATC-3 level.Overall Std. Dev. 2.233 1.475 2.256Between Std. Dev. 2.152 1.441 2.163Within Std. Dev. .614 .378 .661

Outflow rate (Kt+1P−1

)

Overall mean .086 .056 .093It is defined as the number of lost products inan ATC-3 (Kt+1 in regressions) over the total

number of products in t−1 (P−1 in regressions).

Overall Std. Dev. .257 .064 .285Between Std. Dev. .109 .036 .120Within Std. Dev. .233 .053 .259

Avg. age of firmswithin ATC

Overall mean 35.907 33.275 36.573It is the average age of the firms competingwithin an ATC-3. The foundation year of

the firms was present in the data.

Overall Std. Dev. 7.631 5.420 7.960Between Std. Dev. 6.767 4.452 7.093Within Std. Dev. 3.556 3.160 3.650

Herfindahl–Hirschman Index

(hhi)

Overall mean .431 .268 .434The hhi measures the competition within a market.

It can range from 0 to 1.0, moving from a huge numberof very small firms to a single monopolistic producer.

Overall Std. Dev. .260 .159 .262Between Std. Dev. .236 .166 .240Within Std. Dev. .110 .020 .115

Share genericsby ATC

Overall mean .746 .725 .752It represents the percentage of generic products,among all products sold in an ATC-3 market

Overall Std. Dev. .255 .214 .264Between Std. Dev. .238 .210 .245Within Std. Dev. .092 .052 .099

Avg. age prod.by ATC

Overall mean 13.159 12.043 13.441It represents the average age of product withinan ATC-3. The age of a product is based on

the foundation year of the firm that produced it.

Overall Std. Dev. 5.333 3.763 5.627Between Std. Dev. 4.909 3.568 5.164Within Std. Dev. 2.109 1.304 2.268

Scientific knowledgewithin ATC

Overall mean 6.327 6.787 6.211The number of papers and scientific publications foran ATC-3 present in PubMed and other sources.

Overall Std. Dev. 1.718 1.623 1.724Between Std. Dev. 1.705 1.627 1.709Within Std. Dev. .242 .177 .256

Number of firmswithin ATC

Overall mean 21.054 32.802 18.082

Number of firms trading within an ATC-3Overall Std. Dev. 20.954 23.442 19.175Between Std. Dev. 20.604 23.069 18.876Within Std. Dev. 4.050 5.367 3.645

18

Page 19: Product recalls, market size and innovation in the

Tab 2 displays the overall, between, and within standard deviation for the main controlsincluded sales. The statistics are provided for the total sample, the subset of ATC-3 havingundergone at least a recall, and the sub-sample of ATC-3 without recalls. The panel of salesin Tab. 2, displays how, typically, the recalls are found in larger markets than the average.For this reason, recalls have been normalized by the number of products in the ATC marketto avoid possible problems of reverse causality with the market size. The normalized recallshave been denoted as ˜recalls in the following paragraphs.Moreover, as expected, more competitive markets are more prone to recalls, as displayed bythe Herfindahl–Hirschman Index (hhi). There is evidence of differences in terms of competitionbetween ATC-3 groups. [22] provides further insights about which type of firms and productsgenerally undergo a recall. Specifically, [22] evidences a general tendency of recalls to belocated in big established firms and to regard relatively older products than the average age.On the contrary, with respect to firm and product levels, recalls are located in more dynamicATCs, where recalled drugs were pioneering in the past. Fixed effects technique accounts fortime-invariant characteristics of ATCs.The recalls intervene in firms with a high share of generics (see [22]). This finding might resultfrom a less stringent policy for generic drugs’ approvals compared to branded ones. Growingconcern for generic safety is, in fact, a well-known problem in literature (see, e.g., [23]).Besides, in ATC markets, the outflow rate presents a within variance higher than the betweenvariance. The latter means that there is no difference between ATC-3 groups concerning theoutflow rate. As opposed to the firm level, this inversion is expected. Indeed, while strategicpolicies of product placement might occur in firms, this is not the case for ATC aggregation,where market laws apply. Thus, on average, even two utterly different ATC markets woulddisplay similar outflow rates following only a demand-supply logic.Two other variables seem to be related to recalls at the ATC-3 level, i.e., scientific knowledgewithin an ATC and the number of firms trading within an ATC. Specifically, the recallshappen in ATC markets where, on average, trade more firms and scientific knowledge is moreadvanced than other markets.To summarize, the recalls regard relatively old drugs produced in big established firms.The major recalls occur in relatively dynamic markets whereby, on average, many youngerfirms operate, trading relatively young products. A possible reason the markets having thedescribed characteristics undergo more easily recalls is that they are precisely the marketsmonitored by the legislator with special attention.

5.1.2 Analysis of the determinants of drug recalls

This section reports the first-stage results. A Fixed-Effects estimation method is employed.Tab.7 shows the estimates of the first stage at the ATC-3 level. As detailed below, a furtherlevel ATC-Firm has been added to test for compensations within ATCs inside firms. Forconsistency with the best model, the sample was truncated in 2013 also for the first stage.Outcomes with a not-truncated sample display very similar results (see [22]). The F-statisticamounts to 14.32. The standard errors included in the Tables of the present Section and thefollowing ones are all robust and clustered at the ATC-3 level of aggregation.

We found a significant and negative impact of recalls on the logarithm of sales at themarket level. In [22] it is shown that sales of firms undergoing a recall are unaffected. At

19

Page 20: Product recalls, market size and innovation in the

Table 3: First stage results at different levels.ATC-3 aggregation represents the main specification.

(ATC-Firm Aggregation) (ATC-3 Aggregation)Log sales Log sales

˜recalls −0.0053 −0.0283∗∗∗

(0.0033) (0.0056)˜recallst−1 −0.0267∗∗ −0.0267∗∗∗

(0.0083) (0.0070)Kt+1P−1

0.1932∗∗

(0.0628)average age firm 0.1576

(0.0921)average age firm2 −0.0020

(0.0013)hhi 1.2405∗∗∗

(0.2590)share generics in ATC −0.1895

(0.3373)papers −0.0260

(0.0507)# firms 0.0077

(0.0071)Year Dummies Yes YesObs. 48915 1664Groups 8634 208Standard errors in parentheses∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001

Huber-White robust and clustered at ATC-3 level standard errors are in parentheses. First-stage resultsare shown in this Table. (1) fits an F.E. model at ATC-Firm level, i.e., ATC lines of productions withinfirms. This level is introduced to check the possibility of compensations between sales of productsbelonging to the same ATC (excluded due to the significance of the coefficient of recalls witnessing adrop after a recall) (2) fits an F.E. model at the ATC-3 level. ˜recalls represent recalls normalized. Atthe ATC level, recalls are respectively normalized for the number of products within an ATC.

the same time, the prouction lines of medicines belonging to the same ATC-3 encounter adrop in sales due to recalls (ATC-Firm Aggregation in Tab.7). This evidence excludes thepossibility of compensations between sales of products belonging to the same ATC inside afirm. Therefore, the negative effect of recalls at the market level is enforced, whose lacks arenot filled by the same firms with other medicines of the same ATC-3.The second column of Tab.7 represents the first stage of the principle analysis. As illustrated,the effect of recalls at the ATC-3 level is powerful and significant for current recalls anddelayed ones. After having performed a sufficient amount of bootstrap repetitions, we foundthat the t-statistic is invariant to whether we use recalls or lag recalls to obtain it 3. Thisfinding corresponds to a Sargan-Hansen test for over-identification in our contest, implyingthe absence of over-identifying restrictions ( [39]).We believe that the key reason for the strength of the result relies on the level of aggregation.While firms with high-quality managements and inclined to risk can promptly make up forsevere recalls, the latter take ATC-3 markets unaware. Competitors could not anticipatesevere recalls against firms producing in the same ATC as theirs, which can be detected onlyat the market level.The absence of compensations at the market level has been further tested. In particular, weanalyzed the effect of recalls on aggregated sales once the firms’ sales having undergone arecall are removed from the sample. The drop in sales seems to disappear once firms havingundergone a recall are excluded (see Fig. 7 in Appendix).The fall of sales observed at the ATC-3 level becomes evident not only from the estimates in

3t-stat is obtained after 30000 repetitions and amounts to 2.438

20

Page 21: Product recalls, market size and innovation in the

Tab.7 but also from the study of abnormal values in Section 5.1.3Finally, it might be argued that since recalled products are the most innovative ones, nodirect substitute is present in the same market. However, having the generic name of products(both recalled and not recalled) and the active principle of medicines, it has been possible todetect an average of 10 products within the market exploiting the same active principle asthe recalled products. Hence, it has also validated the hypothesis that the lacks left by recallsare probably filled with products already present on the market and that recalled productsare not necessarily the most innovative ones having no substitutes.

5.1.3 Analysis of Abnormal Values

This Section reports estimates of the influence of drug recall on sales. The effect of recalls isdefined by taking a reference value of the given economic indicator as it would be observedunder “normal” dynamics of economic conditions; this is called the “potential” value. Wehence define the Abnormal Value (AV) of the indicator y associated with the unit i in time tas the observed and potential value difference. [60]:

AVit = yit−E (yit) , (4)

The potential value E(yit) is estimated by running a Fixed-Effects regression on the followingmodel:

yit = α+βyst +γXit +µi +λt +uit, (5)

where yst is the aggregated value of y in year t at the sector level. Usual control variables(X) and year dummies are included as regressors. After obtaining estimates of AVit for all iand t, referred to as AV it, the time dimension is re-scaled. Specifically, the time dimensionis centered on the year when the recall is issued for all units experiencing a recall in thetime frame considered. Only these units are kept in the sample. The market-level AbnormalValue AV t associated to recalls is then computed as the simple average of AV it for anyt ∈ {−(T −1), ...,(T −1)}, as follows:

AVt =Nt∑i=1

AV it, (6)

where Nt is the number of units with available data in t among those experiencing one recall.Confidence intervals for AV t are constructed calculating the variance of AV it as follows:

V ar(AV t

)=∑Nt

i=1V ar(AV it

)N2

t, (7)

where V ar(AV it) is the variance of the forecast error derived from estimation of Equation 5.The focus of the analysis is on the growth rate of sales volumes. The exercise is replicatedfor three classifications of recalls (standard recall definition, major recalls, type of recall)and three levels of analysis: product, firm, and sector level. The main text reports onlythe analysis at the ATC-3 level as it is the level at which the first and second stages are

21

Page 22: Product recalls, market size and innovation in the

conducted. Abnormal values at the firm and product level can be found in Appendix.Note that in the model for the sector level, yst is replaced with ymt in Equation 5, that is thevalue at the whole market level.

Fig. 5 reports estimates of the effects of recalls on the AV of sales growth.-.0

50

.05

abno

rmal

valu

esa

les

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8Year

Lower C.I. Upper C.I.AV_t atc3 General Recalls

Recall Year = 0

-.4-.2

0.2

.4

abno

rmal

valu

esa

les

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8Year

Lower C.I. Upper C.I.AV_t atc3

Recall Year = 0

ATC-3: general recalls sales ATC-3: Maj. recalls sales

-.2-.1

0.1

.2

abno

rmal

valu

esa

les

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8Year

Lower C.I. Upper C.I.AV_t atc3 Recalls Class I

Recall Year = 0

-.05

0.0

5

abno

rmal

valu

esa

les

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8Year

Lower C.I. Upper C.I.AV_t atc3 Recalls Class II

Recall Year = 0

ATC-3: Class I recalls sales ATC-3: Class II recalls salesFigure 5: Abnormal values at ATC-3 level of aggregation. Years are normalized. Year 0represents the year of recall. The four scenarios include the path of sales before and after therecall year, using four different definitions of recalls: major recalls, Class I recalls, generalrecalls and Class II recalls. As it is evident from the diagrams, sales drop at recall year forevery type of recall. Major recalls present a more pronounced drop. Moreover, using majorrecalls the lowest error bound is reached.

Fig.5 exhibits abnormal values for ATC-3 level. Confidence intervals are constructed atthe 95% level. As it is evident from Fig.5, after the initial drop at the year of recall, salessoon recover one or two years after year 0 (see major recalls). The latter observation classifiesthe instrument employed in our work as a short-run effect. This distinguishes the effect ofour instrument from the long-run effect that demographic shocks produce in the work of [4].The analysis of abnormal values confirms what was found in previous paragraphs. Especiallya considerable impact of recalls on sales in the year of the recall. The error bound is lowerfor the ATC-3 level with major recalls, thus enforcing the expectation of a drop in sales atthe recall time.

5.2 Relation between innovation and market sizeIn this section we report the results concerning the relationship between market size, Mit

and innovation, Nit. Since the data at our disposal are already converted into dollars of

22

Page 23: Product recalls, market size and innovation in the

2015 using Consumer Price Index (CPI), market size is measured directly as the sum of salesover ATC market i at time t. Innovation is measured with the number of activated trials inATC i at time t. The time window ranges from 2004 to 2013. The samples’ last two years(2014, 2015) have been cut away since very few trials have been conducted in such a period.Including 2014 and 2015 may have led to biases in the procedure, which exploits Poissonestimates. Indeed, the latter method does not tolerate a value of 0 for the dependent in mostobservations.The panel is strongly balanced as required by the procedure. Each year has data for 208therapeutic classes.The best model is estimated by Eq.(1).We introduced several regressors. These comprise supply-side determinants, technologicalopportunities, and age determinants. We draw some controls directly from the literature,comprising knowledge stock (see, e.g., [15], [4] among others) as measured by the numberof papers referred to ATC category i. PubMed database has been consulted. Specifically,we collected the number of scientific works for a given ATC-3 in a given year through MeshTerms. According to NIH, MeSH terms are official words or phrases selected to representparticular biomedical concepts. When labeling an article, indexers select terms only from theofficial MeSH list, never other spellings or variations. For deciding whether a paper referredor not to a specific ATC class, it has been first associated a Mesh Term to ATC category iprimarily exploiting the official synthetic description of ATC. If the latter did not produceany result or did not match evidence from literature, a double-check was made using level3 indications as Mesh terms.4. NCBI Mesh database allowed us to customize the searches.Since the number of papers showed an upward trend, the variable has been detrended throughfirst differentiating its logarithm.Another critical control drawn from literature is the share of generics. As noted in [19],ease of entry and substantial financial incentives to use generics will reduce the expectedprofitability of the innovation. Hence, detecting the degree of penetration of generics withinmarkets is vital, which might discourage firms from undertaking innovation.Besides, as emphasized both in [4], and [19], a further source of declining margins ofinnovation is represented by the increasing number of young entrants within an ATC market.Pharmaceutical competition, in general, might undermine innovation productivity. It is, thus,imperative to measure and control for competition.Apart from [4], empirical literature does not model explicitly competition (see [19]). In thepresent work, we constructed two measures to control pharmaceutical competition. The firstis the Herfindahl index ("hhi" hereafter), which measures firms’ size in relation to the market.It is usually employed as an indicator of the amount of competition among firms withinan ATC-3 market. The Herfindahl index’s major benefit compared to other measures suchas the concentration ratio gives larger firms more weight. The index can range from 0 to1.0, moving from many tiny firms to a single monopolistic producer. The second measure

4For instance, category C6B is described as "PULMONARY ARTERIAL HYPERTENSION (PAH)PRODUCTS." Due to the name’s length and possible different abbreviations employed in the Mesh Termslist, Mesh Terms have been searched by looking at different specifications of the description such as "PAHPRODUCTS," "PULMONARY ARTERIAL HYPERTENSION PRODUCTS." If the latter did not produceany result or the results were not in line with findings in the literature, then Mesh indication at level 3 "PAHalso searched terms." was also selected

23

Page 24: Product recalls, market size and innovation in the

controlling competition is the average age of firms within a market. It controls other aspectsof competition compared to hhi. While hhi measures the "degree of monopoly" within anATC, it cannot clarify the firms populating the market. However, the average age of firmsmainly catches the presence of small biotechnology firms in the market. Such firms are knownon one side to compete for innovation and on the other to have less financial resources incontrast with established companies (see, e.g., [26] among others). Since margins declinewith the number of young entrants, we expect a negative sign of firms’ average age.Tab.4 presents the main results of the analysis. It is technically the second stage of theprocedure described in the methodological section. Precisely, calling zit2 the excludedinstruments (recallsit,recallsit−1) 5, the first stage estimation computes the residuals,uit2, ofa linear fixed-effect model whose dependent is market size. The second stage incorporatesthe residuals and estimates a fixed effect Poisson model. Please refer to steps 1. and 2. inthe methodological section.Differently from literature, in the present work, it is not necessary to construct Mit based ondemographic shifts since the innovative instrument, recalls, already purges market size fromendogeneity. In the following, Mit is simply the logarithm of collapsed sales at ATC-3 level,i.e., the product of the number of purchased drugs expressed in standard units to ensurecomparability with their price.Notice that a critical assumption of the model is that excluded exogenous, Rit appearingwithin zit, do not explicitly appear in the equation of Trials. For the more refined aggregationlevel at our disposal, ATC-3, it is plausible to assume that the average elasticity is the sameacross categories.

5We included two instruments since, following [28], instrumenting with more valid instruments leads tomore accurate estimates

24

Page 25: Product recalls, market size and innovation in the

Table 4: Impact of market size on innovation. Col.(1) employs a simple Poisson modelnot considering fixed effects. Col.(2) is the main specification (fixed effect Poisson).Col.(3) and Col.(4) add the lag of the dependent. Col.(5) eliminates all the controls

(1) (2) (3) (4) (5)Trials Trials Trials log Trials Trials

trialst−1 −0.00741 0.0732∗

(0.0005) (0.0335)Log sales 0.1378∗∗∗ 0.6362∗∗ 0.802∗∗ 0.1176∗∗∗ 0.8229∗∗

(0.0060) (0.2149) (0.266) (0.0153) (0.3174)residuals −0.8018∗∗∗ −0.862∗∗ −0.9711∗∗

(0.2157) (0.269) (0.3177)Kt+1P−1

−0.5378∗∗∗ −0.0926 −0.484∗∗∗ −0.0504(0.0847) (0.0909) (0.147) (0.0914)

average age firm 0.2890∗∗∗ −0.1332∗∗∗ −0.106∗ 0.0634∗∗

(0.0139) (0.0377) (0.0398) (0.0214)average age firm2 −0.0038∗∗∗ 0.0021∗∗∗ 0.00178∗∗∗ −0.0008∗∗

(0.0002) (0.0005) (0.0005) (0.0003)hhi 0.2245∗∗∗ −0.3199 −0.145 0.1106

(0.0446) (0.2903) (0.360) (0.1153)share generics in ATC −0.5571∗∗∗ −0.3168∗∗ −0.898∗∗∗ −0.2036

(0.0404) (0.1124) (0.143) (0.1068)average age product −0.0658∗∗∗ −0.0592∗∗ −0.0928∗∗ −0.0564∗∗∗

(0.0061). (0.0190) (0.0323) (0.0143)average age product2 0.0010∗∗∗ 0.0011 0.0100 0.0010∗

(0.0002) (0.0010) (1.64) (0.0004)papers 0.5608∗∗∗ 0.1558∗ 0.101 −0.0443

(0.0728) (0.0750) (0.0013) (0.1477)papers2 −0.6672∗∗∗ −0.0067 −0.0929 −0.0083

(0.1056) (0.0838) (0.083) (0.0883)# firms 0.0090∗∗∗ 0.0008 −0.0032 0.0048∗∗

(0.0007) (0.0035) (0.0792) (0.0016)Year Dummies Yes Yes Yes Yes YesObs. 1664 1664 1664 1664 1872Groups 208 208 208 208 208Standard errors in parentheses∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001

Huber-White robust and clustered at ATC-3 level standard errors are in parentheses. (1) fits a simplePoisson with exogenous sales. (2) represents the main specification. The dependent variable is count ofactive trials in ATC i at time t. Time interval is 10 years. The technique adopted for the estimation is theone of Wooldridge [2019], please refer to Section. 4. (3) count model with lagged dependent among regres-sors following [4] (4) linear model with lagged dependent among regressors and exogenous size. Dependentis linearized. Both the presence of non-linearities and of endogeneity are ignored.(5) best model withoutcontrols

Column (1) presents a simple Poisson model with exogenous market size exploring whethermarket size’s positive effect is robust in the absence of fixed effects and endogeneity controls.Column (2) is our main specification, i.e., a fixed effect Poisson controlling for market size’sendogeneity.The coefficients of interest in Tab.4 are Log sales and residuals. The former represents themarket size and the latter measuring endogeneity of market size. Specifically, a significantcoefficient of residuals means a correlation between the error term (see the specification inTab.4, i.e., second stage regression) and functions of the error of the model of the marketsize (first stage). In other words, residuals control for co-movements of sales and unob-servables related to the number of trials. Market size is hence "purged" from the allegedendogenous part. Endogeneity is tested with a Wald test on the coefficient ρ of residuals.If ρ is significantly different from zero, endogeneity is present. This latter instance occursin our model as expected (Column (2)). In particular, fully robust standard errors detect astrong idiosyncratic endogeneity. The exploitation of Fixed Effects methodologies allows theunobserved heterogeneity to be correlated with all explanatory variables and the excludedexogenous recalls. The evidence is that even after allowing the market size to be correlatedwith the ATC heterogeneity, market size is not exogenous to idiosyncratic shocks.

25

Page 26: Product recalls, market size and innovation in the

The coefficient of market size is positive and significant in line with past works. According toour estimate, a 10% increase in market size leads to an increase of almost 6.3 % of active trials.It turns out that also the magnitude conforms with literature. Indeed, previous researchgenerally finds elasticities to be approximately 0.5 consistently with our estimates.Recent literature speculated on the possibility that, though clinical trials might respondelastically to market size, the proportion of them resulting in effective innovation mightdecline (see e.g. [19] among others). Hence authors might have overestimated the effect ofmarket size on clinical trials since the latter should be computed only on the trials thateffectively brought innovation. In the paper, we exploited active trials as a dependent, whichpartially solves the issue. We believe that active trials constitute the subset of promisingtrials in terms of innovative contribution. The estimated higher effect than the literaturethat adopts NMEs or NCEs as a dependent is well explained by the substantial costs fordeveloping new pharmaceutical entities. Drug development is, in fact, quite expensive, thecost ranging between $800 Million to $2.5 Billion (see, e.g., [3]). Undertaking clinical trials is,instead, sensibly cheaper, amounting to an average of $20 Million to $40 Million (see [43] aswell as John Hopkins Bloomberg Health School, 2018). Thus it is reasonable to suppose that,ceteris paribus, a 10% increase in market size stimulates more trials than NMEs or NCEs onaverage. Exceptions are still present (see [4], [20], who estimated an higher elasticity thanthe one of the present work).The coefficient of the average age of firms and its square is in line with past observations(see, e.g., [31] and [7] for specific studies on the topic). The effect evidences how the oldestfirms tend to introduce less innovation than entrants in their early years. However, firmsabove intermediate ages appear almost as active in process innovations as entering firms andeven more in product innovations ( [31]).Moreover, innovation decreases with the share of generics within a market. Thus, the effecttheorized in [19] of decreasing margins of innovation proportionally to the entrance of genericsreveals to be correct (see also [37]).In line with [4] and [51], technological advancements as measured by detrended papers arepositively related to innovation. It is indeed reasonable to suppose that more trials emerge inmarkets where scientific research is prolific.

The discrepancies in the magnitude of the coefficients between the main specification(Column (2)) and Column (1) of Tab.4 can be explained in several ways. In Column (1)of Tab.4 correlation over time of units is not controlled. So it is assumed that units areindependent over the cross-sectional dimension and over time dimension, which is quite astrong constriction in a longitudinal setting. The assumption means that the same individual(market) observed at two different times, t0 and t1, is considered independent from herself.In other words, individual (market) i at time t0 is another individual (market) than indi-vidual (market) i at time t1. The main implication of such presumption is that unobservedtime-independent heterogeneities of individuals do not affect other individuals. However, weknow that the same individual observed at two different times is considered "two distinctindividuals." Thus, in the model of Column (1), it is ultimately assumed that unobservedshocks of an individual (market) i at time t do not influence individual (market) i at time t+k.In other words, we are mixing between and within individual effects. Between effects are theeffects obtained once the time component is averaged out from the variables. Between-effect

26

Page 27: Product recalls, market size and innovation in the

settings exploit differences between units, which in our case are independent by definition(we take ATC-3 markets, see previous Sections), not taking into account time variations.Therefore, the market size variance (time-demeaned) will be higher in a between-effect settingsince it considers the average market size difference between independent ATC-3 markets.Furthermore, given the opposite time trends of trials and market size (see Fig. 1) in abetween-effect setting, the between effects of market size on innovation will be deflated.Indeed, from a specific time on, the innovation trend decreases while the market size trendincreases. However, since time variations are not controlled in a between-effect setting, theinverse proportionality of market size and innovation emerges. Mixing between and withinindividual effects will, hence, result in an overall lower coefficient of Column (1) compared toColumn (2).Ultimately, the downwardly biased coefficient of Column (1) suggests that the unobservedheterogeneity is negatively correlated to trials.To provide an example, consider the possibility that an ATC experienced a sizeable positiveshock (more trials) in 2010. For some reason, the mentioned shock is not modeled normeasured. All else being equal, the apparent fixed effect for that ATC in the period 2004-2013will appear to be higher. However, from the literature, we know that the more the productsavailable for treating a particular clinical condition, the lower the margins on each product(see [13] among others). The unobserved positive shock for ATC ith, therefore, would lowerthe margins of all competitor products in the same market, pushing down the sales for thesame market. This negative correlation between the market size regressor and the errorterm deflates the estimate for market size. Vice versa, in Column (2), time dependency iscontrolled, and deflation is eliminated. The coefficient of market size results, therefore, higherthan in Column (1). Column (1) does not control for the reverse causality of market size oninnovation too. Not considering the reverse causality of market size contributes to upwardbiasing the market size’s coefficient (see, e.g., [4]). There are, therefore, two contrastingeffects: the upward effect due to the reverse causality endogeneity and the downward biasgiven by the unobserved heterogeneity endogeneity. The two effects seem not to compensate,and negative heterogeneity bias prevails over reverse causality endogeneity bias.

Robustness checks Col.(3)-(5) of Tab.4 investigate the robustness of the effect of marketsize on innovation. Three additional models are added to the preferred specification. Precisely,Column (3) reproduces the exercise of [4] to control for possibly varying over time technologicalflows (see below) by adding lagged trials among the regressors. Since the estimating equationin Column (3) is nonlinear, we perform this instrumentation strategy by adding the residualsof the first stage. Column (4) is the same as Column (3), where the dependent is loglinearized, and residuals are ignored. Column (4) ignores both the presence of non-linearitiesand endogeneity.Adding lags of the dependent is a valuable exercise. Indeed, following [4], the primary threatto the identification strategy of innovation is represented by changes in the flow rate ofinnovation for every dollar spent for research on a drug (permanent differences in innovationare already dropped through the ATC fixed effects). Differences in the flow rate of innovationsuggest that technological progress is scientifically more difficult in some lines than others.

27

Page 28: Product recalls, market size and innovation in the

The parameter denoting innovation flow is part of the theoretical specification of innovationdrawn from [4]. Following [4], if the flow rate of innovation varies over time, it is also likely tobe serially correlated. Adding lag of log innovation to the preferred specification is a simpleway to check the importance of these concerns. The lagged trials are instrumented withtheir lags through a system GMM one-step procedure. The p-value of the Hansen test ofoveridentification of model in Column (4) is 0.175, falling mainly between the tolerance levelsof 0.1 and 0.25 indicated in [52]. The Arellano-Bond test is investigated in Tab.5.

z-score p-valueArellano-Bond test for AR(1)

in first differences: z = -10.47 Pr > z = 0.000

Arellano-Bond test for AR(2)in first differences: z = 0.88 Pr > z = 0.377

Arellano-Bond test for AR(3)in first differences: z = -1.46 Pr > z = 0.145

Arellano-Bond test for AR(4)in first differences: z = 0.33 Pr > z = 0.740

Table 5: Arellano-Bond test for autocorrelation of first differenced residuals of GMM

When the idiosyncratic errors are independently and identically distributed (i.i.d.), thefirst-differenced errors are first-order serially correlated. So, as expected, the output abovepresents strong evidence against the null hypothesis of zero autocorrelation in the first-differenced errors at order 1. Yet, as suggested in Roodman (2009), "in the context of anArellano-Bond GMM regression, which is run on first differences, AR(1) is to be expected, andtherefore the Arellano-Bond AR(1) test result is usually ignored in that context". The outputabove presents, moreover no significant evidence of serial correlation in the first-differencederrors at order 2, 3 and 4.In Column (4) market size is considered, again, exogenous, though fixed effects are controlled.The model in Column (4) is linear. In order to ensure comparability among models, trials havebeen transformed to a logarithmic scale. Column (4) is, in other words, an essential controlsince, though controlling for fixed effects, it ignores the presence of potential non-linearity(misspecification) and endogeneity, proposing the hypothesis of serial correlation.Finally, Column (5) presents the model without any further control as estimated by thepreferred specification’s control function approach. The idea beyond Column (5) is to checkwhether not controlling for regressors compromises the main specification estimates.

The outcomes of Col.(3)-(5) of Tab.4 confirm the estimates of the main specification forwhat concerns the positive effect of market size on innovation.Columns (3)-(5) in Tab.4 all display a positive effect of market size on innovation.Specifically, Column (3) confirms the results of [4] finding no evidence of serial autocorrelation.In particular, the coefficient of lag trials is negative and non-significant as in [4]. Possibleexplanations are already in [4] and are, therefore, not discussed in the present work. InColumn (4) of Tab.4, the positive coefficient of lagged trials is significant at the 5% tolerance

28

Page 29: Product recalls, market size and innovation in the

level. This evidence is almost in line with [4] when no instrumentation is performed. 6 Underthis scenario the lagged dependent’s coefficient turned out to be positive and not significantalso in [4]. Market size is again strongly and positively related to innovation, with a coefficienthaving the lowest magnitude of the specifications analyzed until now. Indeed, some of thevariability might be caught by lagged dependent. Moreover, possible misspecification biasdue to the not correction of nonlinearity might intervene.Notice that the effect of the market size in Column (4) of Col. of Tab.4 display similaritiesto Col. (1) of the same table which does not control for endogeneity. Furthermore, the effectof market size is larger in the models correcting for endogeneity. Therefore, in general, thelack of control for temporal dependence may matter very little for estimation, as it is alsoconsistent with the fact the autocorrelation coefficient is very weak. Otherwise, indeed, alsothe coefficient of size in Columns (1) and (4) of Tab.4, whose only dissimilarity relies onthe control for temporal dependencies (Col.(4)), would have sensibly differed. Hence, it isreasonable to suppose that the lower magnitude of the coefficient of size in both Columns (1)of Tab.4 and Column (4) of Tab.4 is primarily a consequence of considering market size asexogenous. It is possible to provide further checks by controlling for possible overidentificationof the instrumented lagged dependent variable. To do so, Tab.6 column (1) reports thetwo-step robust system GMM estimates of Column (4) of Tab.4, which, instead, performed aone-step system GMM.

Table 6: Coefficients of market size and lag dependent when a two-step GMM is employed.Col.(2) includes suspended and withdrawn trials in the dependent

(1) (2)log Trials log Trials

trialst−1 0.0592 0.380∗

(0.0426) (0.189)Log sales 0.1208∗∗∗ −0.533∗∗

(0.159) (0.179)Year Dummies Yes YesObs. 1664 1664Groups 208 208Standard errors in parentheses∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001

Huber-White robust and clustered at ATC-3level standard errors are in parentheses. (1)is the two-step GMM version of Column (4)Tab.4, which performed a system one-stepGMM. Only the critical coefficients are in-cluded. (2) is equal to (1) where also sus-pended and withdrawn trials are includedin the dependent. Only the critical coeffi-cients are included. Both equations are lin-earized to enable a simple comparison withColumn (4) Tab.4. Thee same results applyif the count of trials is employed as depen-dent (see [22])

The coefficient of the market size in Tab.6 is higher compared to the one in Column (4)of Tab. 4. Furthermore, the lagged dependent variable is not significant in line with [4]. Thesame applies if the count of trials is employed as dependent (see [22]).The sign of the estimates of market size does not change with respect to the preferred model.A final robustness check has been made by including all the trials, i.e., active ones andsuspended and withdrawn. The number of classes employed is the same, though the numberof trials increased by 0.57% on the total. This is performed in Tab.6 column (2). As displayed,

6different from Column (1) where residuals of first-stage are included

29

Page 30: Product recalls, market size and innovation in the

our estimation does not confirm the hypothesis in [19] showing a lower coefficient insteadboth in terms of magnitude and significance level. The hypothesis is that including non-activetrials, which are less responsive to market size, biases estimate toward randomness. Forinstance, firms having all suspended trials are unaffected by price regulations reducing pricesof treatments by governments. Simultaneously, increases in market size could be less effectiveon such companies, which already have sunk costs due to inactive trials. The presence ofendogeneity is confirmed.

Further robustness checks have been performed by changing the market size’s proxy toalign to [4], moving to another database to collect sales data (Evaluate sales are employed)and employing all the recalls at our disposal to instrument market size. In particular, Tab.7shows the outcomes of the analysis adopting Class II and Class I recalls as instrument formarket size. Tab.8 measures market size through the number of patients within an ATC-3 inaccordance with [4].Since the number of patients is highly correlated with sales and it is employed as a naturalalternative to sales, we adopted recalls as an instrument for the number of patients.The F-test amounts to 12 for the analysis with Evaluate and to 4 for the analysis with thenumber of patients.

Table 7: Col.(1) and Col.(2) represent first and second stage results using all the recalls at our disposal.Data are aggregated at the ATC-3 level. Impact of market size on innovation using Evaluate database

(Col.(3)) number of patients (Col.(4)) as proxy of market size

(First-stage all recalls) (Second-stage all recalls) (First-stage Evaluate) (Second-stage Evaluate)Log sales Trials Log sales Trials

˜recalls 0.00199 −0.260∗∗∗

(0.64) (0.0059)˜recallst−1 −0.0223∗∗∗ −0.0180

(−3.34) (0.0105)Log sales 0.580∗ 0.710∗∗

(2.02) (0.275)Residuals −0.739∗ −0.724∗∗∗

(−2.13) (0.275)Kt+1P−1

0.191∗ −0.173 −0.147(3.07) (0.154) (0.276)

average age firm 0.153 −0.129∗ 0.0470 0.224∗∗∗

(1.67) (−2.28) (0.0678) (0.0045)average age firm2 −0.00195 0.00207∗∗ −0.0004 −0.0814∗∗∗

(−1.53) (2.87) (0.0008) (0.0316)hhi 1.238∗∗∗ −0.250∗∗∗ 1.721∗∗∗ −0.60

(4.79) (−0.56) (0.419) (0.495)share generics in ATC −0.178 −0.326∗∗ −0.372 0.00640. (−0.53) (−2.69) (0.328) (0.157)average age prod. 0.0539 −0.0559∗ −0.0330 −0.0732∗∗

(0.92) (−2.33) (0.0504) (0.0225)average age prod.2 −0.0037 0.0008 0.00138 0.0009

(−1.72) (0.63) (0.00206) (0.0009)papers −0.0267 0.153∗ −0.139 0.265∗∗

(−0.52) (0.0507) (0.0803) (0.0917)# firms 0.00729 0.00103 −0.0106 0.0224∗∗∗

(1.04) (0.25) (0.0095) (0.0045)Year Dummies Yes Yes Yes YesObs. 1664 1664 1136 1056Groups 208 208 142 132Standard errors in parentheses∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001

Huber-White robust and clustered at ATC-3 level standard errors are in parentheses. The table shows the results whenall recalls at our disposal are employed. First-stage results are shown Col. (1) while second stage results are in Col. (2).The level of aggregation is ATC-3 and the estimation method is the same as the one adopted in the main analysis.

30

Page 31: Product recalls, market size and innovation in the

Tab. 7 reports first and second stage of employing Class I and Class II recalls as instrumentfor market size in the first two columns. The remaining columns are devoted to the resultsobtained using Evaluate database to collect market sales. The results of the main analysisare confirmed in both exercises.Employing all recalls, decreases both the magnitude and the significance of the coefficients ofsales. Moreover only the lag of recalls is a good instrument at market level. These two effectsare expected, since, including minor recalls may attenuate the drop in sales consequent to arecall. Indeed, within Class II are also comprehended temporary recalls (e.g. recalls due to alabeling error) which may both be not unexpected to the firm (most of them are voluntary)and, for this reason, taken into account by the management of the company. Losses in termsof sales are, therefore, well compensated. Furthermore, minor recalls are not publicized andcannot damage the image of the company or the market in which they happen.Hence, adding minor recalls overtakes the strong and negative impact of current recalls and,as a consequence, affects the estimates of market size in the second stage. Since, however,Class II recalls, often regards minor but persistent issues7, a cumulative effect intervenes andlagged recalls remain a good instrument.The outcomes of the main analysis remain robust when data on sales are collected from adifferent database.Tab.8 reports the second stage results of the analysis with number of patients as a measurefor market size. First stage results are in Appendix.

7minor recalls often pertain the manufacturing of the product accessories. Their cause range from labelmix-up, presence of particulate matter in certain lots to packaging issues. Though minor recalls do notthreaten the health of patients directly, they are difficult to be corrected in the short-term by firms.

31

Page 32: Product recalls, market size and innovation in the

Table 8: Impact of market size on innovation using number of patients as proxy of market size

(1)Trials

Log patients 3.274∗∗∗

(0.648)residuals −3.291∗∗∗

(0.647)Kt+1P−1

1.476∗∗∗

(0.377)average age firm 0.589∗∗∗

(0.154)average age firm2 −0.00478∗∗

(0.0016)hhi 0.145

(0.260)share generics in ATC −0.648∗∗

(0.244)average age product −0.278∗∗∗

(0.0514)average age product2 0.0011∗∗∗

(0.0022)papers 0.546∗∗∗

(0.147)papers2 0.193

(0.114)# firms 0.0895∗∗∗

(0.0144)Year Dummies YesObs. 1056Groups 132Standard errors in parentheses∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001

Huber-White robust and clustered at ATC-3 level standard errors are in parentheses.(1) employs MEPS database and matches the ATC-3 present in our database. Marketsize is measured through the number of patients within ATC-3.

Adopting the number of patients as a proxy for market size confirms the first and secondstage results compared to the principal specification. The outcomes, however, turn out to beweaker in terms of significance than our main specification. Recalls do not seem a stronginstrument for the number of patients. On the one hand, a more significant number ofpatients within an ATC-3 class might increase the probability of an adverse event than in ascarcely populated ATC-3 class, increasing the probability of a major recall. On the otherhand, however, there is no reason to believe that an adverse event would happen in a morepopulated class, which might refer to commonly employed medicines (and therefore welltested). Moreover, a recall in a class causes a decrease in the number of patients adoptingpharmaceuticals in the questioned ATC-3 class, thus compensating the possible positive effectimplied by a higher probability of adverse events. For what concerns the second stage, Tab.8enforces the results found in [4] where a coefficient of market size between 3 and 4 was found.The significance of residuals confirms the presence of endogeneity.

6 ConclusionsRecent research has stressed the importance of market size in determining the innovation ratein the pharmaceutical industry. At the same time, after [15] ’s critique, instrumenting withdemographical shifts remains a weak, though valid, strategy. Moreover, recent contributionshave stressed the importance of modeling competition and technological opportunities ade-

32

Page 33: Product recalls, market size and innovation in the

quately (see [51], [19]). For example, many scholars pointed to the importance of advancesin molecular biology and related fields for the industry’s technological opportunities andinnovative capabilities ( [51]). Finally, the literature lacks analyses at an aggregation levelthat easily allows drawing policy implications. For this reason, the present work employsATC-3, the aggregation level used by Antitrust authorities.The empirical estimates are conducted on a unique database integrated with additionalsources. The variety of our sources enabled us to collect and adequately classify data ontrials and drug recalls in ATC-3 categories. The methodology employed is innovative ( [39])and, differently from past techniques, permits controlling for both idiosyncratic endogeneityand heterogeneity endogeneity. The technique composes of two stages. A simple Wald teston the residuals’ coefficient in the second stage allows verifying the presence of idiosyncraticendogeneity.An innovative instrument, recalls, has been employed for the first time in literature. Re-calls have been collected consulting various sources comprising FDA Enforcement reports,openFDA, and a database deriving from FOIA agreements with FDA. Recalls are representa-tive. Major recalls have been selected to meet the criteria of sharpness, indirect effect onthe dependent variable (innovation), and exogeneity. The first stage displayed a substantialand significant negative impact of recalls on market size, thus validating the instrument. Tothe best of our knowledge, the effort constitutes an empirical novelty in the literature thatmainly focuses on optimal management of recalls and provides theoretical argumentationof recalls’ negative impact at the firm level. Too few papers focused on the impact of drugrecalls at the market level.Data on clinical trials have been drawn from the Clinicaltrials.gov website from the pre-clinicalphase to Phase IV. They have been integrated with data on INDs from a privately owneddatabase maintained at IMT School for Advanced Studies. To overcome issues deriving fromthe potential more robust response of market size to trials as a whole rather than on essentialtrials (i.e., bringing most probably to an innovation), only activated trials have been selected.This exercise also provides a valid answer to the argumentation that the recall of a productmight imply the suspension of drug trials within its same family. Indeed, suspended andwithdrawn trials have been excluded from the analysis. Nonetheless, as a robustness check,estimates are computed also including the latter in the analysis. The effort confirms thepresence of idiosyncratic endogeneity and the positive sign of the estimates. However, themagnitude and the significance level decrease.Our preferred estimates align with literature displaying an increase in the innovation of 6.3%after an increase in market size of 10%. Most recent studies of [19] display a lower coefficientof 0.23%. Authors specify how a comparison with other works exploiting different measuresof innovation remains a difficult task. They further explain that their usage of global datarather than U.S. ones for the estimations might have led to less responsiveness.Our results are robust to several specifications. The coefficient of independent variables is inline with expectation as well as the scarce effect of lagged trials, already tested in [4]. Furtherchecks confirm a positive and significant effect of market size on innovation even when fixedeffects are not controlled, and the market size is considered exogenous. This latter verificationpartially validates (for what concerns the sign and the significance) the recent findings of [51]who did not find evidence of reverse causality. However, the coefficient’s magnitude decreasessensibly compared to the preferred specification, showing a significant bias.

33

Page 34: Product recalls, market size and innovation in the

Estimates remain robust even when no control is inserted in the analysis.The work provides exciting policy implications for what concerns innovation’s stimuli andsheds some light on the impact of recalls at the market level. Governments, in particular,should be aware wherever applying either tax or price policies in the pharmaceutical sector.As already mentioned, indeed, innovation constitutes an economic phenomenon. Companiesinnovate mainly to have a financial return. Aware of the positive relationship between marketsize and innovation, authorities and policymakers should not penalize economic players toomuch. To guarantee citizens’ future welfare, they should promote research and invest in newtechnologies smartly managing generics’ competition.Recalls, moreover, have not just an impact at the firm level but also the market level. Specif-ically, they provoke adverse shocks on markets, thus affecting economic stability and welfare.Authorities should therefore apply more stringent rules to avoid severe recalls. At the sametime, they should consider that an intensification of Class II and Class III recalls due to thepresence of more players might be physiological.Future research might employ more up-to-date data in order to include also recalls ofcompounders and repackaging firms.

Acknowledgements: We are grateful to Crisis Lab. for the PHID database. We thank Prof. JeffWooldridge for the kind support in the application and interpretation of his innovative methodology. Wewould also like to thank the FDA for providing data on recalls and clearing doubt on recalls’ timing andprocedures. We are grateful to Evaluate data for having allowed very up-to-date and detailed checks on clinicaltrials data. Furthermore, we are thankful to Springer AdInsight for having enabled a detailed classificationat ATC 3 level of rare clinical trials. Finally, we thank Young Economists of Tuscan Institutions (YETI),the AXES unit of the IMT Lucca for Advanced Studies and the ASSA meeting participants for the valuablecomments on the work.

34

Page 35: Product recalls, market size and innovation in the

Appendices

A Literature review tableThe following table summarizes the literature’s findings on the relationship between market size and innovationin the pharmaceutical industry. In particular, the focus is on relevant works coming after [4]. The reasonfor such a choice is that [4] represents a milestone in investigating the relation between market size andinnovation in pharmaceuticals. It overcomes issues emerging in previous studies (such as, and above all, theone of endogeneity) and is taken as a reference point by authors willing to further dig into such a literaturestream.Furthermore, the literature review reports the relationship between market size and innovation in Pharma-ceutical Industry only. Indeed, different industries have different definitions of recalls.

35

Page 36: Product recalls, market size and innovation in the

Table 9: The table reports relevant papers after [4]; NME stands for New Molecular Entities;NDA stands for New Drug Approval.

Paper Data andsample Unit of observation Measurement of

innovation estimation method report estimateof size

proxy marketsize

[4] US; March CPS, 1965–2000;March CPS, 1965–2000; FDA; OECD report number of units NME b QML > 0 4 demographic measures

[15] US; FOIA request; RND process1;& U.S statistical abstract2; 1968-1997 15 drug categories 1 NME FE, GLS, IV, Tobit >0 1 demographic measures

[51] U.S.; RND; FDA; OECD;ClinicalTrials.gov3; 1974-2008 disease NDA; NME; Phase II

and Phase III trials QMLE (Poisson, 1995) 0.3444 (NME); 0.3521 (NDA) demographic measures

[19] 14 countries 5; 1997-2007;IMS, WHO chemical entity;dummies for ATC-1 and ATC-2 NCE (elasticity) b OLS,2SLS,CF

approach (Wooldr.,2002)0.23 (average across

ATC classes) deaths and GDP 5

[11] US; 1998-2010; Pharmaprojects 6;MEPS; OECD; NIH 49 therapeutic classes R&D 6 Negative Bin.; Poisson 0.26; 0.41; 0.51 7 demographic shifts 5 c

1RND process of the pharmaceutical sector (gov. funds); > 0 means that the exogenous increase in market size is initially associatedwith approximately 0.08 more drugs introduced in the market. These new drugs reduce the mortality rates of individuals aged 65and older by 0.8 percent. This decrease in mortality rate leads to increases in market size (more demand), producing an additionalincrease of drugs equal to 0.0962(population data for market size)3 Both Cerda and Rake consulted the 19th edition of the Drug Information Handbook published by Lexi-Comp and the AmericanPharmaceutical Association (Lacy et al., 2010). This handbook is comparable to a pharmaceutical dictionary, providing a list ofdrugs’ active ingredients, the medical conditions the drug is used for, and further information such as adverse effects. The work takesinto account only those medical conditions which can be found on the FDA-approved label. Hence, unlabeled and investigationaluses are not present. For the period 1974 to 2008, FDA approved 599 unique NMEs and 1,665 unique NDAs. These approvals referto the 208 diseases or medical indications analyzed in this study. However, an NME or NDA may be used as therapy for severalmedical indications. In this case, an NME or NDA is counted as innovation for all the medical indications for which it is approved4 The estimates suggest that a 1 percent increase in the potential market size for a drug category leads to a 6 percent increase inthe total number of new drugs entering the U. S. market.5 Data come from IMS (Intercontinental Marketing Services) and include all product sales in 14 countries a (Australia, Brazil,Canada, China, France, Germany, Italy, Japan, Mexico, Korea, Spain, Turkey, United Kingdom, USA). Dubois et al. have data onthe ATC-4 (they report 607 different classes), the main active ingredient of the drug (they report 6216 different active ingredients),the name of the firm producing the drug, whether it has been licensed, the patent start date, and the format of the drug (the workreports 471 different formats). Products in the same ATC-4 by definition have the same indication and mechanism of action. Theauthors do not consider OTC drugs. Quantities are given in standard units, one standard unit corresponding to the smallest typicaldose of a product form, as defined by IMS Health.6 Pharmaprojects trend data "snapshot"; (focus on R&D): focus only on one instance of innovation as explained in [26]. Authorsspecify the adoption of clinical trials (from pre-clinical Phase to Phase III) not taken from ClinicalTrials.gov (see below)7 For a drug class with average Medicare market share (41%, in 2004–2005), Duggan and Scott Morton’s result translates to an11% increase in revenues following Medicare Part D. Our Phase I estimates correspond, for a drug class with average Medicaremarket share, to a 26% increase for 2004–2005, a 33% increase post-implementation in 2006–2007, and a lagged 51% increase in2008–2010. These estimates imply an elasticity of Phase I clinical trials of 2.4 to 4.7 compared to the market size, bracketingAcemoglu and Linn’s estimated elasticity of 3.5 for approved new molecular entities (NMEs). However, when considering all clinicaltrials combined—including Phase III trials for supplemental indications the estimated elasticity of clinical trials with respect tomarket size is somewhat lower than Acemoglu and Linn’s estimated elasticity of 6 for all new drug approvals, but certainly stillmore prominent than the Dubois et al. (2011) estimate of about 0.25. Summary results: "The results indicate that the increase inoutpatient prescription drug coverage provided through Medicare Part D has had a significant impact on pharmaceutical R&D "Critiques:a [11] states that several of the countries chosen regulate prescription drug prices, and regulations may change rapidly over time.Thus, given the lower expected profit per consumer and greater uncertainty about future profits and prices, firms’ R&D decisionsare likely to be less responsive to a unit change in expected revenues for all these countries combined versus the same unit change inthe U.S. market (Sood et al., 2009).b [11]: they measured firms’ innovative activities via clinical trials, whereas Dubois et al. (2011) and Acemoglu and Linn (2004)evaluate the responsiveness of approved and marketed drugs to changes in market sizec [19]: the authors recognize to [11] the fact of having exploited an innovative measure of Market Share (policy change in MedicarePart D)List of controls:[4] Potential Supply-Side Determinants of Innovation (changes in scientific incentives); Proxies for pre-existing time trends acrosssectors; lag dependent var; life-years lost; public funding; pre-existing trends; major category trends; health insurance market size;(see page 1077-1080 for further details on variables)[15]: Gov. expenditure (Medicare and social security); Gov. research efforts (grants on research); year dummies; some demographicinformation such as prevalence rates of disease i on males (fraction of males/white/married attending hospital due to i), blacks,whites, and married individuals as well as the average age of individuals affected by disease i.[51] The empirical analysis draws upon the literature concerning the “demand-pull” versus “technology-push” debate and takesinto account demand- and supply-side factors as the explanatory variables for pharmaceutical innovation. Regressors usedcomprise knowledge stock (consisting of the scientific publications (Pubit) related to medical indication i and published in year t(BioPharmInsight database); Regulatory stringency (average time between the submission of a new drug approval to the FDA andits final approval); pre-sample mean of new pharmaceuticals; mortality rate per medical indication in 1983 to account for differencesin the pre-sample prevalence of medical indication; pre-sample technological opportunities are constructed as the average annualgrowth rate of the knowledge stock from 1979 to 1983.[11] prescription drugs; funding grants for each disease class

36

Page 37: Product recalls, market size and innovation in the

B Time to event analysis

0.00

0.25

0.50

0.75

1.00

0 2 4 6 8 10analysis time

recalls = 0 recalls = 1

Kaplan-Meier survival estimates

Figure 6: Kaplan-Meyer time to event analysis. The x-axis represents the number of years until death.Recalled products (recalls =1) are already scaled down at 2 years of survival time with respect to not recalledproducts (recalls =0). The median survival time for recalled medicines is about 4 years, while the one fornot-recalled medicines is about 7 years. The survival function of recalled products persists in its falling belowthe survival function of not recalled drugs. This means that recalls affect sales for a long period of time.In other words, within the market of the recalled product there will be a lack of potential sales left by therecalled products. Missed sales are hence not a temporary event, demonstrating the length of the lack thatshould be covered to fill the gap provoked by the product’s recall.

C Abnormal Values (firm and product levels)

-.1-.0

50

.05

.1

abno

rmal

valu

est

duni

ts

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8Year

Lower C.I. Upper C.I.AV_t

Recall Year = 0

Figure 7: Effect of recalls on market sales once firms having undergone a major recall are cancelled out.The absence of any effect (i.e. increases of sales due to recalls of competitors) at recall time for productsother than the ones of the recalled firm witnesses the absence of compensations both at time 0 or soon afterthe recall.

37

Page 38: Product recalls, market size and innovation in the

The following figures represent abnormal values at firm and product level for different typologies of recall(according to their gravity).

-.2-.1

0.1

abno

rmal

valu

esa

les

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8Year

Lower C.I. Upper C.I.AV_t product General Recalls

Recall Year = 0

Product: general recalls sales

-.20

.2.4

abno

rmal

valu

esa

les

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8Year

Lower C.I. Upper C.I.AV_t product Recalls Class I

Recall Year = 0

-.15

-.1-.0

50

.05

.1

abno

rmal

valu

esa

les

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8Year

Lower C.I. Upper C.I.AV_t product Recalls Class II

Recall Year = 0

Product: Class I recalls sales Product: Class II recalls salesFigure 8: Abnormal values for product aggregation. Years are normalized. Year 0 representsthe year of recall. The three scenarios include the path of sales before and after the recallyear, using three different definitions of recalls: Class I recalls, general recalls and Class IIrecalls. As shown in the pictures, sales at product level drop at recall year.

38

Page 39: Product recalls, market size and innovation in the

-.1-.0

50

.05

.1

abno

rmal

valu

esa

les

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8Year

Lower C.I. Upper C.I.AV_t firm General Recalls

Recall Year = 0

-.06

-.04

-.02

0.0

2.0

4

abno

rmal

valu

esa

les

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8Year

Lower C.I. Upper C.I.AV_t firm

Recall Year = 0

Firm: general recalls sales Firm: Maj.recalls sales

-.2-.1

0.1

abno

rmal

valu

esa

les

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8Year

Lower C.I. Upper C.I.AV_t firm Recalls Class I

Recall Year = 0

-.1-.0

50

.05

.1

abno

rmal

valu

esa

les

-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8Year

Lower C.I. Upper C.I.AV_t firm Recalls Class II

Recall Year = 0

Firm: Class I recalls sales Firm: Class II recalls salesFigure 9: Abnormal values at firm aggregation. Years are normalized. Year 0 represents theyear of recall. The four scenarios include the path of sales before and after the recall year,using four different definitions of recalls: major recalls, Class I recalls, general recalls andClass II recalls. A part from major recalls, catching firms unaware, the other types of recallsdo not affect firms sales. This might be due to compensation of sales within firms.

The effect of recalls is evident for all aggregations but firm level, where the effect is not evident (futuredevelopment).Possible hypotheses are detailed in the main text.

D First stage rob. checksIn this section are displayed the significant coefficients of the first stage employing the number of patients asmeasure for market size.

Table 10: First stage of the robustness check using the number of patients as measure of market size

(1)# patients

˜recalls 0.0519(0.0333)

˜recallst−1 −0.262∗

(0.149)Year Dummies YesObs. 1056Groups 132Standard errors in parentheses∗ p < 0.05, ∗∗ p < 0.01, ∗∗∗ p < 0.001

Huber-White robust and clustered at ATC-3level standard errors are in parentheses. (1)first stage results when MEPS database isemployed and market size is measured withthe number of patients. Second stage resultsare in Tab. 8.

39

Page 40: Product recalls, market size and innovation in the

References[1] https://ungerconsulting.net/fda-enforcement-statistics-fy-2009-2015/. Accessed:

23.03.2019.

[2] https://lagunatreatment.com/fda-drug-recalls/. Accessed: 23.03.2019.

[3] https://www.efmc.info/medchemwatch-2012-1/sme-2.php#:~:text=Drug%20Discovery%20Scientist-,Drug%20Discovery,now%20estimated%20at%20%241.8%20billion., note = Ac-cessed: 3.12.2020.

[4] Daron Acemoglu and Joshua Linn. Market size in innovation: theory and evidence from the pharmaceu-tical industry. The Quarterly journal of economics, 119(3):1049–1090, 2004.

[5] Zeina Alsharkas. Firm size, competition, financing and innovation. International Journal of Managementand Economics, 44(1):51–73, 2014.

[6] Ram Bala, Pradeep Bhardwaj, and Pradeep K Chintagunta. Pharmaceutical product recalls: Categoryeffects and competitor response. Marketing Science, 36(6):931–943, 2017.

[7] Natarajan Balasubramanian and Jeongsik Lee. Firm age and innovation. Industrial and CorporateChange, 17(5):1019–1047, 2008.

[8] George Ball, Jeffrey Thomas Macher, and Ariel Dora Stern. Recalls, innovation, and competitor response:Evidence from medical device firms. 2018.

[9] George P Ball, Rachna Shah, and Kaitlin D Wowak. Product competition, managerial discretion, andmanufacturing recalls in the us pharmaceutical industry. Journal of Operations Management, 58:59–72,2018.

[10] John M Bertoni, John Philip Arlette, Hubert H Fernandez, Cheryl Fitzer-Attas, Karen Frei, Mohamed NHassan, Stuart H Isaacson, Mark F Lew, Eric Molho, William G Ondo, et al. Increased melanoma risk inparkinson disease: a prospective clinicopathological study. Archives of Neurology, 67(3):347–352, 2010.

[11] Margaret E Blume-Kohout and Neeraj Sood. Market size and innovation: Effects of medicare part d onpharmaceutical research and development. Journal of public economics, 97:327–336, 2013.

[12] Bowe C. Merck quarterly profits hit by vioxx recall, 2005. [Online; accessed 15-April-2021].

[13] Timothy F Bresnahan and Peter C Reiss. Entry and competition in concentrated markets. Journal ofpolitical economy, 99(5):977–1009, 1991.

[14] A Colin Cameron and Pravin K Trivedi. Regression analysis of count data, volume 53. Cambridgeuniversity press, 2013.

[15] Rodrigo A Cerda. Endogenous innovations in the pharmaceutical industry. Journal of EvolutionaryEconomics, 17(4):473–515, 2007.

[16] Jessie Cheng. An antitrust analysis of product hopping in the pharmaceutical industry. Colum. L. Rev.,108:1471, 2008.

[17] Abdulkadir Civan and Michael T Maloney. The effect of price on pharmaceutical r&d. The BE Journalof Economic Analysis & Policy, 9(1), 2009.

[18] Joseph A DiMasi, Ronald W Hansen, and Henry G Grabowski. The price of innovation: new estimatesof drug development costs. Journal of health economics, 22(2):151–185, 2003.

[19] Pierre Dubois, Olivier De Mouzon, Fiona Scott-Morton, and Paul Seabright. Market size and pharma-ceutical innovation. The RAND Journal of Economics, 46(4):844–871, 2015.

[20] Mark Duggan and Fiona Scott Morton. The effect of medicare part d on pharmaceutical prices andutilization. American Economic Review, 100(1):590–607, 2010.

[21] M.Provost et al. Pharmaceutical antitrust law in european union. Dechert LLP, 2019.

40

Page 41: Product recalls, market size and innovation in the

[22] Nutarelli F. At the edge of economics and machine-learning ((unpublished doctoral dissertation). IMTLucca for Advanced Studies, 2021.

[23] Luca Gallelli, Caterina Palleria, Antonio De Vuono, Laura Mumoli, Piero Vasapollo, Brunella Piro,and Emilio Russo. Safety and efficacy of generic drugs with respect to brand formulation. Journal ofpharmacology & pharmacotherapeutics, 4(Suppl1):S110, 2013.

[24] Paul A Geroski and Chris F Walters. Innovative activity over the business cycle. The Economic Journal,105(431):916–928, 1995.

[25] Carmelo Giaccotto, Rexford E Santerre, and John A Vernon. Drug prices and research and developmentinvestment behavior in the pharmaceutical industry. The Journal of Law and Economics, 48(1):195–214,2005.

[26] Bronwyn H Hall and Nathan Rosenberg. Handbook of the Economics of Innovation, volume 1. Elsevier,2010.

[27] Kelsey Hall, Tyler Stewart, Jongwha Chang, and Maisha Kelly Freeman. Characteristics of fda drugrecalls: A 30-month analysis. American Journal of Health-System Pharmacy, 73(4):235–240, 2016.

[28] Christian Hansen, Jerry Hausman, and Whitney Newey. Estimation with many instrumental variables.Journal of Business & Economic Statistics, 26(4):398–422, 2008.

[29] Iraj Hashi and Nebojša Stojčić. The impact of innovation activities on firm performance using amulti-stage model: Evidence from the community innovation survey 4. Research Policy, 42(2):353–366,2013.

[30] Venit J.S. Hawk, B.E. and Huser H.L. Recent developments in eu mergercontrol. antitrust. (15):24,2000.

[31] Elena Huergo and Jordi Jaumandreu. How does probability of innovation change with firm age? SmallBusiness Economics, 22(3-4):193–207, 2004.

[32] Boyan Jovanovic. Product recalls and firm reputation. Technical report, National Bureau of EconomicResearch, 2020.

[33] Alfred Kleinknecht and Bart Verspagen. Demand and innovation: Schmookler re-examined. Researchpolicy, 19(4):387–394, 1990.

[34] Steven Klepper and Franco Malerba. Demand, innovation and industrial dynamics: an introduction.Industrial and Corporate Change, 19(5):1515–1520, 2010.

[35] Srinivas Kolluru and Pundarik Mukhopadhaya. Empirical studies on innovation performance in themanufacturing and service sectors since 1995: A systematic review. Economic Papers: A journal ofapplied economics and policy, 36(2):223–248, 2017.

[36] Margaret K Kyle and Anita M McGahan. Investments in pharmaceuticals before and after trips. Reviewof Economics and Statistics, 94(4):1157–1172, 2012.

[37] Jean O Lanjouw. Patents, price controls, and access to new drugs: how policy affects global marketentry. Technical report, National Bureau of Economic Research, 2005.

[38] Frank R Lichtenberg. Pharmaceutical innovation as a process of creative destruction. KnowledgeAccumulation and Industry Evolution: The Case of Pharma-Biotech, page 61, 2006.

[39] Wei Lin and Jeffrey M Wooldridge. Testing and correcting for endogeneity in nonlinear unobservedeffects models. In Panel Data Econometrics, pages 21–43. Elsevier, 2019.

[40] Chin-jung Luan, Chengli Tien, and Yi-chuang Chi. Downsizing to the wrong size? a study of the impactof downsizing on firm performance during an economic downturn. The International Journal of HumanResource Management, 24(7):1519–1535, 2013.

[41] Franco Malerba. Innovation and the evolution of industries. In Innovation, Industrial Dynamics andStructural Transformation, pages 7–27. Springer, 2007.

41

Page 42: Product recalls, market size and innovation in the

[42] Anthony Markham. Lurbinectedin: first approval. Drugs, pages 1–9, 2020.

[43] Linda Martin, Melissa Hutchens, Conrad Hawkins, and Alaina Radnov. How much do clinical trialscost?, 2017.

[44] Kamel Mellahi and Adrian Wilkinson. A study of the association between level of slack reductionfollowing downsizing and innovation output. Journal of Management Studies, 47(3):483–508, 2010.

[45] David Mowery and Nathan Rosenberg. The influence of market demand upon innovation: a criticalreview of some recent empirical studies. Research policy, 8(2):102–153, 1979.

[46] Igho J Onakpoya, Carl J Heneghan, and Jeffrey K Aronson. Worldwide withdrawal of medicinal productsbecause of adverse drug reactions: a systematic review and analysis. Critical reviews in toxicology,46(6):477–489, 2016.

[47] Ariel Pakes and Mark Schankerman. The rate of obsolescence of patents, research gestation lags, and theprivate rate of return to research resources. In R&D, patents, and productivity, pages 73–88. Universityof Chicago Press, 1984.

[48] Fabio Pammolli, Laura Magazzini, and Massimo Riccaboni. The productivity crisis in pharmaceuticalr&d. Nature reviews Drug discovery, 10(6):428–438, 2011.

[49] Jorge V Pérez-Rodríguez and Beatriz GL Valcarcel. Do product innovation and news about the r&dprocess produce large price changes and overreaction? the case of pharmaceutical stock prices. AppliedEconomics, 44(17):2217–2229, 2012.

[50] W Price and II Nicholson. Making do in making drugs: Innovation policy and pharmaceutical manufac-turing. BCL Rev., 55:491, 2014.

[51] Bastian Rake. Determinants of pharmaceutical innovation: the role of technological opportunitiesrevisited. Journal of Evolutionary Economics, 27(4):691–727, 2017.

[52] David Roodman. How to do xtabond2: An introduction to difference and system gmm in stata. Thestata journal, 9(1):86–136, 2009.

[53] Frederic M Scherer. Demand-pull and technological invention: Schmookler revisted. The Journal ofIndustrial Economics, pages 225–237, 1982.

[54] Jacob Schmookler. Invention and economic growth. Harvard University Press, 2013.

[55] Vishal B Siramshetty, Janette Nickel, Christian Omieczynski, Bjoern-Oliver Gohlke, Malgorzata NDrwal, and Robert Preissner. Withdrawn—a resource for withdrawn and discontinued drugs. Nucleicacids research, 44(D1):D1080–D1086, 2016.

[56] Gregory N Stock, Noel P Greis, and William A Fischer. Firm size and dynamic technological innovation.Technovation, 22(9):537–549, 2002.

[57] Paul Stoneman. Soft innovation: economics, product aesthetics, and the creative industries. OxfordUniversity Press, 2010.

[58] George Symeonidis. Innovation, firm size and market structure: Schumpeterian hypotheses and somenew themes. 1996.

[59] Terence N. Merck pulls vioxx painkiller from market, and stock plunges, 2004. [Online; accessed15-April-2021].

[60] Sriram Thirumalai and Kingshuk Sinha. Product recalls in the medical device industry: An empiricalexploration of the sources and financial consequences. Management Science, 57:376–392, 2011.

[61] Carl H Tong, Lee-Ing Tong, and James E Tong. The vioxx recall case and comments. CompetitivenessReview: An International Business Journal, 2009.

[62] Anish Vaishnav. Product market definition in pharmaceutical antitrust cases: Evaluating cross-priceelasticity of demand. Colum. Bus. L. Rev., page 586, 2011.

42

Page 43: Product recalls, market size and innovation in the

[63] Željko Vujović. A case study of the application of weka software to solve the problem of liver inflamation.2021.

[64] Wikipedia contributors. Anatomical therapeutic chemical classification system — Wikipedia, the free en-cyclopedia. https://en.wikipedia.org/w/index.php?title=Anatomical_Therapeutic_Chemical_Classification_System&oldid=1053053987, 2021. [Online; accessed 15-November-2021].

[65] Wikipedia contributors. Overfitting — Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Overfitting&oldid=1045065285, 2021. [Online; accessed 17-November-2021].

43