instrumental variables: an econometrician's perspective · 2014-10-02 · instrumental...

37
arXiv:1410.0163v1 [stat.ME] 1 Oct 2014 Statistical Science 2014, Vol. 29, No. 3, 323–358 DOI: 10.1214/14-STS480 c Institute of Mathematical Statistics, 2014 Instrumental Variables: An Econometrician’s Perspective 1 Guido W. Imbens Abstract. I review recent work in the statistics literature on instru- mental variables methods from an econometrics perspective. I discuss some of the older, economic, applications including supply and demand models and relate them to the recent applications in settings of ran- domized experiments with noncompliance. I discuss the assumptions underlying instrumental variables methods and in what settings these may be plausible. By providing context to the current applications, a better understanding of the applicability of these methods may arise. Key words and phrases: Simultaneous equations models, randomized experiments, potential outcomes, noncompliance, selection models. 1. INTRODUCTION Instrumental Variables (IV) refers to a set of meth- ods developed in econometrics starting in the 1920s to draw causal inferences in settings where the treat- ment of interest cannot be credibly viewed as ran- domly assigned, even after conditioning on addi- tional covariates, that is, settings where the assump- tion of no unmeasured confounders does not hold. 2 Guido W. Imbens is the Applied Econometrics Professor and Professor of Economics, Graduate School of Business, Stanford University, Stanford, California 94305, USA and NBER e-mail: [email protected]; URL: http://www.gsb.stanford.edu/users/imbens . 1 Discussed in 10.1214/14-STS494, 10.1214/14-STS485 , 10.1214/14-STS488, 10.1214/14-STS491; rejoinder at 10.1214/14-STS496. This is an electronic reprint of the original article published by the Institute of Mathematical Statistics in Statistical Science, 2014, Vol. 29, No. 3, 323–358. This reprint differs from the original in pagination and typographic detail. 2 There is another literature in econometrics using instru- mental variables methods also to deal with classical measure- ment error (where explanatory variables are measured with error that is independent of the true values). My remarks in the current paper do not directly reflect on the use of instru- mental variables to deal with measurement error. See Sargan (1958) for a classical paper, and Hillier (1990) and Arellano (2002) for more recent discussions. In the last two decades, these methods have at- tracted considerable attention in the statistics lit- erature. Although this recent statistics literature builds on the earlier econometric literature, there are nevertheless important differences. First, the re- cent statistics literature primarily focuses on the bi- nary treatment case. Second, the recent literature explicitly allows for treatment effect heterogeneity. Third, the recent instrumental variables literature (starting with Imbens and Angrist (1994); Angrist, Imbens and Rubin (1996); Heckman (1990); Man- ski (1990); and Robins (1986)) explicitly uses the potential outcome framework used by Neyman for randomized experiments and generalized to observa- tional studies by Rubin (1974, 1978, 1990). Fourth, in the applications this literature has concentrated on, including randomized experiments with noncom- pliance, the intention-to-treat or reduced-form esti- mates are often of greater interest than they are in the traditional econometric simultaneous equations applications. Partly the recent statistics literature has been mo- tivated by the earlier econometric literature on in- strumental variables, starting with Wright (1928) (see the discussion on the origins of instrumental variables in Stock and Trebbi (2003)). However, there are also other antecedents, outside of the tradi- tional econometric instrumental variables literature, notably the work by Zelen on encouragement designs (Zelen, 1979, 1990). Early papers in the recent statis- tics literature include Angrist, Imbens and Rubin 1

Upload: others

Post on 04-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

arX

iv:1

410.

0163

v1 [

stat

.ME

] 1

Oct

201

4

Statistical Science

2014, Vol. 29, No. 3, 323–358DOI: 10.1214/14-STS480c© Institute of Mathematical Statistics, 2014

Instrumental Variables:An Econometrician’s Perspective1

Guido W. Imbens

Abstract. I review recent work in the statistics literature on instru-mental variables methods from an econometrics perspective. I discusssome of the older, economic, applications including supply and demandmodels and relate them to the recent applications in settings of ran-domized experiments with noncompliance. I discuss the assumptionsunderlying instrumental variables methods and in what settings thesemay be plausible. By providing context to the current applications, abetter understanding of the applicability of these methods may arise.

Key words and phrases: Simultaneous equations models, randomizedexperiments, potential outcomes, noncompliance, selection models.

1. INTRODUCTION

Instrumental Variables (IV) refers to a set of meth-ods developed in econometrics starting in the 1920sto draw causal inferences in settings where the treat-ment of interest cannot be credibly viewed as ran-domly assigned, even after conditioning on addi-tional covariates, that is, settings where the assump-tion of no unmeasured confounders does not hold.2

Guido W. Imbens is the Applied EconometricsProfessor and Professor of Economics, Graduate Schoolof Business, Stanford University, Stanford, California94305, USA and NBER e-mail:[email protected]; URL:http://www.gsb.stanford.edu/users/imbens.

1Discussed in 10.1214/14-STS494, 10.1214/14-STS485,10.1214/14-STS488, 10.1214/14-STS491; rejoinder at10.1214/14-STS496.

This is an electronic reprint of the original article

published by the Institute of Mathematical Statistics in

Statistical Science, 2014, Vol. 29, No. 3, 323–358. This

reprint differs from the original in pagination and

typographic detail.2There is another literature in econometrics using instru-

mental variables methods also to deal with classical measure-ment error (where explanatory variables are measured witherror that is independent of the true values). My remarks inthe current paper do not directly reflect on the use of instru-mental variables to deal with measurement error. See Sargan(1958) for a classical paper, and Hillier (1990) and Arellano(2002) for more recent discussions.

In the last two decades, these methods have at-tracted considerable attention in the statistics lit-erature. Although this recent statistics literaturebuilds on the earlier econometric literature, thereare nevertheless important differences. First, the re-cent statistics literature primarily focuses on the bi-nary treatment case. Second, the recent literatureexplicitly allows for treatment effect heterogeneity.Third, the recent instrumental variables literature(starting with Imbens and Angrist (1994); Angrist,Imbens and Rubin (1996); Heckman (1990); Man-ski (1990); and Robins (1986)) explicitly uses thepotential outcome framework used by Neyman forrandomized experiments and generalized to observa-tional studies by Rubin (1974, 1978, 1990). Fourth,in the applications this literature has concentratedon, including randomized experiments with noncom-pliance, the intention-to-treat or reduced-form esti-mates are often of greater interest than they are inthe traditional econometric simultaneous equationsapplications.

Partly the recent statistics literature has been mo-tivated by the earlier econometric literature on in-strumental variables, starting with Wright (1928)(see the discussion on the origins of instrumentalvariables in Stock and Trebbi (2003)). However,there are also other antecedents, outside of the tradi-tional econometric instrumental variables literature,notably the work by Zelen on encouragement designs(Zelen, 1979, 1990). Early papers in the recent statis-tics literature include Angrist, Imbens and Rubin

1

Page 2: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

2 G. W. IMBENS

(1996), Robins (1989) and McClellan and Newhouse(1994). Recent reviews include Rosenbaum (2010),Vansteelandt et al. (2011) and Hernán and Robins(2006). Although these reviews include many refer-ences to the earlier economics literature, it mightstill be useful to discuss the econometric literaturein more detail to provide some background and per-spective on the applicability of instrumental vari-ables methods in other fields. In this discussion, I willdo so.

Instrumental variables methods have been a cen-tral part of the econometrics canon since the firsthalf of the twentieth century, and continue to bean integral part of most graduate and undergrad-uate textbooks (e.g., Angrist and Pischke, 2009;Bowden and Turkington (1984); Greene (2011);Hayashi (2000); Manski (1995); Stock and Watson(2010); Wooldridge, 2010, 2008). Like the statisti-cians Fisher and Neyman (Fisher (1925); Splawa-Neyman, 1990), early econometricians such as Wright(1928), Working (1927), Tinbergen (1930) andHaavelmo (1943) were interested in drawing causalinferences, in their case about the effect of economicpolicies on economic behavior. However, in sharpcontrast to the statistical literature on causal infer-ence, the starting point for these econometricianswas not the randomized experiment. From the out-set, there was a recognition that in the settings theystudied, the causes, or treatments, were not assignedto passive units (economic agents in their setting,such as individuals, households, firms or countries).Instead the economic agents actively influence, oreven explicitly choose, the level of the treatmentthey receive. Choice, rather than chance, was thestarting point for thinking about the assignmentmechanism in the econometrics literature. In thisperspective, units receiving the active treatment aredifferent from those receiving the control treatmentnot just because of the receipt of the treatment: they(choose to) receive the active treatment because theyare different to begin with. This makes the treat-ment potentially endogenous, and creates what issometimes in the econometrics literature referred toas the selection problem (Heckman (1979)).

The early econometrics literature on instrumentalvariables did not have much impact on thinking inthe statistics community. Although some of the tech-nical work on large sample properties of various esti-mators did get published in statistics journals (e.g.,the still influential Anderson and Rubin, 1949 pa-per), applications by noneconomists were rare. It is

not clear exactly what the reasons for this are. Onepossibility is the fact that the early literature oninstrumental variables was closely tied to substan-tive economic questions (e.g., interventions in mar-kets), using theoretical economic concepts that mayhave appeared irrelevant or difficult to translate toother fields (e.g., supply and demand). This mayhave suggested to noneconomists that the instru-mental variables methods in general had limited ap-plicability outside of economics. The use of economicconcepts was not entirely unavoidable, as the crit-ical assumptions underlying instrumental variablesmethods are substantive and require subtle subjectmatter knowledge. A second reason may be that al-though the early work by Tinbergen and Haavelmoused a notation that is very similar to what Rubin(1974) later called the potential outcome notation,quickly the literature settled on a notation only in-volving realized or observed outcomes; see for a his-torial perspective Hendry and Morgan (1992) andImbens (1997). This realized-outcome notation thatremains common in the econometric textbooks ob-scures the connections between the Fisher and Ney-man work on randomized experiments and the in-strumental variables literature. It is only in the 1990sthat econometricians returned to the potential out-come notation for causal questions (e.g., Heckman(1990); Manski (1990); Imbens and Angrist (1994)),facilitating and initiating a dialogue with statisti-cians on instrumental variable methods.

The main theme of the current paper is that theearly work in econometrics is helpful in understand-ing the modern instrumental variables literature,and furthermore, is potentially useful in improvingapplications of these methods and identifying poten-tial instruments. These methods may in fact be use-ful in many settings statisticians study. Exposure totreatment is rarely solely a matter of chance or solelya matter of choice. Both aspects are important andhelp to understand when causal inferences are cred-ible and when they are not. In order to make thesepoints, I will discuss some of the early work and putit in a modern framework and notation. In doing so,I will address some of the concerns that have beenraised about the applicability of instrumental vari-ables methods in statistics. I will also discuss someareas where the recent statistics literature has ex-tended and improved our understanding of instru-mental variables methods. Finally, I will review someof the econometric terminology and relate it to thestatistical literature to remove some of the semantic

Page 3: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 3

barriers that continue to separate the literatures. Ishould emphasize that many of the topics discussedin this review continue to be active research areas,about which there is considerable controversy bothinside and outside of econometrics.

The remainder of the paper is organized as follows.In Section 2, I will discuss the distinction betweenthe statistics literature on causality with its primaryfocus on chance, arising from its origins in the experi-mental literature, and the econometrics or economicsliterature with its emphasis on choice. The next twosections discuss in detail two classes of examples.In Section 3, I discuss the canonical example of in-strumental variables in economics, the estimation ofsupply and demand functions. In Section 4, I discussa modern class of examples, randomized experimentswith noncompliance. In Section 5, I discuss the sub-stantive content of the critical assumptions, and inSection 6, I link the current literature to the oldertextbook discussions. In Section 7, I discuss someof the recent extensions of traditional instrumentalvariables methods. Section 8 concludes.

2. CHOICE VERSUS CHANCE INTREATMENT ASSIGNMENT

Although the objectives of causal analyses instatistics and econometrics are very similar, tra-ditionally statisticians and economists have ap-proached these questions very differently. A key dif-ference in the approaches taken in the statisticaland econometric literatures is the focus on differ-ent assignment mechanisms, those with an emphasison chance versus those with an emphasis on choice.Although in practice in many observational stud-ies assignment mechanisms have elements of bothchance and choice, the traditional starting points inthe two literatures are very different, and it is onlyrecently that these literatures have discovered howmuch they have in common.3

3In both literatures, it is typically assumed that there is nointerference between units. In the statistics literature, this isoften referred to as the Stable Unit Treatment Value Assump-tion (SUTVA, Rubin (1978)). In economics, there are manycases where this is not a reasonable assumption because thereare general equilibrium effects. In an interesting recent exper-iment, Crépon et al. (2012) varied the scale of experimentalinterventions (job training programs in their case) in differentlabor markets and found that the scale substantially affectedthe average effects of the interventions. There is also a grow-ing literature on settings directly modeling interactions. Inthis discussion, I will largely ignore the complications aris-

2.1 The Statistics Literature: The Focus onChance

The starting point in the statistics literature,going back to Fisher (1925) and Splawa-Neyman(1990), is the randomized experiment, with bothFisher and Neyman motivated by agricultural ap-plications where the units of analysis are plots ofland. To be specific, suppose we are interested inthe average causal effect of a binary treatment orintervention, say fertilizer A or fertilizer B, on plotyields. In the modern notation and language orig-inating with Rubin (1974), the unit (plot) levelcausal effect is a comparison between the two po-tential outcomes, Yi(A) and Yi(B) [e.g., the differ-ence τi = Yi(B)−Yi(A)], where Yi(A) is the potentialoutcome given fertilizer A and Yi(B) is the potentialoutcome given fertilizer B, both for plot i. In a com-pletely randomized experiment with N plots, we se-lect M (with M ∈ 1, . . . ,N−1) plots at random toreceive fertilizer B, with the remaining N −M plotsassigned to fertilizer A. Thus, the treatment assign-ment, denoted by Xi ∈ A,B for plot i, is by designindependent of the potential outcomes.4 In this spe-cific setting, the work by Fisher and Neyman showshow one can draw exact causal inferences. Fisher fo-cused on calculating exact p-values for sharp nullhypotheses, typically the null hypothesis of no effectwhatsoever, Yi(A) = Yi(B) for all plots. Neyman fo-cused on developing unbiased estimators for the av-erage treatment effect

∑i(Yi(A)−Yi(B))/N and the

variance of those estimators.The subsequent literature in statistics, much of it

associated with the work by Rubin and coauthors(Cochran (1968); Cochran and Rubin (1973); Ru-bin, 1974, 1990, 2006; Rosenbaum and Rubin, 1983;Rubin and Thomas (1992); Rosenbaum, 2002, 2010;Holland (1986)) has focused on extending and gen-eralizing the Fisher and Neyman results that werederived explicitly for randomized experiments to themore general setting of observational studies. A largepart of this literature focuses on the case where

ing from interference between units. See, for example, Manski(2000a).

4To facilitate comparisons with the econometrics literature,I will follow the notation that is common in econometrics,denoting the endogenous regressors, here the treatment of in-terest, by Xi, and later the instruments by Zi. Additional(exogenous) regressors will be denoted by Vi. In the statis-tics literature, the treatments of interested are often denotedby Wi, the instruments by Zi, with Xi denoting additionalregressors or attributes.

Page 4: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

4 G. W. IMBENS

the researcher has additional background informa-tion available about the units in the study. The ad-ditional information is in the form of pretreatmentvariables or covariates not affected by the treatment.Let Vi denote these covariates. A key assumption inthis literature is that conditional on these pretreat-ment variables the assignment to treatment is inde-pendent of the treatment assignment. Formally,

Xi ⊥ (Yi(A), Yi(B))|Vi (unconfoundedness).

Following Rubin (1990), I refer to this assumptionas unconfoundedness given Vi, also known as no un-

measured confounders. This assumption, in combina-tion with the auxiliary assumption that for all valuesof the covariates the probability of being assignedto each level of the treatment is strictly positive isreferred to as strong ignorability (Rosenbaum andRubin, 1983). If we assume only that Xi ⊥ Yi(A)|Vi

and Xi ⊥ Yi(B)|Vi rather than jointly, the assump-tion is referred to as weak unconfoundedness (Im-bens (2000)), and the combination as weak ignor-

ability. Substantively, it is not clear that there arecases in the setting with binary treatments where theweak version is plausible but not the strong version,although the difference between the two assump-tions has some content in the multivalued treat-ment case (Imbens (2000)). In the econometric lit-erature, closely related assumptions are referred toas selection-on-observables (Barnow, Cain and Gold-berger (1980)) or exogeneity.

Under weak ignorability (and thus also understrong ignorability), it is possible to estimate pre-cisely the average effect of the treatment in largesamples. In other words, the average effect of thetreatment is identified. Various specific methodshave been proposed, including matching, subclas-sification and regression. See Rosenbaum (2010),Rubin (2006), Imbens (2004, 2014), Gelman andHill (2006), Imbens and Rubin (2014) and Angristand Pischke (2009) for general discussions and sur-veys. Robins and coauthors (Robins (1986); Gill andRobins (2001); Richardson and Robins (2013); Vander Laan and Robins, 2003) have extended this ap-proach to settings with sequential treatments.

2.2 The Econometrics Literature:The Focus on Choice

In contrast to the statistics literature whose pointof departure was the randomized experiment, thestarting point in the economics and econometricsliteratures for studying causal effects emphasizes the

choices that led to the treatment received. Unlike theoriginal applications in statistics where the units arepassive, for example, plots of land, with no influenceover their treatment exposure, units in economicanalyses are typically economic agents, for example,individuals, families, firms or administrations. Theseare agents with objectives and the ability to pursuethese objectives within constraints. The objectivesare typically closely related to the outcomes underthe various treatments. The constraints may be le-gal, financial or information-based.

The starting point of economic science is to modelthese agents as behaving optimally. More specifi-cally, this implies that economists think of everyoneof these agents as choosing the level of the treatmentto most efficiently pursue their objectives given theconstraints they face.5 In practice, of course, there isoften evidence that not all agents behave optimally.Nevertheless, the starting point is the presumptionthat optimal behavior is a reasonable approximationto actual behavior, and the models economists taketo the data often reflect this.

2.3 Some Examples

Let us contrast the statistical and econometric ap-proaches in a highly stylized example. Roy (1951)studies the problem of occupational choice and theimplications for the observed distribution of earn-ings. He focuses on an example where individualscan choose between two occupations, hunting andfishing. Each individual has a level of productiv-ity associated with each occupation, say, the totalvalue of the catch per day. For individual i, the twoproductivity levels are Yi(h) and Yi(f), for the pro-ductivity level if hunting and fishing, respectively.6

Suppose the researcher is interested in the averagedifference in productivity in these two occupations,τ = E[Yi(f)−Yi(h)], where the averaging is over thepopulation of individuals.7 The researcher observesfor all units in the sample the occupation they chose

5In principle, these objectives may include the effort it takesto find the optimal strategy, although it is rare that these costsare taken into account.

6In this example, the no-interference (SUTVA) assumptionthat there are no effects of other individual’s choices and,therefore, that the individual level potential outcomes are welldefined is tenuous—if one hunter is successful that will reducethe number of animals available to other hunters—but I willignore these issues here.

7That is not actually the goal of Roy’s original study, butthat is beside the point here.

Page 5: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 5

(Xi, equal to h for hunters and f for fishermen) andthe productivity in their chosen occupation,

Y obs

i = Yi(Xi) =

Yi(h) if Xi = h,

Yi(f) if Xi = f .

In the Fisher–Neyman–Rubin statistics tradition,one might start by estimating τ by comparing pro-ductivity levels by occupation:

τ = Yobs

f − Yobs

h ,

where

Yobs

f =1

Nf

i:Xi=f

Y obs

i , Yobs

h =1

Nh

i:Xi=h

Y obs

i ,

Nf =

N∑

i=1

1Xi=f and Nh =N −Nf .

If there is concern that these unadjusted differencesare not credible as estimates of the average causaleffect, the next step in this approach would be toadjust for observed individual characteristics suchas education levels or family background. This wouldbe justified if individuals can be thought of as choos-ing, at least within homogenous groups defined bycovariates, randomly which occupation to engage in.

Roy, in the economics tradition, starts from a verydifferent place. Instead of assuming that individualschoose their occupation (possibly after conditioningon covariates) randomly, he assumes that each indi-vidual chooses her occupation optimally, that is, theoccupation that maximizes her productivity:

Xi =f if Yi(f)≥ Yi(h),h otherwise.

There need not be a solution in all cases, especiallyif there is interference, and thus there are generalequilibrium effects, but I will assume here that sucha solution exists. If this assumption about the occu-pation choice were strictly true, it would be difficultto learn much about τ from data on occupations andearnings. In the spirit of research by Manski (1990,2000b, 2001), Manski and Pepper (2000), and Man-ski et al. (1992), one can derive bounds on τ , exploit-ing the fact that if Xi = f , then the unobserved Yi(h)must satisfy Yi(h)≤ Yi(f), with Yi(f) observed. Forthe Roy model, the specific calculations have beenreported in Manski (1995), Section 2.6. Without ad-ditional information or restrictions, these boundsmight be fairly wide, and often one would not learnmuch about τ . However, the original version of the

Roy model, where individuals know ex ante the ex-act value of the potential outcomes and choose thelevel of the treatment corresponding to the maxi-mum of those, is ultimately not plausible in prac-tice. It is likely that individuals face uncertainty re-garding their future productivity, and thus may notbe able to choose the ex post optimal occupation;see for bounds under that scenario Manski and Na-gin (1998). Alternatively, and this is emphasized inAthey and Stern (1998), individuals may have morecomplex objective functions taking into account het-erogenous costs or nonmonetary benefits associatedwith each occupation. This creates a wedge betweenthe outcomes that the researcher focuses on and theoutcomes that the agent optimizes over. What is keyhere in relation to the statistics literature is that un-der the Roy model and its generalizations the veryfact that two individuals have different occupationsis seen as indicative that they have different poten-tial outcomes, thus fundamentally calling into ques-tion the unconfoundedness assumption that individ-uals with similar pretreatment variables but differ-ent treatment levels are comparable. This concernabout differences between individuals with the samevalues for pretreatment variables but different treat-ment levels underlies many econometric analyses ofcausal effects, specifically in the literature on selec-tion models. See Heckman and Robb (1985) for ageneral discussion.

Let me discuss two additional examples. There isa large literature in economics concerned with esti-mating the causal effect of educational achievement(measured as years of education) on earnings; seefor general discussions Griliches (1977) and Card(2001). One starting point, and in fact the basis ofa large empirical literature, is to compare earningsfor individuals who look similar in terms of back-ground characteristics, but who differ in terms ofeducational achievement. The concern in an equallylarge literature is that those individuals who chooseto acquire higher levels of education did so preciselybecause they expected their returns to additionalyears of education to be higher than individuals whochoose not to acquire higher levels of education ex-pected their returns to be. In the terminology ofthe returns-to-education literature, the individualschoosing higher levels of education may have higherlevels of ability, which lead to higher earnings forgiven levels of education.

Another canonical example is that of voluntaryjob training programs. One approach to estimate

Page 6: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

6 G. W. IMBENS

the causal effect of training programs on subsequentearnings would be to compare earnings for those par-ticipating in the program with earnings for thosewho did not. Again the concern would be that thosewho choose to participate did so because they ex-pected bigger benefits (financial or otherwise) fromdoing so than individuals who chose not to partici-pate.

These issues also arise in the missing data liter-ature. The statistics literature (Rubin, 1976, 1987,1996; Little and Rubin, 1987) has primarily focusedon models that assume that units with item non-response are comparable to units with complete re-sponse, conditional on covariates that are always ob-served. The econometrics literature (Heckman, 1976,1979) has focused more heavily on models that in-terpret the nonresponse as the result of systematicdifferences between units. Philipson (1997a, 1997b),Philipson and DeSimone (1997), and Philipson andHedges (1998) take this even further, viewing surveyresponse as a market transaction, where individu-als not responding the survey do so deliberately be-cause the costs of responding outweighs the benefitsto these nonrespondents. The Heckman-style selec-tion models often assume strong parametric alterna-tives to the Little and Rubin missing-at-random orignorability condition. This has often in turn led toestimators that are sensitive to small changes in thedata generating process. See Little (1985).

These issues of nonrandom selection are of coursenot special to economics. Outside of randomized ex-periments, the exposure to treatment is typicallyalso chosen to achieve some objectives, rather thanrandomly within homogenous populations. For ex-ample, physicians presumably choose treatments fortheir patients optimally, given their knowledge andgiven other constraints (e.g., financial). Similarly, ineconomics and other social sciences one may view in-dividuals as making optimal decisions, but these aretypically made given incomplete information, lead-ing to errors that may make the ultimate decisionsappear as good as random within homogenous sub-populations. What is important is that the start-ing point is different in the two disciplines, and thishas led to the development of substantially differentmethods for causal inference.

2.4 Instrumental Variables

How do instrumental variables methods addressthe type of selection issues the Roy model raises?

At the core, instrumental variables change the in-centives for agents to choose a particular level of thetreatment, without affecting the potential outcomesassociated with these treatment levels. Consider ajob training program example where the researcheris interested in the average effect of the training pro-gram on earnings. Each individual is characterizedby two potential earnings outcomes, earnings giventhe training and earnings in the absence of the train-ing. Each individual chooses to participate or notbased on their perceived net benefits from doing so.As pointed out in Athey and Stern (1998), it is im-portant that these net benefits that enter into the in-dividual’s decision differ from the earnings that arethe primary outcome of interest to the researcher.They do so by the costs associated with participat-ing in that regime. Suppose that there is variationin the costs individuals incur with participation inthe training program. The costs are broadly defined,and may include travel time to the program facili-ties, or the effort required to become informed aboutthe program. Furthermore, suppose that these costsare independent of the potential outcomes. This isa strong assumption, often made more plausible byconditioning on covariates. Measures of the partici-pation cost may then serve as instrument variablesand aid in the identification of the causal effects ofthe program. Ultimately, we compare earnings for in-dividuals with low costs of participation in the pro-gram with those for individuals with high costs ofparticipation and attribute the difference in averageearnings to the increased rate of participation in theprogram among the two groups.

In almost all cases, the assumption that there isno direct effect of the change in incentives on thepotential outcomes is controversial, and it needs tobe assessed at a case-by-case level. The second partof the assumption, that the costs are independent ofthe potential outcomes, possibly after conditioningon covariates, is qualitatively very different. In somecases, it is satisfied by design, for example, if theincentives are randomized. In observational studies,it is a substantive, unconfoundedness-type, assump-tion, that may be more plausible or at least approxi-mately hold after conditioning on covariates. For ex-ample, in a number of studies researchers have usedphysical distance to facilities as instruments for ex-posure to treatments available at such facilities. Suchstudies include McClellan and Newhouse (1994) andBaiocchi et al. (2010) who use distance to hospi-tals with particular capabilities as an instrument for

Page 7: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 7

treatments associated with those capabilities, afterconditioning on distance to the nearest medical fa-cility, and Card (1995), who uses distance to collegesas an instrument for attending college.

3. THE CLASSIC EXAMPLE:SUPPLY AND DEMAND

In this section, I will discuss the classic exampleof instrumental variables methods in econometrics,that is, simultaneous equations. Simultaneous equa-tions models are both at the core of the econometricscanon and at the core of the confusion concerninginstrumental variables methods in the statistics lit-erature. More precisely, in this section I will look atsupply and demand models that motivated the orig-inal research into instrumental variables. Here, theendogeneity, that is, the violation of unconfounded-ness, arises from an equilibrium condition. I will dis-cuss the model in a very specific example to makethe issues clear, as I think that perhaps the level ofabstraction used in the older econometric text bookshas hampered communication with researchers inother fields.

3.1 Discussions in the Statistics Literature

To show the level of frustration and confusion inthe statistics literature with these models, let mepresent some quotes. In a comment on Pratt andSchlaifer (1984), Dawid (1984) writes “I despair ofever understanding the logic of simultaneous equa-tions well enough to tackle them,” (page 24). Cox(1992) writes in a discussion on causality “it seemsreasonable that models should be specified in a waythat would allow direct computer simulation of thedata. . . . This, for example, precludes the use of y2as an explanatory variable for y1 if at the sametime y1 is an explanatory variable for y2” (page294). This restriction appears to rule out the firstmodel Haavelmo considers, that is, equations (1.1)and (1.2) (Haavelmo (1943), page 2):

Y = aX + ǫ1, X = bY + ǫ2

(see also Haavelmo, 1944). In fact, the comment byCox appears to rule out all simultaneous equationsmodels of the type studied by economists. Holland(1988), in comment on structural equation meth-ods in econometrics, writes “why should [this distur-bance] be independent of [the instrument]. . . whenthe very definition of [this disturbance] involves [theinstrument],” (page 460). Freedman writes “Addi-tionally, some variables are taken to be exogenous

(independent of the disturbance terms) and someendogenous (dependent on the disturbance terms).The rationale is seldom clear, because—among otherthings—there is seldom any very clear descriptionof what the disturbance terms mean, or where theycome from” (Freedman (2006), page 699).

3.2 The Market for Fish

The specific example I will use in this section isthe market for whiting (a particular white fish, oftenused in fish sticks) traded at the Fulton fish marketin New York City. Whiting was sold at the Fultonfish market at the time by a small number of deal-ers to a large number of buyers. Kathryn Graddycollected data on quantities and prices of whitingsold by a particular trader at the Fulton fish marketon 111 days between December 2, 1991, and May8, 1992 (Graddy, 1995, 1996; Angrist, Graddy andImbens (2000)). I will take as the unit of analysisa day, and interchangeably refer to this as a mar-ket. Each day, or market, during the period coveredin this data set, indexed by t = 1, . . . ,111, a num-ber of pounds of whiting are sold by this particulartrader, denoted by Qobs

t . Not every transaction onthe same day involves the same price, but to focuson the essentials I will aggregate the total amount ofwhiting sold and the total amount of money it wassold for, and calculate a price per pound (in cents)for each of the 111 days, denoted by P obs

t . Figure 1presents a scatterplot of the observed log price andlog quantity data. The average quantity sold overthe 111 days was 6335 pounds, with a standard de-viation of 4040 pounds, for an average of the averagewithin-day prices of 88 cts per pound and a standarddeviation of 34 cts. For example, on the first day ofthis period 8058 pounds were sold for an average of65 cents, and the next day 2224 pounds were sold foran average of 100 cents. Table 1 presents averages oflog prices and log quantities for the fish data.

Now suppose we are interested in predicting the ef-fect of a tax in this market. To be specific, supposethe government is considering imposing a 100× r%tax (e.g., a 10% tax) on all whiting sold, but beforedoing so it wishes to predict the average percent-age change in the quantity sold as a result of thetax. We may formalize that by looking at the av-erage effect on the logarithm of the quantity, τ =E[lnQt(r) − lnQt(0)], where Qt(r) is the quantitytraded in market/day t if the tax rate were set at r.The problem, substantially worse than in the stan-dard causal inference setting where for some units

Page 8: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

8 G. W. IMBENS

Fig. 1. Scatterplot of log prices and log quantities.

we observe one of the two potential outcomes andfor other units we observe the other potential out-come, is that in all 111 markets we observe the quan-tity traded at tax rate 0, Qobs

t =Qt(0), and we neversee the quantity traded at the tax rate contemplatedby the government, Qt(r). Because only E[lnQt(0)]is directly estimable from data on the quantities weobserve, the question is how to draw inferences aboutE[lnQt(r)].

A naive approach would be to assume that a taxincrease by 10% would simply raise prices by 10%. Ifone additionally is willing to make the unconfound-edness assumption that prices can be viewed as setindependently of market conditions on a particularday, it follows that those markets after the intro-duction of the tax where the price net of taxes is$1.00 would on average be like those markets priorto the introduction of the 10% tax where the price

was $1.10. Formally, this approach assumes that

E[lnQt(r)|P obs

t = p](3.1)

= E[lnQt(0)|P obs

t = (1+ r)× p],

implying that

E[lnQt(r)− lnQt(0)|P obs

t = p]

= E[lnQobs

t |P obs

t = (1 + r)× p]

−E[lnQobs

t |P obs

t = p]

≈ E[lnQobs

t | lnP obs

t = r+ lnp]

−E[lnQobs

t | lnP obs

t = lnp].

The last quantity is often estimated using linear re-gression methods. Typically, the regression functionis assumed to be linear in logarithms with constantcoefficients,

lnQobs

t = αls + βls × lnP obs

t + εt.(3.2)

Table 1

Fulton fish market data (N = 111)

Logarithm of price Logarithm of quantityNumber ofobservations Average Standard deviation Average Standard deviation

All 111 −0.19 (0.38) 8.52 (0.74)

Stormy 32 0.04 (0.35) 8.27 (0.71)Not-stormy 79 −0.29 (0.35) 8.63 (0.73)

Stormy 32 0.04 (0.35) 8.27 (0.71)Mixed 34 −0.16 (0.35) 8.51 (0.77)Fair 45 −0.39 (0.37) 8.71 (0.69)

Page 9: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 9

Ordinary least squares estimation with the Fultonfish market data collected by Graddy leads to

lnQobst = 8.42 − 0.54 × lnP obs

t .(0.08) (0.18)

The estimated regression line is also plotted in Fig-ure 1. Interestingly, this is what Working (1927) callsthe “statistical ‘demand curve’,” as opposed to theconcept of a demand curve in economic theory. Thissimple regression, in combination with the assump-tion embodied in (3.1), suggests that the quantitytraded would go down, on average, by 5.4% in re-sponse to a 10% tax.

τ =−0.054 (s.e. 0.018).

Why does this answer, or at least the method inwhich it was derived, not make any sense to aneconomist? The answer assumes that prices can beviewed as independent of the potential quantitiestraded, or, in other words, unconfounded. This as-signment mechanism is unrealistic. In reality, it islikely the markets/days, prior to the introduction ofthe tax, when the price was $1.10 were systemati-cally different from those where the price was $1.00.From an economists’ perspective, the fact that theprice was $1.10 rather than $1.00 implies that mar-ket conditions must have been different, and it islikely that these differences are directly related tothe potential quantities traded. For example, on dayswhere the price was high there may have been morebuyers, or buyers may have been interested in buy-ing larger quantities, or there may have been less fishbrought ashore. In order to predict the effect of thetax, we need to think about the responses of bothbuyers and sellers to changes in prices, and aboutthe determination of prices. This is where economictheory comes in.

3.3 The Supply of and Demand for Fish

So, how do economists go about analyzing ques-tions such as this one if not by regressing quantitieson prices? The starting point for economists is tothink of an economic model for the determination ofprices (the treatment assignment mechanism in Ru-bin’s potential outcome terminology). The first partof the simplest model an economist would considerfor this type of setting is a pair of functions, thedemand and supply functions. Think of the buyerscoming to the Fulton fishmarket on a given mar-ket/day (say, day t) with a demand function Qd

t (p).This function tells us, for that particular morning,

how much fish all buyers combined would be willingto buy if the price on that day were p, for any valueof p. This function is conceptually exactly like thepotential outcomes set up commonly used in causalinference in the modern literature. It is more com-plicated than the binary treatment case with twopotential outcomes, because there is a potential out-come for each value of the price, with more or lessa continuum of possible price values, but it is inline with continuous treatment extensions such asthose in Gill and Robins (2001). Common sense, andeconomic theory, suggests that this demand func-tion is a downward sloping function: buyers wouldlikely be willing to buy more pounds of whiting if itwere cheaper. Traditionally, the demand function isspecified parametrically, for example, linear in loga-rithms:

lnQdt (p) = αd + βd × lnp+ εdt ,(3.3)

where βd is the price elasticity of demand. This equa-tion is not a regression function like (3.2). It is inter-preted as a structural equation or behavioral equa-tion, and in the treatment effect literature terminol-ogy, it is a model for the potential outcomes. Partof the confusion between the model for the poten-tial outcomes in (3.3) and the regression functionin (3.2) may stem from the traditional notation inthe econometrics literature where the same symbol(e.g., Qt) would be used for the observed outcomes

(Qobst in our notation) and the potential outcome

function [Qdt (p) in our notation], and the same sym-

bol (e.g., Pt) would be used for the observed valueof the treatment (P obs

t in our notation) and the ar-gument in the potential outcome function (p in ournotation). Interestingly, the pioneers in this litera-ture, Tinbergen (1930) and Haavelmo (1943), did

distinguish between these concepts in their nota-tion, but the subsequent literature on simultaneousequations dropped that distinction and adopted anotation that did not distinguish between observedand potential outcomes. For a historical perspectivesee Christ (1994) and Stock and Trebbi (2003). Myview is that dropping this distinction was merelyincidental, and that implicitly the interpretation ofthe simultaneous equations models remained that interms of potential outcomes.8

8As a reviewer pointed out, once one views simultaneousequations in terms of potential outcomes, there is a naturalnormalization of the equations. This suggests that perhaps thediscussions of issues concerning normalizations of equations in

Page 10: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

10 G. W. IMBENS

Implicit (by the lack of a subscript on the coeffi-cients) in the specification of the demand functionin (3.3) is the strong assumption that the effect ofa unit change in the logarithm of the price (equalto βd) is the same for all values of the price, andthat the effect is the same in all markets. This isclearly a very strong assumption, and the modernliterature on simultaneous equations (see Matzkin(2007) for an overview) has developed less restric-tive specifications allowing for nonlinear and nonad-ditive effects while maintaining identification. Theunobserved component in the demand function, de-noted by εdt , represents unobserved determinants ofthe demand on any given day/market: a particularbuyer may be sick on a particular day and not go tothe market, or may be expecting a client wanting topurchase a large quantity of whiting. We can normal-ize this unobserved component to have expectationzero, where the expectation is taken over all marketsor days:

E[lnQdt (p)] = αd + βd × lnp.

The interpretation of this expectation is subtle, andagain it is part of the confusion that sometimesarises. Consider the expected demand at p = 1,E[lnQd

t (1)], under the linear specification in (3.3)equal to αd + βd · ln(1) = αd. This αd is the averageof all demand functions, evaluated at price equal to$1.00, irrespective of what the actual price in themarket is, where the expectation is taken over all

markets. It is not, and this is key, the conditional ex-pectation of the observed quantity in markets wherethe observed price is equal to $1.00 (or which is thesame the demand function at 1 in those markets),which is E[lnQobs

t |P obst = 1] = E[lnQd

t (1)|P obst = 1].

The original Tinbergen and Haavelmo notation andthe modern potential outcome version is helpfulin making this distinction, compared to the sixtieseconometrics textbook notation.9

simultaneous equations models (e.g., Basmann, 1963a, 1963b,1965; Hillier (1990)) implicitly rely on a different interpreta-tion, for example, thinking of the endogeneity arising frommeasurement error. Throughout this discussion, I will inter-pret simultaneous equations in terms of potential outcomes,viewing the realized outcome notation simply as obscuringthat.

9Other notations have been recently proposed to stressthe difference between the conditional expectation of the ob-served outcome and the expectation of the potential out-come. Pearl (2000) writes the expected demand when theprice is set to $1.00 as E[lnQd

t |do(Pt = 1)], rather than con-

Similar to the demand function, the supply func-tion Qs

t (p) represents the quantity of whiting thesellers collectively are willing to sell at any givenprice p, on day t. Here, common sense would suggestthat this function is sloping upward: the higher theprice, the more the sellers are willing to sell. As withthe demand function, the supply function is typicallyspecified parametrically with constant coefficients:

lnQst(p) = αs + βs × lnp+ εst ,(3.4)

where βs is the price elasticity of supply. Again wecan normalize the expectation of εst to zero (wherethe expectation is taken over markets), and write

E[lnQst(p)] = αs + βs × lnp.

Note that the εdt and εst are not assumed to be in-dependent in general, although in some applicationsthat may be a reasonable assumption. In this spe-cific example, εdt may represent random variation inthe set or number of buyers coming to the marketon a particular day, and εst may represent randomvariation in suppliers showing up at the market andin their ability to catch whiting during the precedingdays. These components may well be uncorrelated,but there may be common components, for example,in traffic conditions around the market that make itdifficult for both suppliers and buyers to come to themarket.

3.4 Market Equilibrium

Now comes the second part of the simple economicmodel, the determination of the price, or, in the ter-minology of the treatment effect literature, the as-signment mechanism. The conventional assumptionin this type of market is that the price that is ob-served, that is, the price at which the fish is traded inmarket/day t, is the (unique) market clearing priceat which demand and supply are equal. In otherwords, this is the price at which the market is inequilibrium, denoted by P obs

t . This equilibrium pricesolves

Qdt (P

obs

t ) =Qst(P

obs

t ).(3.5)

The observed quantity on that day, that is the quan-tity actually traded, denoted by Qobs

t , is then equal

ditional on the price being observed to be $1.00. Hernánand Robins (2006) write this average potential outcome asE[lnQd

t (Pt = 1)], whereas Lauritzen and Richardson (2002)write it as E[lnQobs

t ‖ P obs

t = 1] where the double ‖ impliesconditioning by intervention.

Page 11: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 11

to the demand function at the equilibrium price (or,equivalently, because of the equilibrium assumption,the supply function at that price):

Qobs

t =Qdt (P

obs

t ) =Qst (P

obs

t ).(3.6)

Assuming that the demand function does slopedownward and the supply function does slope up-ward, and both are linear in logarithms, the equi-librium price exists and is unique, and we can solvefor the observed price and quantities in terms of theparameters of the model and the unobserved com-ponents:

lnP obs

t =αd −αs

βs − βd+

εdt − εstβs − βd

and

lnQobs

t =βs · αd − βd · αs

βs − βd+

βs · εdt − βd · εstβs − βd

.

For economists, this is a more plausible model forthe determination of realized prices and quantitiesthan the model that assumes prices are independentof market conditions. It is not without its problemsthough. Chief among these from our perspective isthe complication that, just as in the Roy model, wecannot necessarily infer the values of the unknownparameters in this model even if we have data onequilibrium prices and quantities P obs

t and Qobst for

many markets.Another issue is how buyers and sellers arrive at

the equilibrium price. There is a theoretical eco-nomic literature addressing this question. Often theidea is that there is a sequential process of buyersmaking bids, and suppliers responding with offers ofquantities at those prices, with this process repeat-ing itself until it arrives at a price at which supplyand demand are equal. In practice, economists oftenrefrain from specifying the details of this process andsimply assume that the market is in equilibrium. Ifthe process is fast enough, it may be reasonable to ig-nore the fact the specifics of the process and analyzethe data as if equilibrium was instantaneous.10 A re-lated issue is whether this model with an equilibriumprices that equates supply and demand is a reason-able approximation to the actual process that deter-mines prices and quantities. In fact, Graddy’s datacontains information showing that the seller wouldtrade at different prices on the same day, so strictly

10See Shapley and Shubik (1977) and Giraud (2003), andfor some experimental evidence, Plott and Smith (1987) andSmith (1982).

speaking this model does not hold. There is a longtradition in economics, however, of using such mod-els as approximations to price determination and wewill do so here.

Finally, let me connect this to the textbook dis-cussion of supply and demand models. In many text-books, the demand and supply equations would bewritten directly in terms of the observed (equilib-rium) quantities and prices as

Qobs

t = αs + βs × lnP obs

t + εst ,(3.7)

Qobs

t = αd + βd × lnP obs

t + εdt .(3.8)

This representation leaves out much of the struc-ture that gives the demand and supply function theirmeaning, that is, the demand equation (3.3), thesupply equation (3.4) and the equilibrium condition(3.5). As Strotz and Wold (1960) write, “Those whowrite such systems [(3.8) and (3.8)] do not, however,really mean what they write, but introduce an ellip-sis which is familiar to economists” (page 425), withthe ellipsis referring to the market equilibrium con-dition that is left out. See also Strotz (1960), Strotzand Wold (1965), and Wold (1960)

3.5 The Statistical Demand Curve

Given this set up, let me discuss two issues.First, let us explore, under this model, the inter-pretation of what Working (1927) called the “sta-tistical demand curve.” The covariance between ob-served (equilibrium) log quantities and log prices iscov(lnQobs

t , lnP obst ) = (βs · σ2

d + βd · σ2s − ρ · σd · σs ·

(βd + βs))/((βs − βd)2), where σd and σs are thestandard deviations of εdt and εst , respectively, and ρis their correlation. Because the variance of lnP obs

t is(σ2

s +σ2

d−2 ·ρ ·σd ·σs)/(βs−βd)2, it follows that theregression coefficient in the regression of log quanti-ties on log prices is

cov(lnQobst , lnP obs

t )

var(lnP obst )

=βs · σ2

d + βd · σ2s − ρ · σd · σs · (βd + βs)

σ2s + σ2

d − 2 · ρ · σd · σs.

Working focuses on the interpretation of this relationbetween equilibrium quantities and prices. Supposethat the correlation between εdt and εst , denoted by ρ,is zero. Then the regression coefficient is a weightedaverage of the two slope coefficients of the supplyand demand function, weighted by the variances ofthe residuals:

cov(lnQobst , lnP obs

t )

var(lnP obst )

= βs · σ2

d

σ2s + σ2

d

+ βd · σ2s

σ2s + σ2

d

.

Page 12: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

12 G. W. IMBENS

If σ2

d is small relative to σ2s , then we estimate some-

thing close to the slope of the demand function, andif σ2

s is small relative to σ2

d , then we estimate some-thing close to the slope of the supply function. Ingeneral, however, as Working stresses, the “statisti-cal demand curve” is not informative about the de-mand function (or about the supply function); seealso Leamer (1981).

3.6 The Effect of a Tax Increase

The second question is how this model with supplyand demand functions and a market clearing pricehelps us answer the substantive question of interest.The specific question considered is the effect of thetax increase on the average quantity traded. In agiven market, let p be the price sellers receive perpound of whiting, and let p = p× (1 + r) the pricebuyers pay after the tax has been imposed. The keyassumption is that the only way buyers and sellersrespond to the tax is through the effect of the tax onprices: they do not change how much they would bewilling to buy or sell at any given price, and the pro-cess that determines the equilibrium price does notchange. The technical econometric term for this isthat the demand and supply functions are structural

or invariant in the sense that they are not affectedby changes in the treatment, taxes in this case. Thismay not be a perfect assumption, but certainly inmany cases it is reasonable: if I have to pay $1.10per pound of whiting, I probably do not care whether10 cts of that goes to the government and $1 to theseller, or all of it goes to the seller. If we are willingto make that assumption, we can solve for the newequilibrium price and quantity. Let Pt(r) be the newequilibrium price [net of taxes, that is, the price sell-ers receive, with (1+ r) ·Pt(r) the price buyers pay],given a tax rate r, with in our example r= 0.1. Thisprice solves

Qdt (Pt(r)× (1 + r)) =Qs

t (Pt(r)).

Given the log linear specification for the demand andsupply functions, this leads to

lnPt(r) =αd −αs

βs − βd+

βd × ln(1 + r)

βs − βd+

εdt − εstβs − βd

.

The result of the tax is that the average of the loga-rithm of the price that sellers receive with a positivetax rate r is less than what they would have receivedin the absence of the tax rate:

E[lnPt(r)] =αd −αs

βs − βd+

βd × ln(1 + r)

βs − βd

≤ αd −αs

βs − βd= E[lnPt(0)].

(Note that βd < 0.) On the other hand, the buyerswill pay more on average:

E[ln((1 + r) · Pt(r))] =αd −αs

βs − βd+

βs × ln(1 + r)

βs − βd

≥ E[lnPt(0)].

The quantity traded after the tax increase is

lnQt(r) =βs · αd − βd ·αs

βs − βd+

βs · βd · ln(1 + r)

βs − βd

+βs · εdt − βd · εst

βs − βd,

which is less than the quantity that would be tradedin the absence of the tax increase. The causal effectis

lnQt(r)− lnQt(0) =βs · βd · ln(1 + r)

βs − βd,

the same in all markets, and proportional to the sup-ply and demand elasticities and, for small r, propor-tional to the tax. What should we take away fromthis discussion? There are three points. First, theregression coefficient in the regression of log quan-tity on log prices does not tell us much about theeffect of new tax. The sign of this regression coeffi-cient is ambiguous, depending on the variances andcovariance of the unobserved determinants of supplyand demand. Second, in order to predict the mag-nitude of the effect of a new tax we need to learnabout the demand and supply functions separately,or in the econometrics terminology, identify the sup-ply and demand functions. Third, observations onequilibrium prices and quantities by themselves donot identify these functions.

3.7 Identification with Instrumental Variables

Given this identification problem, how do we iden-tify the demand and supply functions? This is whereinstrumental variables enter the discussion. To iden-tify the demand function, we look for determinantsof the supply of whiting that do not affect the de-mand for whiting, and, similarly, to identify the sup-ply function we look for determinants of the de-mand for whiting that do not affect the supply. Inthis specific case, Graddy (1995, 1996) assumes thatweather conditions at sea on the days prior to mar-ket t, denoted by Zt, affect supply but do not affect

Page 13: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 13

demand. Certainly, it appears reasonable to thinkthat weather is a direct determinant of supply: hav-ing high waves and strong winds makes it harder tocatch fish. On the other hand, there does not seemto be any reason why demand on day t, at a givenprice p, would be correlated with wave height orwind speed on previous days. This assumption maybe made more plausible by conditioning on covari-ates. For example, if one is concerned that weatherconditions on land affect demand, one may wish tocondition on those, and only look at variation inweather conditions at sea given similar weather con-ditions on land as an instrument. Formally, the keyassumptions are that

Qdt (p)⊥Zt and Qs

t (p) 6⊥ Zt,

possibly conditional on covariates. If both of theseconditions hold, we can use weather conditions as aninstrument.

How do we exploit these assumptions? The tradi-tional approach is to generalize the functional formof the supply function to explicitly incorporate theeffect of the instrument on the supply of whiting. Inour notation,

lnQst (p, z) = αs + βs × lnp+ γs × z+ εst .

The demand function remains unchanged, capturingthe fact that demand is not affected by the instru-ment:

lnQdt (p, z) = αd + βd × lnp+ εdt .

We assume that the unobserved components of sup-ply and demand are independent of (or at least un-correlated with) the weather conditions:

(εdt , εst )⊥Zt.

The equilibrium price P obst is the solution for p in

the equation

Qd(p,Zt) =Qst(p,Zt),

which, in combination with the log linear specifica-tion for the demand and supply functions, leads to

lnP obs

t =αd −αs

βs − βd+

εdt − εstβs − βd

− γs ·Zt

βs − βd

and

lnQobs

t =βs · αd − βd · αs

βs − βd+

βs · εdt − βd · εstβs − βd

− γs · βd ·Zt

βs − βd.

Now consider the expected value of the equilib-rium price and quantity given the weather condi-tions:

E[lnQobs

t |Zt = z](3.9)

=βs · αd − βd · αs

βs − βd− γs · βd

βs − βd· z

and

E[lnP obs

t |Zt = z] =αd −αs

βs − βd− γs

βs − βd· z.(3.10)

Equations (3.9) and (3.10) are what is called ineconometrics the reduced form of the simultaneousequations model. It expresses the endogenous vari-ables (those variables whose values are determinedinside the model, price and quantity in this exam-ple) in terms of the exogenous variables (those vari-ables whose values are not determined within themodel, weather conditions in this example). Theslope coefficients on the instrument in these reducedform equations are what in randomized experimentswith noncompliance would be called the intention-

to-treat effects. One can estimate the coefficients inthe reduced form by least squares methods. Thekey insight is that the ratio of the coefficients onthe weather conditions in the two regression func-tions, γs · βd/(βs − βd) in the quantity regressionand γs/(βs − βd) in the price regression, is equal tothe slope coefficient in the demand function.

For some purposes, the reduced-form or intention-to-treat effects may be of substantive interest. Inthe Fulton fish market example, people attemptingto predict prices and quantities under the currentregime may find these estimates of interest. Theyare of less interest to policy makers contemplatingthe introduction of a new tax. In simultaneous equa-tions settings, the demand and supply functions areviewed as structural in the sense that they are notaffected by interventions in the market such as newtaxes. As such they, and not the reduced-form re-gression functions, are the key components of predic-tions of market outcomes under new regimes. Thisis somewhat different in many of the recent applica-tions of instrumental variables methods in the statis-tics literature in the context of randomized experi-ments with noncompliance where the intention-to-treat effects are traditionally of primary interest.

Let me illustrate this with the Fulton Fish Marketdata collected by Graddy. For ease of illustration,let me simplify the instrument to a binary one: the

Page 14: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

14 G. W. IMBENS

Fig. 2. Scatterplot of log prices and log quantities by weather conditions.

weather conditions are good for catching fish (Zt =0, fair weather, corresponding to low wind speed andlow wave height) or stormy (Zt = 1, corresponding torelatively strong winds and high waves).11 The priceis the average daily price in cents for one dealer, andthe quantity is the daily quantity in pounds. Thetwo estimated reduced forms are

lnQobs

t = 8.63 − 0.36 ×Zt

(0.08) (0.15)

and

lnPobs

t =−0.29 + 0.34 ×Zt.(0.04) (0.07)

Hence, the instrumental variables estimate of theslope of the demand function is

βd =−0.36

0.34=−1.08 (s.e. 0.46).

Another, perhaps more intuitive way of looking atthese estimates is to consider the location of the av-erage log quantity and average log price separatelyby weather conditions. Figure 2 presents the scatterplot of log quantity and log prices, with the starsindicating stormy days and the plus signs indicat-ing calm days. On fair weather days the average logprice is −0.29, and the average log quantity is 8.6.

11The formal definition I use, following Angrist, Graddyand Imbens (2000) is that stormy is defined as wind speedgreater than 18 knots in combination with wave height morethan 4.5 ft, and fair weather is anything else.

On stormy days, the average log price is 0.04, andthe average log quantity is 8.3. These two loci aremarked by circles in Figure 2. On stormy days, theprice is higher and the quantity traded is lower thanon fair weather days. This is used to estimate theslope of the demand function. The figure also in-cludes the estimated demand function based on us-ing the indicator for stormy days as an instrumentfor the price: the estimated demand function goesthrough the two points defined by the average ofthe log price and log quantity for stormy and fairweather days.

With the data collected by Graddy, it is more dif-ficult to point identify the supply curve. The tra-ditional route toward identifying the supply curvewould rely on finding an instrument that shifts de-mand without directly affecting supply. Withoutsuch an instrument, we cannot point identify the ef-fect of the introduction of the tax on quantity andprices. It is possible under weaker assumptions tofind bounds on these estimands (e.g., Leamer (1981);Manski (2003)), but we do not pursue this here.

3.8 Recent Research on SimultaneousEquations Models

The traditional econometric literature on simul-taneous equations models is surveyed in Hausman(1983). Compared to the discussion in the preced-ing sections, this literature focuses on a more gen-eral case, allowing for multiple endogenous variablesand multiple instruments. The modern economet-ric literature, starting in the 1980s, has relaxed the

Page 15: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 15

linearity and additivity assumptions in specification(3.3) substantially. Key references to this literatureare Brown (1983), Roehrig (1988), Newey and Pow-ell (2003), Chesher (2003, 2010), Benkard and Berry(2006), Matzkin (2003, 2007), Altonji and Matzkin(2005), Imbens and Newey (2009), Hoderlein andMammen (2007), Horowitz (2011) and Horowitz andLee (2007). Matzkin (2007) provides a recent surveyof this technically demanding literature. This litera-ture has continued to use the observed outcome no-tation, making it more difficult to connect to the sta-tistical literature. Here, I briefly review some of thisliterature. The starting point is a structural equa-tion, in the potential outcome notation,

Yi(x) = α+ β · x+ εi

and an instrument Zi that satisfies

Zi ⊥ εi and Zi 6⊥Xi.

The traditional econometric literature would formu-late this in the observed outcome notation as

Yi = α+ β ·Xi + εi, Zi ⊥ εi and Zi 6⊥Xi.

There are a number of generalizations considered inthe modern literature. First, instead of assuming in-dependence of the unobserved component and theinstrument, part of the current literature assumesonly that the conditional mean of the unobservedcomponent given the instrument is free of depen-dence on the instrument, allowing the variance andother distributional aspects to depend on the valueof the instrument; see Horowitz (2011). Another gen-eralization of the linear model allows for general non-linear function forms of the type

Yi = g(Xi) + εi, Zi ⊥ εi and Zi 6⊥Xi,

where the focus is on nonparametric identificationand estimation of g(x); see Brown (1983), Roehrig(1988), Benkard and Berry (2006). Allowing for evenmore generality, researchers have studied nonaddi-tive versions of these models with

Yi = g(Xi, εi), Zi ⊥ εi and Zi 6⊥Xi,

with g(x, ε) strictly monotone in a scalar unobservedcomponent ε. In these settings, point identificationoften requires strong assumptions on the support ofthe instrument and its relation to the endogenous re-gressor and, therefore, researchers have also exploredbounds. See Matzkin (2003, 2007, 2008) and Imbensand Newey (2009).

4. A MODERN EXAMPLE: RANDOMIZEDEXPERIMENTS WITH NONCOMPLIANCE

AND HETEROGENOUS TREATMENTEFFECTS

In this section, I will discuss part of the modern lit-erature on instrumental variables methods that hasevolved simultaneously in the statistics and econo-metrics literature. I will do so in the context of asecond example. On the one hand, concern arosein the econometric literature about the restrictive-ness of the functional form assumptions in the tra-ditional instrumental variables methods and in par-ticular with the constant treatment effect assump-tion that were commonly used in the so-called selec-tion models (Heckman (1979); Heckman and Robb(1985)). The initial results in this literature demon-strated the difficulties in establishing point identi-fication (Heckman (1990); Manski (1990)), leadingto the bounds approach developed by Manski (1995,2003). At the same time, statisticians analyzed thecomplications arising from noncompliance in ran-domized experiments (Robins (1989)) and the mer-its of encouragement designs (Zelen, 1979, 1990). Byadopting a common framework and notation in Im-bens and Angrist (1994) and Angrist, Imbens andRubin (1996), these literatures have become closelyconnected and influenced each other substantially.

4.1 The McDonald, Hiu and Tierney (1992)Data

The canonical example in this literature is thatof a randomized experiment with noncompliance.To illustrate the issues, I will use here data previ-ously analyzed in Hirano et al. (2000) and McDon-ald, Hiu and Tierney (1992). McDonald, Hiu andTierney (1992) carried out a randomized experimentto evaluate the effect of an influenza vaccination onflu-related hospital visits. Instead of randomly as-signing individuals to receive the vaccination, theresearchers randomly assigned physicians to receiveletters reminding them of the upcoming flu seasonand encouraging them to vaccinate their patients.This is what Zelen (1979, 1990) refers to as an en-

couragement design. I discuss this using the potentialoutcome notation used for this particular set up inAngrist, Imbens and Rubin (1996), and in generalsometimes referred to as the Rubin Causal Model(Holland (1986)), although there are important an-tecedents in Splawa-Neyman (1990). I consider two

Page 16: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

16 G. W. IMBENS

Table 2

Influenza data (N = 2861)

Hospitalized for Influenzaflu-related reasons vaccine Letter Number of

Yobs

i Xobs

i Zi individuals

No No No 1027No No Yes 935No Yes No 233No Yes Yes 422Yes No No 99Yes No Yes 84Yes Yes No 30Yes Yes Yes 31

distinct treatments: the first the receipt of the let-ter, and second the receipt of the influenza vacci-nation. Let Zi ∈ 0,1 be the indicator for the re-ceipt of the letter, and let Xi ∈ 0,1 be the indi-cator for the receipt of the vaccination. We startby postulating the existence of four potential out-comes. Let Yi(z,x) be the potential outcome cor-responding to the receipt of letter equal to Zi = z,and the receipt of vaccination equal to Xi = x, forz = 0,1 and x = 0,1. In addition, we postulate theexistence of two potential outcomes correspondingto the receipt of the vaccination as a function of thereceipt of the letter, Xi(z), for z = 0,1. We observefor each unit in a population of size N = 2861 thevalue of the assignment, Zi, the treatment actuallyreceived, Xobs

i =Xi(Zi) and the potential outcomecorresponding to the assignment and treatment re-ceived, Y obs

i = Yi(Zi,Xi(Zi)). Table 2 presents thenumber of individuals for each of the eight valuesof the triple (Zi,X

obsi , Y obs

i ) in the McDonald, Hiuand Tierney data set. It should be noted that therandomization in this experiment is at the physicianlevel. I do not have physician indicators and, there-fore, ignore the clustering. This will tend to lead tounderestimation of the standard errors.

4.2 Instrumental Variables Assumptions

There are four key of assumptions underlyinginstrumental variables methods beyond the no-interference assumption or SUTVA, with differentversions for some of them. I will introduce these as-sumptions in this section, and in Section 5 discusstheir substantive content in the context of some ex-amples. The first assumption concerns the assign-ment to the instrument Zi, in the flu example the

receipt of the letter by the physician. The assump-tion requires that the instrument is as good as ran-domly assigned:

Zi ⊥ (Yi(0,0), Yi(0,1), Yi(1,0),

Yi(1,1),Xi(0),Xi(1))(4.1)

(random assignment).

This assumption is often satisfied by design: if theassignment is physically randomized, as the letter inthe flu example and as in many of the applicationsin the statistics literature (e.g., see the discussion inRobins (1989)), it is automatically satisfied. In otherapplications with observational data, common in theeconometrics literature, this assumption is more con-troversial. It can in those cases be relaxed by requir-ing it to hold only within subpopulations defined bycovariates Vi, assuming the assignment of the instru-ment is unconfounded:

Zi ⊥ (Yi(0,0), Yi(0,1), Yi(1,0),

Yi(1,1),Xi(0),Xi(1))|Vi

(unconfounded assignment given Vi).

This is identical to the generalization from randomassignment to unconfounded assignment in observa-tional studies. Either version of this assumption jus-tifies the causal interpretation of Intention-To-Treat

(ITT) effects, the comparison of outcomes by assign-ment to the treatment. In many cases, these ITT ef-fects are only of limited interest, however, and thismotivates the consideration of additional assump-tions that do allow the researcher to make state-ments about the causal effects of the treatment ofinterest. It should be stressed, however, that in or-der to draw inferences beyond ITT effects, additionalassumptions will be used; whether the resulting in-ferences are credible will depend on the credibilityof these assumptions.

The second class of assumptions limits or rules outcompletely direct effects of the assignment (the re-ceipt of the letter in the flu example) on the outcome,other than through the effect of the assignment onthe receipt of the treatment of interest (the receiptof the vaccine). This is the most critical, and typ-ically most controversial assumption underlying in-strumental variables methods, sometimes viewed asthe defining characteristic of instruments. One wayof formulating this assumption is as

Yi(0, x) = Yi(1, x) for x= 0,1, for all i

(exclusion restriction).

Page 17: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 17

Robins (1989) formulates a similar assumption as re-quiring that the instrument is “not an independentcausal risk factor” (Robins (1989), page 119). Un-der this assumption, we can drop the z argument ofthe potential outcomes and write the potential out-comes without ambiguity as Yi(x). This assumptionis typically a substantive one. In the flu example,one might be concerned that the physician, in re-sponse to the receipt of the letter, takes actions thataffect the likelihood of the patient getting infectedwith the flu other than simply administering the fluvaccine. In randomized experiments with noncom-pliance, the exclusion restriction is sometimes madeimplicitly by indexing the potential outcomes onlyby the treatment x and not the instrument z (e.g.,Zelen (1990)).

There are other, weaker versions of this assump-tion. Hirano et al. (2000) use a stochastic versionof the exclusion restriction that only requires thatthe distribution of Yi(0, x) is the same as the dis-tribution of Yi(1, x). Manski (1990) uses a weakerrestriction that he calls a level set restriction, whichrequires that the average value of Yi(0, x) is equal tothe average value of Yi(1, x). In another approach,Manski and Pepper (2000) consider monotonicity as-sumptions that restrict the sign of Yi(1, x)−Yi(0, x)across individuals without requiring that the effectsare completely absent.

Imbens and Angrist (1994) combine the randomassignment assumption and the exclusion restrictionby postulating the existence of a pair of potentialoutcomes Yi(x), for x= 0,1, and directly assumingthat

Zi ⊥ (Yi(0), Yi(1)).

A disadvantage of this formulation is that it becomesless clear exactly what role randomization of the in-strument plays. Another version of this combinationof the exclusion restriction and random assignmentassumption does not require full independence, butassumes that the conditional mean of Yi(0) and Yi(1)given the instrument is free of dependence on the in-strument. A concern with such assumptions is thatthey are functional form dependent: if they hold inlevels, they do not hold in logarithms unless full in-dependence holds.

A third assumption that is often used, labeledmonotonicity by Imbens and Angrist (1994), re-quires that

Xi(1)≥Xi(0) for all i (monotonicity),

for all units. This assumption rules out the presenceof units who always do the opposite of their assign-ment [units with Xi(0) = 1 and Xi(1) = 0], and istherefore also referred to as the no-defiance assump-tion (Balke and Pearl (1995)). It is implicit in thelatent index models often used in econometric eval-uation models (e.g., Heckman and Robb, 1985). Inthe randomized experiments such as the flu example,this assumption is often plausible. There it requiresthat in response to the receipt of the letter by theirphysician, no patient is less likely to get the vaccine.Robins (1989) makes this assumption in the contextof a randomized trial for the effect of AZT on AIDS,and describes the assumption as “often, but not al-ways, reasonable” (Robins (1989), page 122).

Finally, we need the instrument to be correlatedwith the treatment, or the instrument to be relevant

in the terminology of Phillips (1989) and Staiger andStock (1997):

Xi 6⊥Zi.

In practice, we need the correlation to be substantialin order to draw precise inferences. A recent litera-ture on weak instruments is concerned with credi-ble inference in settings where this correlation be-tween the instrument and the treatment is weak; seeStaiger and Stock (1997) and Andrews and Stock(2007).

The random assignment assumption and the ex-clusion restriction are conveniently captured by thegraphical model below, although the monotonicityassumption does not fit in as easily. The unobservedcomponent U has a direct effect on both the treat-ment X and the outcome Y (captured by arrowsfrom U to X and to Y ). The instrument Z is not re-lated to the unobserved component U (captured bythe absence of a link between U and Z), and is onlyrelated to the outcome Y through the treatment X(as captured by the arrow from Z to X and an arrowfrom X to Y , and the absence of an arrow betweenZ and Y ).

I will primarily focus on the case with all four as-sumptions maintained, random assignment, the ex-clusion restriction, monotonicity and instrument rel-evance, without additional covariates, because this

Page 18: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

18 G. W. IMBENS

case has been the focus of, or a special case ofthe focus of, many studies, allowing me to com-pare different approaches. Methodological studiesconsidering essentially this set of assumptions, some-times without explicitly stating instrument rele-vance, and sometimes adding additional assump-tions, include Robins (1989), Heckman (1990), Man-ski (1990), Imbens and Angrist (1994), Angrist,Imbens and Rubin (1996), Robins and Greenland(1996), Balke and Pearl (1995, 1997), Greenland(2000), Hernán and Robins (2006), Robins (1994),Robins and Rotnitzky (2004), Vansteelandt andGoetghebeur (2003), Vansteelandt et al. (2011),Hirano et al. (2000), Tan (2006, 2010), Abadie(2002, 2003), Duflo, Glennester and Kremer (2007),Brookhart et al. (2006), Martens et al. (2006), Mor-gan and Winship (2007), and others. Many morestudies make the same assumptions in combinationwith a constant treatment effect assumption.

The modern literature analyzed this setting froma number of different approaches. Initially, the lit-erature focused on the inability, under these fourassumptions, to identify the average effect of thetreatment. Some researchers, including prominentlyManski (1990), Balke and Pearl (1995) and Robins(1989), showed that although one could not point-identify the average effect under these assumptions,there was information about the average effect inthe data under these assumptions and they derivedbounds for it. Another strand of the literature, start-ing with Imbens and Angrist (1994) and Angrist,Imbens and Rubin (1996) abandoned the effort todo inference for the overall average effect, and fo-cused on subpopulations for which the average effectcould be identified, the so-called compliers, leadingto the local average treatment effect. We discuss thebounds approach in the next section (Section 4.3)and the local average treatment effect approach inSections 4.4–4.6.

4.3 Point Identification versus Bounds

In a number of studies, the primary estimand isthe average effect of the treatment, or the averageeffect for the treated:

τ = E[Yi(1)− Yi(0)] and(4.2)

τt = E[Yi(1)− Yi(0)|Xi = 1].

With only the four assumptions, random assign-ment, the exclusion restriction, monotonicity, andinstrument relevance Robins (1989), Manski (1990)

and Balke and Pearl (1995) established that the av-erage treatment effect can often not be consistentlyestimated even in large samples. In other words, thatit is often not point-identified.

Following this result, a number of different ap-proaches have been taken. Heckman (1990) showedthat if the instrument takes on values such that theprobability of treatment given the instrument canbe arbitrarily close to zero and one, then the aver-age effect is identified. This is sometimes referred toas identification at infinity. Robins (1989) also for-mulates assumptions that allow for point identifica-tion, focusing on the average effect for the treated,τt. These assumptions restrict the average value ofthe potential outcomes when not observed in termsof average outcomes that are observed. For example,Robins formulates the condition that

E[Yi(1)− Yi(0)|Zi = 1,Xi = 1]

= E[Yi(1)− Yi(0)|Zi = 0,Xi = 1],

which, in combination with the random assign-ment and the exclusion restriction, this allows forpoint identification of the average effect for thetreated. Robins also formulates two other assump-tions, including one where the effects are propor-tional to survival rates E[Yi(1)|Zi = 1,Xi = 1] andE[Yi(1)|Zi = 0,Xi = 1] respectively, that also point-identifies the average effect for the treated. However,Robins questions the applicability of these resultsby commenting that “it would be hard to imaginethat there is sufficient understanding of the biologi-cal mechanism. . . to have strong beliefs that any ofthe three conditions. . . is more likely to hold thaneither of the other two” (Robins (1989), page 122).

As an alternative to adding assumptions, Robins(1989), Manski (1990) and Balke and Pearl (1995),focused on the question what can be learned aboutτ or τt given these four assumptions that do notallow for point identification. Here, I focus on thecase where the three assumptions, random assign-ment, the exclusion restriction and monotonicityare maintained (without necessarily instrument rele-vance holding), although Robins (1989) and Manski(1990) also consider other combinations of assump-tions. For ease of exposition, I focus on the boundsfor the average treatment effect τ under these as-sumptions, in the case where Yi(0) and Yi(1) arebinary. Then

E[Yi(1)− Yi(0)]

∈ [−(1−E[Xi|Zi = 1]) ·E[Yi|Zi = 1,Xi = 0]

Page 19: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 19

+E[Yi|Zi = 1]− E[Yi|Zi = 0]

+E[Xi|Zi = 0] · (E[Yi|Zi = 0,Xi = 1]− 1),

(1− E[Xi|Zi = 1])

· (1−E[Yi|Zi = 1,Xi = 0])

+E[Yi|Zi = 1]− E[Yi|Zi = 0]

+E[Xi|Zi = 0] ·E[Yi|Zi = 0,Xi = 1]],

which are known at the natural bounds. In this sim-ple setting, this is a straightforward calculation.Work by Manski (1995, 2003, 2005, 2007), Robins(1989) and Hernán and Robins (2006) extends thepartial identification approach to substantially morecomplex settings.

For the McDonald–Hiu–Tierney flu data, the esti-mated identified set for the population average treat-ment effect is

E[Yi(1)− Yi(0)] ∈ [−0.24,0.64].

There is a growing literature developing methods forestablishing confidence intervals for parameters insettings with partial identification taking samplinguncertainty into account; see Imbens and Manski(2004) and Chernozhukov, Hong and Tamer (2007).

4.4 Compliance Types

Imbens and Angrist (1994) and Angrist, Imbensand Rubin (1996) take a different approach. Ratherthan focusing on the average effect for the popula-tion that is not identified under the three assump-tions given in Section 4.2, they focus on different av-erage causal effects. A first key step in the Angrist–Imbens–Rubin set up is that we can think of four dif-ferent compliance types defined by the pair of valuesof (Xi(0),Xi(1)), that is, defined by how individualswould respond to different assignments in terms ofreceipt of the treatment:12

Ti =

n (never-taker) if Xi(0) =Xi(1) = 0,c (complier) if Xi(0) = 0,Xi(1) = 1,d (defier) if Xi(0) = 1,Xi(1) = 0,a (always-taker) if Xi(0) =Xi(1) = 1.

Given the existence of deterministic potential out-comes this partitioning of the population into four

12Frangakis and Rubin (2002) generalize this notion of sub-populations whose membership is not completely observedinto their principal stratification approach; see also Sec-tion 7.2.

subpopulations is simply a definition.13 It clarifiesimmediately that it will be difficult to identify theaverage effect of the primary treatment (the receiptof the vaccine) for the entire population: never-takersand always-takers can only be observed exposed to asingle level of the treatment of interest, and thus forthese groups any point estimates of the causal effectof the treatment must be based on extrapolation.

We cannot infer without additional assumptionsthe compliance type of any unit: for each unit weobserve Xi(Zi), but the data contain no informa-tion about the value of Xi(1 − Zi). For each unit,there are therefore two compliance types consistentwith the observed behavior. We can also not iden-tify the proportion of individuals of each compli-ance type without additional restrictions. The mono-tonicity assumption implies that there are no defiers.This, in combination with random assignment, im-plies that we can identify the population shares ofthe remaining three compliance types. The propor-tion of always-takers and never-takers are

πa = pr(Ti = a) = pr(Xi = 1|Zi = 0) and

πn = pr(Ti = n) = pr(Xi = 0|Zi = 1),

respectively, and the proportion of compliers is theremainder:

πc = pr(Ti = c) = 1− πa − πn.

For the McDonald–Hiu–Tierney data these sharesare estimated to be

πa = 0.189, πn = 0.692, πc = 0.119,

although, as I discuss in Section 5.2, these sharesmay not be consistent with the exclusion restriction.

4.5 Local Average Treatment Effects

If, in addition to monotonicity, we also assumethat the exclusion restriction holds, Imbens and An-grist (1994) and Angrist, Imbens and Rubin (1996)show that the local average treatment effect or com-

plier average causal effect is identified:

τlate = E[Yi(1)− Yi(0)|Ti = c](4.3)

=E[Yi|Zi = 1]−E[Yi|Zi = 0]

E[Xi|Zi = 1]−E[Xi|Zi = 0].

13Outside of this framework, the existence of these foursubpopulations would be an assumption.

Page 20: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

20 G. W. IMBENS

The components of the right-hand side of this ex-pression can be estimated consistently from a ran-dom sample (Zi,Xi, Yi)

Ni=1

. For the McDonald–Hiu–Tierney data, this leads to

τlate =−0.125 (s.e. 0.090).

Note that just as in the supply and demand exam-ple, the causal estimand is the ratio of the intention-to-treat effects of the letter on hospitalization andof the letter on the receipt of the vaccine. Theseintention-to-treat effects are

ITTY =−0.015 (s.e. 0.011),

ITTX = πc = 0.119 (s.e. 0.016),

with the latter equal to the estimated proportion ofcompliers in the population.

Without the monotonicity assumption, but main-taining the random assignment assumption and theexclusion restriction, the ratio of ITT effects still hasa clear interpretation. In that case, it is equal to alinear combination average of the effect of the treat-ment for compliers and defiers:

E[Yi|Zi = 1]− E[Yi|Zi = 0]

E[Xi|Zi = 1]− E[Xi|Zi = 0]

=pr(Ti = c)

pr(Ti = c)− pr(Ti = d)

·E[Yi(1)− Yi(0)|Ti = c](4.4)

− pr(Ti = d)

pr(Ti = c)− pr(Ti = d)

·E[Yi(1)− Yi(0)|Ti = d].

This estimand has a clear interpretation if the treat-ment effect is constant across all units, but if there isheterogeneity in the treatment effects it is a weightedaverage with some weights negative. This represen-tation shows that if the monotonicity assumption isviolated, but the proportion of defiers is small rel-ative to that of compliers, the interpretation of theinstrumental variables estimand is not severely im-pacted.

4.6 Do We Care About the Local AverageTreatment Effect?

The local average treatment effect is an unusualestimand. It is an average effect of the treatmentfor a subpopulation that cannot be identified in thesense that there are no units whom we know for sureto belong to this subpopulation, although there are

some units whom we know do not belong to it. Amore conventional approach is to start an analysisby clearly articulating the object of interest, say theaverage effect of a treatment for a well-defined popu-lation. There may be challenges in obtaining credibleestimates of this object of interest, and along the wayone may make more or less credible assumptions, buttypically the focus remains squarely on the originallyspecified object of interest.

Here, the approach appears to be quite different.We started off by defining unit-level treatment ef-fects for all units. We did not articulate explicitlywhat the target estimand was. In the McDonald–Hiu–Tierney influenza-vaccine application a naturalestimand might be the population average effect ofthe vaccine. Then, apparently more or less by acci-dent, the definition of the compliance types led usto focus on the average effects for compliers. In thisexample, the compliers were defined by the responsein terms of the receipt of the vaccine to the receiptof the letter. It appears difficult to argue that thisis a substantially interesting group, and in fact noattempt was made to do so.

This type of example has led distinguished re-searchers both in economics and in statistics toquestion whether and why one should care aboutthe local average treatment effect. The economistDeaton writes “I find it hard to make any sense ofthe LATE [local average treatment effect]” (Deaton(2010), page 430). Pearl similarly wonders “Realizingthat the population averaged treatment effect (ATE)is not identifiable in experiments marred by non-compliance, they have shifted attention to a specificresponse type (i.e., compliers) for which the causaleffect was identifiable, and presented the latter [thelocal average treatment effect] as an approximationfor ATE. . . . However, most authors in this cate-gory do not state explicitly whether their focus on aspecific stratum is motivated by mathematical con-venience, mathematical necessity (to achieve identi-fication) or a genuine interest in the stratum underanalysis” (Pearl (2011), page 3). Freedman writes“In many circumstances, the instrumental-variablesestimator turns out to be estimating some data-dependent average of structural parameters, whosemeaning would have to be elucidated” (Freedman(2006), pages 700–701). Let me attempt to clear upthis confusion. See also Imbens (2010). An instru-mental variables analysis is an analysis in a second-best setting. It would have been preferable if one

Page 21: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 21

had been able to carry out a well-designed random-ized experiment. However, such an experiment wasnot carried out, and we have noncompliance. As aresult, we cannot answer all the questions we mighthave wanted to ask. Specifically, if the noncompli-ance is substantial, we are limited in the questions wecan answer credibly and precisely. Ultimately, thereis only one subpopulation we can credibly (point-)identify the average effect of the treatment for,namely, the compliers.

It may be useful to draw an analogy. Suppose a re-searcher is interested in evaluating a medical treat-ment and suppose a randomized experiment hadbeen carried out to estimate the average effect ofthis new treatment. However, the population of therandomized experiment included only men, and theresearcher is interested in the average effect for theentire population, including both men and women.What should the researcher do? I would argue thatthe researcher should report the results for the men,and acknowledge the limitation of the results for theoriginal question of interest. Similarly, in the instru-mental variables I see the limitation of the results tothe compliers as one that was unintended, but drivenby the lack of identification for other subpopulationsgiven the design of the study. This limitation shouldbe acknowledged, but one should not drop the anal-ysis simply because the original estimand cannot beidentified. Note that our case with instrumental vari-ables is slightly worse than in the gender example,because we cannot actually identify all individualswith certainty as compliers.

There are alternatives to this view. One approachis to focus solely or primarily on intention-to-treateffects. The strongest argument for that is in thecontext of randomized experiments with noncompli-ance. The causal interpretation of intention-to-treateffects is justified by the randomization. As Freed-man writes, “Experimental data should therefore beanalyzed first by comparing rates or averages, fol-lowing the intention-to-treat principle. Such compar-isons are justified because the treatment and controlgroups are balanced, within the limits of chance vari-ation, by randomization” (Freedman (2006), page701). Even in that case one may wish to also re-port estimates of the local average treatment effectsbecause they may correspond more closely to theobject of ultimate interest. The argument for focus-ing on intention-to-treat or reduced-form estimatesis weaker in other settings. For example, in the Ful-ton Fish Market demand and supply application, the

intention-to-treat effects are the effects of weatherconditions on prices and quantities. These effectsmay be of little substantive interest to policy mak-ers interested in tax policy. The substantive interestfor these policy makers is almost exclusively in thestructural effects of price changes on demand andsupply, and reduced form effects are only of interestin sofar as they are informative about those struc-tural effects. Of course, one should bear in mind thatthe reduced form or intention-to-treat effects rely onfewer assumptions.

A second alternative is associated with the partialidentification approach by Manski (1990, 2002, 2003,2007); see also Robins (1989) and Leamer (1981)for antecedents. In this setting that suggests main-taining the focus on the original estimand, say theoverall average effect, we cannot estimate that accu-rately because we cannot estimate the average valueof Yi(0) for always-takers or the average value ofYi(1) for nevertakers, but we can bound the averageeffect of interest because we know a priori that theaverage value of Yi(0) for always-takers and the av-erage value of Yi(0) for nevertakers is restricted tolie in the unit interval. Manski’s is a principled andcoherent approach. One concern with the approachis that it has often focused on reporting solely thesebounds, leading researchers to miss relevant infor-mation that is available given the maintained as-sumptions. Two different data sets may lead to thesame bounds even though in one case we may knowthat the average effect for one subpopulation (thecompliers) is positive and statistically significantlydifferent from zero whereas in the other case thereneed not be any evidence of a nonzero effect for anysubpopulation. It would appear to be useful to dis-tinguish between such cases by reporting estimatesof both the local average treatment effect and thebounds.

5. THE SUBSTANTIVE CONTENT OF THEINSTRUMENTAL VARIABLES ASSUMPTIONS

In this section, I will discuss the substantive con-tent of the three key assumptions, random assign-ment, the exclusion restriction and the monotonic-ity assumption. I will not discuss here the fourthassumption, instrument relevance. In practice, themain issue with that assumption concerns the qual-ity of inferences when the assumption is close to be-ing violated. See Section 7.5 for more discussion, andStaiger and Stock (1997) for a detailed study.

Page 22: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

22 G. W. IMBENS

5.1 Unconfoundedness of the Instrument

First, consider the random assignment or uncon-foundedness assumption. In a slightly different set-ting, this is a very familiar assumption. Matchingmethods often rely on random assignment, either un-conditionally or conditionally, for their justification.

In some of the leading applications of instrumen-tal variables methods, this assumption is satisfiedby design, when the instrument is physically ran-domized. For example, in the draft lottery example(Angrist, 1990), draft priority is used as an instru-ment for veteran status in an evaluation of the causaleffect of veteran status on mortality and earnings. Inthat case, the instrument, the draft priority numberwas assigned by randomization. Similarly, in the fluexample (Hirano et al., 2000), the instrument for in-fluenza vaccinations, the letter to the physician, wasrandomly assigned.

In other cases, the conditional version of this as-sumption is more plausible. In the McClellan andNewhouse (1994) study, proximity of an individualto a hospital with particular facilities is used as aninstrument for the receipt of intensive treatment ofacute myocardial infarction. This proximity measureis not randomly assigned, and McClellan and New-house use covariates to make the unconfoundednessassumption more plausible. For example, they worryabout differences between individuals living in ruralversus urban areas. To adjust for such differences,they use as one of the covariates the distance to thenearest hospital (regardless of the facilities at thenearest hospital).

A key issue is that although on its own this randomassignment or unconfoundedness assumption justi-fies a causal interpretation of the intention-to-treateffects, it is not sufficient for a causal interpretationof the instrumental variables estimand, the ratio ofthe ITT effects for outcome and treatment.

5.2 The Exclusion Restriction

Second, consider the exclusion restriction. This isthe most critical and typically most controversial as-sumption underlying instrumental variables meth-ods.

First of all, it has some testable implications;see Balke and Pearl (1997) and the recent discus-sions in Kitagawa (2009) and Ramsahai and Lau-ritzen (2011). This testable restriction can be seenmost easily in a binary outcome setting. Under the

three assumptions, random assignment, the exclu-sion restriction and monotonicity, the intention-to-treatment effect of the assignment on the outcomeis the product of two causal effects. First, the av-erage effect of the assignment on the outcome forcompliers, and second, the intention-to-treat effectof the assignment on receipt of the treatment, whichis equal to the population proportion of compliers.If the outcome is binary, the first factor is between−1 and 1. Hence, the intention-to-treat effect of theassignment on the outcome has to be bounded inabsolute value by the intention-to-treat effect of theassignment on the receipt of the treatment. This isa testable restriction. If the outcomes are multival-ued, there is in fact a range of restrictions impliedby the assumptions. However, there exist no consis-tent tests that will reject the null hypothesis withprobability going to one as the sample size increasesin all scenarios where the null hypothesis is wrong.

Let us assess these restrictions in the flu example.Because

pr(Yi = 1,Xi = 0|Zi = 1)

= pr(Yi(0) = 1|Ti = n) · pr(Ti = n)

and

pr(Yi = 1,Xi = 0|Zi = 0)

= pr(Yi(0) = 1|Ti = n or c)

· pr(Ti = n or c)

= pr(Yi(0) = 1|Ti = n) · pr(Ti = n)

+ pr(Yi(0) = 1|Ti = c) · pr(Ti = c)

it follows that

pr(Yi = 1,Xi = 0|Zi = 1)(5.1)

≤ pr(Yi = 1,Xi = 0|Zi = 0).

There are three more restrictions in this settingwith a binary outcome, binary treatment and bi-nary instrument; see Imbens and Rubin (1997b),Balke and Pearl (1997) and Richardson, Evans andRobins (2011) for details. For the flu data, the sim-ple frequency estimator for the left-hand side of(5.1) is 30/1389 = 0.0216, and the right-hand sideis 31/72 = 0.0211, leading to a slight violation aspointed out in Richardson, Evans and Robins (2011)and Imbens and Rubin (2014). Although not statisti-cally significant, it shows that these restrictions havecontent in practice.

Page 23: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 23

To assess the plausibility of the exclusion restric-tion, it is often helpful to do so separately in subpop-ulations defined by compliance status. Let us firstconsider the exclusion restriction for always-takers,who would receive the influenza vaccine irrespectiveof the receipt of the letter by their physician. Pre-sumably, such patients are generally at higher riskfor the flu. Why would such patients be affected bya letter warning their physicians about the upcom-ing flu season when they will get inoculated irre-spective of this warning? It may be that the letterled the physician to take other actions beyond giv-ing the flu vaccine, such as encouraging the patientto avoid exposure. These other actions may affecthealth outcomes, in which case the exclusion restric-tion would be violated. The exclusion restriction fornever-takers has different content. These patientswould not receive the vaccine in any case. If theirphysicians did not regard the risk of flu as sufficientlyhigh to encourage their patients to have the vaccina-tion, presumably the physician would not take otheractions either. For these patients, the exclusion re-striction may therefore be reasonable.

Consider the draft lottery example. In that case,the always-takers are individuals who volunteer formilitary service irrespective of their draft prioritynumber. It seems plausible that the draft prior-ity number has no causal effect on their outcomes.never-takers are individuals who do not serve in themilitary irrespective of their draft priority number.If this is for medical reasons, or more generally rea-sons that make them ineligible to serve, this seemsplausible. If, on the other hand these are individu-als fit but unwilling to serve, they may have had totake actions to stay out of the military that couldhave affected their subsequent civilian labor marketcareers. Such actions may include extending theireducational career, or temporarily leaving the coun-try. Note that these issues are not addressed by therandom assignment of the instrument.

In general, the concern is that the instrument cre-ates incentives not only to receive the treatment, butalso to take additional actions that may affect theoutcome of interest. The nature of these actions maywell differ by compliance type. Most important is tokeep in mind that this assumption is typically a sub-stantive assumption, not satisfied by design outsideof double-blind, single-dose placebo control random-ized experiments with noncompliance.

5.3 Monotonicity

Finally, consider the monotonicity or no-defiersassumption. Even though this assumption is of-ten the least controversial of the three instru-mental variables assumptions, it is still sometimesviewed with suspicion. For example, whereas Robinsviews the assumption as “often, but not alwaysreasonable” (Robins (1989), page 122), Freedman(2006) wonders: “The identifying restriction for theinstrumental-variables estimator is troublesome: justwhy are there no defiers?” (Freedman (2006), page700). In many applications, it is perfectly clear whythere should be no or at most few defiers. The instru-ment plays the role of an incentive for the individualto choose the active treatment by either making itmore attractive to take the active treatment or lessattractive to take the control treatment. As long asindividuals do not respond perversely to this incen-tive, monotonicity is plausible with either no or anegligible proportion of defiers in the population.The term incentive is used broadly here: it may be afinancial incentive, or the provision of information,or an imperfectly monitored legal requirement, butin all cases something that makes it more likely, atthe individual level, that the individual participatesin the treatment.

Let us consider some examples. If noncomplianceis one-sided, and those assigned to the control groupare effectively embargoed from receiving the treat-ment, monotonicity is automatically satisfied. Inthat case Xi(0) = 0, and there are no always-takersor defiers. The example discussed in Sommer andZeger (1991), Imbens and Rubin (1997a) and Green-land (2000) fits this set up.

In the flu application introduced in Section 4, theletter to the physician creates an additional incen-tive for the physician to provide the flu vaccine to apatient, something beyond any incentives the physi-cian may have had already to provide the vaccine.Some individuals may already be committed to thevaccine, irrespective of the letter (the always-takers),and some may not be swayed by the receipt of theletter (the never-takers), and that is consistent withthis assumption. Monotonicity only requires thatthere is no patient, who, if their physician receivesthe letter, would not take the vaccine, whereas theywould have taken the vaccine in the absence of theletter.

Consider a second example, the influential draftlottery application by Angrist (1990) (see also

Page 24: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

24 G. W. IMBENS

Hearst, Newman and Hulley (1986)). Angrist is in-terested in evaluating the effect of military service onsubsequent civilian earnings, using the draft priorityestablished by the draft lottery as an instrument.Monotonicity requires that assigning an individualpriority for the draft rather than not, may inducethem to serve in the military, or may not affect them,but cannot induce them to switch from serving tonot serving in the military. Again that seems plausi-ble. Having high priority for the draft increases thecost of staying out of the military: that may not beenough to change behavior, but it would be unusualif the increased cost of staying out of the militaryinduced an individual to switch from serving in themilitary to not serving.

As a third example, consider the Permutt andHebel (1989) study of the effect of smoking on birth-weight. Permutt and Hebel use the random assign-ment to a smoking-cessation program as an instru-ment for the amount of smoking. In this case, themonotonicity assumption requires that there are noindividuals who as a causal effect of the assignmentto the smoking-cessation program end up smokingmore. There may be individuals who continue tosmoke as much under either assignment and indi-viduals who reduce smoking as a result of the as-signment, but the assumption is that there is no-body who increases their smoking as a result of thesmoking-cessation program. In all these examples,monotonicity requires individuals not to respondperversely to changes in incentives. Systematic andmajor violations in such settings seem unlikely.

In other settings, the assumption is less attractive.Suppose a program has assignment criteria that arechecked by two administrators. Individuals enteringthe assignment process are assigned randomly to oneof the two administrators. The assignment criteriamay be interpreted slightly differently by the twoadministrators, with on average administrator A be-ing more strict than administrator B. Monotonicityrequires that anyone admitted by administrator Awould also be admitted by administrator B, or vice-versa. In this type of setting, monotonicity does notappear to be as plausible as it is in the settings wherethe instrument can be viewed as creating an incen-tive to participate in the treatment. For example,in an analysis of the effect of prison time on recidi-vism, Aizer and Doyle (2013) use random assignmentof cases to judges, and in an analysis of the effect ofbankruptcy, Dobbie and Song (2013) use random as-signment of bankruptcy applications to judges.

The discussion in this section focuses primarilyon the case with a binary treatment and a binaryinstrument. In cases with multivalued treatments,the monotonicity can be generalized in two differentways. In both cases, it may be less plausible thanin the binary case. Let Xi(z) be the potential treat-ment level associated with the assignment z. Onecan generalize the monotonicity assumption for thebinary instrument case to this case as

Xi(z) is nondecreasing in z for all i

(monotonicity in instrument).

This generalization is used in Angrist and Imbens(1995). It is consistent with the view of the instru-ment as changing the incentive to participate in thetreatment: increasing the incentive cannot decreasethe level of the treatment received. Angrist and Im-bens show that this assumption has testable impli-cations.

An alternative generalization is

if Xi(z)>Xj(z)

then Xi(z′)≥Xj(z

′) for all z, z′, i, j

(monotonicity in unobservables).

This assumption, referred to as rank preservation inRobins (1986), implicitly ranks all units in terms ofsome unobservables (Imbens (2007)). It assumes thisranking is invariant to the level of the instrument.It implies that if Xi(z) > Xj(z), then it cannot bethat Xj(z

′)>Xi(z′). It is equivalent to the “contin-

uous prescribing preference” in Hernán and Robins(2006).

In both cases, the special case with a binary treat-ment is identical to the previously stated monotonic-ity. In settings with multivalued treatments, theseassumptions are more restrictive than in the binarytreatment case. In the demand and supply examplein Section 3 with linear supply and demand func-tions, both the monotonicity in the instrument andmonotonicity in the unobservables conditions aresatisfied.

6. THE LINK TO THE TEXTBOOKDISCUSSIONS OF INSTRUMENTAL

VARIABLES

Most textbook discussions of instrumental vari-ables use a framework that is quite different at firstsight from the potential outcome set up used in Sec-tions 4 and 5. These textbook discussions (grad-uate texts include Wooldridge, 2010; Angrist and

Page 25: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 25

Pischke (2009); Greene (2011); and Hayashi (2000),and introductory undergraduate textbooks includeWooldridge (2008); and Stock and Watson (2010))are often closer to the simultaneous equations exam-ple from Section 3. An exception is Manski (2007)who uses the potential outcome set up used in thisdiscussion. In this section I will discuss the standardtextbook set up and relate it to the potential out-come framework and the simultaneous equations setup.

The textbook version of instrumental variablesdoes not explicitly define the potential outcomes. In-stead the starting point is a linear regression func-tion describing the relation between the realized (ob-served) outcome Yi, the endogenous regressor of in-terest Xi and other regressors Vi:

Y obs

i = β0 + β1Xi + β′2Vi + εi.(6.1)

These other regressors a well as the instruments areoften referred to in the econometric literature as ex-

ogenous variables. Although this term does not havea well-defined meaning, informally it includes vari-ables that Cox (1992) called attributes, as well aspotential causes whose assignment is unconfounded.This set up covers both the demand function settingand the randomized experiment example. Althoughthis equation looks like a standard regression func-tion, that similarity is misleading. Equation (6.1) isnot an ordinary regression function in the sense thatthe first part does not represent the conditional ex-pectation of the outcome Yi given the right-hand sidevariables Xi and Vi. Instead it is what is sometimescalled a structural equation representing the causalresponse to changes in the input Xi.

The key assumption in this formulation is that theunobserved component εi in this regression functionis independent of the exogenous regressors Vi andthe instruments Zi, or, formally

εi ⊥ (Zi, Vi).(6.2)

The unobserved component is not independent ofthe endogenous regressor Xi though. The value ofthe regressor Xi may be partly chosen by individuali to optimize some objection function as in the non-compliance example, or the result of an equilibriumcondition as in the supply and demand model. Theprecise relation between Xi and εi is often not fullyspecified.

How does this set up relate to the earlier discussioninvolving potential outcomes? Implicitly, there is inthe background of this set up a causal, unit-level

response function. In the potential outcome nota-tion, let Yi(x) denote this causal response functionfor unit i, describing for each value of x the potentialoutcome corresponding to that level of the treatmentfor that unit. Suppose the conditional expectation ofthis causal response function is linear in x and someexogenous covariates:

E[Yi(x)|Vi] = β0 + β1 · x+ β′2Vi.(6.3)

Moreover, let us make the (strong) assumption thatthe difference between the response function Yi(x)and its conditional expectation does not depend onx, so we can define the residual unambiguously as

εi = Yi(x)− (β0 + β1 · x+ β′2Vi),

with the equality holding for all x. The residual εiis now uncorrelated with Vi by definition. We willassume that it is in fact independent of Vi. Now sup-pose we have an instrument Zi such that

Yi(x)⊥ Zi|Vi.

This assumption is, given the linear representationfor Yi(x), equivalent to

εi ⊥ Zi|Vi.

In combination with the assumption that εi ⊥ Vi,this gives us the textbook version of the assumptiongiven in (6.2). We observe Vi, Xi, the instrument Zi,and the realized outcome

Y obs

i = Yi(Xi) = β0 + β1Xi + β′2Vi + εi,

which is the starting point in the econometric text-book discussion (6.1).

This set up is more restrictive than it needs to be.For example, the assumption that the difference be-tween the response function Yi(x) and its conditionalexpectation does not depend on x can be relaxed toallow for variation in the slope coefficient,

Yi(x)− Yi(0) = β1 · x+ ηi · x,as long as the ηi satisfies conditions similar to thoseon εi. The modern literature (e.g., Matzkin (2007))discusses such models in more detail.

One key feature of the textbook version is thatthere is no separate role for the monotonicity as-sumption. Because the linear model implicitly as-sumes that the per-unit causal effect is constantacross units and levels of the treatment, violationsof the monotonicity assumption do not affect the in-terpretation of the estimand. A second feature of thetextbook version is that the exclusion restriction and

Page 26: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

26 G. W. IMBENS

the random assignment assumption are combinedin (6.2). Implicitly, the exclusion restriction is cap-tured by the absence of Zi in the equation (6.1), andthe (conditional) random assignment is captured by(6.2).

7. EXTENSIONS AND GENERALIZATIONS

In this section, I will briefly review some of otherapproaches taken in the instrumental variables lit-erature. Some of these originate in the statistics lit-erature, some in the econometrics literature. Theyreflect different concerns with the traditional instru-mental variables methods, sometimes because of dif-ferent applications, sometimes because of differenttraditions in econometrics and statistics. This dis-cussion is not exhaustive. I will focus on highlightingthe most interesting developments and provide somereferences to the relevant literature.

7.1 Model-based Approaches to Estimation andInference

Traditionally, instrumental variables analyses re-lied on linear regression methods. Additional ex-planatory variables are incorporated linearly in theregression function. The recent work in the statisticsliterature has explored more flexible approaches toinclude covariates. These approaches often involvemodeling the conditional distribution of the endoge-nous regressor given the instruments and the exoge-nous variables. This is in contrast to the traditionaleconometric literature which has focused on settingsand methods that do not rely on such models.

Robins (1989, 1994), Hernán and Robins (2006),Greenland (2000), Robins and Rotnitzky (2004) andTan (2010) developed an approach that allow foridentification of average treatment effect by addingparametric modelling assumptions. This approachstarts with the specification of what they call thestructural mean, the expectation of Yi(x). This struc-tural mean can be the conditional mean given co-variates, or the marginal mean, labeled the marginal

structural mean. The specification for this expecta-tion is typically parametric. Then estimating equa-tions for the parameters of these models are de-veloped. In the simple setting considered here, thiswould typically lead to the same estimators consid-ered already. An important virtue of the method isthat it has been extended to much more generalsettings, in particular with time-varying covariates

and dynamic treatment regimes in a series of pa-pers. In other settings, it has also led to the develop-ment of doubly robust estimators (Robins and Rot-nitzky (2004)). A key feature of the models is thatthe models are robust in a particular sense. Specifi-cally, the estimators for the average treatment effectsare consistent irrespective of the misspecification ofthe model, in the absence of intention-to-treat effects(what they call the conditional ITT null).

Imbens and Rubin (1997a) and Hirano et al.(2000) propose building a parametric model for thecompliance status in terms of additional covariates,combined with models for the potential outcomesconditional on compliance status and covariates.Given the monotonicity assumption, there are threecompliance types: never-takers, always-takers andcompliers. A natural model for compliance statusgiven individual characteristics Vi is therefore a tri-nomial logit model:

pr(Ti = n|Vi = v) =exp(v′γn)

1 + exp(v′γn) + exp(v′γn),

pr(Ti = a|Vi = v) =exp(v′γa)

1 + exp(v′γn) + exp(v′γn)

and

pr(Ti = c|Vi = v) =1

1+ exp(v′γn) + exp(v′γn).

With continuous outcomes, the conditional outcomedistributions given compliance status and covariatesmay be normal:

Yi(x)|Ti = t, Vi = v ∼N (β′txv,σ

2

tx),

for (t, x) = (n,0), (a,1), (c,0), (c,1). With binaryoutcomes, one may wish to use logistic regressionmodels here. This specification defines the likelihoodfunction. Hirano et al. (2000) apply this to the fludata discussed before. Simulations in Richardson,Evans and Robins (2011) suggest that the model-ing of the compliance status here is key. Specifically,they point out that even in the absence of ITT effectsthere can be biases if the model of the compliancestatus is misspecified.

Like Hirano et al. (2000), Richardson, Evans andRobins (2011) build parametric model only for theidentified distributions. They use them to estimatethe bounds so that the parametric assumptions donot contain identifying information.

Little and Yau (1998) and Yau and Little (2001)similarly model the conditional expectation of theoutcome given compliance status and covariates. In

Page 27: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 27

their application, there are no always-takers, onlynever-takers and compliers. Their specification spec-ifies parametric forms for the conditional meansgiven the compliance types and the treatment sta-tus:

E[Yi(0)|Ti = n,Vi = v] = βn0 + β′n1v,

E[Yi(0)|Ti = c, Vi = v] = βc00 + β′c01v

and

E[Yi(1)|Ti = c, Vi = v] = βc00 + β′c11v.

7.2 Principal Stratification

Frangakis and Rubin (2002) generalize the latentcompliance type approach to instrumental variablesin an important and novel way. Their focus is on thecausal effect of a binary treatment on some outcome.However, it is not the average effect of the treat-ment they are interested in, but the average withina subpopulation. It is the way this subpopulationis defined that creates the complications as well asthe connection to instrumental variables. There is apost-treatment variable that may be affected by thetreatment. Frangakis and Rubin postulate the exis-tence of a pair of potential outcomes for this post-treatment variable. The subpopulation of interest isthen defined by the values for the pair of potentialoutcomes for this post-treatment variables.

Let us consider two examples: first, the random-ized experiment with noncompliance. The treatmenthere is the random assignment. The post-treatmentvariable is the actual receipt of the treatment. Thepair of potential outcomes for this post-treatmentvariable captures the compliance status. The sub-population of interest is the subpopulation of com-pliers.

The second example shows how principal stratifi-cation generalizes the instrumental variables set upto other cases. Examples of this type are consideredin Zhang, Rubin and Mealli (2009), Frumento et al.(2012) and Robins (1986). Suppose we have a ran-domized experiment with perfect compliance. Theprimary outcome is survival after one year. For pa-tients who survive, a quality of life measure is ob-served. We may be interested in the effect of thetreatment on quality of life. This is only defined forpatients who survive up to one year. The principalstratification approach suggests focusing on the sub-population or principal stratum of patients who sur-vive irrespective of the treatment assignment. Mem-bership in this stratum is not observed, and so we

cannot directly estimate the average effect of thetreatment on quality of life for individuals in thisstratum, but the data are generally still informativeabout such effects, particularly under monotonicityassumptions.

7.3 Randomization Inference withInstrumental Variables

Most of the work on inference in instrumentalvariables settings is model-based. After specifyinga model relating the treatment to the outcome, theconditional distribution or conditional mean of out-comes given instruments is derived. The resultinginferences are conditional on the values of the instru-ments. A very different approach is taken in Rosen-baum (1996) and Imbens and Rosenbaum (2005).

Rosenbaum focuses on the distribution for statis-tics generated by the random assignment of the in-struments. In the spirit of the work by Fisher (1925)confidence intervals for the parameter of interest, β1in equation (6.3) are based on this randomizationdistribution. Similar to confidence intervals for treat-ment effects based on inverting conventional Fisherp-values, these intervals have exact coverage underthe stated assumptions. However, these results relyon arguably restrictive constant treatment effect as-sumptions.

7.4 Matching and Instrumental Variables

In many observational studies using instrumentalvariables approaches, the instruments are not ran-domly assigned. In that case, adjustment for addi-tional pretreatment variables can sometimes makecausal inferences more credible. Even if the instru-ment is randomly assigned, such adjustments canmake the inferences more precise. Traditionally, ineconometrics these adjustments are based on regres-sion methods. Recently, in the statistics literaturematching methods have been proposed as a way todo the adjustment for pretreatment variables (Baioc-chi et al., 2010).

7.5 Weak Instruments

One concern that has arisen in the econometricsliterature is about weak instruments. For an instru-ment to be helpful in estimating the effect of thetreatment, it not only needs to have no direct effecton the outcome, it also needs to be correlated withthe treatment. Suppose this correlation is very closeto zero. In the simple case, the IV estimator is the

Page 28: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

28 G. W. IMBENS

ratio of covariances,

β1,iv =cov(Yi,Zi)

cov(Xi,Zi)

=(1/N)

∑Ni=1

(Yi − Y )(Zi −Z)

(1/N)∑N

i=1(Xi −X)(Zi −Z)

.

The distribution of this ratio can be approximatedby a normal distribution in large samples, as longas the covariance in the denominator is nonzero inthe population. If the population value of the co-variance in the denominator is exactly zero, the dis-tribution of the ratio β1,iv is Cauchy in large sam-ples, rather than normal (Phillips (1989); Staigerand Stock (1997)). The weak instrument literatureis concerned with the construction of confidence in-tervals in the case the covariance is close to zero.Interest in this problem rose sharply after a studyby Angrist and Krueger (1991), which remains theprimary empirical motivation for this literature. An-grist and Krueger were interested in estimating thecausal effect of years of education on earnings. Theyexploited variation in educational achievement byquarter of birth attributed to differences in com-pulsory schooling laws. These differences in averageyears of education by quarter of birth were small,and they attempted to improve precision of their es-timators by including interactions of the basic in-struments, the three quarter of birth dummies, withindicators for year and state of birth. Bound, Jaegerand Baker (1995) showed that the estimates usingthe interactions as additional instruments were po-tentially severely affected by the weakness of theinstruments. In one striking analysis, they reesti-mated the Angrist–Krueger regressions using ran-domly generated quarter of birth data (uncorrelatedwith earnings or years of education). One mighthave expected, and hoped, that in that case onewould find an imprecisely estimated effect. Surpris-ingly, Bound, Jaeger and Baker (1995) found thatthe confidence intervals constructed by Angrist andKrueger suggested precisely estimated effects for theeffect of years of education on earnings. It was subse-quently found that with weak instruments the TSLSestimator, especially with many instruments, was bi-ased, and that the standard variance estimator ledto confidence intervals with substantial undercover-age (Bound, Jaeger and Baker (1995); Staiger andStock (1997); Chamberlain and Imbens (2004)).

Motivated by the Bound–Jaeger–Baker findings,the weak and many instruments literature focused

on point and interval estimators with better proper-ties in settings with weak instruments. Starting withStaiger and Stock (1997), a literature developed toconstruct confidence intervals for the instrumentalvariables estimand that remained valid irrespectiveof the strength of the instruments. A key insight wasthat confidence intervals based on the inversion ofAnderson–Rubin (1949) statistics have good proper-ties in settings with weak instruments; see also Mor-eira (2003), Andrews and Stock (2007), Kleibergen(2002) and Andrews, Moreira and Stock (2006).

Let us look at the simplest case with a single en-dogenous regressor, a single instrument, and no addi-tional regressors and normally distributed residuals:

Yi(x) = β0 + β1 · x+ εi with εi|Zi ∼N (0, σ2

ε).

The Anderson–Rubin statistic is, for a given valueof b

AR(b) =

(1√N

N∑

i=1

(Zi −Z) · (Yi − b ·Xi)

)2

/(1

N

N∑

i=1

(Zi −Z)2 · σ2

ε

),

where Z =∑N

i=1Zi/N , and for some estimate of the

residual variance σ2ε . At the true value b = β1, the

AR statistic has in large samples a chi-squared dis-tribution with one degree of freedom. Staiger andStock (1997) propose constructing a confidence in-terval by inverting this test statistic:

CI0.95(β1) = b|AR(b)≤ 3.84.

The subsequent literature has extended this by al-lowing for multiple instruments and developed vari-ous alternatives, all with the focus on methods thatremain valid irrespective of the strength of the in-struments; see Andrews and Stock (2007) for anoverview of this literature.

7.6 Many Instruments

Another strand of the literature motivated bythe Angrist–Krueger study focused on settings withmany weak instruments. The concern centered onthe Bound, Jaeger and Baker (1995) finding thatin a setting similar to the Angrist–Krueger settingusing TSLS with many randomly generated instru-ments led to confidence intervals that had very lowcoverage rates.

Page 29: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 29

To analyze this setting, Bekker (1994) consideredthe behavior of various estimators under an asymp-totic sequence where the number of instruments in-creases with the sample size. Asymptotic approxi-mations to sampling distributions based on this se-quence turned out to be much more accurate thanthose based on conventional asymptotic approxima-tions. A key finding in Bekker (1994) is that undersuch sequences one of the leading estimators, Two-Stage-Least-Squares (TSLS, see the Appendix fordetails) estimator is no longer consistent, whereasanother estimator, Limited Information MaximumLikelihood (LIML, again see the Appendix for de-tails) estimator remains consistent although the vari-ance under this asymptotic sequence differs fromthat under the standard sequence; see also Kunit-omo (1980), Morimune (1983), Bekker and van derPloeg (2005), Chamberlain and Imbens (2004), Chaoand Swanson (2005), Hahn (2002), Hansen, Haus-man and Newey (2008), Kolesár et al. (2013).

7.7 Proxies for Instruments

Hernán and Robins (2006) and Chalak (2011) ex-plores settings where the instrument is not directlyobserved. Instead a proxy variable Z∗

i is observed.This proxy variable is correlated with the under-lying instrument Zi, but not perfectly so. The po-tential outcomes Yi(z,x) are still defined in termsof the underlying, unobserved instrument Zi. Theunobserved instrument Zi satisfies the instrumentalvariables assumptions, random assignment, the ex-clusion restriction and the monotonicity assumption.In addition, the observed proxy Z∗

i satisfies

Z∗i ⊥ Yi(0,0), Yi(0,1), Yi(1,0),

Yi(1,1),Xi(0),Xi(1)|Zi.

Chalak shows that the ratio of covariances (now nolonger the ratio of intention-to-treat effects) still hasan interpretation of an average causal effect.

7.8 Regression Discontinuity Designs

Regression Discontinuity (RD) designs attemptto estimate causal effects of a binary treatment insettings where the assignment mechanism is a de-terministic function of a pretreatment variable. Inthe sharp version of the RD design, the assignmentmechanism takes the form

Xi = 1Vi≥c,

for some fixed threshold c: all units with a value forthe covariate Vi exceeding c receive the treatment

and all units with a value for Vi less than c are inthe control group. Under smoothness assumptions,it is possible in such settings to estimate the averageeffect of the treatment for units with a value for thepretreatment variable equal to Vi ≈ c:

E[Yi(1)− Yi(0)|Vi = c]

= limw↑c

E[Yi|Vi =w]− limw↓c

E[Yi|Vi =w].

These designs were introduced by Thistlewaite andCampbell (1960), and have been used in psychology,sociology, political science and economics. For ex-ample, many educational programs have eligibilitycriteria that allow for the application of RD meth-ods; see Cook (2008) for a recent historical perspec-tive and Imbens and Wooldridge (2009) for a recentreview.

A generalization of the sharp RD design is theFuzzy Regression Discontinuity or FRD design. Inthis case, the probability of receipt of the treatmentincreases discontinuously at the threshold, but notnecessarily from zero to one:

limw↓c

pr(Xi = 1|Vi =w) 6= limw↑c

pr(Xi = 1|Vi =w).

In that case, it is no longer possible to consistentlyestimate the average effect of the treatment for allunits at the threshold. Hahn, Todd and Van derKlaauw (2001) demonstrate that there is a closelink to the instrumental variables set up. Specifi-cally Hahn, Todd and Van der Klaauw show thatone can estimate a local average treatment effect atthe threshold. To be precise, one can identify theaverage effect of the treatment for those who are onthe margin of getting the treatment:

E

[Yi(1)− Yi(0)|

Vi = c, limw↑c

Xi(w) = 0, limw↓c

Xi(w) = 1]

=limw↑cE[Yi|Vi =w]− limw↓cE[Yi|Vi =w]

limw↑cE[Xi|Vi =w]− limw↓cE[Xi|Vi =w].

This estimand can be estimated as the ratio of anestimator for the discontinuity in the regression func-tion for the outcome and an estimator for the discon-tinuity in the regression function for the treatmentof interest.

8. CONCLUSION

In this paper, I review the connection betweenthe recent statistics literature on instrumental vari-ables and the older econometrics literature. Al-though the econometric literature on instrumental

Page 30: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

30 G. W. IMBENS

variables goes back to the 1920s, until recently ithad not made much of an impact on the statisticsliterature. The recent statistics literature has com-bined some of the older insights from the economet-rics instrumental variables literature with the sep-arate literature on causality, enriching both in theprocess.

APPENDIX: ESTIMATION AND INFERENCE,TWO-STAGE-LEAST-SQUARES AND OTHER

TRADITIONAL METHODS

A.1 Set up

In this section, I will discuss the traditional econo-metric approaches to estimation and inference in in-strumental variables settings. Part of the aim of thissection is to provide easier access to the economet-ric literature and terminology on instrumental vari-ables, and to provide a perspective and context forthe recent advances.

The textbook setting is the one discussed in theprevious section, where a scalar outcome Yi is lin-early related to a scalar covariate of interest Xi. Inaddition, there may be additional exogenous covari-ates Vi. The traditional model is

Yi = β0 + β1Xi + β′2Vi + εi.(A.1)

In addition, we have a vector of instrumental vari-ables Zi, with dimension K.

An important distinction in the traditional econo-metric literature is between the case with a singleinstrument (K = 1), and the case with more thanone instrument (K > 1). More generally, with morethan one endogenous regressor, the distinction isbetween the case with the number of instrumentsequal to the number of endogenous regressors andthe case with the number of instruments larger thanthe number of endogenous regressors. In the empir-ical literature, there are few credible examples withmore than one endogenous regressor, so I focus hereon the case with a single endogenous regressor. Thefirst case, with a single instrument, is referred to asthe just-identified case, and the second, with multi-ple instruments and a single endogenous regressor,as the over-identified case. In the textbook settingwith a linear model and constant coefficients, thisdistinction has motivated different estimators andspecification tests. In the modern literature, with itsexplicit allowance for heterogeneity in the treatmenteffects, these tests, and the distinction between thevarious estimators, are of less interest. In the recent

statistics literature, little attention has been paid tothe over-identified case with multiple instruments.An exception is Small (2007).

Obviously, it is often difficult in applications tofind even a single variable that satisfies the con-ditions for it to be a valid instrument. This raisesthe question how relevant the literature focusing onmethods to deal with multiple instruments is forempirical practice. There are two classes of appli-cations where multiple instruments could crediblearise. First, suppose one has a single continuous(or multivalued) instrument that satisfies the instru-mental variables assumptions, monotonicity, randomassignment and the exclusion restriction. Then anymonotone function of the instruments also satisfiesthese assumptions, and one can use multiple mono-tone functions of the original instrument as instru-ments. Second, if one has a single instrument in com-bination with exogenous covariates, then one can useinteractions of the instrument and the covariates togenerate additional instruments.

Consider, for example, the Fulton fish marketstudy by Graddy (1995, 1996). Graddy uses weatherconditions as an instrument that affects supply butnot demand. Specifically, she measures wind speedand wave height, giving her two basic instruments.She also constructs functions of these basic instru-ments, such as indicators that the wind speed orwave height exceeds some threshold.

A.2 The Just-Identified Case with noAdditional Covariates

The traditional approach to estimation in this caseis to use what is known in the econometrics literatureas the instrumental variables estimator. In the casewithout additional exogenous covariates, the mostwidely used estimator is simply the ratio of two co-variances:

β1,iv =cov(Yi,Zi)

cov(Xi,Zi)

=(1/N)

∑Ni=1

(Yi − Y )(Zi −Z)

(1/N)∑N

i=1(Xi −X)(Zi −Z)

,

where Y , Z and X denote sample averages. If theinstrument Zi is binary, this is also known as theWald estimator:

β1,iv =Y 1 − Y 0

X1 −X0

,

Page 31: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 31

where for z = 0,1

Y z =1

Nz

i:Zi=z

Yi, Xz =1

Nz

i:Zi=z

Xi,

and N1 =∑N

i=1Zi and N0 =

∑Ni=1

(1−Zi).One can interpret this estimator in two different

ways. These interpretations are useful for motivatingextensions to settings with multiple instruments andadditional exogenous regressors. First, the indirect

least squares interpretation. This relies on first esti-mating separately the two reduced form regressions,the regressions of the outcome on the instrument:

Yi = π10 + π11 ·Zi + ε1i,

and the regression of the endogenous regressor onthe instrument:

Xi = π20 + π21 ·Zi + ε2i.

The indirect least squares estimator is the ratio ofthe least squares estimates of π11 and π21, or β1,ils =π11/π21. Note that in the randomized experimentexample where Xi and Zi are binary, the π11 and π12are the intention-to-treat effects, with π11 = Y 1−Y 0

and π12 =X1 −X0.Second, I discuss the two-stage-least-squares in-

terpretation of the instrumental variables estimator.First, estimate the reduced form regression of thetreatment on the instruments and the exogenous co-variates. Calculate the predicted value for the en-dogenous regressor from this regression:

Xi = π20 + π21 ·Zi.

The estimate the regression of the outcome on thepredicted endogenous regressor and the additionalcovariates,

Yi = β0 + β1Xi + ηi,

by least squares to get the TSLS estimator βtsls. Inthis just-identified setting, the three estimators forβ1 are numerically identical: β1,iv = β1,ils = β1,tsls.

A.3 The Just-Identified Case withAdditional Covariates

In most econometric applications, the instrumentis not physically randomized. There is in those casesno guarantee that the instrument is independent ofthe potential outcomes. Often researchers use covari-ates to weaken the requirement on the instrument toconditional independence given the exogenous co-variates. In addition, the additional exogenous co-variates can serve to increase precision. In that case

with additional covariates, the estimation strategychanges slightly. The two reduced form regressionsnow take the form

Yi = π10 + π11 ·Zi + π′12Vi + ε1i,

and the regression of the endogenous regressor onthe instrument:

Xi = π20 + π21 ·Zi + π′22Vi + ε2i.

The indirect least squares estimator is again the ra-tio of the least squares estimates of π11 and π21, orβ1,ils = π11/π21.

For the two-stage-least-squares estimator, we againfirst estimate the regression of the endogenous re-gressor on the instrument, now also including theexogenous regressors. The next step is to predict theendogenous covariate:

Xi = π20 + π21 ·Zi + π′22Vi.

Finally, the outcome is regressed on the predictedvalue of the endogenous regressor and the actual val-ues of the exogenous variables:

Yi = β0 + β1Xi + β′2Vi + ηi.

The TSLS estimator is again identical to the ILSestimator.

For inference, the traditional approach is to as-sume homoscedasticity of the residuals Yi − β0 −β1Xi − β′

2Vi with variance σ2

ε . In large samples, the

distribution of the estimator βiv is approximatelynormal, centered around the true value β1. Typi-cally, the variance is estimated as

V= σ2

ε ·

1

Xi

Vi

1

Xi

Vi

−1

.

See the textbook discussion in Wooldridge (2010).

A.4 The Over-Identified Case

The second case of interest is the overidentifiedcase. The main equation remains

Yi = β0 + β1Xi + β′2Vi + εi,

but now the instrument Zi has dimension K > 1.We continue to assume that the residuals εi are in-dependent of the instruments with mean zero andvariance σ2

ε . This case is the subject of a large lit-erature, and many estimators have been proposed. Iwill briefly discuss two. For a more detailed discus-sion, see Wooldridge (2010).

Page 32: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

32 G. W. IMBENS

A.5 Two-Stage-Least-Squares

The TSLS approach extends naturally to the set-ting with multiple instruments. First, estimate thereduced form regression of the endogenous variableXi on the instruments Zi and the exogenous vari-ables Vi,

Xi = π20 + π′21Zi + π′

22Vi + ε2i,

by least squares. Next, calculate the predicted value,

Xi = π20 + π′21Zi + π′

22Vi.

Finally, regress the outcome on the predicted valuefrom this regression:

Yi = β0 + β1Xi + β′2Vi + ηi.

The fact that the dimension of the instrument Zi isgreater than one does not affect the mechanics of theprocedure.

To illustrate this, consider the Graddy FultonFish Market data. Instead of simply using the bi-nary indicator stormy/not-stormy as the instru-ment, we can use the trivalued weather indica-tor, stormy/mixed/fair to generate two instruments.This leads to TSLS estimates equal to

β1,tsls =−1.014 (s.e. 0.384).

A.6 Limited-Information-Maximum-Likelihood

The second most popular estimator in this over-identified setting is the limited-information-maximum-likelihood (LIML) estimator, originally proposed byAnderson and Rubin (1949) in the statistics litera-ture. The likelihood is based on joint normality ofthe joint endogenous variables, (Yi,Xi)

′, given theinstruments and exogenous variables (Zi, Vi):(Yi

Xi

)∣∣∣∣Zi, Vi ∼N((

π10 + β1π′21Zi + π′

12Vi

π20 + π′21Zi + π′

22Vi

),Ω

).

The LIML estimator can be expressed in terms ofsome eigenvalue calculations, so that it is compu-tationally fairly simple, though more complicatedthan the TSLS estimator which only requires ma-trix inversion. Although motivated by a normal-distribution-based likelihood function, the LIML es-timator is consistent under much weaker conditions,as long as (ε1i, ε2i)

′ are independent of (Zi, Vi) andthe model (A.1) is correct with εi independent of(Zi, Vi).

Both the TSLS and LIML estimators are consis-tent and asymptotically normally distributed withthe same variance. In the just-identified case, the

two estimators are numerically identical. The vari-ance can be estimated as in the just-identified caseas

V= σ2

ε ·

1

Xi

Vi

1

Xi

Vi

−1

.

In practice, there can be substantial differences be-tween the TSLS and LIML estimators when the in-struments are weak (see Section 7.5) or when thereare many instruments (see Section 7.6), that is, whenthe degree of overidentification is high.

For the fish data, the LIML estimates are

β1,liml =−1.016 (s.e. 0.384).

A.7 Testing the Over-Indentifying Restrictions

The indirect least squares procedure does not workwell in the case with multiple instruments. The tworeduced form regressions are

Xi = π20 + π′21Zi + π′

22Vi + ε2i

and

Yi = π10 + π′11Zi + π′

12Vi + ε1i.

If the model is correctly specified, the K-componentvector π11 should be equal to β1 ·π21. However, thereis nothing in the reduced form estimates that im-poses proportionality of the estimates. In principle,we can use any element of the K-component vectoror ratios π21/π11 as an estimator for β1. If the as-sumption that ε1i is independent of Zi is true foreach component of the instrument, all estimatorswill estimate the same object, and differences be-tween them should be due to sampling variation.Comparisons of these K estimators can therefore beused to test the assumptions that all instruments arevalid.

Although such tests have been popular in theeconometrics literature, they are also sensitive tothe other maintained assumptions in the model,notably linearity in the endogenous regressor andthe constant effect assumption. In the local-average-treatment-effect set up from Section 4.5, differencesin estimators based on different instruments can sim-ply be due to the fact that the different instrumentscorrespond to different populations of compliers.

Page 33: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 33

ACKNOWLEDGMENTS

Financial support for this research was generouslyprovided through NSF Grants 0820361 and 0961707.I am grateful to Joshua Angrist who got me inter-ested in these topics many years ago, and over theyears, has taught me much about the issues dis-cussed in this manuscript, the editor of Statistical

Science for suggesting this review and three anony-mous referees who wrote remarkably thoughtful re-views.

REFERENCES

Abadie, A. (2002). Bootstrap tests for distributional treat-ment effects in instrumental variable models. J. Amer.Statist. Assoc. 97 284–292. MR1947286

Abadie, A. (2003). Semiparametric instrumental variable es-timation of treatment response models. J. Econometrics113 231–263. MR1960380

Aizer, A. and Doyle, J. (2013). Juvenile incarceration, hu-man capital, and future crime: Evidence from randomlyassigned judges. Unpublished working paper, Dept. Eco-nomics, Brown Univ., Providence, RI.

Altonji, J. G. and Matzkin, R. L. (2005). Cross sec-tion and panel data estimators for nonseparable modelswith endogenous regressors. Econometrica 73 1053–1102.MR2149241

Anderson, T. W. and Rubin, H. (1949). Estimation ofthe parameters of a single equation in a complete systemof stochastic equations. Ann. Math. Statistics 20 46–63.MR0028546

Andrews, D. W. K., Moreira, M. J. and Stock, J. H.

(2006). Optimal two-sided invariant similar tests for in-strumental variables regression. Econometrica 74 715–752.MR2217614

Andrews, D. and Stock, J. (2007). Inference with weakinstruments. In Advances in Economics and Econometrics:Theory and Applications, Ninth World Congress, Vol. III(R. Blundell, W. Newey and T. Persson, eds.) 122–173.Cambridge Univ. Press, Cambridge.

Angrist, J. (1990). Lifetime earnings and the Vietnam eradraft lottery: Evidence from social security administrativerecords. American Economic Review 80 313–335.

Angrist, J., Graddy, K. and Imbens, G. (2000). The inter-pretation of instrumental variables estimators in simulta-neous equations models with an application to the demandfor fish. Rev. Econom. Stud. 67 499–527.

Angrist, J. D. and Imbens, G. W. (1995). Two-stage leastsquares estimation of average causal effects in models withvariable treatment intensity. J. Amer. Statist. Assoc. 90431–442. MR1340501

Angrist, J., Imbens, G. and Rubin, D. (1996). Identifi-cation of causal effects using instrumental variables (withdiscussion). J. Amer. Statist. Assoc. 91 444–472.

Angrist, J. and Krueger, A. (1991). Does compulsoryschool attendance affect schooling and earnings. QuarterlyJournal of Economics 106 979–1014.

Angrist, J. and Pischke, S. (2009). Mostly HarmlessEconometrics. Princeton Univ. Press, Princeton, NJ.

Arellano, M. (2002). Sargan’s instrumental variables esti-mation and the generalized method of moments. J. Bus.Econom. Statist. 20 450–459. MR1973797

Athey, S. and Stern, S. (1998). An empirical framework fortesting theories about complementarity in organizationaldesign. Working Paper 6600, NBER.

Baiocchi, M., Small, D. S., Lorch, S. and Rosen-

baum, P. R. (2010). Building a stronger instrument in anobservational study of perinatal care for premature infants.J. Amer. Statist. Assoc. 105 1285–1296. MR2796550

Balke, A. and Pearl, J. (1995). Counterfactuals and pol-icy analysis in structural models. In Uncertainty in Artifi-cial Intelligence 11 (P. Besnard and S. Hanks, eds.) 11–18.Morgan Kaufmann, San Francisco, CA.

Balke, A. and Pearl, J. (1997). Bounds on treatment ef-fects from studies with imperfect compliance. J. Amer.Statist. Assoc. 92 1171–1176.

Barnow, B. S., Cain, G. G. and Goldberger, A. S.

(1980). Issues in the analysis of selectivity bias. In Evalua-tion Studies, Vol. 5 (E. Stromsdorfer and G. Farkas,eds.). Sage, San Francisco, CA.

Basmann, R. (1963a). The causal interpretation of non-triangular systems of economic relations. Econometrica 31439–448.

Basmann, R. (1963b). On the causal interpretation of non-triangular systems of economic relations: A rejoinder.Econometrica 31 451–453.

Basmann, R. (1965). Causal systems and stability: Reply toR. W. Clower. Econometrica 33 242–243.

Bekker, P. A. (1994). Alternative approximations to thedistributions of instrumental variable estimators. Econo-metrica 62 657–681. MR1281697

Bekker, P. A. and van der Ploeg, J. (2005). Instrumentalvariable estimation based on grouped data. Statist. Neer-landica 59 239–267. MR2189771

Benkard, L. and Berry, S. (2006). On the nonparametricidentification of nonlinear simultaneous equations models:Comment on Brown (1983) and Roehrig (1988). Economet-rica 74 1429–1440.

Bound, J., Jaeger, D. and Baker, R. (1995). Problemswith instrumental variables estimation when the correla-tion between the instruments and the endogenous explana-tory variable is weak. J. Amer. Statist. Assoc. 90 443–450.

Bowden, R. J. and Turkington, D. A. (1984). Instrumen-tal Variables. Econometric Society Monographs in Quanti-tative Economics 8. Cambridge Univ. Press, Cambridge.MR0798790

Brookhart, M., Wang, P., Solomon, D. and Schnee-

weiss, S. (2006). Evaluating short-term drug effects usinga physician-specific prescribing preference as an instrumen-tal variable. Epidemiology 17 268–275.

Brown, B. W. (1983). The identification problem in sys-tems nonlinear in the variables. Econometrica 51 175–196.MR0694456

Card, D. (1995). Using geographic variation in college prox-imity to estimate the return to schooling. In Aspectsof Labor Market Behaviour: Essays in Honour of John

Page 34: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

34 G. W. IMBENS

Vanderkamp (L. N. Christofides, E. K. Grant and

R. Swidinsky, eds.). Univ. Toronto Press, Toronto.

Card, D. (2001). Estimating the return to schooling:

Progress on some persistent econometric problems. Econo-

metrica 69 1127–1160.

Chalak, K. (2011). Identification of local treatment effects

using a proxy for an instrument. Unpublished Manuscript,

Dept. Economics, Boston College, Boston, MA.

Chamberlain, G. and Imbens, G. (2004). Random effects

estimators with many instrumental variables. Economet-

rica 72 295–306. MR2031020

Chao, J. C. and Swanson, N. R. (2005). Consistent esti-

mation with a large number of weak instruments. Econo-

metrica 73 1673–1692. MR2156676

Chernozhukov, V., Hong, H. and Tamer, E. (2007). Esti-

mation and confidence regions for parameter sets in econo-

metric models. Econometrica 75 1243–1284. MR2347346

Chesher, A. (2003). Identification in nonseparable models.

Econometrica 71 1405–1441. MR2000252

Chesher, A. (2010). Instrumental variable models for dis-

crete outcomes. Econometrica 78 575–601. MR2656640

Christ, C. (1994). The Cowles Commission’s contributions

to econometrics at Chicago, 1939–1955. Journal of Eco-

nomic Literature 32 30–59.

Cochran, W. G. (1968). The effectiveness of adjustment by

subclassification in removing bias in observational studies.

Biometrics 24 295–313. MR0228136

Cochran, W. and Rubin, D. (1973). Controlling bias in

observational studies: A review. Sankhya 35 417–446.

Cook, T. D. (2008). “Waiting for life to arrive”: A history of

the regression-discontinuity design in psychology, statistics

and economics. J. Econometrics 142 636–654. MR2416822

Cox, D. R. (1992). Causality: Some statistical aspects. J.

Roy. Statist. Soc. Ser. A 155 291–301. MR1157712

Crépon, B., Duflo, E., Gurgand, M., Rathelot, M. and

Zamoray, P. (2012). Do labor market policies have dis-

placement effects? Evidence from a clustered randomized

experiment. Unpublished manuscript.

Dawid, P. (1984). Causal inference from messy data. Com-

ment on ‘On the nature and discovery of structure’. J.

Amer. Statist. Assoc. 79 22–24.

Deaton, A. (2010). Instruments, randomization, and learn-

ing about development. Journal of Economic Literature 48

424–455.

Dobbie, W. and Song, J. (2013). Debt relief and debtor

outcomes: Measuring the effects of consumer bankruptcy

protection. Unpublished working paper, Dept. Economics,

Princeton Univ., Princeton, NJ.

Duflo, E., Glennester, R. and Kremer, M. (2007).

Using randomization in development economics research:

A toolkit. In Handbook of Development Economics, Vol.

4 (T. P. Schultz and J. Strauss, eds.) 3895–3962. North-

Holland, Amsterdam.

Fisher, R. A. (1925). The Design of Experiments, 1st ed.

Oliver & Boyd, London.

Frangakis, C. E. and Rubin, D. B. (2002). Principal

stratification in causal inference. Biometrics 58 21–29.

MR1891039

Freedman, D. A. (2006). Statistical models for causation:What inferential leverage do they provide? Eval. Rev. 30691–713.

Frumento, P., Mealli, F., Pacini, B. and Rubin, D. B.

(2012). Evaluating the effect of training on wages in thepresence of noncompliance, nonemployment, and miss-ing outcome data. J. Amer. Statist. Assoc. 107 450–466.MR2980057

Gelman, A. and Hill, J. (2006). Data Analysis Using Re-gression and Multilevel/Hierarchical Models. CambridgeUniv. Press, Cambridge.

Gill, R. D. and Robins, J. M. (2001). Causal inferencefor complex longitudinal data: The continuous case. Ann.Statist. 29 1785–1811. MR1891746

Giraud, G. (2003). Strategic market games: An introduction.J. Math. Econom. 39 355–375. MR1996481

Graddy, K. (1995). Who pays more? Essays on bargainingand price discrimination. Ph.D. thesis, Dept. Economics,Princeton Univ., Princeton, NJ.

Graddy, K. (1996). Testing for imperfect competition at theFulton fish market. RAND Journal of Economics 26 75–92.

Greene, W. (2011). Econometric Analysis, 7th ed. PrenticeHall, Upper Saddle River, NJ.

Greenland, S. (2000). An introducation to instrumentalvariables for epidemiologists. International Journal of Epi-demiology 29 722–729.

Griliches, Z. (1977). Estimating the returns to schooling:Some econometric problems. Econometrica 45 1–22.

Haavelmo, T. (1943). The statistical implications of a sys-tem of simultaneous equations. Econometrica 11 1–12.MR0007954

Haavelmo, T. (1944). The probability approach ineconometrics. Econometrica 12 (Supplement) 118 pages.MR0010953

Hahn, J. (2002). Optimal inference with many instruments.Econometric Theory 18 140–168. MR1885354

Hahn, J., Todd, P. and Van der Klaauw, W. (2001).Identification and estimation of treatment effects with aregression-discontinuity design. Econometrica 69 201–209.

Hansen, C., Hausman, J. and Newey, W. (2008). Estima-tion with many instrumental variables. J. Bus. Econom.Statist. 26 398–422. MR2459342

Hausman, J. (1983). Specification and estimation of simul-taneous equations models. In Handbook of Econometrics,Vol. 1 (Z. Grilliches and M. D. Intrilligator, eds.). North-Holland, Amsterdam.

Hayashi, F. (2000). Econometrics. Princeton Univ. Press,Princeton, NJ. MR1881537

Hearst, N., Newman, T. and Hulley, S. (1986). Delayedeffects of the military draft on mortality: A randomizednatural experiment. N. Engl. J. Med. 314 620–624.

Heckman, J. (1976). The common structure of statisticalmodels of truncation, sample selection and limited depen-dent variables and a simple estimator for such models. An-nals of Economic and Social Measurement 5 475–492.

Heckman, J. (1979). Sample selection bias as a specificationerror. Econometrica 47 153–161. MR0518832

Heckman, J. (1990). Varieties of selection bias. AmericanEconomic Review: Papers and Proceedings 80 313–318.

Page 35: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 35

Heckman, J. and Robb, R. (1985). Alternative methodsfor evaluating the impact of interventions. In LongitudinalAnalysis of Labor Market Data (J. Heckman and B. Singer,eds.). Cambridge Univ. Press, Cambridge.

Hendry, D. and Morgan, M. (1992). The Foundations ofEconometric Analysis. Cambridge Univ. Press, Cambridge.

Hernán, M. and Robins, J. (2006). Instruments for causalinference: An epidemiologist’s dream? Epidemiology 17360–372.

Hillier, G. H. (1990). On the normalization of structuralequations: Properties of direction estimators. Econometrica58 1181–1194. MR1079413

Hirano, K., Imbens, G., Rubin, D. and Zhou, X. (2000).Identification and estimation of local average treatment ef-fects. Biostatistics 1 69–88.

Hoderlein, S. and Mammen, E. (2007). Identification ofmarginal effects in nonseparable models without mono-tonicity. Econometrica 75 1513–1518. MR2347352

Holland, P. W. (1986). Statistics and causal inference. J.Amer. Statist. Assoc. 81 945–970. MR0867618

Holland, P. (1988). Causal inference, path analysis, andrecursive structural equations models. In SociologicalMethodology, Chapter 13. American Sociological Associa-tion, Washington, DC.

Horowitz, J. L. (2011). Applied nonparametric instru-mental variables estimation. Econometrica 79 347–394.MR2809374

Horowitz, J. L. and Lee, S. (2007). Nonparametric instru-mental variables estimation of a quantile regression model.Econometrica 75 1191–1208. MR2333498

Imbens, G. (1997). Book review of ‘The foundations of econo-metric analysis,’ by David Hendry and Mary Morgan. J.Appl. Econometrics 12 91–94.

Imbens, G. W. (2000). The role of the propensity score in es-timating dose-response functions. Biometrika 87 706–710.MR1789821

Imbens, G. (2004). Nonparametric estimation of aver-age treatment effects under exogeneity: A review. Rev.Econom. Statist. 86 1–29.

Imbens, G. (2007). Nonadditive models with endogenousregressors. In Advances in Economics and Econometrics:Theory and Applications, Ninth World Congress, Vol. III(R. Blundell, W. Newey and T. Persson, eds.) 17–46. Cam-bridge Univ. Press, Cambridge.

Imbens, G. (2010). Better LATE than nothing: Some com-ments on Deaton (2009) and Heckman and Urzua (2009).Journal of Economic Literature 48 399–423.

Imbens, G. (2014). Matching in practice. Journal of HumanResources. To appear.

Imbens, G. and Angrist, J. (1994). Identification and esti-mation of local average treatment effects. Econometrica 61467–476.

Imbens, G. W. and Newey, W. K. (2009). Identifica-tion and estimation of triangular simultaneous equationsmodels without additivity. Econometrica 77 1481–1512.MR2561069

Imbens, G. W. and Rosenbaum, P. R. (2005). Robust, ac-curate confidence intervals with a weak instrument: Quar-ter of birth and education. J. Roy. Statist. Soc. Ser. A 168109–126. MR2113230

Imbens, G. W. and Rubin, D. B. (1997a). Bayesian infer-

ence for causal effects in randomized experiments with non-

compliance. Ann. Statist. 25 305–327. MR1429927

Imbens, G. W. and Rubin, D. B. (1997b). Estimating out-

come distributions for compliers in instrumental variables

models. Rev. Econom. Stud. 64 555–574. MR1485828

Imbens, G. and Rubin, D. (2014). Causal Inference for

Statistics, Social and Biomedical Sciences: An Introduc-

tion. Cambridge Univ. Press, Cambridge.

Imbens, G. and Wooldridge, J. (2009). Recent develop-

ments in the econometrics of program evaluation. Journal

of Economic Literature 47 5–86.

Kitagawa, T. (2009). Identification region of the poten-

tial outcome distributions under instrument independence.

Manuscript, Dept. Economics, Univ. College London.

Kleibergen, F. (2002). Pivotal statistics for testing struc-

tural parameters in instrumental variables regression.

Econometrica 70 1781–1803. MR1925156

Kolesá, M., Chetty, R., Friedman, J. N., Glaeser, E.

and Imbens, G. W. (2013). Identification and inference

with many invalid instruments. Unpublished manuscript.

Kunitomo, N. (1980). Asymptotic expansions of the distri-

butions of estimators in a linear functional relationship and

simultaneous equations. J. Amer. Statist. Assoc. 75 693–

700. MR0590703

Lauritzen, S. L. and Richardson, T. S. (2002). Chain

graph models and their causal interpretations. J. R. Stat.

Soc. Ser. B Stat. Methodol. 64 321–361. MR1924296

Leamer, E. (1981). Is it a demand curve, or is it a supply

curve? Partial identification through inequality constraints.

Rev. Econom. Statist. 63 319–327.

Little, R. (1985). A note about models for selectivity bias.

Econometrica 53 1469–1474.

Little, R. J. A. and Rubin, D. B. (1987). Statistical Anal-

ysis with Missing Data. Wiley, New York. MR0890519

Little, R. and Yau, L. (1998). Statistical techniques for

analyzing data from prevention trials: Treatment of no-

shows using Rubin’s causal model. Psychological Methods

3 147–159.

Manski, C. (1990). Nonparametric bounds on treatment ef-

fects. American Economic Review: Papers and Proceedings

80 319–323.

Manski, C. (1995). Identification Problems in the Social Sci-

ences. Harvard Univ. Press, Cambridge.

Manski, C. (2000a). Economic analysis of social interactions.

Journal of Economic Perspectives 14 115–136.

Manski, C. (2000b). Identification problems and decisions

under ambiguity: Empirical analysis of treatment response

and normative analysis of treatment choice. J. Economet-

rics 95 415–442.

Manski, C. (2001). Designing programs for heterogenous

populations: The value of covariate information. American

Economic Review: Papers and Proceedings 91 103–106.

Manski, C. F. (2002). Treatment choice under ambiguity in-

duced by inferential problems. J. Statist. Plann. Inference

105 67–82. MR1911559

Manski, C. F. (2003). Partial Identification of Probability

Distributions. Springer, New York. MR2151380

Page 36: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

36 G. W. IMBENS

Manski, C. F. (2004). Statistical treatment rules forheterogeneous populations. Econometrica 72 1221–1246.MR2064712

Manski, C. F. (2005). Social Choice with Partial Knowledgeof Treatment Response. Princeton Univ. Press, Princeton,NJ. MR2178946

Manski, C. (2007). Identification for Prediction and Deci-sion. Princeton Univ. Press, Princeton, NJ.

Manski, C. and Nagin, D. (1998). Bounding disagreementsabout treatment effects: A case study of sentencing andrecidivism. Sociological Methodology 28 99–137.

Manski, C. F. and Pepper, J. V. (2000). Monotone in-strumental variables: With an application to the returnsto schooling. Econometrica 68 997–1010. MR1771587

Manski, C., Sandefur, G., McLanahan, S. and Pow-

ers, D. (1992). Alternative estimates of the effect of fam-ily structure during adolescence on high school. J. Amer.Statist. Assoc. 87 25–37.

Martens, E., Pestman, W., de Boer, A., Belitser, S.

and Klungel, O. (2006). Instrumental variables: Appli-cation and limitations. Epidemiology 17 260–267.

Matzkin, R. L. (2003). Nonparametric estimation of non-additive random functions. Econometrica 71 1339–1375.MR2000250

Matzkin, R. (2007). Nonparametric identification. In Hand-book of Econometrics 6B (J. Heckman and E. Leamer,eds.). North-Holland, Amsterdam.

Matzkin, R. L. (2008). Identification in nonparametric si-multaneous equations models. Econometrica 76 945–978.MR2455118

McClellan, M. and Newhouse, J. P. (1994). Does moreintensive treatment of acute myocardial infarction in theelderly reduce mortality. Journal of the American MedicalAssociation 272 859–866.

McDonald, C., Hiu, S. and Tierney, W. (1992). Effects ofcomputer reminders for influenza vaccination on morbidityduring influenza epidemics. MD Computing 9 304–312.

Moreira, M. J. (2003). A conditional likelihood ratiotest for structural models. Econometrica 71 1027–1048.MR1995822

Morgan, S. and Winship, C. (2007). Counterfactuals andCausal Inference. Cambridge Univ. Press, Cambridge.

Morimune, K. (1983). Approximate distributions of k-classestimators when the degree of overidentifiability is largecompared with the sample size. Econometrica 51 821–841.MR0712372

Newey, W. K. and Powell, J. L. (2003). Instrumentalvariable estimation of nonparametric models. Economet-rica 71 1565–1578. MR2000257

Pearl, J. (2000). Causality: Models, Reasoning, and Infer-ence. Cambridge Univ. Press, Cambridge. MR1744773

Pearl, J. (2011). Principal stratification—A goal or a tool?Int. J. Biostat. 7 Art. 20. MR2787410

Permutt, T. and Hebel, J. (1989). Simultaneous-equationestimation in a clinical trial of the effect of smoking onbirth weight. Biometrics 45 619–622.

Philipson, T. (1997a). The evaluation of new health caretechnology: The labor economics of statistics. J. Econo-metrics 76 375–396.

Philipson, T. (1997b). Data markets and the production ofsurveys. Rev. Econom. Stud. 64 47–73.

Philipson, T. and DeSimone, J. (1997). Experiments andsubject sampling. Biometrika 84 619–630. MR1603928

Philipson, T. and Hedges, L. (1998). Subject evaluation insocial experiments. Econometrica 66 381–408.

Phillips, P. C. B. (1989). Partially identified econometricmodels. Econometric Theory 5 181–240. MR1006540

Plott, C. and Smith, V. (1987). An experimental examina-tion of two exchange institutions. Rev. Econom. Stud. 45133–153.

Pratt, J. and Shlaifer, R. (1984). On the nature and dis-covery of structure. J. Amer. Statist. Assoc. 79 9–21.

Ramsahai, R. R. and Lauritzen, S. L. (2011). Likeli-hood analysis of the binary instrumental variable model.Biometrika 98 987–994. MR2860338

Richardson, T., Evans, R. and Robins, J. (2011). Trans-parent parametrizations of models for potential outcomes.In Bayesian Statistics 9 (B. Bayarri, D. Berger andS. Heckerman, eds.). Oxford Univ. Press, Oxford.

Richardson, T. and Robins, J. (2013). Single World Inter-vention Graphs (SWIGs): A unification of the counterfac-tual and graphical approaches to causality. Working Paper128, Center for Statistics and the Social Sciences, Univ.Washington, Seattle, WA.

Robins, J. (1986). A new approach to causal inferencein mortality studies with a sustained exposure period—Application to control of the healthy worker survivor effect.Math. Modelling 7 1393–1512. MR0877758

Robins, J. M. (1989). The analysis of randomized and non-randomized AIDS treatment trials using a new approach tocausal inference in longitudinal studies. In Health ServiceResearch Methodology: A Focus on AIDS (L. Sechrest, H.Freeman and A. Bailey, eds.). NCHSR, U.S. Public HealthService, Washington, DC.

Robins, J. M. (1994). Correcting for non-compliance inrandomized trials using structural nested mean models.Comm. Statist. Theory Methods 23 2379–2412. MR1293185

Robins, J. and Greenland, S. (1996). Comment on: Iden-tification of causal effects using instrumental variables. J.Amer. Statist. Assoc. 91 456–468.

Robins, J. and Rotnitzky, A. (2004). Estimation of treat-ment effects in randomised trials with non-compliance anda dichotomous outcome using structural mean models.Biometrika 91 763–783. MR2126032

Roehrig, C. S. (1988). Conditions for identification in non-parametric and parametric models. Econometrica 56 433–447. MR0935634

Rosenbaum, P. (1996). Comment on: Identification of causaleffects using instrumental variables. J. Amer. Statist. As-soc. 91 465–468.

Rosenbaum, P. R. (2002). Observational Studies, 2nd ed.Springer, New York. MR1899138

Rosenbaum, P. R. (2010). Design of Observational Studies.Springer, New York. MR2561612

Rosenbaum, P. R. and Rubin, D. B. (1983). The centralrole of the propensity score in observational studies forcausal effects. Biometrika 70 41–55. MR0742974

Roy, A. (1951). Some thoughts on the distribution of earn-ings. Oxford Economics Papers 3 135–146.

Page 37: Instrumental Variables: An Econometrician's Perspective · 2014-10-02 · INSTRUMENTAL VARIABLES 3 barriers that continue to separate the literatures. I should emphasize that many

INSTRUMENTAL VARIABLES 37

Rubin, D. (1974). Estimating causal effects of treatments inrandomized and non-randomized studies. Journal of Edu-cational Psychology 66 688–701.

Rubin, D. B. (1976). Inference and missing data. Biometrika63 581–592. MR0455196

Rubin, D. B. (1978). Bayesian inference for causal ef-fects: The role of randomization. Ann. Statist. 6 34–58.MR0472152

Rubin, D. B. (1987). Multiple Imputation for Nonresponsein Surveys. Wiley, New York. MR0899519

Rubin, D. B. (1990). Formal modes of statistical inferencefor causal effects. J. Statist. Plann. Inference 25 279–292.

Rubin, D. B. (1996). Multiple imputation after 18+ years.J. Amer. Statist. Assoc. 91 473–489. -292.

Rubin, D. B. (2006). Matched Sampling for Causal Effects(D. B. Rubin, ed.). Cambridge Univ. Press, Cambridge.MR2307965

Rubin, D. B. and Thomas, N. (1992). Affinely invari-ant matching methods with ellipsoidal distributions. Ann.Statist. 20 1079–1093. MR1165607

Sargan, J. D. (1958). The estimation of economic relation-ships using instrumental variables. Econometrica 26 393–415. MR0110567

Shapley, L. and Shubik, M. (1977). Trade using one com-modity as a means of payment. Journal of Political Econ-omy 85 937–968.

Small, D. S. (2007). Sensitivity analysis for instrumen-tal variables regression with overidentifying restrictions. J.Amer. Statist. Assoc. 102 1049–1058. MR2411664

Smith, V. (1982). Markets as economizers of information:Experimental examination of the Hayek hypothesis. Eco-nomic Inquiry 20 165–179.

Sommer, A. and Zeger, S. (1991). On estimating efficacyfrom clinical trials. Stat. Med. 10 45–52.

Splawa-Neyman, J. (1990). On the application of proba-bility theory to agricultural experiments. Essay on princi-ples. Section 9. Statist. Sci. 5 465–472. Translated from thePolish and edited by D. M. Dabrowska and T. P. Speed.MR1092986

Staiger, D. and Stock, J. H. (1997). Instrumental vari-ables regression with weak instruments. Econometrica 65557–586. MR1445622

Stock, J. and Trebbi, F. (2003). Who invented instrumen-tal variable regression? Journal of Economic Perspectives17 177–194.

Stock, J. and Watson, M. (2010). Introduction to Econo-metrics, 3rd ed. Addison-Wesley, Reading, MA.

Strotz, R. H. (1960). Interdependence as a specificationerror. Econometrica 28 428–442. MR0120035

Strotz, R. H. and Wold, H. O. A. (1960). Recursive vs.nonrecursive systems: An attempt at synthesis. Economet-rica 28 417–427. MR0120034

Strotz, R. and Wold, H. (1965). The causal interpretabil-ity of structural parameters: A reply. Econometrica 31 449–450.

Tan, Z. (2006). Regression and weighting methods for causalinference using instrumental variables. J. Amer. Statist.Assoc. 101 1607–1618. MR2279483

Tan, Z. (2010). Marginal and nested structural models usinginstrumental variables. J. Amer. Statist. Assoc. 105 157–169. MR2757199

Thistlewaite, D. and Campbell, D. (1960). Regression-discontinuity analysis: An alternative to the ex-post factoexperiment. Journal of Educational Psychology 51 309–317.

Tinbergen, J. (1930). Bestimmung und Deuting vonAngebotskurven. Ein Beispiel. Zeitschrift fur Nation-alokonomie 1 669–679. Translated as: Determination andinterpretation of supply curves. An example. In The Foun-dations of Econometric Analysis (D. Hendryand and M.Morgan, eds.) 233–245. Cambridge Univ. Press, Cam-bridge.

Van der Laan, M. J. and Robins, J. M. (2003). UnifiedMethods for Censored Longitudinal Data and Causality.Springer, New York. MR1958123

Vansteelandt, S., Bowden, J., Babanezhad, M. andGoetghebeur, E. (2011). On instrumental variables es-timation of causal odds ratios. Statist. Sci. 26 403–422.MR2917963

Vansteelandt, S. and Goetghebeur, E. (2003). Causalinference with generalized structural mean models. J. R.Stat. Soc. Ser. B Stat. Methodol. 65 817–835. MR2017872

Wold, H. O. A. (1960). A generalization of causal chainmodels. Econometrica 28 443–463. MR0120036

Wooldridge, J. (2008). Introductory Econometrics. South-Western College Pub., New York.

Wooldridge, J. M. (2010). Econometric Analysis of CrossSection and Panel Data, 2nd ed. MIT Press, Cambridge,MA. MR2768559

Working, E. (1927). What do statistical ‘demand curves’show? Quarterly Journal of Economics 41 212–235.

Wright, P. (1928). The Tariff on Animal and Vegetable Oils.MacMillan, New York.

Yau, L. H. Y. and Little, R. J. (2001). Inference for thecomplier-average causal effect from longitudinal data sub-ject to noncompliance and missing data, with applicationto a job training assessment for the unemployed. J. Amer.Statist. Assoc. 96 1232–1244. MR1973667

Zelen, M. (1979). A new design for randomized clinical tri-als. N. Engl. J. Med. 300 1242–1245.

Zelen, M. (1990). Randomized consent designs for clinicaltrials: An update. Stat. Med. 9 645–656.

Zhang, J. L., Rubin, D. B. and Mealli, F. (2009).Likelihood-based analysis of causal effects of job-trainingprograms using principal stratification. J. Amer. Statist.Assoc. 104 166–176. MR2663040