ec1123 section 7 instrumental variables › files › apassalacqua › files ›...
TRANSCRIPT
Ec1123 Section 7Instrumental Variables
Andrea Passalacqua
Harvard [email protected]
November 16th, 2017
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 1 / 28
Outline
1 Simultaneous Causality
2 Instrumental Variable Regression: Introduction
Conditions
3 IV: Examples
4 Two-Stage Least Squares
5 Testing the Validity of Instruments
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 2 / 28
Simultaneous causality: What is that?
Yi = β0 + β1Xi + β2W1i + β3W2i + ui
Simultaneous causality arises when
X −→ Y and X ←− Y
OLS will fail because conditional mean independence (CMI) is violated.
I Recall CMI: E [u|X ,W1,W2] = E [u|W1,W2] – conditional on Z1 and Z2, X isas good as randomly assigned
However, it is not as “fixable” as OVB (ie. when E [u|X ] 6= 0) since addingcontrols won’t address the fact that X ←− Y
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 3 / 28
Examples of simultaneous causality
QUANTITYi = β0 + β1PRICEi + ui
Changes in price affect quantities supplied and demanded
Changes in quantity supplied and demanded affect price
Price and quantity are jointly determined by a set ofsimultaneous equations (ie. the demand and supply curve)
Other examples:
Democracy and growth
Parental involvement and child’s performance in school
Police and crime
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 4 / 28
Outline
1 Simultaneous Causality
2 Instrumental Variable Regression: Introduction
Conditions
3 IV: Examples
4 Two-Stage Least Squares
5 Testing the Validity of Instruments
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 5 / 28
Instrumental Variables
Instrumental Variables (IV) are useful for estimating models
with measurement error
with simultaneous causalityor
with omitted variable bias
I IV especially useful when we cannot plausibly control for all omitted variables
More generally, whenever conditional mean independence on X fails
I Our estimated coefficient is biased and cannot be interpreted causally
I IV relies on using a valid instrumental variable Z to recover as-if randomassignment of X
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 6 / 28
Instrumental Variables
Yi = β0 + β1Xi + ui
We want to know the causal effect of X on Y
Xi Yi
S1i
W1iW2i
but are confounded by OVB and simultaneous causality−→ β2 is biased
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 7 / 28
Conditions for IV: Intuition
Yi = β0 + β1Xi + ui
Z is an instrumental variable for X if:
Condition 1: Relevance
Z is related to X
Condition 2: Exogeneity of Z
Two ways of saying the Exogeneity condition:Z is as-if randomly assignedThe only relationship between Z and Y goes through X
after conditioning on any control variable W s.
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 8 / 28
Instrumental Variables
Yi = β0 + β1Xi + ui
Xi Yi
S1i
W1iW2i
Z1i
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 9 / 28
Instrumental Variables
Yi = β0 + β1Xi + ui
Xi Yi
S1i
W1iW2i
Z1i
Z2i
There can be multiple instruments Z1 and Z2 for the same X
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 10 / 28
IV - Instrument Relevance
In the single-variable case:
Yi = β0 + β1Xi + ui
Suppose the model fails CMI (Recall CMI:E [u|X ,W1,W2] = E [u|W1,W2]), but we have a valid instrument Z :
βIV =Cov(Y ,Z )
Cov(X ,Z )
Notice that clearly we need Cov(X ,Z ) 6= 0. In fact, in our dataset, wereally want Cov(X ,Z ) to be far away from zero.
An instrument where Cov(X ,Z ) (ie. the “relevance” of Z ) is close to zerois known as a weak instrument.
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 11 / 28
IV - Exogeneity Condition
Yi = β0 + β1Xi + ui
All of these would violate Condition 2 (exogeneity):
Zi ←→ Yi
Zi ←→ S1i
Zi ←→W1i and Z ←→W2i
I But we can control for W1i or W2i
In other words, if Z is related to Y (causally or not) through anyrelationship other than through X .
In the figure, this would be any path from Z to Y that does not go throughX .
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 12 / 28
Instrumental Variables – RECAP
ConsiderYi = β0 + β1Xi + ui
where OLS yields a biased β1
Conditions for IV
Z is an instrumental variable for X in this model if:1 Relevance: Z is related to X
Corr(Z ,X ) 6= 0
2 Exogeneity: The only relationship between Z and Y goes through X
Corr(Z , u) = 0
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 13 / 28
Outline
1 Simultaneous Causality
2 Instrumental Variable Regression: Introduction
Conditions
3 IV: Examples
4 Two-Stage Least Squares
5 Testing the Validity of Instruments
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 14 / 28
Instrumental Variables Example: Roommate Assignment
Suppose we analyzing among first semester college freshmen:
GPAi = β0 + β1 × (Hours Studying)i + ui
E [u|X ] 6= 0 because of omitted variable bias.
Proposed instrumental variable:
Z = whether randomly assigned roommatebrought a video game to college
Relevant?
Exogeneity?
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 15 / 28
Instrumental Variables Example: Roommate Assignment
GPAi = β0 + β1 × (Hours Studying)i + ui
Condition 1: Relevance
Z is related to X . i.e. Corr(Z ,X ) 6= 0
How to check? Examine the first-stage relationship between Z and X :
First-stage: (Hours Studying)i = γ0 + γ1Zi + viregress hours videogame
From lecture, γ1 = −0.668∗∗ (large and significant)
(later) Conduct a F -test to confirm
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 16 / 28
Instrumental Variables Example: Roommate Assignment
GPAi = β0 + β1 × (Hours Studying)i + ui
Condition 2: Exogeneity
The only relationship between Z and Y goes through X : Corr(Z , u) = 0
Condition 2 holds if roommates having video games (Z ) only affects GPA(Y ) through affecting hours studying (X )
Plausible? Are there other channels through which Z is related to Y ?
What if being assigned a roommate with video games made students sleepless?
What if male students are more likely to bring video games and males havelower grades on average? (for argument’s sake)
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 17 / 28
Instrumental Variables Example: Roommate Assignment
The Z to Y alternate channel does NOT have to be causal to violateCondition 2
More mathematically: hours of sleep and gender lie in the error term u, sowe have Corr(Z , u) 6= 0
So control for hours of sleep (W1i ) and gender (W2i ):
GPAi = β0 + β1 × (Hours Studying)i + β2W1i + β3W2i + ui
Similar to OVB, we can never directly test whether the exogeneity conditionis violated or satisfied since u is unobserved. We can only provide arguments(using theory or institutional knowledge)
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 18 / 28
Examples of Instrumental Variables
Z X YHow does prenatal health affect a child’s long-run development?
In wombPrenatal health
Adult health &
during Ramadan income
What effect does serving in the military have on future wages?
Military draft lottery # Military service Income
What is the effect of rioting on community development?
Rainfall on day Number of Long-run
of MLK assassination riots property values
Each of these examples also requires some control variables W s for theexogeneity condition to hold. In general, arguing the exogeneity conditioncan be very difficult.
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 19 / 28
Outline
1 Simultaneous Causality
2 Instrumental Variable Regression: Introduction
Conditions
3 IV: Examples
4 Two-Stage Least Squares
5 Testing the Validity of Instruments
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 20 / 28
IV in STATA
IV regression of Y on X using instrument Z :
ivregress 2sls y (x = z), robust
IV regression of Y on X using instrument Z and controls W1 and W2:
ivregress 2sls y w1 w2 (x = z), robust
IV regression of Y on X using instruments Z1 and Z2 and controls W1 andW2:
ivregress 2sls y w1 w2 (x = z1 z2), robust
where 2sls stands for Two-Stage Least Squares
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 21 / 28
Two-Stage Least Squares (TSLS or 2SLS)
Goal of IV: estimate the causal relationship of X on Y using instrument Z
Yi = β0 + β1Xi + ui
2SLS estimates β1 in “two stages”:
Stage 1: Regress X on Z and calculate predicted values Xi
Xi = γ0 + γ1Zi
Stage 2: Regress Y on X to get β1 = β2SLS = βIV
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 22 / 28
2SLS – Intuition
Yi = β0 + β1Xi + ui
We cannot causally interpret β1 since X is not “randomly assigned”
Think of the variation (not variance) in X as coming from two separatesources:
Variation in X = As-if random part + Non-random part
The non-random part is giving us problems (OVB or simultaneouscausality)
Stage 1 of 2SLS isolates the as-if random part of X , which is X .
I Since Z is a valid instrument for X in this model, Z is as-if randomlyassigned by Condition 2
I The variation in X related to Z is also as-if randomly assigned
Stage 2 uses the as-if random part to estimate the causal relationship of Xon Y . Regressing Y on X is like regressing Y on the random variation in X .
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 23 / 28
Outline
1 Simultaneous Causality
2 Instrumental Variable Regression: Introduction
Conditions
3 IV: Examples
4 Two-Stage Least Squares
5 Testing the Validity of Instruments
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 24 / 28
Testing the Validity of Instruments
Yi = β0 + β1Xi + β2W1i + β3W2i + ui
Conditions for IV
Z is an instrumental variable for X in this model if controlling for W s:1 Relevance: Corr(Z ,X ) 6= 0
2 Exogeneity: Corr(Z , u) = 0
Testing Condition 1 is straightforward, since we have data on both Z and X
Testing Condition 2 is trickier, because we never observe u. In fact, we canonly test Condition 2 when we have more instruments Z s than X s
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 25 / 28
Testing Condition 1: Relevance
Condition 1: Relevance
Z must be related to X . i.e. Corr(Z ,X ) 6= 0
We need the relationship between X and Z to be “meaningfully large”
How to check?
Run first-stage regression with OLS
Xi = α0 + α1Z1i + α2Z2i + α3W1i + α4W2i + · · ·+ vi
Check the F-test on all the coefficients on the instrumentsH0 : α1 = α2 = 0
If F > 10, we claim that Z is a strong instrument
If F ≤ 10, we have a weak instruments problem
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 26 / 28
Testing Condition 2: ExogeneityCondition 2: Exogeneity of Z
Z is as-if randomly assigned. i.e. Corr(Z , u) = 0
To check exogeneity, we need more instruments Z s than endogenous X s(ie., our model is overidentified)
Suppose there is one treatment variable of interest X , multiple Z s,potentially, multiple control variables W s.
Yi = β0 + β1Xi + β2W1i + β3W2i + ui
Use Z1 to estimate β1 and predict ui . If Z1 and Z2 are both validinstruments, then:
Corr(Z2, u) = 0
This is the basic idea of a J-test
for a model with just one instrument and one endogenous variable there isno formal test
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 27 / 28
Testing Condition 2: Exogeneity
J-test for overidentifying restrictions:
H0 : Both Z1 and Z2 satisfy the exogeneity condition
Ha : Either Z1,Z2, or both are invalid instruments
In STATA:
ivregress 2sls y w1 w2 (x = z1 z2), robust
estat overid
display "J-test = " r(score) " p-value = " r(p score)
If the p-value < 0.05, then we reject the null hypothesis that all ourinstruments are valid
But just like an F -test, rejecting the test does not reveal which instrument isinvalid, only that at least one fails the exogeneity condition
Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 28 / 28