ec1123 section 7 instrumental variables › files › apassalacqua › files ›...

28
Ec1123 Section 7 Instrumental Variables Andrea Passalacqua Harvard University [email protected] November 16th, 2017 Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 1 / 28

Upload: others

Post on 23-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Ec1123 Section 7Instrumental Variables

Andrea Passalacqua

Harvard [email protected]

November 16th, 2017

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 1 / 28

Page 2: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Outline

1 Simultaneous Causality

2 Instrumental Variable Regression: Introduction

Conditions

3 IV: Examples

4 Two-Stage Least Squares

5 Testing the Validity of Instruments

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 2 / 28

Page 3: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Simultaneous causality: What is that?

Yi = β0 + β1Xi + β2W1i + β3W2i + ui

Simultaneous causality arises when

X −→ Y and X ←− Y

OLS will fail because conditional mean independence (CMI) is violated.

I Recall CMI: E [u|X ,W1,W2] = E [u|W1,W2] – conditional on Z1 and Z2, X isas good as randomly assigned

However, it is not as “fixable” as OVB (ie. when E [u|X ] 6= 0) since addingcontrols won’t address the fact that X ←− Y

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 3 / 28

Page 4: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Examples of simultaneous causality

QUANTITYi = β0 + β1PRICEi + ui

Changes in price affect quantities supplied and demanded

Changes in quantity supplied and demanded affect price

Price and quantity are jointly determined by a set ofsimultaneous equations (ie. the demand and supply curve)

Other examples:

Democracy and growth

Parental involvement and child’s performance in school

Police and crime

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 4 / 28

Page 5: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Outline

1 Simultaneous Causality

2 Instrumental Variable Regression: Introduction

Conditions

3 IV: Examples

4 Two-Stage Least Squares

5 Testing the Validity of Instruments

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 5 / 28

Page 6: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Instrumental Variables

Instrumental Variables (IV) are useful for estimating models

with measurement error

with simultaneous causalityor

with omitted variable bias

I IV especially useful when we cannot plausibly control for all omitted variables

More generally, whenever conditional mean independence on X fails

I Our estimated coefficient is biased and cannot be interpreted causally

I IV relies on using a valid instrumental variable Z to recover as-if randomassignment of X

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 6 / 28

Page 7: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Instrumental Variables

Yi = β0 + β1Xi + ui

We want to know the causal effect of X on Y

Xi Yi

S1i

W1iW2i

but are confounded by OVB and simultaneous causality−→ β2 is biased

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 7 / 28

Page 8: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Conditions for IV: Intuition

Yi = β0 + β1Xi + ui

Z is an instrumental variable for X if:

Condition 1: Relevance

Z is related to X

Condition 2: Exogeneity of Z

Two ways of saying the Exogeneity condition:Z is as-if randomly assignedThe only relationship between Z and Y goes through X

after conditioning on any control variable W s.

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 8 / 28

Page 9: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Instrumental Variables

Yi = β0 + β1Xi + ui

Xi Yi

S1i

W1iW2i

Z1i

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 9 / 28

Page 10: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Instrumental Variables

Yi = β0 + β1Xi + ui

Xi Yi

S1i

W1iW2i

Z1i

Z2i

There can be multiple instruments Z1 and Z2 for the same X

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 10 / 28

Page 11: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

IV - Instrument Relevance

In the single-variable case:

Yi = β0 + β1Xi + ui

Suppose the model fails CMI (Recall CMI:E [u|X ,W1,W2] = E [u|W1,W2]), but we have a valid instrument Z :

βIV =Cov(Y ,Z )

Cov(X ,Z )

Notice that clearly we need Cov(X ,Z ) 6= 0. In fact, in our dataset, wereally want Cov(X ,Z ) to be far away from zero.

An instrument where Cov(X ,Z ) (ie. the “relevance” of Z ) is close to zerois known as a weak instrument.

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 11 / 28

Page 12: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

IV - Exogeneity Condition

Yi = β0 + β1Xi + ui

All of these would violate Condition 2 (exogeneity):

Zi ←→ Yi

Zi ←→ S1i

Zi ←→W1i and Z ←→W2i

I But we can control for W1i or W2i

In other words, if Z is related to Y (causally or not) through anyrelationship other than through X .

In the figure, this would be any path from Z to Y that does not go throughX .

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 12 / 28

Page 13: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Instrumental Variables – RECAP

ConsiderYi = β0 + β1Xi + ui

where OLS yields a biased β1

Conditions for IV

Z is an instrumental variable for X in this model if:1 Relevance: Z is related to X

Corr(Z ,X ) 6= 0

2 Exogeneity: The only relationship between Z and Y goes through X

Corr(Z , u) = 0

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 13 / 28

Page 14: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Outline

1 Simultaneous Causality

2 Instrumental Variable Regression: Introduction

Conditions

3 IV: Examples

4 Two-Stage Least Squares

5 Testing the Validity of Instruments

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 14 / 28

Page 15: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Instrumental Variables Example: Roommate Assignment

Suppose we analyzing among first semester college freshmen:

GPAi = β0 + β1 × (Hours Studying)i + ui

E [u|X ] 6= 0 because of omitted variable bias.

Proposed instrumental variable:

Z = whether randomly assigned roommatebrought a video game to college

Relevant?

Exogeneity?

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 15 / 28

Page 16: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Instrumental Variables Example: Roommate Assignment

GPAi = β0 + β1 × (Hours Studying)i + ui

Condition 1: Relevance

Z is related to X . i.e. Corr(Z ,X ) 6= 0

How to check? Examine the first-stage relationship between Z and X :

First-stage: (Hours Studying)i = γ0 + γ1Zi + viregress hours videogame

From lecture, γ1 = −0.668∗∗ (large and significant)

(later) Conduct a F -test to confirm

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 16 / 28

Page 17: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Instrumental Variables Example: Roommate Assignment

GPAi = β0 + β1 × (Hours Studying)i + ui

Condition 2: Exogeneity

The only relationship between Z and Y goes through X : Corr(Z , u) = 0

Condition 2 holds if roommates having video games (Z ) only affects GPA(Y ) through affecting hours studying (X )

Plausible? Are there other channels through which Z is related to Y ?

What if being assigned a roommate with video games made students sleepless?

What if male students are more likely to bring video games and males havelower grades on average? (for argument’s sake)

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 17 / 28

Page 18: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Instrumental Variables Example: Roommate Assignment

The Z to Y alternate channel does NOT have to be causal to violateCondition 2

More mathematically: hours of sleep and gender lie in the error term u, sowe have Corr(Z , u) 6= 0

So control for hours of sleep (W1i ) and gender (W2i ):

GPAi = β0 + β1 × (Hours Studying)i + β2W1i + β3W2i + ui

Similar to OVB, we can never directly test whether the exogeneity conditionis violated or satisfied since u is unobserved. We can only provide arguments(using theory or institutional knowledge)

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 18 / 28

Page 19: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Examples of Instrumental Variables

Z X YHow does prenatal health affect a child’s long-run development?

In wombPrenatal health

Adult health &

during Ramadan income

What effect does serving in the military have on future wages?

Military draft lottery # Military service Income

What is the effect of rioting on community development?

Rainfall on day Number of Long-run

of MLK assassination riots property values

Each of these examples also requires some control variables W s for theexogeneity condition to hold. In general, arguing the exogeneity conditioncan be very difficult.

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 19 / 28

Page 20: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Outline

1 Simultaneous Causality

2 Instrumental Variable Regression: Introduction

Conditions

3 IV: Examples

4 Two-Stage Least Squares

5 Testing the Validity of Instruments

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 20 / 28

Page 21: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

IV in STATA

IV regression of Y on X using instrument Z :

ivregress 2sls y (x = z), robust

IV regression of Y on X using instrument Z and controls W1 and W2:

ivregress 2sls y w1 w2 (x = z), robust

IV regression of Y on X using instruments Z1 and Z2 and controls W1 andW2:

ivregress 2sls y w1 w2 (x = z1 z2), robust

where 2sls stands for Two-Stage Least Squares

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 21 / 28

Page 22: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Two-Stage Least Squares (TSLS or 2SLS)

Goal of IV: estimate the causal relationship of X on Y using instrument Z

Yi = β0 + β1Xi + ui

2SLS estimates β1 in “two stages”:

Stage 1: Regress X on Z and calculate predicted values Xi

Xi = γ0 + γ1Zi

Stage 2: Regress Y on X to get β1 = β2SLS = βIV

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 22 / 28

Page 23: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

2SLS – Intuition

Yi = β0 + β1Xi + ui

We cannot causally interpret β1 since X is not “randomly assigned”

Think of the variation (not variance) in X as coming from two separatesources:

Variation in X = As-if random part + Non-random part

The non-random part is giving us problems (OVB or simultaneouscausality)

Stage 1 of 2SLS isolates the as-if random part of X , which is X .

I Since Z is a valid instrument for X in this model, Z is as-if randomlyassigned by Condition 2

I The variation in X related to Z is also as-if randomly assigned

Stage 2 uses the as-if random part to estimate the causal relationship of Xon Y . Regressing Y on X is like regressing Y on the random variation in X .

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 23 / 28

Page 24: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Outline

1 Simultaneous Causality

2 Instrumental Variable Regression: Introduction

Conditions

3 IV: Examples

4 Two-Stage Least Squares

5 Testing the Validity of Instruments

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 24 / 28

Page 25: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Testing the Validity of Instruments

Yi = β0 + β1Xi + β2W1i + β3W2i + ui

Conditions for IV

Z is an instrumental variable for X in this model if controlling for W s:1 Relevance: Corr(Z ,X ) 6= 0

2 Exogeneity: Corr(Z , u) = 0

Testing Condition 1 is straightforward, since we have data on both Z and X

Testing Condition 2 is trickier, because we never observe u. In fact, we canonly test Condition 2 when we have more instruments Z s than X s

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 25 / 28

Page 26: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Testing Condition 1: Relevance

Condition 1: Relevance

Z must be related to X . i.e. Corr(Z ,X ) 6= 0

We need the relationship between X and Z to be “meaningfully large”

How to check?

Run first-stage regression with OLS

Xi = α0 + α1Z1i + α2Z2i + α3W1i + α4W2i + · · ·+ vi

Check the F-test on all the coefficients on the instrumentsH0 : α1 = α2 = 0

If F > 10, we claim that Z is a strong instrument

If F ≤ 10, we have a weak instruments problem

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 26 / 28

Page 27: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Testing Condition 2: ExogeneityCondition 2: Exogeneity of Z

Z is as-if randomly assigned. i.e. Corr(Z , u) = 0

To check exogeneity, we need more instruments Z s than endogenous X s(ie., our model is overidentified)

Suppose there is one treatment variable of interest X , multiple Z s,potentially, multiple control variables W s.

Yi = β0 + β1Xi + β2W1i + β3W2i + ui

Use Z1 to estimate β1 and predict ui . If Z1 and Z2 are both validinstruments, then:

Corr(Z2, u) = 0

This is the basic idea of a J-test

for a model with just one instrument and one endogenous variable there isno formal test

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 27 / 28

Page 28: Ec1123 Section 7 Instrumental Variables › files › apassalacqua › files › section7_iv.p… · Outline 1 Simultaneous Causality 2 Instrumental Variable Regression: Introduction

Testing Condition 2: Exogeneity

J-test for overidentifying restrictions:

H0 : Both Z1 and Z2 satisfy the exogeneity condition

Ha : Either Z1,Z2, or both are invalid instruments

In STATA:

ivregress 2sls y w1 w2 (x = z1 z2), robust

estat overid

display "J-test = " r(score) " p-value = " r(p score)

If the p-value < 0.05, then we reject the null hypothesis that all ourinstruments are valid

But just like an F -test, rejecting the test does not reveal which instrument isinvalid, only that at least one fails the exogeneity condition

Andrea Passalacqua (Harvard) Ec1123 Section 7 Instrumental Variables November 16th, 2017 28 / 28