econometrics using stata : part 2

Post on 21-Feb-2022

16 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Econometrics using STATA :

Part 2

Benjamin MonneryEconomiX, Univ Paris Nanterre

M1 Economie du Droit2017-2018

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

CONTENT OF PART 2

When RCT is not an option, only option is to use observational /real-life data

1. How to retrieve data ?

public sources (data.gouv), data repositories, journal archiveshow to clean/manipulate data sets in Stata ?

2. How to fix selection bias ?• when there is only selection on observables (part 2)

i.e. easy problems where you know all the determinants ofassignment correlated with YMethods : stratification, covariate-adjustment and matching

x when there is also selection on unobservables (part 3)Methods : IV, panel, DID, RDD...

B. Monnery (EconomiX) Econometrics using Stata II 2 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

EXAMPLE OF SELECTION ON OBSERVABLES

What’s the effect of lawyers on judicial outcomes ? e.g. Pr(conviction)

among defendants, having a lawyer is “as random” conditional on ...

• strengh of the case (evidence)• wealth, ...

⇒ Among these determinants of treatment, strengh of casecorrelates with Pr(conviction) for sure

what about wealth ? (depends on the judicial system)

Assumption : there is selection on observables (only) if

E [Y 1i |T = 1,X ] = E [Y 1

i |T = 0,X ]

E [Y 0i |T = 1,X ] = E [Y 0

i |T = 0,X ]

Potential outcomes are the same on average for treated anduntreated with same X

B. Monnery (EconomiX) Econometrics using Stata II 3 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

Finding Data

B. Monnery (EconomiX) Econometrics using Stata II 4 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

Access to data is necessary to answer questions> know key sources, be able to manipulate their data

Access to novel data is (almost) necessary to publish in top scientificjournals

• good data + good method + interesting topic = top science• “competition” for data among researchers• difficult to teach> be curious, follow the news, learn code

B. Monnery (EconomiX) Econometrics using Stata II 5 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

DATA GOUV

also look at INSEE, ministries’ websites...

B. Monnery (EconomiX) Econometrics using Stata II 6 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

HARVARD DATAVERSE AND JOURNAL ARCHIVES

Many top scientific journals now require online publication of datasets(like AER)

https://www.aeaweb.org/articles?id=10.1257/aer.20161503

B. Monnery (EconomiX) Econometrics using Stata II 7 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

CIVIL SOCIETY INITIATIVES

We will use some of their data later in the course (Diff-in-Diff)

B. Monnery (EconomiX) Econometrics using Stata II 8 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

Covariate-adjustment

B. Monnery (EconomiX) Econometrics using Stata II 9 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

INTUITION

We want to estimate a causal treatment effect by comparing theobserved outcomes of treated and untreated people

If we think we know all the determinants X of treatment assignment Tthat also relate to Y (selection on observables), we can simplycompare treated and untreated outcomes conditional on X

How to “condition on X ” ?

1. statistically control for X in a regression model (covariateadjustment)

estimate Yi = β0 + β1Ti + β2Xi + εi2. use matching (e.g. propensity score matching)3. use stratification (subclassification) :

compute differences within small groups (strata/cells) of X

⇒ Covariate-adjustement is the regression analog to stratification

B. Monnery (EconomiX) Econometrics using Stata II 10 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

In a problem of selection on observables, we want to compare treatedand untreated within subgroups with similar potential outcomes

Ex : what’s the effect of lawyers on defendants’ probability ofconviction ?

⇒ True answer ? Probably a reduction of Pr (conviction)

⇒ Problem (selection bias) : propensity to hire a lawyer andprobability of conviction are both related to strengh of evidenceagainst defendant

• if court has strong evidence against defendant, he is more likelyto hire a lawyer to help him

• however, he is also more likely to be convicted eventually⇒ hence risk of selection bias due to differences in strengh of

evidence

If you can measure strengh of evidence, selection bias can be “easily”eliminated by stratification, covariate-adjustment or matching

B. Monnery (EconomiX) Econometrics using Stata II 11 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

STRATIFICATION

Tab 1. Sample of Defendants Tab 2. Numbers Convicted

X / T Yes No All X / T Yes No AllStrong 40 10 70 Strong 30 10 40Weak 10 20 30 Weak 5 15 20

All 50 50 100 All 35 25 60

Stata : tab X T tab X T if Convicted==1

• Naive estimator : compare rates of conviction between Yes & NoTreated : 35/50 = 70% Untreated : 25/50 = 50%

• Naive answer : detrimental “effect” of lawyers of +20% points !

⇒ But strengh of evidence is related to both Lawyers andConvictions : selection bias

Better estimator : stratify by (condition on) strengh of evidence

B. Monnery (EconomiX) Econometrics using Stata II 12 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

STRATIFICATION

Tab 1. Sample of Defendants Tab 2. Numbers Convicted

X / T Yes No All X / T Yes No AllStrong 40 10 70 Strong 30 10 40Weak 10 20 30 Weak 5 15 20

All 50 50 100 All 35 25 60

• Among Strong casesTreated : 30/40 = 75% Untreated : 10/10 = 100%

Treatment effect : -25pp effect

• Among Weak casesTreated : 5/10 = 50% Untreated : 15/20 = 75%

Treatment effect : -25pp effect

⇒ Hence the stratified estimator gives a treatment effect of -25 pp

B. Monnery (EconomiX) Econometrics using Stata II 13 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

STRATIFICATION VERSUS REGRESSIONS

Stratification solves problems of selection on observables

However in practice, it is only appropriate in the most simplesituations :

• with few variables affecting T and Y• which are all categorical• e.g. 1 dummy (strong/weak), 2 dummies (+rich/poor), ...

In real-life, assignment often depends on a large number ofnon-dichotomic variables, i.e. need to stratify the sample within a lotof different groups (cells/strata)⇒ problem known as the curse of dimensionality

B. Monnery (EconomiX) Econometrics using Stata II 14 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

STRATIFICATION VERSUS REGRESSIONS

Problem 1 with stratification : the curse of dimensionality

Assume we want to condition on (stratify by) k dummy variables :the number of different groups will be 2k

with k = 10, we have 210 = 1024 group-specific treatments effects tocompare and average (211 = 2024 , 310 = 59049)

• computation can become long• many cells will be empty or only contain treated or untreated

observations : can’t compute group-specific effect> makes the estimated effect less general (i.e. local) as someobservations are left-out

B. Monnery (EconomiX) Econometrics using Stata II 15 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

STRATIFICATION VERSUS REGRESSIONS

Problem 2 with stratification : continuous variables

In real-life, many variables are not categorical but continuous

• strong/weak and rich/poor are statistical constructions to easecalculus

• the true underlying variables are continuous in nature⇒ stratification makes assumptions of homogeneity within groups

Regressions can easily solve both problems : many X and mix ofcategorical and continuous variables

B. Monnery (EconomiX) Econometrics using Stata II 16 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

COVARIATE ADJUSTMENT

Goal : conditional on X , treatment should be “as random”

Key : control appropriately the effect of wealth and case strengh

• Flexible specification :- only linear effect Yi = β0 + β1Lawyeri + β2Wealthi + εi- or more flexible form : logarithmic, polynomial

(Wealth2,Wealth3,...), by categories/bins, linear+bins...

• Relevant data/variables :- Use data on the “best” variables explaining treatment

assignment, instead of long-shot proxy variablesannual pre-tax income, disposable income, net wealth, grosswealth ? Family wealth (to account for possible family support) ?

> a (linear ?) combination of several variables, or some index ?

Recall : do not condition on potential mediators (e.g. lenght of trial) asthey will capture part of the true causal effect of T on Y

B. Monnery (EconomiX) Econometrics using Stata II 17 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

ASSUMPTIONS

The key underlying assumptions :

• Conditional independance assumption (CIA, orunconfoundedness)

Y 1i ,Y

0i ⊥ T | X

CIA is not directly testable (you need to argue why it’s credible)

• Common support (or overlap)

Pr (T = 1|X ) ∈ (0,1)common support is easily testable

+ SUTVA

Then stratification, covariate-adjustment and matching will work

B. Monnery (EconomiX) Econometrics using Stata II 18 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

REGRESSION ANATOMY

Under those assumptions, why exactly does covariate-adjustmentwork, i.e. give a causal effect of T on Y ?

⇒ what do multiple regressions do ?

We know that a simple regression with OLS : Yi = β0 + β1X1 + εi

... gives β̂1 = Cov (Y ,X1)Var (X1)

And a multiple regression with OLS : Yi = β0 + β1X1 + β2X2 + ui

... gives β̂ = (X ′X )−1X ′Y ... ?

To understand what it means, let’s turn to the regression anatomytheorem

B. Monnery (EconomiX) Econometrics using Stata II 19 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

REGRESSION ANATOMY

B. Monnery (EconomiX) Econometrics using Stata II 20 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

SENSITIVITY TO CIA

We can estimate how sensitive the results are to potentialconfounders

Simulation approach :

• Simulate a “fake” variable F that is correlated with both T and Y

• Look at the effect of including this new covariate F on β̂T

• By comparing the β̂T s under different constructions of F(variance-covariance), document the sensitivity of your findingswith respect to a violation of CIA

⇒ If β̂T only disappears under “unrealistic” assumptions (superlarge correlations (F ,X ) and (F ,Y )), then the effect is robust topotential selection on unobservables

B. Monnery (EconomiX) Econometrics using Stata II 21 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

Matching

B. Monnery (EconomiX) Econometrics using Stata II 22 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

MATCHING

Another popular method to deal with selection on observables ismatching

Matching = Appariemment

Idea : make many pairs of similar individuals (i , j), one treated & onenon-treated, and look at their average differences in outcomes

ˆATT =1

N1

∑T =1

(Yi − Yj (i))

where Yj (i) is the outcome of j , the non-treated individual closest tothe treated i (i.e. the match for i)

B. Monnery (EconomiX) Econometrics using Stata II 23 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

Note that we can also recover ATU and ATE with matching :

ˆATU =1

N0

∑T =0

(Yi − Yj (i))

ˆATE =N1

NˆATT +

N0

NˆATU

B. Monnery (EconomiX) Econometrics using Stata II 24 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

SIMPLE EXTENSIONS

Note that we can match

• on many dimensions, many X

that’s preferable to make CIA hold

• use several matches for a given i�

that’s prefered to reduce variance

ˆATT =1

N1

∑T=1

( Yi −1M

M∑m=1

Yjm(i) )

For now, most simple 1x1 matching on one X

B. Monnery (EconomiX) Econometrics using Stata II 25 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

1X1 MATCHING ON ONE X

B. Monnery (EconomiX) Econometrics using Stata II 26 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

1X1 MATCHING ON ONE X

B. Monnery (EconomiX) Econometrics using Stata II 27 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

ANOTHER EXAMPLE : 1X1 MATCHING ON ONE X

B. Monnery (EconomiX) Econometrics using Stata II 28 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

ANOTHER EXAMPLE : 1X1 MATCHING ON ONE X

The estimated ATT after matching is 16426− 13982 = 2444

whereas before matching : 16426− 20724 = −4298

B. Monnery (EconomiX) Econometrics using Stata II 29 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

SEVERAL X

In practice, we usually need to match on many observable variables

⇒ difficult to find perfectly similar i and j on all X (exact matching)

Other methods :• coarsened exact matching (“exact” matching within bins/ranges)

• distance-based matching- Euclidian distance||xi − xj || =

√(xi − xj )′(xi − xj ) =

√∑Kk=1(xki − xkj )2

- Normalized Euclidian distance, Mahalanobis distance

• propensity score matching

Distance-based and propensity score matching are most often used

B. Monnery (EconomiX) Econometrics using Stata II 30 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

SEVERAL MATCHES

In practice, we often want to increase precision by using severalmatches for each i

• Single nearest neighbor matching

• k-nearest neighbors matching (e.g. k=5 or 10)

• Caliper (or raduis) matching (maximal distance i − j)

• Kernel matching (different weights by distance)

• etc.

Asymptotically, they are all similar ; but in practice, this choice canmatter

B. Monnery (EconomiX) Econometrics using Stata II 31 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

PROPENSITY SCORE MATCHING

Like with distance-based matching, we want to aggregate alldifferences in X in only one index, the propensity score p(x)

p(x) measures the probability that individuals are treated (T = 1)based on their observables

• Among treated, some were very likely to be treated, some less so• Among non-treated, some were very likely not to be treated,

some less so

common support in p(x) between the two groups

Propensity score matching matches individuals with similar p(x) (butdifferent actual treatment status)

⇒ need to estimate p(x)

B. Monnery (EconomiX) Econometrics using Stata II 32 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

PROPENSITY SCORE MATCHING

To estimate p(x) for each individual (and then match neighbors), weusually use a probit (or logit) model :

Pr (T = 1|X ) = Pr (T ∗ > 0)= Pr (X ′β + ε > 0)= Pr (ε > −X ′β)= 1− CDF (−X ′β)= Phi(X ′β)

⇒p̂i (xi ) ranges from 0 to 1 (if probit or logit is used)

X are pre-determined variables (and interactions, polynomials, etc.)likely to explain T

and then predict the scores : p̂i (xi ) = Phi(X ′i β̂)

⇒ Hopefully with common support and balance of x between the twogroups

B. Monnery (EconomiX) Econometrics using Stata II 33 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

MAIN PRACTICAL ISSUES

Check common support : compare the two distributions of p(x)

Check balance of covariates : use simple t-tests, proportional tests, orthe standarized bias :

if std bias > 20%, difference is still “large”

Be careful about inference : propensity score matching is a two-stepprocess, so you need to adjust your standard errors (using bootstrap)

Many other choices to make : type of matching (1-1, 1-5, caliper,kernel, etc), replacement or not...

B. Monnery (EconomiX) Econometrics using Stata II 34 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

BONUS 3 : PRISON-BASED EDUCATION AND RECIDIVISM

Goal : make a 1-page critical review of the paper/chapter• brief summary of the paper (topic, method, main points, results)• discuss method, experimental design, interpretations,

conclusions• relate it to the class• criticisms, shortcomings ?

Send PDF by email before next monday (noon)at bmonnery@parisnanterre.fr

B. Monnery (EconomiX) Econometrics using Stata II 35 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

EXAMPLE : PRISON-BASED EDUCATION AND RECIDIVISM

Data on 31,000 prisoners released in New York State between 2005and 2008

They follow recidivism within 3 years (rearrest)

Only 347 of them received a college degree in prison

Challenge : make those 347 graduates as comparable as possible toother prisoners not getting a college degree

Method : match prisoners based on their propensity to get a degreepredicted for 47 covariates⇒ 1-1 nearest neighbor matching with a caliper of 0.01

B. Monnery (EconomiX) Econometrics using Stata II 36 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

EXAMPLE : PRISON-BASED EDUCATION AND RECIDIVISM

B. Monnery (EconomiX) Econometrics using Stata II 37 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

EXAMPLE : PRISON-BASED EDUCATION AND RECIDIVISM

B. Monnery (EconomiX) Econometrics using Stata II 38 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

EXAMPLE : PRISON-BASED EDUCATION AND RECIDIVISM

B. Monnery (EconomiX) Econometrics using Stata II 39 / 41

FINDING DATA COVARIATE-ADJUSTMENT MATCHING

APPLICATION ON STATA

Let’s imagine we want to estimate the effect of halfway houses(semi-liberté) instead of prison on recidivism in a sample of offendersconvicted to prison in France

• allows convicts to work, train, follow classes (probably good forreentry)

• requires them to return in “custody” every night (probably ok tomonitor offenders)

• often perceived as less punitive (possibly bad for futuredeterrence)

⇒ what’s the net causal effect on recidivism, after accounting forselection ?

Main assumption : the Conditional Independence Assumption holdsafter matching on propensity score

In Stata, we can simply use psmatch2

B. Monnery (EconomiX) Econometrics using Stata II 40 / 41

Econometrics using STATA :

Part 2

Benjamin MonneryEconomiX, Univ Paris Nanterre

M1 Economie du Droit2017-2018

top related