africa impact evaluation initiative, aftrl africa program for education impact evaluation david...

27
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J. Gertler & Sebastian Martinez Impact Evaluation Methods: Impact Evaluation Methods: Difference in difference & Matching

Upload: russell-newton

Post on 04-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

AFRICA IMPACT EVALUATION INITIATIVE, AFTRL

Africa Program for Education Impact Evaluation

David Evans

Impact Evaluation Cluster, AFTRL

Slides by Paul J. Gertler & Sebastian Martinez

Impact Evaluation Methods: Impact Evaluation Methods: Difference in difference & Matching

Page 2: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Measuring Impact

► Randomized Experiments► Quasi-experiments

Randomized Promotion – Instrumental Variables

Regression Discontinuity Double differences (Diff in diff) Matching

Page 3: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Case 5: Diff in diff

► Compare change in outcomes between treatments and non-treatment Impact is the difference in the change in

outcomes

►Impact = (Yt1-Yt0

) - (Yc1-Yc0

)

Page 4: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

TimeTreatment

Outcome

Treatment Group

Control Group

Average Treatment Effect

Page 5: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

TimeTreatment

Outcome

Treatment Group

Control Group

Measured effect without pre-measurement

Page 6: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

TimeTreatment

Outcome

EstimatedAverage Treatment Effect

Average Treatment Effect

Treatment Group

Control Group

Page 7: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Diff in diff

► What is the key difference between these two cases?

► Fundamental assumption that trends (slopes) are the same in treatments and controls (sometimes true, sometimes not)

► Need a minimum of three points in time to verify this and estimate treatment (two pre-intervention)

Page 8: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

TimeTreatment

Outcome

Treatment Group

Control Group

Average Treatment Effect

First

observation

Second

observation

Third

observation

Page 9: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Examples

► Two neighboring school districts School enrollment or test scores are

improving at same rate before the program (even if at different levels)

One receives program, one does not Neighboring _______

Page 10: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Case 5: Diff in Diff

Not Enrolled Enrolled t-statMean change

CPC 8.26 35.92 10.31

Case 5 - Diff in Diff

Linear Regression Multivariate Linear Regression

Estimated Impact on CPC 27.66** 25.53**(2.68) (2.77)

** Significant at 1% level

Case 5 - Diff in Diff

Page 11: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Impact Evaluation Example –Summary of Results

Case 1 - Before and After

Case 2 - Enrolled/Not

Enrolled

Case 3 - Randomization

Case 4 - Regression

Discontinuity

Case 5 - Diff in Diff

Multivariate Linear

RegressionMultivariate Linear

Regression

Multivariate Linear

Regression

Multivariate Linear

Regression

Multivariate Linear

Regression

Estimated Impact on CPC 34.28** -4.15 29.79** 30.58** 25.53**

(2.11) (4.05) (3.00) (5.93) (2.77)** Significant at 1% level

Impact Evaluation Example –Summary of Results

Case 1 - Before and After

Case 2 - Enrolled/Not

Enrolled

Case 3 - Randomization

Case 4 - Regression

Discontinuity

Case 5 - Diff in Diff

Multivariate Linear

RegressionMultivariate Linear

Regression

Multivariate Linear

Regression

Multivariate Linear

Regression

Multivariate Linear

Regression

Estimated Impact on CPC 34.28** -4.15 29.79** 30.58** 25.53**

(2.11) (4.05) (3.00) (5.93) (2.77)** Significant at 1% level

Page 12: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Example

► Old-age pensions and schooling in South Africa Eligible if household member over 60 Not eligible if under 60

• Used household with member age 55-60

Pensions for women and girls’ education

Page 13: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Measuring Impact

► Randomized Experiments► Quasi-experiments

Randomized Promotion – Instrumental Variables

Regression Discontinuity Double differences (Diff in diff) Matching

Page 14: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Matching

► Pick the ideal comparison group that matches the treatment group from a larger survey.

► The matches are selected on the basis of similarities in observed characteristics. For example?

► This assumes no selection bias based on unobserved characteristics. Example: income Example: entrepreneurship

Source: Martin Ravallion

Page 15: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Propensity-Score Matching (PSM)► Controls: non-participants with same characteristics

as participants In practice, it is very hard. The entire vector of X observed

characteristics could be huge.

► Match on the basis of the propensity score

P(Xi) = Pr (participationi=1|X) Instead of aiming to ensure that the matched control for

each participant has exactly the same value of X, same result can be achieved by matching on the probability of participation.

This assumes that participation is independent of outcomes given X (not true if important unobserved outcomes are affecting participation)

Page 16: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Steps in Score Matching

1. Representative & highly comparable survey of non-participants and participants.

2. Pool the two samples and estimate a logit (or probit) model of program participation:

Gives the probability of participating for a person with X

3. Restrict samples to assure common support (important source of bias in observational studies)

For each participant find a sample of non-participants that have similar propensity scores

Compare the outcome indicators. The difference is the estimate of the gain due to the program for that observation.

Calculate the mean of these individual gains to obtain the average overall gain.

Page 17: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Density

0 1Propensity score

Region of common support

Density of scores for participants

High probability of participating given X

Page 18: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Steps in Score Matching1. Representative & highly comparable survey of non-

participants and participants.2. Pool the two samples and estimate a logit (or probit) model

of program participation:Gives the probability of participating for a person with X

3. Restrict samples to assure common support (important source of bias in observational studies)

4. For each participant find a sample of non-participants that have similar propensity scores

5. Compare the outcome indicators. The difference is the estimate of the gain due to the program for that observation.

6. Calculate the mean of these individual gains to obtain the average overall gain.

Page 19: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

PSM vs an experiment

► Pure experiment does not require the untestable assumption of independence conditional on observables

► PSM requires large samples and good data

Page 20: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Lessons on Matching Methods

► Typically used for IE when neither randomization, RD or other quasi-experimental options are not possible (i.e. no baseline) Be cautious of ex-post matching:

• Matching on variables that change due to participation (i.e., endogenous)

• What are some variables that won’t change?

► Matching helps control for OBSERVABLE differences

Page 21: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

More Lessons on Matching Methods

► Matching at baseline can be very useful: Estimation:

• Combine with other techniques (i.e. diff in diff)

• Know the assignment rule (match on this rule)

Sampling:• Selecting non-randomized control

sample► Need good quality data

Common support can be a problem

Page 22: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Case 7: Matching

Case 7 - PROPENSITY SCORE: Pr(treatment=1)

Variable Coef. Std. Err.

Age Head -0.03 0.00Educ Head -0.05 0.01Age Spouse -0.02 0.00Educ Spouse -0.06 0.01Ethnicity 0.42 0.04Female Head -0.23 0.07Constant 1.6 0.10

P-score Quintiles

Xi T C t-score T C t-score T C t-score T C t-score T C t-scoreAge Head 68.04 67.45 -1.2 53.61 53.38 -0.51 44.16 44.68 1.34 37.67 38.2 1.72 32.48 32.14 -1.18Educ Head 1.54 1.97 3.13 2.39 2.69 1.67 3.25 3.26 -0.04 3.53 3.43 -0.98 2.98 3.12 1.96Age Spouse 55.95 55.05 -1.43 46.5 46.41 0.66 39.54 40.01 1.86 34.2 34.8 1.84 29.6 29.19 -1.44Educ Spouse 1.89 2.19 2.47 2.61 2.64 0.31 3.17 3.19 0.23 3.34 3.26 -0.78 2.37 2.72 1.99Ethnicity 0.16 0.11 -2.81 0.24 0.27 -1.73 0.3 0.32 1.04 0.14 0.13 -0.11 0.7 0.66 -2.3Female Head 0.19 0.21 0.92 0.42 0.16 -1.4 0.092 0.088 -0.35 0.35 0.32 -0.34 0.008 0.008 0.83

Quintile 4 Quintile 5Quintile 1 Quintile 2 Quintile 3

Page 23: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Case 7: Matching

Linear Regression Multivariate Linear Regression

Estimated Impact on CPC 1.16 7.06+(3.59) (3.65)

** Significant at 1% level, + Significant at 10% level

Case 7 - Matching

Page 24: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Impact Evaluation Example –Summary of Results

Case 1 - Before and After

Case 2 - Enrolled/Not

Enrolled

Case 3 - Randomization

Case 4 - Regression

Discontinuity

Case 5 - Diff in Diff

Case 6 - IV (TOT)

Case 7 - Matching

Multivariate Linear

RegressionMultivariate Linear

Regression

Multivariate Linear

Regression

Multivariate Linear

Regression

Multivariate Linear

Regression 2SLS

Multivariate Linear

RegressionEstimated Impact on CPC 34.28** -4.15 29.79** 30.58** 25.53** 30.44** 7.06+

(2.11) (4.05) (3.00) (5.93) (2.77) (3.07) (3.65)** Significant at 1% level

Page 25: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Measuring Impact

► Experimental design/randomization► Quasi-experiments

Regression Discontinuity Double differences (Diff in diff) Other options

• Instrumental Variables• Matching

Combinations of the above

Page 26: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Remember…..

► Objective of impact evaluation is to estimate the CAUSAL effect of a program on outcomes of interest

► In designing the program we must understand the data generation process behavioral process that generates the

data how benefits are assigned

► Fit the best evaluation design to the operational context

Page 27: AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J

Design When to use Advantages Disadvantages

Randomization ►Whenever possible►When an intervention will not be universally implemented

►Gold standard►Most powerful

►Not always feasible►Not always ethical

Random Promotion ►When an intervention is universally implemented

► Learn and intervention ►Only looks at sub-group of sample

Regression Discontinuity

►If an intervention is assigned based on rank

►Assignment based on rank is common

►Only look at sub-group of sample

Double differences ►If two groups are growing at similar rates

►Eliminates fixed differences not related to treatment

►Can be biased if trends change

Matching ►One other methods are not possible

►Overcomes observed differences between treatment and comparison

►Assumes no unobserved differences (often implausible)