![Page 1: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/1.jpg)
Impact Evaluation
Click to edit Master title style
Click to edit Master subtitle style
Impact EvaluationImpact Evaluation
World Bank InstituteHuman Development Network
Middle East and North Africa Region
Measuring Impact:Impact Evaluation Methods for Policy
MakersPaul GertlerUC Berkeley
Note: slides by Sebastian Martinez, Christel Vermeersch and Paul Gertler. The content of this presentation reflects the views of the authors and not necessarily those of the World Bank. This version: November 2009.
![Page 2: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/2.jpg)
2
Impact Evaluation
Logical Framework How the program works “in
theory” Measuring Impact
Identification Strategy Data Operational Plan Resources
![Page 3: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/3.jpg)
3
Measuring Impact1) Causal Inference
Counterfactuals False Counterfactuals:
Before & After (pre & post) Enrolled & Not Enrolled (apples & oranges)
2) IE Methods Toolbox: Random Assignment Random Promotion Discontinuity Design Difference in Difference (Diff-in-diff) Matching (P-score matching)
![Page 4: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/4.jpg)
4
Our Objective Estimate the CAUSAL effect (impact) of
intervention P (program or treatment) on outcome Y (indicator, measure of success)
Example: what is the effect of a Health Insurance Subsidy Program(P) on Out of Pocket Health Expenditures (Y)?
![Page 5: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/5.jpg)
5
Causal Inference What is the impact of P on Y?
Answer:α= (Y | P=1)-(Y | P=0)
Can we all go home?
![Page 6: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/6.jpg)
6
Problem of missing data
For a program beneficiary: we observe (Y | P=1):
Health expenditures (Y) with health insurance subsidy (P=1)
but we do not observe (Y | P=0): Health expenditures (Y) without health insurance subsidy
(P=0)
α= (Y | P=1)-(Y | P=0)
![Page 7: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/7.jpg)
7
Solution Estimate what would have
happened to Y in the absence of P We call this the…………
COUNTERFACTUALThe key to a good
impact evaluation is a valid counterfactual!
![Page 8: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/8.jpg)
8
Estimating Impact of P on Y
OBSERVE (Y | P=1)Outcome with treatment
ESTIMATE (Y | P=0) counterfactual
α = (Y | P=1) - (Y | P=0)
IMPACT = outcome with treatment - counterfactual Intention to Treat (ITT) -
Those offered treatment Treatment on the Treated
(TOT) – Those receiving treatment
Use comparison or control group
![Page 9: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/9.jpg)
9
Example: What is the Impact of:
giving Fulanito
additional pocket money (P)
onFulanito’s consumption of candies (Y)
![Page 10: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/10.jpg)
10
The perfect “Clone”
6 Candies
Impact =
Fulanito Fulanito’s Clone
4 Candies
![Page 11: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/11.jpg)
11
In reality, use statistics
Average Y = 6 Candies
Impact = 6 - 4 = 2 Candies
Treatment Comparison
Average Y = 4 Candies
![Page 12: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/12.jpg)
12
Finding Good Comparison Groups We want to find “clones” for the Fulanito’s in our
programs The treatment and comparison groups should:
have identical characteristics, except for benefiting from the intervention
In practice, use program eligibility & assignment rules to construct valid counterfactuals
With a good comparison group, the only reason for different outcomes between treatments and controls
is the intervention (P)
![Page 13: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/13.jpg)
13
National Health System Reform Closing gap in access and quality of services between rural and urban
areas Large expansion in supply of health services Reduction of health care costs for rural poor
Health Insurance Subsidy Program (HISP) Pilot program Covers costs for primary health care and drugs Targeted to poor – eligibility based on poverty index
Rigorous impact evaluation with rich data 200 communities, 10,000 households Baseline and follow-up data two years later
Many outcomes of interest Yearly out of pocket health expenditures per capita
What is the effect of HISP (P) on health expenditures (Y)? If impact is a reduction of $9 or more, then scale up nationally
Case Study: HISP
![Page 14: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/14.jpg)
14
Ineligibles(Non-Poor)
Eligibles(Poor)
Case Study: HISP
Not Enrolled
Enrolled
Eligibility and Enrollment
![Page 15: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/15.jpg)
15
Measuring Impact1) Causal Inference
Counterfactuals False Counterfactuals:
Before & After (pre & post) Enrolled & Not enrolled (apples & oranges)
2) IE Methods Toolbox: Random Assignment Random Promotion Discontinuity Design Difference in Difference (Diff-in-diff) Matching (P-score matching)
![Page 16: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/16.jpg)
16
Counterfeit Counterfactual #1
Before & AfterY
TimeT=0Baseline
T=1Endline
IMPACT?
B
A
C (counterfactual)
![Page 17: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/17.jpg)
17
Case 1: Before & After
Observe only beneficiaries (P=1)
2 observations in time expenditures at T=0 expenditures at T=1
“Impact” = A-B =
Time
What is the effect of HISP (P) on health expenditures (Y)?
B
T=0 T=1
Y
7.8
14.4
A
α =
![Page 18: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/18.jpg)
18
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
Outcome with Treatment Counterfactual Impact
(After) (Before) (Y | P=1) - (Y | P=0)
health expenditures (Y) 7.8 14.4 -6.6**
Linear Regression Multivariate Linear Regression
estimated impact on health expenditures (Y)
-6.59** -6.65**
Case 1: Before & After
![Page 19: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/19.jpg)
Economic Boom: Real Impact = A-C A-B is an underestimate
Economic Recession: Real Impact = A-D A-B is an overestimate
Time
B
T=0 T=1
Y
7.8
14.4
A
α = -$6.6D?
C?
Impact ?
Case 1: What’s the Problem?
Impact ?
Problem with before & after: doesn’t control for other time-varying factors!
![Page 20: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/20.jpg)
20
Measuring Impact1) Causal Inference
Counterfactuals False Counterfactuals:
Before & After (pre & post) Enrolled & Not Enrolled (apples & oranges)
2) IE Methods Toolbox: Random Assignment Random Promotion Discontinuity Design Difference in Difference (Diff-in-diff) Matching (P-score matching)
![Page 21: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/21.jpg)
21
False Counterfactual #2Enrolled & Not Enrolled
If we have post-treatment data on Enrolled: treatment group
Not-enrolled: “control” group (counterfactual) Those ineligible to participate Those that choose NOT to participate
Selection Bias Reason for not enrolling may be correlated with outcome (Y)
Control for observables But not unobservables!!
Estimated impact is confounded with other things
![Page 22: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/22.jpg)
22
Ineligibles(Non-Poor)
Eligibles(Poor)
Measure outcomes in post-treatment (T=1)
In what ways might enrolled & not enrolled be different, other than their enrollment in the program?
Not Enrolled Y = 21.8
Enrolled Y = 7.8
Case 2: Enrolled & Not Enrolled
![Page 23: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/23.jpg)
23
Case 2: Enrolled & Not Enrolled
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
Outcome with Treatment Counterfactual Impact
(Enrolled) (Not Enrolled) (Y | P=1) - (Y | P=0)
health expenditures (Y) 7.8 21.8 -14**
Linear Regression Multivariate Linear Regression
estimated impact on health expenditures (Y)
-13.9** -9.4**
![Page 24: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/24.jpg)
24
Will you recommend scaling up HISP? Before-After:
Are there other time-varying factors that also influence health expenditures?
Enrolled-Not Enrolled: Are reasons for enrolling correlated with health expenditures? Selection Bias
Policy Recommendation?Case 1: Before and After Case 2: Enrolled & Not-
Enrolled
Linear Regression
Multivariate Linear
Regression
Linear Regression
Multivariate Linear
Regression
impact on health expenditures (Y)
-6.59** -6.65** -13.9** -9.4**
![Page 25: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/25.jpg)
25
Keep in mind…….. Two common comparisons to be avoided!!
Before & After (pre & post) Compare: same individuals before and after they receive P Problem: other things may have happened over time
Enrolled & Not Enrolled (apples & oranges) Compare: a group of individuals that enrolled in a program
with a group that chooses not to enroll Problem: Selection Bias we don’t know why they are not
enrolled Both counterfactuals may lead to biased estimates of
the impact
![Page 26: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/26.jpg)
26
Measuring Impact1) Causal Inference
Counterfactuals False Counterfactuals:
Before & After (pre & post) Enrolled & Not Enrolled (apples & oranges)
2) IE Methods Toolbox: Random Assignment Random Promotion Discontinuity Design Difference in Differences (Diff-in-diff) Matching (P-score matching)
![Page 27: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/27.jpg)
27
Choosing your IE method(s)….. Key information you will need for identifying the
right method for your program: Prospective/retrospective evaluation? Eligibility rules and criteria?
Poverty targeting? Geographic targeting ?
Roll-out plan (pipeline) ? Is the number of eligible units larger than
available resources at a given point in time? Budget and capacity constraints? Excess demand for program? Etc….
![Page 28: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/28.jpg)
28
Choosing your IE method(s)…..
Best design = best comparison group you can find + least operational risk
Have we controlled for “everything”? Internal validity Good comparison group
Is the result valid for “everyone”? External validity Local versus global treatment effect Evaluation results apply to population we’re interested in
Choose the “best” possible design given the operational context
![Page 29: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/29.jpg)
29
Measuring Impact1) Causal Inference
Counterfactuals False Counterfactuals:
Before & After (pre & post) Enrolled & Not enrolled (apples & oranges)
2) IE Methods Toolbox: Random Assignment Random Promotion Discontinuity Design Difference in Differences (Diff-in-diff) Matching (P-score matching)
![Page 30: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/30.jpg)
30
Randomized Treatments and Controls
When universe of eligibles > # benefits: Randomize! Lottery for who is offered benefits Fair, transparent and ethical way to assign benefits to equally
deserving populations Oversubscription:
Give each eligible unit the same chance of receiving treatment
Compare those offered treatment with those not offered treatment (controls)
Randomized phase in: Give each eligible unit the same chance of receiving
treatment first, second, third…. Compare those offered treatment first, with those offered
treatment later (controls)
![Page 31: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/31.jpg)
31
Randomized treatments and controls
1. Universe2. Random Sample
of Eligibles
Ineligible = Eligible =
3. Randomize Treatment
External Validity Internal Validity
Control
![Page 32: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/32.jpg)
32
Unit of Randomization Choose according to type of program:
Individual/Household School/Health Clinic/catchment area Block/Village/Community Ward/District/Region
Keep in mind: Need “sufficiently large” number of units to detect
minimum desired impact power Spillovers/contamination Operational and survey costs
As a rule of thumb, randomize at the smallest viable unit of implementation
![Page 33: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/33.jpg)
Health Insurance Subsidy Program (HISP) Unit of randomization: Community 200 communities in the sample Randomized phase-in:
100 treatment communities (5,000 households)
Started receiving transfers at baseline T = 0 100 control communities (5,000 households)
Receive transfers after follow up T = 1 if program is scaled up
Case 3: Random Assignment
33
![Page 34: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/34.jpg)
34
T=0
100 TreatmentCommunities(5,000 HH)
100 Control Communities(5,000 HH)
T=1
Time
Comparison period
Case 3: Random Assignment
![Page 35: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/35.jpg)
35
How do we know we have good clones?
Case 3: Random Assignment
![Page 36: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/36.jpg)
36
Case 3: Random AssignmentControl Treatment T-stat
Health Expenditures ($ yearly per capita) 14.57 14.48 -0.39
Head’s age (years) 42.3 41.6 1.2
Spouse’s age (years) 36.8 36.8 -0.38
Head’s education (years) 2.8 2.9 -2.16**
Spouse’s education (years) 2.6 2.7 -0.006
**= significant at 1%
Case 3: Balance at Baseline
![Page 37: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/37.jpg)
37
Case 3: Random AssignmentControl Treatment T-stat
Head is female = 1 0.07 0.07 0.66
Indigenous =1 0.42 0.42 0.21
Numer of household members 5.7 5.7 -1.21
Bathroom =1 0.56 0.57 -1.04
Hectares of Land 1.71 1.67 1.35
Distance to hospital (km) 106 109 -1.02
**= significant at 1%
Case 3: Balance at Baseline
![Page 38: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/38.jpg)
38
Case 3: Random Assignment
Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**).
Treatment
Group Counterfactual Impact
(Randomized to
treatment)(Randomized to comaparison) (Y | P=1)-(Y | P=0)
Baseline (T=0) health expenditures (Y) 14.48 14.57 -0.09
Follow-up (T=1) health expenditures (Y) 7.8 17.9 -10.1**
Linear Regression Multivariate Linear Regression
estimated impact on health expenditures (Y)
-10.1** -10**
![Page 39: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/39.jpg)
39
**= significant at 1%
HISP Policy Recommendation?Case 1: Before
and After
Case 2: Enrolled & Not-
Enrolled
Case 2: Enrolled &
Not-Enrolled
Case 3: Random
Assignment
Multivariate Linear
Regression
Linear Regression
Multivariate Linear
Regression
Multivariate Linear
Regression
impact of HISP on health expenditures (Y)
-6.65** -13.9** -9.4** -10**
![Page 40: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/40.jpg)
Random Assignment: With large enough samples, produces two
groups that are statistically equivalent We have identified the perfect “clone”
Feasible for prospective evaluations with over-subscription/excess demand
Most pilots and new programs fall into this category!
40
Keep in mind……..
Randomized beneficiary Randomized comparison
![Page 41: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/41.jpg)
41
Remember….. Objective of impact evaluation is to estimate the
CAUSAL effect or IMPACT of a program on outcomes of interest
To estimate impact, we need to estimate the counterfactual What would have happened in the absence of the
program Use comparison or control groups
We have toolbox with 5 methods to identify good comparison groups
Choose the best evaluation method that is feasible in the program’s operational context
![Page 42: Measuring Impact: Impact Evaluation Methods for Policy Makers](https://reader034.vdocuments.net/reader034/viewer/2022051518/568160e5550346895dd01742/html5/thumbnails/42.jpg)
42
THANK YOU!