evaluation methods for measuring the impact of social protection programs
DESCRIPTION
Evaluation methods for measuring the impact of social protection programs. Joost de Laat , Menahem Prywes, Shafique Jamal The World Bank. Objectives:. Understand: Principles of the difference in the differences method of project evaluation and weaknesses of the method. - PowerPoint PPT PresentationTRANSCRIPT
Evaluation methods for measuring the impact of social protection programs
Joost de Laat, Menahem Prywes, Shafique JamalThe World Bank
Objectives:
Understand:
Principles of the difference in the differences method of project evaluation and weaknesses of the method.
Principles of the randomized controlled trial (RCT) method. Limits and weaknesses of the randomized controlled trial
method. Principles of the regression discontinuity design (RDD)
method.
How donors evaluated development projects. Donors often couldn’t evaluate development projects, and
especially health projects, convincingly because: No one bothered to collect baseline data, No one tracked the treatment (beneficiary) group over time.
Sometimes donors collected this information and then measured changes in the treatment group over time. However it remained unclear whether this performance was
better or worse to the comparator (treatment) groups. Sometimes, donors applied the difference in the differences
methods. This compares results from the treatment group to results from a
control group. But it’s often unclear whether the comparator group really is
comparable to the treatment group. Parliaments and donors increasingly demand credible evaluations!
Souce: Prashant Bharadwaj
Difference in differences methodology-1
Difference in differences methodology-2Source: Prashant Bharadwaj
Difference in differences methodology-3Source: Prashant Bharadway
Difference in the differences: simple Difference in the differences methodology: simple numerical example.Source: Prashant Bharadway
In contrast to the Difference in the Differences method, randomized controlled trials seek to make valid comparisons between outcomes for treatment and control groups.
Randomization establishes a control group that is statistically identical to the intervention group. This produces unbiased results.
Randomization reduces selection bias, for example Undercoverage: some parts of the population are under-
represented in the sample. Self-selection: people who agree to participate in the trial
have special characteristics, i.e. strong opinions on an issue. Nonresponse: bias: participants who do not respond may
have particular views or other characteristics.
Unit of RandomizationChoose according to type of program
o Individual/Householdo School/Health Clinic/catchment
areao Block/Village/Communityo Ward/District/Region
Keep in mindo Need “sufficiently large” number of units to detect
minimum desired impact: Power.o Spillovers/contaminationo Operational and survey costs
As a rule of thumb, randomize at the smallest viable unit of implementation.
Example: Randomized Assignment
Mexico Progresa Conditional Cash Transfer program
Unit of randomization: Community
o 320 treatment communities (14446 households): First transfers in April 1998.
o 186 comparison communities (9630 households): First transfers November 1999
506 communities in the evaluation sample
Randomized phase-in
Example: Randomized Assignment
Treatment Communities
320
Comparison Communities
186
Time
T=1T=0
Comparison Period
Example: Randomized Assignment
How do we know we have good clones?
In the absence of Progresa, treatment and comparisons should be identical
Let’s compare their characteristics at baseline (T=0)
Example: Balance at Baseline
Case 3: Randomized AssignmentTreatment Comparison T-stat
Consumption($ monthly per capita) 233.4 233.47 -0.39Head’s age (years) 41.6 42.3 -1.2Spouse’s age(years) 36.8 36.8 -0.38Head’s education (years) 2.9 2.8 2.16**Spouse’s education (years) 2.7 2.6 0.006
Note: If the effect is statistically significant at the 5% significance level, we label the estimated impact with 2 stars (**).
Example: Balance at Baseline
Case 3: Randomized AssignmentTreatment Comparison T-stat
Head is female=1 0.07 0.07 -0.66Indigenous=1 0.42 0.42 -0.21Number of household members 5.7 5.7 1.21Bathroom=1 0.57 0.56 1.04Hectares of Land 1.67 1.71 -1.35Distance to Hospital (km) 109 106 1.02
Note: If the effect is statistically significant at the 5% significance level, we label the estimated impact with 2 stars (**).
Example: Randomized AssignmentTreatment Group(Randomized to
treatment)
Counterfactual (Randomized to
Comparison)
Impact(Y | P=1) - (Y | P=0)
Baseline (T=0) Consumption (Y) 233.47 233.40 0.07Follow-up (T=1) Consumption (Y) 268.75 239.5 29.25**
Estimated Impact on Consumption (Y)
Linear Regression 29.25**Multivariate Linear Regression 29.75**
Note: If the effect is statistically significant at the 5% significance level, we label the estimated impact with 2 stars (**).
Keep in MindRandomized Assignment
In Randomized Assignment, large enough samples, produces 2 statistically equivalent groups.
We have identified the perfect clone.
Randomized beneficiary
Randomized comparison
Feasible for prospective evaluations with over-subscription/excess demand.
Most pilots and new programs fall into this category.
!
Limits on randomized controlled trials
Out of sample generalization: Results from these trials are internally valid but cannot be generalized (extrapolated) out of sample. An inference of general validity of a result would require an internally consistent theory of causation and repeated randomized controlled trials in different countries, demographic rules, and natural environments.
Results are comparisons of averages. Therefore the results of a randomized controlled trial may not be valid for making policies for sub-groups or for individual households and people –especially if the policymaker has additional information.
Risks of bias in the randomized controlled trial methodology
Self selection out of the control group. Randomized controlled trials in the social sciences are not double blind, like pharmaceuticals trials. The people who are not receiving the treatment (for example, tutoring, or nutritional supplements) may decide to obtain these on their own, biasing the results.
Replacement of drop-outs may lead to bias.
Limits to use of randomized controlled trials
Randomized controlled trials are expensive. They can cost any where from $150,000 to several million dollars. A $500,000 cost is typical. This means the method cannot be applied to all development projects.
Many development projects do not address units that can be randomized. For instance states or provinces/oblasts cannot be meaningfully randomized.
Ethical rules are unclear. In medical research, participation in a randomized controlled trial requires informed consent. There are no general rules for economic development project. US universities and some developing countries have ethical rules.
Subtle conflicts of interests and biases can prejudice all evaluation studies –whatever the methodology.
Sponsors’ conflict of interest. Donors, governments, project units, and NGOs prefer to report positive findings because this helps to sustain their business and jobs. Sometimes, project units resist or even refuse payment to contractors who deliver negative evaluation reports.
Contractors’ conflict of interest. The contractors who carry-out the evaluation studies may be influenced by their clients preferences.
Confirmation bias. Donors, governments, project units, NGOs often believe that outcomes are positive and tend to perceive positive outcomes. Also, officials, managers, and development experts come to identify personally with the projects. Their psychological bias is, ‘I mean well, therefore the project is successful.’
Publication bias. Scholarly journals prefer to publish positive results and generally neglect negative results (‘no effect’ is not newsworthy). This may induce bias in academic work.
Economic and ethical questions
When should donors and governments insist on application of the randomized controlled trial methodology and when is this inappropriate?
When is it unethical to use the randomized controlled trials methodology in a development context?
Regression Discontinuity Design Many social programs select beneficiaries using an index or score:
Anti-poverty Programs
Pensions
Education
Agriculture
Targeted to households below a given poverty index/income
Targeted to population above a certain age
Scholarships targeted to students with high scores on standarized text
Fertilizer program targeted to small farms less than given number of hectares)
Regression Discontinuity DesignExample: Effect of social assistance program on nutrition
Reduce vulnerability and improve nutrition of poor families
Goal
o Households with a poverty score ≤50 are pooro Households with a poverty score >50 are not poor
Method
Poor households receive social assistance transfersIntervention
Regression Discontinuity Design-Baseline
Not eligible
Eligible
Regression Discontinuity Design-Post Intervention
IMPACT
Regression Discontinuity Design
We have a continuous eligibility index with a defined cut-offo Households with a score ≤ cutoff are eligibleo Households with a score > cutoff are not eligibleo Or vice-versa
Intuitive explanation of the method:o Units just above the cut-off point are very similar to units
just below it – good comparison.o Compare outcomes Y for units just above and below the cut-
off point.For a discontinuity design, you need: 1) Continuous eligibility index2) Clearly defines eligibility cut-off.
THANK YOU!
Questions?
Next: Tajikistan Example