Causal Logic Models: Incorporating Change & Action Models; Fidelity-Adaptation Relationship; Stakeholder Engagement & Partnership StrategiesPADM 522—Summer 2012
Lecture 3
Professor Mario Rivera
Intervention
OtherFactors
OutcomeA causal logic model clarifies the program’s theory, or change and action modeling of the way that interventions produce outcomes, by isolating program effects from other factors or influences.
Multiple methods may be used to establish the relative importance of various causative influences. These include experimental, quasi-experimental, and cross-case analysis, and range from quantitative to mixed-methods to purely qualitative methods. Most evaluations use mixed-methods designs.
Causal logic models—essential definitions, methods
Components of a causal logic model (in red) pertain to program theory; they augment regular logic modeling Left to right on the graphic one would find, in some order: Inputs (Fiscal and Human Resources Invested; Key
Programmatic Initiatives) Assumptions, Underlying Conditions, Premises (May Specify
Ones Under Program Control and Outside Program Control, as in USAID’s Logical Framework or LogFrame)
Causative (If-then) Linkages Among Program Functions, Indicating Change and Action Models or Program Theory
Program Activities, Services Immediate or Short-term Outcomes (process measures) Intermediate or Medium-term Outcomes (outcome measures) Long-term Results, Long-term Outcomes or Program Impact
(impact measures)
The Causal Logic Model Framework: Incorporating “If-then” Causative Linkages Among Program Components
Activities/Services
Provide BB
Do AA
Populationgets BB
IntermediateOutcomes
ImmediateOutcomes
Somethinghappens
# Trained
A later result
A later result
Condition ABC
Improve
Long Term Outcomes/
Results
Provide trainingabout CC
InputsAssumptions/
Conditions
PersonnelResources
If condition A exists
Funding/Other
ResourcesIf need B exists
Curriculum If condition C exists
A Causal Model Worksheet (one format)
Inputs Assumption or Underlying
Condition
Activities Immediate Outcomes
Intermediate Outcomes
Long Term Outcomes/
Results/
Impact
Making program theory or a program’s change model explicit: an example from Chen In a hypothetical spouse abuse treatment program that
relies on group counseling, with a target group of “abusers convicted by a court,” Chen (page 18) proposes that the change model may work as follows: “[T]he designers decide that group counseling should be provided weekly for 10 weeks because they believe that 10 counseling sessions is a sufficient ‘dose’ for most people” who are similarly situated. Here the tacit theory or change model is bound up with the expectation that counseling is a sufficient intervention to elicit behavioral change in adjudicated abusers. So-called “zero-tolerance” automatic incarceration programs instead build on the premise that incarceration is required as a deterrent and as a prompt for behavioral change.
Action and change models and partnerships Chen divides program theory into two component parts: An action model and a change model. An action model should incorporate both program ecological context and dimensions of inter-agency collaboration. Using the just-cited example of a domestic violence program, Chen argues that the program would fail if it lacked a working relationship with the courts, police, and community social agency partners and advocacy groups (p. 26). It is therefore important to align models as well as strategies in working in concert with other agencies, although that can be very difficult.
Partnered programs may have different change models at work, or they may operate on different concepts of a single model set. What if one partner agency in a domestic violence collaborative operates on one set of assumptions (e.g., a model based on zero-tolerance, and deterrence through incarceration) while another does so based on a rehabilitation & counseling model?
Such programs create complex effects chains, as the efforts of various partners have impact in different places, at different times.
Chen’s Stakeholder Engagement and Partnership Strategies Chen provides another dimension of partnership in his
evaluation framework, namely that of evaluator-stakeholder partnership, particularly in the development and assessment of partnered programs. This essentially occurs when program principals and stakeholders bringing evaluators into the program coalition and program development effort as key partners. What are the pros and cons of this kind of evaluator involvement in program development? At what junctures of evaluator involvement are dilemmas likely to present themselves? Might it be possible for an evaluator to become involved in this way early on in a program but then detach himself or herself for the purposes of outcome evaluation? If not, why not? Can stakeholders empower evaluators? How?
“Integrative Validity”—From Chen, Huey T., 2010. “The bottom-up approach to integrative validity: A new perspective for program evaluation,” Evaluation and Program Planning, Elsevier, vol. 33(3), pages 205-214, August. “Evaluators and researchers have . . . increasingly recognized
that in an evaluation, the over-emphasis on internal validity reduces that evaluation's usefulness and contributes to the gulf between academic and practical communities regarding interventions (p. 205).”
Chen proposes an alternative integrative validity model for program evaluation, premised on viability and “bottom-up” incorporation of stakeholders’ views and concerns. The integrative validity model and the bottom-up approach enable evaluators to meet scientific and practical requirements, facilitate in advancing external validity, and gain a new perspective on methods. For integrative validity to obtain, stakeholders must be centrally involved. Consistent with Chen’s emphasis on addressing both scientific and stakeholder validity.
Key Concepts in Impact Assessment Linking interventions to outcomes.
Establishing impact essentially amounts to establishing causality.
Most causal relationships in social science and behavioral science are expressed as probabilities.
Conditions limiting assessments of causality External conditions and causes. Internal conditions (such as biased selection). Other social programs with similar targets.
Key Concepts in Impact Assessment “Perfect” versus “good enough” impact assessments.
Intervention and target may not allow perfect design. Time and resource constraints. Importance often determines rigor. Review design options to determine most appropriate
—mixed methods are most often used. Quasi-experiments and cross-case or cross-site design, and “natural experiments,” are typically the closest one can come to true experimentation. These may provide as much or more rigor than efforts at randomized experiments on a clinical model.
Key Concepts in Impact Assessment Gross versus net outcomes. Net outcomes and the
counterfactual: Net outcomes equal outcomes of the program minus projected outcomes without the program.
Effects
Design
factors)
gconfoundin s(extraneou
processesother
of Effects
effect)(net
onInterventi
of Effects
OutcomeGross
Program impacts as comparative net outcomesIf, for example, one finds in an anti-smoking program that only 2 percent of targeted youth have quit or not taken up smoking by virtue of the program, the program appears ineffective. However, if in comparable populations not exposed to it there was a 1.5 percent increase in smoking behaviors, it seems more effective. Arguably, it was able to stem some of the naturally occurring increase in tobacco use (first or continued use). The critical distinction is a difference between outcomes and impacts. In evaluation, an outcome is the value of any variable measured after an intervention. An impact is the difference between the outcome observed and what would have occurred without an intervention; i.e., an impact is the difference in outcomes attributable to the program. Impacts also must entail lasting changes in a targeted condition.
Key terminology re: attribution/causation Independent variables – direct policy/program interventions Dependent variables –outcomes Intervention variables are a special class of independent
variables that refer to policy/programming factors as discrete variables; these are endogenous (internal) factors
Exogenous factors – external to the program; contextual Counterfactual – the state of affairs that would have occurred
without the program Gross impact: observed change in outcome or outcomes Net impact: portion of gross impact attributable to the program
intervention; program intervention effects minus counterfactual. Confounding variables – Other factors making for impact felt
or measured within the program.
Confounding Factors—exogenous (external) & endogenous (internal) Exogenous confounding factors—other programs
and messages, socioeconomic context. Endogenous effects of uncontrolled selection.
Preexisting differences between treatment and control groups.
Self-selection. Program location and access. Deselection processes (attrition bias).
Endogenous change. Secular drift. Interfering events internal to the program. Maturational trends.
Design Effects Choice of outcome measures.
A critical measurement problem in evaluations is that of selecting the best measures for assessing outcomes. Conceptualization. Reliability. Feasibility. Proxy and indirect measures.
Missing information. Missing information is generally not randomly distributed. Often must be compensated for by alternative survey
items, unobtrusive measures, or estimates.
Design Strategies Compensating for Experimental Controls Full- versus partial-coverage programs.
Full coverage means absence of a control group. This is the norm for social programs, since it is unfeasible to deny the intervention or treatment to a control group of participants.
The evaluator must then use reflexive controls, for instance cross-case and cross-site comparisons internal to the program.
“Reflexive controls” means program-specific approximations of experimental controls
Realities of Randomized Experimental Design: Afterschool Science Program Example One would need to recruit all interested and eligible middle
school students to create a large enough subject pool, when it’s hard enough to recruit adequately-sized cohorts
Would need to ask parents and students for permission to randomly assign to one of two conditions. Then divide subjects into two conditions. But what? Denial of program benefits is unfeasible, and it would alienate everyone—parents, students, teachers. Try two curricula? Expensive, plus it raises the question of what is really being evaluated.
Could focus outcome evaluation efforts on randomly assigned subjects, while including all subjects in process evaluation
However, it is not clear that one would learn any more than otherwise from all this effort. Quasi-experiments and cross-case design would likely offer equal rigor.
Example: One experimental-design evaluation examined whether a home-based mentoring intervention forestalled 2nd birth for at least 2 years after an adolescent’s 1st birthDoes participation in the program reduce likelihood of early 2nd birth? Randomized controlled trial involving first-time African-American adolescent mothers
(n=181) younger than age 18 Intervention based on social cognitive theory, focused on interpersonal negotiation skills,
adolescent development, and parentingDelivered bi-weekly until infant’s first birthdayMentors were African-American, college-educated single mothers
Control group received usual care—no differences in baseline contraceptive use or other measures of ‘risk.’
Follow-up at 6, 13, and 24 months after recruitment at first deliveryResponse rate 82% at 24 months
Intervention mothers were less likely than control mothers to have a second infant; two or more intervention visits more than tripled the odds of avoiding 2nd birth within 2 years of the 1st.
Black et al. (2006). Delaying second births among adolescent mothers: A randomized, controlled trial of a home-based mentoring program. Pediatrics, 118, e1087-1099.
Incorporate Process Evaluation Measures in Outcome Analysis
Process evaluation measures assess qualitative and quantitative measures of program implementation, e.g.
Attendance data Participant feedback Program-delivery adherence to implementation guidelines
Facilitate replication. Make possible greater understanding of outcome evaluation findings, and program improvement
Avoids a typical evaluation error: Concluding that program is not effective, when in fact the program was not implemented as intended—program stakeholders may point out that discrepancy if the are consulted about process, therefore “empowering” the outcome evaluationSource: USDHHS. (2002). Science-based prevention programs and principles, 2002. Rockville, MD: Author.
Example: Children’s Hospital Boston Study to increase parenting skills and improve attitudes about parenting among parenting teens through a structured psycho-educational group model.
All parenting teens (n=91) were offered a 12-week group parenting curriculum Comparison group (n=54) declined the curriculum but agreed to participate in
evaluation Pre-test, post-test measures included the Adult-Adolescent Parenting
Inventory (AAPI) and the Maternal Self-Report Inventory (MSRI). Analysis controlled for mother’s age, baby’s age, demographics Evaluation results: Program participants or those who attended more
sessions improved their mothering role, perception of childbearing, developmental expectations of child, and empathy for baby, and they saw a reduced frequency of problems in child and family events. Couldn’t comparable results have been attained without going to the trouble of experimental design?Source—Woods et al. (2003). The parenting project for teen mothers: The impact of a nurturing curriculum … Ambul Pediatr, 3, 240-245.
Afterschool Science Program Causal Logic Model: Inputs,
Mediating and Moderating Factors, Outcomes, and Impacts
Curriculum Design
Coaching & Scientist Visits
Best-practices-based curricular content both builds on & strengthens in-school science
Improved ability to succeed academically
Greater school retention; more high school grads going to college
Process Evaluation
Science Camp
Tested Program Content
Skilled Program Delivery Stimulating Lab Activities
Outcome Evaluation
Mediators
Moderators Poverty; family linguistic & education barriers; historic gender- and ethnicity-based constraints on educational and professional aspirations
Short-term OutcomesMedium-term Outcomes
Long-term Outcomes,or Impacts
Hands-on Program
Increased student role-identification as a scientist and personal interest in learning science
Increased student desire and capacity to engage in science
Increased involvement in science
More opt for science courses, major in science
Increased adolescent
contraceptive use
Increased self-efficacy in science
More consider ascience career
Science Camp example Randomized experimental design unfeasible, undesirable. What is the comparison group? Not possible to identify
close control groups; non-participants in same middle schools not really closely comparable (self-selection, demographics). Non participants in other schools or in other local afterschool programs not comparable either.
Use other afterschool science programs for middle-school students nationally as the comparison group, especially those targeting or largely incorporating girls and students from historically-underrepresented minorities. Targeted literature review with over 80 citations basis of comparison. Most studies find negligible gains in science knowledge and academic performance, while a few do find modest gains in interest in and self-efficacy in science.
Literature review as analytical synthesis The extensive literature review developed for the 2010
evaluation set the backdrop for the outcome findings in the 2011 evaluation. The subject became the program itself, and its significant positive outcomes, against the baseline of limited-gain or ambiguous impact findings in dozens of other national and international evaluations. Findings for the 2010 and 2011 evaluations were considered together, in finding that the Science Camp consistently produced major gains in knowledge, self-efficacy, and motivation toward as well as identification with science. A more comprehensive standpoint than localized comparisons. The lit review itself became part of the evaluation methodology.
Science Camp Outcome Measures Science Camp evaluation found significant gains in
science content knowledge, aspiration, and self-efficacy. Repeated measure paired t-tests were used to gauge gains in knowledge for each subject-matter module. T-tests are a form of variation sampling that do not require (or allow for) randomization but do set up a comparison vector between results and results to be expected by chance variation.
The formula for the t-test is a ratio. The top part of the ratio is the difference between the two means or averages. The bottom part is a measure of the variability or dispersion of the scores.
A Science Attitude Survey developed as a synthesis of proven tests (in 2011 Report) showed major motivation gains. Unpaired t-tests were used for this assessment.
Another Example: Strategic Prevention Framework State Incentive Grant (SPF SIG ) New Mexico Community Causal Logic Model: Reducing alcohol-related youth traffic fatalities
High rate of alcohol-
related crash mortality
Among 15 to 24 year olds
Low or discount PRICING of alcohol
Easy RETAIL ACCESS to Alcohol for youth
Easy SOCIAL ACCESS to Alcohol
Media Advocacy to Increase Community
Concern about Underage Drinking
Restrictions on alcohol advertising in
youth markets
SOCIAL NORMS accepting and/or encouraging
youth drinking
PROMOTION of alcohol use (advertising, movies,
music, etc)
Low ENFORCEMENT of alcohol laws
Underage DRINKING AND
DRIVING
Social Event Monitoring and
Enforcement
Bans on alcohol price promotions and
happy hours
Young AdultBINGE
DRINKING
Enforce underage retail sales laws
Causal Factors
Strategies(Examples)
Substance-Related
Consequences
SubstanceUse
Low PERCEIVED RISK of alcohol use
Young Adult DRINKING AND
DRIVING
UnderageBINGE
DRINKING
Chen: Program Implementation and FidelityAssessment of program fidelity is a part of impact evaluation. “Fidelity”= congruence between program outcomes & design: Consistency with goals articulated in funding proposals or
position papers or other reports and program sources Consistency with key stakeholder intent (e.g., the intent of a
foundation, legislature, or other funding or authorizing sources Congruence in program design, implementation, and evaluation Important dimensions of fidelity
Coverage of target populations as planned, promised Preservation of the causal mechanism underlying the program
(e.g. childhood inoculations as a crucial initiative in improving children’s health outcomes)
Preserving the defining features of the program when scaling up in size and/or scope
The Fidelity-Adaptation Relationship is important; maintaining fidelity requires creative adaptation to changing and unexpected circumstances (not rigid or formulaic conformance to original plan)
Further definition of Program Fidelity from Chen Fidelity means that the implemented model is substantially
or essentially the same as the intended model. Fidelity means that normative theory (what should be
accomplished), causative theory (anticipated causal processes), and implicit and explicit conceptions of these, are mutually consistent:
Normative theory (prescriptive model/theory) The “what” and “how” are and remain congruent Relationships among program activities, outputs, outcomes,
moderators remain relatively constant Causative theory (causal theory, change model or theory)
The “why” of the program does not essentially change Mediating factors or moderators, factors making for conversion
from action to outcome (from a systems perspective), remain reasonably constant
Chen: articulating and testing program theory Chen addresses the role of stakeholders in regard to program theory—recall Chen’s contrast between scientific validity and stakeholder validity. The evaluator can ascertain program theory by reviewing existing program documents and materials, interviewing stakeholders, and creating evaluation workgroups with them (a participatory and consultative mode of interaction). S/he may also facilitate discussions, on topics ranging from strategy to logic models to program theory. Discussion of program theory entails forward reasoning and backward reasoning in some combination—either (1) projecting from program premises or (2) reasoning back from actual or desired program outcomes. The terms “feedback” and “feed-forward” are also used. An action model may be articulated in draft form by the evaluator as a consequence of facilitated discussion, then distributed to stakeholders and program principals for further consideration and refinement. Evaluation design will involve incorporation of needs assessments and articulated program theory, with a plan to test single or multiple stages of the program. For instance, one might have yearly formative evaluations followed by a comprehensive and summative evaluation the final program year.
Chen: Causal analysis and systems analysis Inevitably, some evaluation must be carried out at systems levels:
It is important to consider that systems dynamics are inherently complex They are governed by feedback, changeable, non-linear, and history-
dependent; Adaptive and evolving; Systems are characterized by trade-offs, shifting dynamics Characterized by complex causality—coordination complexity,
sequencing complexity, causal complexity due to multiple actors and influences, and the like.
Too much focus in evaluation on a single intervention as the unit of analysis;
Understanding connectivity between programs is important; Many complex interventions require programming (and therefore also
evaluation) at multiple levels, e.g., at the community, neighborhood, school and individual level;
Multilevel alignment is required across interventions
Every program should have a strategic frame-work comprised of a series of cascading
Relationship between a program’s strategic framework and evaluation indicators, measures
Every program evaluation should have a series of corresponding indicators and performance measures Impact indicators & measures
Outcome indicators & measures
Process Indicators & measures
Goals
Objectives
Activities
Evaluating partnered, multi-causal programs Program evaluation in collaborative network/partnership
contexts:Does it matter to the functioning and success of a program that it
involves different sectors, organizations, stakeholders, and standards?
What level and breadth of consultation are needed to achieve program aims?
How do we determine if partnerships have been strengthened or new linkages formed as a result of a particular program?
How can we evaluate the development of partnered efforts and partnership capacity along with program outcomes and program capacity?
To what extent have program managers and evaluators consulted with each other and with key constituencies in establishing goals and designing programs? In after-school programs, working partnerships between teachers and after-school personnel, and between these and parents, is essential.
Chen pp.240-241; Action Model for HIV/AIDS education
Mediating Variables Moderating Variables
Implementation (interventiondeterminantsprogram outcomes)
Usually +: e.g., help from supportive networks—support groups, family and friends, reinforcing messages, social and institutional cultural supports
Usually less than +: e.g., lack of partner support, social and economic variables such as poverty, education, prejudice
Impacts on individual subject(s) of the intervention, with “impacts’ defined as the aggregate of comparative net outcomes
Action Model (which along with the Change Model=ProgramTheory)
Implementation fidelity and change modeling Models of systems change versus models of
inducement of behavioral or social change Stage-like nature of change management Multi-level quality of directed change
Change can be conceptualized at the individual, group, programmatic, organizational, and social-system levels; these are interlocking levels of action
Change is not a discrete event but a continuum, a seamless process in which decisions and actions, and actions and their effects, affect one another continually and are difficult to separate while they are occurring
Change can be anticipated and managed on the basis of program design and the testing of implementation. Evaluation is in effect a test of change and action models
Other Elements of Fidelity Assessment The quality and efficacy of implementation is a critical
element of program fidelity and fidelity evaluation Fidelity-based evaluation is a form of merit evaluation Importance of context—does it make a difference that the
program is being implemented in New Mexico or New York? Considerations for conceptualizing fidelity
Multilevel nature of many interventions Level and intensity of measurement increases with the need
for more probing evaluation What is the program’s capacity for monitoring fidelity? What is the burden of monitoring fidelity? Key elements of fidelity—e.g., alignment of program outcomes
with desired outcomes—may focus or streamline fidelity-focused evaluation
Adaptive alignment with essential program goals (desired outcomes) is more important than slavish conformance to stated goals as such