a procedure for assessing fidelity of implementation in experiments testing educational...
TRANSCRIPT
A Procedure for Assessing Fidelity of Implementation
in Experiments Testing Educational Interventions
Michael C. Nelson1, David S. Cordray1, Chris S. Hulleman2, Catherine L. Darrow1, & Evan C.
Sommer1
1Vanderbilt University, 2James Madison University
1
Purposes of Paper:
2
To argue for a model-based approach for assessing implementation fidelity
To provide a template for assessing implementation fidelity that can be used by intervention developers, researchers, and implementers as a standard approach.
Presentation Outline
3
I. What is implementation fidelity?
II. Why assess implementation fidelity?
III. A five-step process for assessing
implementation fidelity
IV. Concluding points
A Note on Examples:
4
• Examples are drawn from our review of (mainly) elementary math intervention studies, which we are currently deepening and expanding to other subject areas
• Examples for many areas are imperfect or lacking
• As our argument depends on having good examples of the most complicated cases, we appreciate any valuable examples to which you can refer us ([email protected].)
What Is Implementation Fidelity?
5
What is implementation fidelity?
6
Implementation fidelity is the extent to which the intervention has been implemented as expected
Assessing fidelity raises the question: Fidelity to what?
Our answer: Fidelity to the intervention model.
Background in “theory-based evaluations” (e.g., Chen, 1990; Donaldson & Lipsey, 2006)
Why Assess Implementation Fidelity?
7
Fidelity vs. the Black Box
8
The intent-to-treat (ITT) experiment identifies the effects of causes:
Assignment to Condition
Treatment “Black Box”
Intervention’s Causal
Processes
OutcomesOutcome Measures
Control “Black Box”
Business As UsualCausal
Processes
OutcomesOutcome Measures
Fidelity vs. the Black Box
9
…While fidelity assessment “opens up” the black box to explain the effects of causes:
Intervention “Black Box”
Intervention Component
Mediator OutcomeAssignment to Condition
FidelityMeasure 1
FidelityMeasure 2
Outcome Measure
Fidelity assessment allows us to:
10
Determine the extent of construct validity and external validity, contributing to generalizability of results
Fidelity assessment allows us to:
11
Determine the extent of construct validity and external validity, contributing to generalizability of results
For significant results, describe what exactly did work (actual difference between Tx and C)
Fidelity assessment allows us to:
12
Determine the extent of construct validity and external validity, contributing to generalizability of results
For significant results, describe what exactly did work (actual difference between Tx and C)
For non-significant results, it may explain why beyond simply “the intervention doesn’t work”
Fidelity assessment allows us to:
13
Determine the extent of construct validity and external validity, contributing to generalizability of results
For significant results, describe what exactly did work (actual difference between Tx and C)
For non-significant results, it may explain why beyond simply “the intervention doesn’t work”
Potentially improve understanding of results and future implementation
Limitations of Fidelity Assessment:
14
Not a causal analysis, but it does provide evidence for answering important questions
Limitations of Fidelity Assessment:
15
Not a causal analysis, but it does provide evidence for answering important questions
Involves secondary questions
Limitations of Fidelity Assessment:
16
Not a causal analysis, but it does provide evidence for answering important questions
Involves secondary questions Field is still developing and validating methods
and tools for measurement and analysis
Limitations of Fidelity Assessment:
17
Not a causal analysis, but it does provide evidence for answering important questions
Involves secondary questions Field is still developing and validating methods
and tools for measurement and analysis Cannot be a specific, one-size-fits-all approach
A Five Step Process for Assessing Fidelity of Implementation
18
1. Specify the intervention model2. Identify fidelity indices3. Determine index reliability and validity4. Combine fidelity indices*5. Link fidelity measures to outcomes*
*Not always possible or necessary
Step 1: Specify the Intervention Model
19
The Change Model
20
• A hypothetical set of constructs and relationships among constructs representing the core components of the intervention and the causal processes that result in outcomes
The Change Model
21
• A hypothetical set of constructs and relationships among constructs representing the core components of the intervention and the causal processes that result in outcomes
• Should be based on theory, empirical findings, discussion with developer, actual implementation
The Change Model
22
• A hypothetical set of constructs and relationships among constructs representing the core components of the intervention and the causal processes that result in outcomes
• Should be based on theory, empirical findings, discussion with developer, actual implementation
• Start with Change Model because it is sufficiently abstract to be generalizable, but also specifies important components/processes, thus guiding operationalization, measurement, and analysis
Change Model: Generic Example
23
Teacher training in use of educational
software
Teachers assist students in using
educational software
Improved student learning
Intervention Component
Mediator Outcome
Change Model: Project LINCS
24
Adapted from Swafford, Jones, and Thornton, 1997
Instruction in student
cognition of geometry
Instruction in geometry
Increase in teacher
knowledge of student
cognition
Increase in teacher
knowledge of geometry
Improved teacher
instructional practice
The Logic Model
25
The set of resources and activities that operationalize the change model for a particular implementation
The Logic Model
26
The set of resources and activities that operationalize the change model for a particular implementation
A roadmap for implementation
The Logic Model
27
The set of resources and activities that operationalize the change model for a particular implementation
A roadmap for implementation
Derived from the change model with input from developer and other sources (literature, implementers, etc.)
Logic Model: Project LINCS
28
Adapted from Swafford, Jones, and Thornton, 1997
Research seminar on van Hiele
model
Geometry content course
Increase in teacher knowledge of student
cognition
Increase in teacher knowledge of geometry
Instruction in geometry
Instruction in student cognition of
geometry
Improved teacher
instructional practice
How it is taught
Characteristics teachers display
What is taught
A Note on Models and Analysis:
29
Recall that one can specify models for both the treatment and control conditions.
The “true” cause is the difference between conditions, as reflected in the model for each.
Using the change model as a guide, one may design equivalent indices for each condition to determine the relative strength of the intervention (Achieved Relative Strength, ARS).
This approach will be discussed in the next presentation (Hulleman).
Steps 2 and 3: Develop Reliable and Valid Fidelity Indices and Apply to the Model
30
Examples of Fidelity Indices
31
Self-report surveys Interviews Participant logs Observations Examination of permanent products created
during the implementation process
Index Reliability and Validity
32
Both are reported inconsistently Report reliability at a minimum, because
unreliable indices cannot be valid Validity is probably best established from pre-
existing information or side studies
Index Reliability and Validity
33
Both are reported inconsistently Report reliability at a minimum, because
unreliable indices cannot be valid Validity is probably best established from pre-
existing information or side studies We should be as careful in measuring the
cause as we are in measuring its effects!
Selecting Indices
34
• Guided foremost by the change model: identify core components as those that differ significantly between conditions and upon which the causal processes are thought to depend
Selecting Indices
35
• Guided foremost by the change model: identify core components as those that differ significantly between conditions and upon which the causal processes are thought to depend
• Use the logic model to determine fidelity indicator(s) for each change component
Selecting Indices
36
• Guided foremost by the change model: identify core components as those that differ significantly between conditions and upon which the causal processes are thought to depend
• Use the logic model to determine fidelity indicator(s) for each change component
• Base the number and type of indices on the nature and importance of each component
Selecting Indices: Project LINCS
37
Adapted from Swafford, Jones, and Thornton, 1997
Change Model Construct
Logic Model Components
Indicators Indices
Instruction in geometry
Geometry content course
None; Proposed: Teacher attendance, content delivery
None; Proposed: Head count, observation
Instruction of student cognition of geometry
Research seminar van Hiele model
None; Proposed: Teacher attendance, content delivery
None; Proposed: Head count, observation
Increase of teacher knowledge of geometry
None Ability to apply geometry knowledge
Pre/post test of geometry knowledge
Increase of teacher knowledge of student cognition
None Ability to describe student cognition
Pre/post test of van Hiele levels
Improved teacher instructional practice
What is taught Alignment of lesson content with van Hiele levels
Observations
Improved teacher instructional practice
How it is taught
Particular instructional behaviors of teachers
Observations
Improved teacher instructional practice
Characteristics teachers display
Reflecting knowledge of student cognition in planning
Lesson plan task
Step 4: Combining Fidelity Indices*
38
Why Combine Indices?
39
*May not be possible for the simplest models *Depends on particular questions
Why Combine Indices?
40
*May not be possible for the simplest models *Depends on particular questions
Combine within component to assess fidelity to a construct
Combine across components to assess phase of implementation
Combine across model to characterize overall fidelity and facilitate comparison of studies
Some Approaches to Combining Indices:
41
• Total percentage of steps implemented• Average number of steps implemented
Some Approaches to Combining Indices:
42
• Total percentage of steps implemented• Average number of steps implemented
HOWEVER: These approaches may underestimate or overestimate the importance of some components!
Some Approaches to Combining Indices:
43
• Total percentage of steps implemented• Average number of steps implemented
HOWEVER: These approaches may underestimate or overestimate the importance of some components!
• Weighting components based on the intervention model
• Sensitivity analysis
MAP Example
44
Weighting of training sessions for the MAP intervention
Cordray, et al (Unpublished)
TrainingSession
Month Content Initial Weight
Adjusted Weight
Session 1 September Administration .25 .10
Session 2 October Data use .25 .30
Session 3 November Differentiated Instruction
.25 .50
Session 4 May Growth and planning
.25 .10
Step 5: Linking Fidelity Measures to Outcomes*
45
Linking Fidelity and Outcomes
46
• *Not possible in (rare) cases of perfect fidelity (no covariation without variation)
• *Depends on particular questions• Provide evidence supporting the model (or
not)• Identify “weak links” in implementation• Point to opportunities for “boosting” strength• Identify incorrectly-specified components of
the model
Assessment to Instruction (A2i)
47
Teacher use of web-based software for differentiation of reading instruction
Professional developmentStudents use A2i Teachers use A2i recommendations for grouping and lesson planningStudents improve learning
Measures: Time teachers logged in, observation of instruction, pre/post reading
(Connor, Morrison, Fishman, Schatschneider, and Underwood, 1997)
Assessment to Instruction (A2i)
48
Used Hierarchical Linear Modeling to analyze Overall effect size of .25 Tx vs. C Pooling Tx+C, teacher time using A2i accounted
for 15% of student performance Since gains were greatest among teachers who
both attended PD and were logged in more, concluded both components were necessary for outcome
(Connor, Morrison, Fishman, Schatschneider, and Underwood, 1997)
Some Other Approaches to Linking from the Literature
49
• Compare results of hypothesis testing (e.g., ANOVA) when “low fidelity” classrooms are included or excluded
• Correlate overall fidelity index with each student outcome
• Correlate each fidelity indicator with the single outcome
• Calculate Achieved Relative Strength (ARS) and use HLM to link to outcomes
Concluding points
50
In Summary:
51
If we do not know what we are testing, we cannot know what the results of our tests mean.
In Summary:
52
If we do not know what we are testing, we cannot know what the results of our tests mean.
Model-based (change and logic) assessment answers the question “Fidelity to what?”
In Summary:
53
If we do not know what we are testing, we cannot know what the results of our tests mean.
Model-based (change and logic) assessment answers the question “Fidelity to what?”
There is a need for a systematic approach to fidelity assessment, which we describe
In Summary:
54
If we do not know what we are testing, we cannot know what the results of our tests mean.
Model-based (change and logic) assessment answers the question “Fidelity to what?”
There is a need for a systematic approach to fidelity assessment, which we describe
Most useful when research designs are able to incorporate this process from early stages
In Summary:
55
If we do not know what we are testing, we cannot know what the results of our tests mean.
Model-based (change and logic) assessment answers the question “Fidelity to what?”
There is a need for a systematic approach to fidelity assessment, which we describe
Most useful when research designs are able to incorporate this process from early stages
Additional examples and refinement of measurement and analytical tools are needed
References
56
Chen, H.T. (1990). Theory-driven evaluation. Thousand Oaks, CA: Sage Publications.
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences, 2nd ed. Hillsdale, NJ: Erlbaum.
Connor, C. M., Morrison, F. M., Fishman, B. J., Schatschneider, C., & Underwood, P. (2007). Algorithm-guided individualized reading instruction. Science, 315, 464-465.
Cook, T. (1985). Postpositivist critical multiplism. In R. L. Shotland & M. M. Marks (Eds.), Social science and social policy (pp. 21-62). Beverly Hills, CA: Sage.
Cordray, D.S. (2007) Assessing Intervention Fidelity in Randomized Field Experiments. Funded Goal 5 proposal to Institute of Education Sciences.
Cordray, D.S., Pion, G.M., Dawson, M., and Brandt, C. (2008). The Efficacy of NWEA’s MAP Program. Institute of Education Sciences funded proposal.
Donaldson, S.I., & Lipsey, M.W. (2006). Roles for theory in contemporary evaluation practice: Developing practical knowledge. In I. Shaw, J.C. Greene, & M.M. Mark (Eds.), The Handbook of Evaluation: Policies, Programs, and Practices (pp. 56-75). London: Sage.
Fuchs, L.S., Fuchs, D., and Karns, K. (2001). Enhancing kindergarteners’ mathematical development: Effects of peer-assisted learning strategies. Elementary School Journal, 101, 495-510.
Fuchs, L. S., Fuchs, D., Yazdian, L, & Powell, S. R. (2002). Enhancing First-Grade Children's Mathematical Development with Peer-Assisted Learning Strategies. School Psychology Review, 31, 569-583.
Gamse, B.C., Jacob, R.T., Megan, H., Boulay, B., Unlu, Fatih, Bozzi, L., Caswell, L., Rodger, C., Smith, W.C., Brigham, N., and Rosenblum, S. (2008). Reading First Impact Study Final Report. Washington, D.C.: Institute of Education Sciences.
References
57
Ginsburg-Block, M. & Fantuzzo, J. (1997). Reciprocal peer tutoring: An analysis of teacher and student interactions as a function of training and experience. School Psychology Quarterly, 12, 1-16.
Holland, P.W. (1986). Statistics and causal inference. Journal of the American Statistical Association.81(396), 945-960.
Hulleman, C. S., & Cordray, D. (2009). Moving from the lab to the field: The role of fidelity and achieved relative intervention strength. Journal of Research on Intervention Effectiveness, 2(1), 88-110.
Hulleman, C.S., Cordray, D.S., Nelson, M.C., Darrow, C.L., & Sommer, E.C. (2009, June). The State of Treatment Fidelity Assessment in Elementary Mathematics Interventions. Poster presented at the annual Institute of Education Sciences Conference, Washington, D.C.
Institute of Education Sciences (2004). Pre-doctoral training grant announcement. Washington, DC: US Department of Education.
Knowlton, L.W. and Phillips, C.C. (2009). The Logic Model Guidebook: Better Strategies for Great Results. Washington, D.C.: Sage.
Kutash, K., Duchnowski, A. J., Sumi, W. C., Rudo, Z. & Harris, K. M. (2002). A school, family, and community collaborative program for children who have emotional disturbances. Journal of Emotional and Behavioral Disorders, 10(2), 99-107.
McIntyre, L.L., Gresham, F.M., DiGennaro, F.D., and Reed, D.D. (2007). Treatment integrity of school-based interventions with children in the Journal of Applied Behavior Analysis 1991-2005. Journal of Applied Behavior Analysis. 40, 659-672.
References
58
Michalopoulos, C. (2005). Precedents and Prospects for Randomized Experiments. In H.S. Bloom (Ed.) Learning More from Social Experiments, (pp. 1-36). New York, NY: Russell Sage Foundation.
Noell, G.H., Witt, J.C., Slider, N.J., Connell, J.E., Gatti, S.L., Williams, K.L., Koenig, J.L. & Resetar, J.L. (2005). Treatment Implementation Following Behavioral Consultation in Schools: A Comparison of Three Follow-up Strategies. School Psychology Review, 34(1), 87-106.
O'Donnell, C. L. (2008). Defining, Conceptualizing, and Measuring Fidelity of Implementation and Its Relationship to Outcomes in K-12 Curriculum Intervention Research. Review of Educational Research, 78(1), 33-84.
Shadish, W.R., Cook, T.D., and Campbell, D.T. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference. New York, NY: Houghton Mifflin Company.
Swafford, J.O., Jones, G.A., and Thornton, C.A. (1997). Increased Knowledge in Geometry and Instructional Practice. Journal for Research in Mathematics Education, 28(4), 467- 483.
Trochim, W. and Cook, J. (1992). Pattern matching in theory-driven evaluation: A field example from psychiatric rehabilitation. In H. Chen and P.H. Rossi (Eds.) Using Theory to Improve Program and Policy Evaluations. Greenwood Press, New York, 49-69.