accommodations research: reconsidering the test accommodation validation process: a paradigm for...

29
Accommodations Research: Reconsidering the Test Accommodation Validation Process: A Paradigm for Research Design With Initial Outcomes Gerald Tindal University of Oregon

Upload: beatrix-harrell

Post on 26-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Accommodations Research:Reconsidering the Test Accommodation Validation Process:

A Paradigm for Research Design With Initial Outcomes

Gerald Tindal

University of Oregon

Accommodations:Issues and Options

A Bi-polar Condition

OR

Universal Design of Something

Marshall McLuhanisms• Why is it so easy to acquire the solutions of past problems and so difficult to solve current ones?• Mud sometimes gives the illusion of depth.• The answers are always inside the problem, not outside.• “I may be wrong, but I’m never in doubt.”• You mean my whole fallacy’s wrong?

The Measurement Conundrum

• Fixing a condition of measurement reduces error and increases the precision of measurements, but it does so at the expense of narrowing interpretations of measurements” (Brennon, 2001, p. 2).

• The reliability-validity paradox: Attempts to increase reliability through standardization can actually lead to a decrease in the validity of interpretations.

Distinguishing between Nouns and Verbs

• Constructs– Meaning and Interpretation– Construct Irrelevant Variance– Construct Under or Misrepresentation

• To Construct: The Test Environment– Contexts and Settings– Expected Routines– Enacted Behaviors

Research Basis

• Naturalistic Evaluations– That it works

• Quasi-experimental Studies– Sometimes it works

• Experimental Studies– How it works

The Unit of Analysis• Test Level

– Bundled Items– Variation in Skills– Reporting Categories

• Item Level– Specific Skills– Difficulty and Discrimination– Differential Item Functioning

Keeping Score For All

• The effects of inclusion and accommodation policies on large-scale educational assessment

• National Research Council, 2004

Chapter 1. Introduction: Two Questions

• “Do commonly used accommodations yield scores that are comparable to those obtained when accommodations are not used? Do they over- or under-correct for the impediment for which they are designed to compensate” (p. 13)?

• “Do commonly used accommodations alter the construct being tested? What methods should be used for evaluating the effects of a particular accommodations on the validity of test results (p. 13)?

Criticisms of Previous Research

• Score Gains

• Differential score gain vs. overall score gain

• Quasi-experimental studies

• Descriptive research

• Population comparisons

Chapter 2. Characteristics of SWD and

Assessments and Accommodations

• Identification rates• Legal mandates• State testing programs• Allowable accommodations

– No mention of skill levels– No mention of dissaggregated performances– No mention of rationale for accommodations given type of test

Chapter 3. Participation in NAEP

• Considerable increase in participation of SWD and ELL from 1992 to 1998

• Use of 3 cohorts to study the effects of accommodations (and maintain integrity to past data): (a) SWD excluded, (b) no accommodations, (b) accommodations permitted.

Conclusion on Participation

• “The increased use of accommodations with NAEP assessments has corresponded to increased participation rates for students with disabilities and English Language Learners.”

• So what?

What We Don’t Know• What accommodations have been used in NAEP

(singly and bundled).• The SKILL of the student (versus the disability).• Use of the accommodation in instruction.• Teacher recommendation for accommodation (on

NAEP or state test)• Performance levels on accommodated versus non-

accommodated items.

Chapter 4. Factors that Affect Accuracy of NAEPs

Estimates of Achievement

• Comparison of states that allow an accommodation and NAEP allowance of the accommodation.

• Representativeness of NAEP samples and comparison of NAEP with national samples.

• “Decision making regarding the inclusion or exclusion of students and the use of accommodations for NAEP is controlled at the school level. There is variability in the way these decision are made, both across schools within a state and across states” (p.83).

Recommendations

• Review inclusion criteria for inclusion and accommodations of SWD and ELL

• Clarify, elaborate, and revise the criteria

• Standardize implementation of criteria at the school level.

• Make policies more consistent between state and NAEP.

• “More clearly define the characteristics of the population of students to whom the results are intended to generalize. This definition should serve as a guide for decision-making and the formulation of regulations regarding inclusion, exclusion, and reporting” (p. 84).

• Confirm the inclusion rates with state data.

Chapter 5. Available Research on Effects of

Accommodations on Validity

• “The vast majority of studies pertaining to the interaction hypothesis showed that all student groups (SWD, ELL, and their general education peers) had score gains under accommodation conditions. Moreover, in general, the gains for SWD and ELL were greater than their general education peers under accommodation conditions. These conclusions varied somewhat across student groups and accommodation conditions, as we discuss below” …(p. 60)

Chapter 5. Available Research on Effects of

Accommodations on Validity

• However, it appears that the interaction hypothesis needs qualification. When SWD or ELL students exhibit greater gains with accommodations than their general education peers, an interaction is present. When the gains experienced by SWD or ELL are significantly greater than the gains experienced by their general education peers, the fact that the general education students achieved higher scores with an accommodation condition does not imply that the accommodation is unfair. It could imply that the standardized test conditions are too stringent for all students (p. 60).

Chapter 6. Articulating Validation Arguments

• Target and ancillary skill required by NAEP reading and math items

• Use of claims, data, and warrants• Disconnect with previous literature• No anchor to state and NAEP relationships• No focus on item level

No Right Way to Do a Wrong Thing

• NAEP data base as it is structured can never address the question of accommodations.

• Research designs are lacking.• Data are too global to answer any serious

question.• Construct validity at item level is lacking.

What We Know and Don’t Know

• Need to consider accommodations as complex packages

• Need different research designs than randomized experiments (because of low sample size and inappropriate use of group statistics)

• We need to study populations and items more carefully– Smart about items– Smart about people

Two Examples• Kansas Computer-Based Testing

– Guidelines, highlight and erase, presentation of passage and item format, mark text with icons, cross-out, synthesized read aloud (in math), mark for review.

• Oregon Accommodation Station– MATH: Reading skills analysis, comprehension, computation

skills, trial changes: Read aloud in math, simplified, Spanish, and perception survey

– READING: word search, split screen, drag and drop, highlight

Design 1 of Research: Smart about Items

• Student is presented a standard item• Can I solve the problem as presented?

Yes No

Incorrect Correct Accommodated

StandardAccommodated

Design 2 of Research: Smart about People

• Pre Measure Student Reading Fluency• Pre Measure Student Basic Math Skill

Low FluencyIntact Math Skill

Correct

Simplified Standard

Intact FluencyLow Math Skill

Intact FluencyIntact Math Skill

Low FluencyLow Math Skill

SimplifiedRead Aloud

Read Aloud

Incorrect

Standard

Assessment Adaptations beyond Research Findings

• The ASK Settlement in Oregon• When the Sidewalk Ends: Practice in the

Absence of Research– Purpose: What is the construct irrelevant variance?

– Function: How does it work?

– Error: What are the false positives and false negatives?

– Systems: What are the implications for the whole?

• Modifications

Accommodations to Modified Achievement Standards

• System Level Uniformity

• Manipulation of Breadth and Depth– Modified Achievement Standards (2%)– Alternate Achievement Standards (1%)

• Meaning of Score Reporting Categories– Exceeds, Meets, Does Not Meet

• Consequences as Validation Process– Social policy versus construct validity

An Example in Grades 3-5

Know and explain common antonyms and synonyms.

Determine the meanings of words using knowledge of antonyms, synonyms, homophones, and homographs.

Apply knowledge of synonyms, antonyms, homographs, and idioms to determine the meaning of words and phrases.

An Example in Grades 6-8

Identify and/or summarize sequence of events, main ideas and supporting details in literary selections.

Identify and/or summarize sequence of events, main ideas, and supporting details in literary selections.

Identify and/or summarize sequence of events, main ideas, and supporting details in literary selections.

Manipulating Breadth and Depth

• Content– Grade Level

• Context– Applications and experience

• Concepts– Attributes and Examples