cara cahalan-laitusis operational data or experimental design? a variety of approaches to examining...

31
Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Upload: roderick-goodwin

Post on 30-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Cara Cahalan-Laitusis

Operational Data or Experimental Design?A Variety of Approaches to Examining the

Validity of Test Accommodations

Page 2: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

• Review types of evidence

• Review current research designs

• Pros/Cons for each approach

Page 3: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Types of Validity Evidence

• Psychometric research

• Experimental research

• Survey research

• Argument based approach

Page 4: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Psychometric Indicators (National Academy of Sciences, 1982)

• Reliability• Factor Structure• Item functioning• Predicted Performance• Admission Decisions

Page 5: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Psychometric Evidence

• Is the test as reliable when taken with and without accommodations? (Reliability)

• Does the test (or test items) appear to measure the same construct for each group? (Validity)

• Are test items of relatively equal difficulty for students with and without a disability who are matched on total test score? (Fairness/Validity)

Page 6: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Psychometric Evidence

• Are completion rates relatively equal between students with and without a disability who are matched on total test score? (Fairness)

• Is equal access provided to testing accommodations across different disability, racial/ethnic, language, gender, and socio-economic groups? (Fairness)

• Do tests scores under or over predict an alternate measure of performance (e.g., grades, teacher ratings, other test scores, post graduate success) for students with disabilities? (Validity)

Page 7: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Advantages of Operational Data

• Cost effective• Quick results• Easy to replicate• Provides evidence of validity• Large sample size• Motivated test takers

Page 8: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Limitations of Operational Data• Disability and accommodation are confounded• Order effects can not be controlled for• Sample size can be insufficient• Difficult to show reasons why data is not

comparable between subgroups• Disability and Accommodation codes are not

always accurate – Approved accommodations may not be used– Disability category may be too broad

Page 9: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Types of Analyses

• Correlations

• Factor Analysis

• Differential Item Functioning

• Descriptive analyses

Page 10: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Relationship Among Content Areas

• Correlation between content areas (e.g. reading and writing) can also assess a tests reliability. – Compare correlations among content areas by population

(e.g., LD with read aloud vs. LD without an accommodation)

– Does the accommodation alter construct being measured? (e.g., correlations between reading and writing may be lower if read aloud is used for writing but not reading).

– Is correlation significantly lower for one population? (difference of .10 or greater)

Page 11: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Reliability• Examine internal consistency measures

– with and without specific accommodations – with and without a disability

• Examine test-retest reliability between different populations– with and without specific accommodations – with and without a disability

Page 12: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Factor Structure

• Types of questions– Are the number of factors invariant?– Are the factor loadings invariant for each

of the groups?– Are the intercorrelations of the factors

invariant for each of the groups?

Page 13: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Differential Item Functioning

• DIF refers to a difference in item performance between two comparable groups of test takers

• DIF exists if test takers who have the same underlying ability level are not equally likely to get an item correct

• Some recent DIF studies on accommodations/disability– Bielinski, Thurlow, Ysseldyke, Freidebach & Friedebach, 2001– Bolt, 2004– Barton & Finch, 2004– Cahalan-Laitusis, Cook, & Aicher, 2004

Page 14: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Issues Related to the Use of DIF Procedures for Students with Disabilities

• Group characteristics– Definition of group membership– Differences between ability levels of reference and

focal groups• The characteristics of the criterion

– Unidimensional– Reliable– Same meaning across groups

Page 15: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Procedures/Sample

• DIF Procedures (e.g., Mantel-Haenszel, Logistic regression, DIF analysis paradigm, Sibtest)

• Reference/focal groups – minimum of 100 per group, ETS uses a minimum

of 300 for most operational tests– Select groups that are specific (e.g., LD with read

aloud) rather than broad (e.g., all students with IEP or 504)

Page 16: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

DIF with hypotheses• Generate hypotheses on why items may

function differently• Code items based on hypotheses• Compare DIF results with item coding• Examine DIF results to generate new

hypotheses

Page 17: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Other Psychometric Research• DIF to examine fatigue on extended time• Item completion rates between groups

matched on ability• Loglinear analysis to examine if specific

demographic subgroups (SES, race/ethnicity, geographic regions, gender) are using specific accommodation less than other groups.

Page 18: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Other Research Studies

• Experimental Research– Differential Boost

• Survey/Field Test Research

• Argument-based Evidence

Page 19: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Advantages of Collecting Data

• Disability and accommodation can be examined separately

• Form and Order effects can be controlled • Sample can be specific (e.g., reading-based LD

rather than all LD or LD with or without ADHD)• Opportunity to collect additional information• Reasons for differences can be tested• Data can be reused for psychometric analyses

Page 20: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Disadvantages• Cost of large data collection• Test takers may not be as motivated• More time consuming than psychometric

research• Over testing of students

Page 21: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Differential Boost (Fuchs & Fuchs 1999)

• Would students without disabilities benefit as much from the accommodation as students with disabilities?– If Yes then the accommodation is not valid.– If No, then the accommodation may be valid.

Page 22: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Example of Differential Boost

0

LD Non-LD

Ave

rage

Tes

t Sco

re

Standard

Audio

Page 23: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Differential Boost Design

Session 1 Session 2

Group N* Booklet Accommodation Booklet Accommodation

1 350 A Standard B Audio

2 350 A Audio B Standard

3 350 B Audio A Standard

4 350 B Standard A Audio

Total 1400

Page 24: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Ways to reduce cost:• Decrease sample size

• Randomly assign students to one of two conditions

• Use operational test data for one of the two sessions

Page 25: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Additional data to collect:• Alternate measure of performance on

construct being assessed• Teacher survey (ratings of student

performance, history of accommodation use)• Student survey• Observational data (how student used

accommodation)• Timing data

Page 26: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Additional Analyses

• Differential Boost – by subgroups– controlling for ability level

• Psychometric properties (e.g, DIF)

• Predictive Validity (alt performance measure required)

Page 27: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Field Testing Survey

• How well does item type measure intended construct (e.g., reading comprehension, problem solving)?

• Did you have enough time to complete this item type?

• How clear were the directions (for this type of test question)?

Page 28: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Field Testing Survey

• How would you improve this item type?– To make the directions clearer – To measure the intended construct

• What specific accommodations would improve this item type?

• Which presentation approach did the test takers prefer?

Page 29: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Additional Types of Surveys

• How accommodation decisions are made• Expert opinion on how/if accommodation

interferes with construct being measured• Information on how test scores with and

without accommodations interpreted• Correlation between use of accommodations

in class and on standardized tests

Page 30: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Additional Research Designs

• Think Aloud Studies or Cognitive Labs

• Item Timing Studies

• Scaffolded Accommodations

Page 31: Cara Cahalan-Laitusis Operational Data or Experimental Design? A Variety of Approaches to Examining the Validity of Test Accommodations

Argument-Based Validity

• Clearly Define Construct Assessed – Evidence Centered Design

• Decision Tree