1 causal rasch models iomw april 11-12, 2012 vancouver, canada a.jackson stenner donald s. burdick...

27
1 Causal Rasch Models IOMW April 11-12, 2012 Vancouver, Canada A. Jackson Stenner Donald S. Burdick Mark H. Stone

Upload: kristopher-eaton

Post on 26-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

1

Causal Rasch Models

IOMW April 11-12, 2012Vancouver, Canada

A. Jackson StennerDonald S. Burdick

Mark H. Stone

2

Causal Rasch Models - AbstractRasch’s unidimensional models for measurement tell us how to connect object

measures (e.g. reader abilities), measurement mechanisms (e.g. machine generated cloze reading items) and measurement outcomes (counts correct on reading instruments). Substantive theory tells us what interventions or manipulations to the measurement mechanism must offset (be traded off for) a change to the measure for an object of measurement to hold the measurement outcome constant. Integrating a Rasch model with a substantive theory dictates the form and substance of permissible interventions. Rasch analysis absent construct theory and an associated specification equation is a black box in which understanding may be more illusory than not. Finally, the “quantitivity hypothesis” (Michel, 2004) can be tested by comparing theory based trade-off relations with observed trade-off relations. It is asserted that only quantitative variables (as measured) support such trade-offs. Note that testing the quantitivity hypothesis requires more than manipulating the algebraic equivalencies in the Rasch model or descriptively fitting data to the model. What is required is an experimental intervention/manipulation on either reader ability or text complexity or a conjoint intervention on both simultaneously that yields a successful prediction on the resultant measurement outcome (count correct). When manipulations of the sort just described are introduced for individual reader text encounters and model predictions are consistent with what is observed the quantitivity hypothesis is sustained.

 

3

Reader Ability

Temperature

Short Term MemoryLow Knox Cube Test

Hi

Low Peabody Picture Vocabulary Test Hi

Vocabulary Knowledge

4

Each of the instruments we examine in this presentation has been shown to be able to detect within person variation in its respective target attribute consistently across a wide range of person characteristics (e.g. age) and measurement mechanisms.

5

The central validity issue is how an instrument works and this has nothing whatsoever to do with how useful the measures it produces are for commerce, research or human well being.

6

A measurement instrument is comprised in part by a mechanism that is sensitive to variation of a kind. Instrument validation is about specifying what this mechanism is and how it works.

7

An acid test for the existence of an attribute is the identification of multiple mechanisms for measuring that attribute.

8

In each case a specification equation “specifies” key features of the measurement mechanism and how these features act together to cause changes in the measurement outcome.

9

If intervention/manipulation of the measurement mechanism can be traded-off for interventions/manipulations on the attribute to produce successful predictions on the measurement outcome (up and down the scale) then the quantitative hypothesis for the attribute is sustained.

10

The model that links the differences between measurement mechanism and attribute measure to the measurement outcome is: temperature/Guttman, short term memory/dichotomous Rasch; Receptive vocabulary/dichotomous Rasch; Reading ability/ensemble Rasch.

11

Causal Rasch models are individually centered. In all four cases the attribute detected within person over time is the same attribute detected between persons at one point in time.

12

Specification Equation = Fixed 3 component compound with a variable amount of an additive in each cavity

R2 = .9999 Predicting temperature change of state in optical properties of cavities from amount of additive

Strictly parallel forms: millions

Alternative mechanism: expansion of mercury in a glass tube

13

Short Term Memory

Specification Equation = Distance covered and number of taps

R2 = .95

-- 86,000 possible items 2-8 taps in length

Strictly Parallel Forms: Hundreds

Alternative mechanism: oral recitation of number series

1

2

3

4

.

..

.Taps = 5DistanceCovered = 9

.

Picture of a Knox Cube set-up with tapping sequence

14

Example of a Picture Vocabulary Item (1500L)

Specification Equation: Log word frequency and dispersion across content domains

R2 = .72

Strictly Parallel Forms: Thousands

Alternative mechanism: use target word in a written or spoken sentence.

(2 variables)

ANSWER

Bucolic

15

“Atom and Atomic Theory”

Specification equation: log mean word Frequency

and mean log sentence length

R2 = .94

Strictly Parallel Forms: millions

(2 variables)

16

Cloze ExampleAtomic and Atomic Theory

17

Reading is a process in which information from the text and the

knowledge possessed by the reader act together to

produce meaning.

Anderson, R.C., Hiebert, E.H., Scott, J.A., & Wilkinson, I.A.G. (1985) Becoming a nation of readers: The report of the Commission on Reading Urbana, IL: University of Illinois

18

A Causal Rasch Model

= Reader Ability

Text ComplexityComprehension -

Conceptual

Statistical

RawScore

=i

e (RA – TCi)

1 + e (RA – TC i)

RA = Reading Ability

TC = Text Complexity

19

The Measurement Trade-off Property

Reader AbilityDial

Text ComplexityDial

ComprehensionDisplay

72%

200L

200L

1700L

1700L

20

Why has the notion of a causal Rasch model been largely ignored for 30 years? Answer: Pervasive discomfort with “causal” talkHow Many Ways Can We Say X Causes Y?

X “elicited a greater” Y X “impacts” YX “accounts for” Y X “has been linked to”

YY “is the result of” X X “didn’t diminish” Y

Y “because of” X Y “depends on” XX “has led to” Y X “largely motivates” Y

Y “stemmed from” X X “proved critical to” YX “fosters” Y X “changes” YX “triggers” Y X “affects” Y

21

“The best way to understand something is to try to change it.”

Kurt Lewin

22

Features and Uses of Specification Equations1. “Test Validity” is an answer to the question “Does the

instrument measure what you intend to measure”. The specification equation provides a statement of intention that is independent of the instrument. Such independence is required to avoid a circular argument.

2. Specification equations provide theory based instrument calibrations. Thus, absolute person measures are generally objective, i.e. instrument independent.

3. The specification equation enables the development of large numbers of strictly parallel instruments.

4. The specification equation can be used to calibrate non-test situations by imagining them to be tests.

23

Features and Uses of Specification Equations cont’d.5. The specification equation maintains the unit of measurement over

time, context, task type, etc.

6. The specification equation specifies the mechanism that transmits variation in the attribute to the measurement outcome.

7. The specification equation specifies those features of the mechanism that can be traded off for a change in the attribute measure to produce predictable changes in the measurement outcome (these trade-offs test the quantitivity hypothesis).

8. The specification equation enables individual centered measurement by eliminating the dependence on other persons to make a measurement for an individual.

9. The specification facilitates cost savings and an order of magnitude reduction in measurement error.

24

Theoretical versus Empirical Text Complexity for

719 Articles*

Reliability = 0.997

SEM = 12.8L

r = 0.968

r” = 0.969

R2” = 0.938

RMSE” = 89.6L

* Inclusion criteria: 50 encounters and 1,000 items

Mean Theoretical = 884.4L (356.2)

Mean Empirical = 884.4L (355.0)

25

Artifactual Sources of Variance in Empirical Text complexity Measures1. Random measurement error2. Sampling error3. Range restriction4. Systematic error in empirical complexity measures5. Wrong function form (not linear)6. Variation in empirical text complexities across

estimation algorithms

We have estimated that the first three of these artifactual sources of variance account for no more than 4% of the total variance in the system – leaving 2% still unexplained. Sources 4-6 may account for this remaining 2%.

26

May 2016(12th Grade)

1200

1000

1400

1600

Text Demands forCollege and Career

May 2007 – April 2011

347 Encounters138,695 Words3,342 Items983 Minutes

Student 1528

7th GradeMaleHispanicPaid Lunch

27

A. Jackson Stenner Chairman & CEO, MetaMetrics

University of North Carolina, Chapel [email protected]

Contact Info: