generalizability theory nothing more practical than a good theory!

58
Generalizability Theory Nothing more practical than a good theory!

Upload: belinda-riley

Post on 03-Jan-2016

229 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Generalizability Theory Nothing more practical than a good theory!

Generalizability Theory

Nothing more practical than a good theory!

Page 2: Generalizability Theory Nothing more practical than a good theory!

This presentation is made by Prof. ZhaoThis presentation is made by Prof. Zhao

Page 3: Generalizability Theory Nothing more practical than a good theory!

Overview of Presentation

Classes of reliability theories Generalizability Theory

G-study D-study

Illustrations

Page 4: Generalizability Theory Nothing more practical than a good theory!

Three Reliability Theories

Classical Test Theory Generalizability Theory Item Response Theory

Page 5: Generalizability Theory Nothing more practical than a good theory!

Overview of Presentation

Classes of reliability theories Generalizability Theory

G-study D-study

Illustrations

Page 6: Generalizability Theory Nothing more practical than a good theory!

Generalizability Theory

Fundamental is the concept of parallel measures (like classical test theory), but the theory allows a multitude of error sources

Generalizability concept:Reliability is dependent on the inferences (generalizations) that the investigator wishes to make with the data from the measurement

Page 7: Generalizability Theory Nothing more practical than a good theory!

Illustration

Essay test 7 vignette based essay questions 2 markers independently marking all

questions for all examinees Reliability in a classical framework:

Cronbach’s alpha: 0.66 Inter rater reliability (i.e. kappa) 0.71

Page 8: Generalizability Theory Nothing more practical than a good theory!

Fundamental Equation

X =X = Observed score

T + E T = True score

E = Error score

Reliability = Variance of TVariance of X

The larger the variance of T in relation to X, the higher the

reliability

Page 9: Generalizability Theory Nothing more practical than a good theory!

Fundamental Equation

X =X = Observed score

T + E T = True score

E = Error score

Reliability = Variance of TVariance of X

= = =

Page 10: Generalizability Theory Nothing more practical than a good theory!

Fundamental Equation

X =X = Observed score

T + E T = True score

E = Error score

Reliability = Variance of TVariance of X

Reliability = Variance of TVar T + Var E

Page 11: Generalizability Theory Nothing more practical than a good theory!

Multiple sources of error variance

Reliability = Variance of TVar T + Var E

Markers Essays Unexplained

Page 12: Generalizability Theory Nothing more practical than a good theory!

Two steps in G analysis

1) G(eneralizability)-study:Estimation of sources of variance that influence the measurement (e.g., variance between examinees, essays and markers)

2) D(ecision)-study:Estimation of reliability indices as a function of concrete sample size(s) (e.g., number of essays, number of markers)

Page 13: Generalizability Theory Nothing more practical than a good theory!

G-study steps

Determine facets (factors of variance)

Determine design Random vs fixed Crossed vs nested

Page 14: Generalizability Theory Nothing more practical than a good theory!

Crossed vs nested designs

A B

1

2

3

4

5

6

A B C D E F G H I J K L

Crosseddesign

Nesteddesign

Page 15: Generalizability Theory Nothing more practical than a good theory!

G-study

Determine facets (factors of variance)

Determine design Random vs Fixed Crossed vs nested Collect data

Analysis of Variance (ANOVA) Estimation of variance components

Page 16: Generalizability Theory Nothing more practical than a good theory!

Illustration 1

Essay Test 7 vignette based open ended questions 100 students One marker marked all essays for all

students G-study questions?

N of factors/facets? Random/fixed facets? Nested or crossed?

One facet designRandomCrossed

Page 17: Generalizability Theory Nothing more practical than a good theory!

Sources of Variance

Person x Items

p ipi,e

Page 18: Generalizability Theory Nothing more practical than a good theory!

Sources of Variance

Person x Items

ip pi,e

Page 19: Generalizability Theory Nothing more practical than a good theory!

Sources of Variance

Person x Items

p ipi,e

Page 20: Generalizability Theory Nothing more practical than a good theory!

Sources of Variance

Person x Items

p pi,e

Page 21: Generalizability Theory Nothing more practical than a good theory!

Variance component estimation (one facet design)

An observed score for a person on an item (Xpi):

Xpi = [Overall mean]

+ p - [Person effect]

+ i - [Item effect]

+ pi - p - i - [Residual]

Each of these effects have an average (always 0) anda variance (2). The latter ones are the variance components.

The variance of all observes scores Xpi across all persons and items:

^

^2 (Xpi) = ^2p

^2i + ^2

pi,e +

Page 22: Generalizability Theory Nothing more practical than a good theory!

Variance components

P x I design

Source

pi

pi,e

EstimatedVariance

Component

97.57261.24371.97

StandardError

19.02112.9817.60

Percentageof TotalVariance

13.3535.7550.90

Page 23: Generalizability Theory Nothing more practical than a good theory!

Crossed vs nested designs

A B

1

2

3

4

5

6

A B C D E F G H I J K L

Crosseddesign

Nesteddesign

Page 24: Generalizability Theory Nothing more practical than a good theory!

Sources of Variance

Items : Persons

p i,pi,e

Page 25: Generalizability Theory Nothing more practical than a good theory!

Variance components

I : P design

p

i,pi,e

97.57

663.21

35.7550.90

13.35

86.65

ipi,e

261.24371.97

Source

EstimatedVariance

Component

Percentageof TotalVariance

Page 26: Generalizability Theory Nothing more practical than a good theory!

Variance components

I : P design

p

i,pi,e

97.57

663.21

35.7550.90

13.35

86.65

ipi,e

261.24371.97

Source

EstimatedVariance

Component

Percentageof TotalVariance

pi,pi,e

97.57663.21

13.3586.65

Page 27: Generalizability Theory Nothing more practical than a good theory!

Sources of Variance

Person x Items x Judges

p i

pij,e

pi

pj ij

j

Page 28: Generalizability Theory Nothing more practical than a good theory!

Variance component estimation (two facet design)

An observed score for a person on an item (Xpi):

Xpi = [Overall mean]

+ p - [Person effect]

+ j - [Item effect]

+ i - [Judge effect]

+ pj - p - j + [Person by judge effect] + pi - p - i + [Person by item effect]

+ ij - j - i + [Judge x item effect]

+ pij - pj - pi - ij + p + j + i - [Residual]

The variance of observes scores Xpi across all persons and items:

^2 (Xpij) = ^2p

^2j + + ^2

i + ^2pj +

^2pi +

^2ij +

^2pij,e

Page 29: Generalizability Theory Nothing more practical than a good theory!

Variance componentsP x I x J design

Source

pij

pipjij

pij,e

EstimatedVariance

Component

48.7125.1215.00

185.8733.1880.0072.94

Percentageof TotalVariance

10.575.453.26

40.337.20

17.3615.83

Page 30: Generalizability Theory Nothing more practical than a good theory!

Overview of Presentation

Classes of reliability theories Generalizability Theory

G-study D-study

Illustrations

Page 31: Generalizability Theory Nothing more practical than a good theory!

Two steps in G analysis

1) G(eneralizability)-study:Estimation of sources of variance that influence the measurement (e.g., variance between examinees, essays and markers)

2) D(ecision)-study:Estimation of reliability indices as a function of concrete sample size(s) (e.g., number of essays, number of markers)

Page 32: Generalizability Theory Nothing more practical than a good theory!

Interpretation of scores

Norm-oriented perspectiveScores have relative meaning; scores have meaning in relation to each other

Domain-oriented perspectiveScores have absolute meaning to the domain of measurement

Mastery-oriented perspectiveScores have meaning in relation to a cut-off score (reliability of decisions, not of scores)

Page 33: Generalizability Theory Nothing more practical than a good theory!

Fundamental Equation

X =X = Observed score

T + E T = True score

E = Error score

Reliability = Variance of TVariance of X

Reliability = Variance of TVar T + Var E

Page 34: Generalizability Theory Nothing more practical than a good theory!

Illustration 1

Essay test 7 vignette based essay questions 1 markers marked all questions for all

examinees Norm-referenced perspective

Calculate generalizability coefficient!

Page 35: Generalizability Theory Nothing more practical than a good theory!

D-study (ni = 7; norm-referenced)

Source

pi

pi,e

EstimatedVariance

Component

97.57261.24371.97

StandardError

19.02112.9817.60

Percentageof TotalVariance

13.3535.7550.90

G =T

T + E=

97.57

97.57 + 371.97/7= 0.65

Page 36: Generalizability Theory Nothing more practical than a good theory!

Illustration 2

Essay test 7 vignette based essay questions 1 markers marked all questions for all

examinees Domain-referenced perspective

Calculate dependability coefficient!

Page 37: Generalizability Theory Nothing more practical than a good theory!

D-study (ni = 7; domain referenced)

Source

pi

pi,e

EstimatedVariance

Component

97.57261.24371.97

StandardError

19.02112.9817.60

Percentageof TotalVariance

13.3535.7550.90

D =97.57

97.57+= 0.52

261.24/ 7

+371.97/ 7

Page 38: Generalizability Theory Nothing more practical than a good theory!

Illustration 3

Essay test 7 vignette based essay questions 1 markers marked all questions for all

examinees Domain-referenced perspective

Calculate dependability coefficient fora sample of 10 essays!

Page 39: Generalizability Theory Nothing more practical than a good theory!

D-study (ni = 10; domain referenced)

Source

pi

pi,e

EstimatedVariance

Component

97.57261.24371.97

StandardError

19.02112.9817.60

Percentageof TotalVariance

13.3535.7550.90

D =97.57

97.57+= 0.61

261.24/10

+371.97/ 10

Page 40: Generalizability Theory Nothing more practical than a good theory!

D-studies for several item samples

N Essays

1571015

GeneralizabilityCoefficient (G)

0.210.570.650.720.80

DependabilityCoefficient (D)

0.130.440.520.610.70

Page 41: Generalizability Theory Nothing more practical than a good theory!

Illustration 4

Essay test 7 vignette based essay questions 2 markers independently marked all

questions for all examinees Norm-referenced perspective

Calculate generalizability coefficient!

Page 42: Generalizability Theory Nothing more practical than a good theory!

D-study (ni=7; nj=2; norm referenced)

Source

pij

pipjij

pij,e

VarianceComponent

48.7125.1215.00

185.8733.1880.0072.94

% of TotalVariance

10.575.453.2640.337.2017.3615.83

G =48.71

48.71+= 0.50

185.87/ 7

+33.18/2

+72.94/2 x 7

Page 43: Generalizability Theory Nothing more practical than a good theory!

Illustration 5

Essay test 7 vignette based essay questions 2 markers independently marked all

questions for all examinees Domain-referenced perspective

Calculate dependability coefficient!

Page 44: Generalizability Theory Nothing more practical than a good theory!

D-study (ni=7; nj=2; domain referenced)

Source

pij

pipjij

pij,e

VarianceComponent

48.7125.1215.00

185.8733.1880.0072.94

% of TotalVariance

10.575.453.2640.337.2017.3615.83

D =48.71

48.71+= 0.43

25.12/ 7

+15.00/2+185.87/

14+33.18/

2+80.00/

14+72.94/

14

Page 45: Generalizability Theory Nothing more practical than a good theory!

Illustration 6

Essay test 7 vignette based essay questions 2 different markers

independently marked each question for all examinees

Norm-referenced perspective

Calculate generalizability coefficient!

Page 46: Generalizability Theory Nothing more practical than a good theory!

D-study (ni=7; nj=2; norm referenced)

SourceEstimated Var

ComponentPerc of Total

Variance

(Judges : Items) x Persons

pi

j,ijpi

pj,pij,e

48.7125.1895.00

185.87106.12

10.575.45

20.6240.3323.03

G =48.71

48.71+= 0.52

185.87/ 7

+ 106.12/2 x 7

Page 47: Generalizability Theory Nothing more practical than a good theory!

D-study summary table

TwoMarkers

0.440.500.560.61

OneMarker

0.390.470.560.65

TwoMarkers

0.460.540.630.72

Same Markerfor all essays

Different Markerfor each essayNumber

ofEssays

571015

OneMarker

0.360.410.450.49

Norm-referenced score interpretation

Page 48: Generalizability Theory Nothing more practical than a good theory!

Another reliability index

Reliability coefficient (G & D coefficients) Scale independent (0-1) Non-intuitive interpretation

Standard Error of Measurement (SEM) Intuitive interpretation Scale dependent

Page 49: Generalizability Theory Nothing more practical than a good theory!

Standard Error of Measurement

X =X = Observed score

T + E T = True score

E = Error score

Reliability index = Variance of TVariance T + Variance E

EStandard Error of Measurement (SEM) =

Page 50: Generalizability Theory Nothing more practical than a good theory!

Interpretation of SEM

Suppose an examinee has a score of 60% and the SEM is 5:

60555045 65 70 7565% CI

1.96 x 5 10

60555045 65 70 7595% CI

2.14 x 5 11

60555045 65 70 7595% CI

Page 51: Generalizability Theory Nothing more practical than a good theory!

D-study (ni = 7; norm referenced)

Source

pi

pi,e

EstimatedVariance

Component

97.57261.24371.97

StandardError

19.02112.9817.60

Percentageof TotalVariance

13.3535.7550.90

G =97.57

97.57 + 371.97/7= 0.65

SEM = = 7.29 371.97 /7

Page 52: Generalizability Theory Nothing more practical than a good theory!

D-study (ni=7; nj=2; domain referenced)

Source

pij

pipjij

pij,e

VarianceComponent

48.7125.1215.00

185.8733.1880.0072.94

% of TotalVariance

10.575.453.2640.337.2017.3615.83

D =48.71

48.71+= 0.43

25.12/ 2

+15.00/2+185.87/

14+33.18/

2+80.00/

14+72.94/

14SEM = = 8.57

Page 53: Generalizability Theory Nothing more practical than a good theory!

Overview of Presentation

Classes of reliability theories Generalizability Theory

G-study D-study

Illustrations

Page 54: Generalizability Theory Nothing more practical than a good theory!

Scenario CEX

A clinical mini exercise (CEX) was developed in which examinees are periodically observed and rated on a rating form. An investigator analyzed a data set from 88 residents who were each observed on 4 occasions by a single different examiner (cf. 1. Norcini JJ, Blank LL, Arnold GK, Kimbal HR. The mini-CEX (Clinical Evaluation Exercise): A

preliminary investigation. Annals of Internal Medicine 1995;123:795-799.). Variance

Componentsp

o,op,eG =

p

p + o:p /4

= Do:p

Page 55: Generalizability Theory Nothing more practical than a good theory!

Scenario OSCE I

An OSCE was administered to 100 final year students consisting of 15 stations. Each station was scored by two independent examiners on a case specific checklist. Different examiners were used in each station.

VarianceComponents

ps

G =p

p +j:spspj:s

ps /15

+ pj:s /2 x15

Page 56: Generalizability Theory Nothing more practical than a good theory!

Scenario OSCE II

An experimental OSCE was administered to 20 residents. Each resident was tested on a different day. For each resident 3 stations were organized consisting of real patients that were available that day. Two examiners observed all residents in all stations and completed a generic rating scale.

VarianceComponents

ps:p D =

p

p +s:p /3j

ps:spj

+ j /2+ ps:s /

3+ pj /

6

Page 57: Generalizability Theory Nothing more practical than a good theory!

Scenario Clerkship Evaluation

An investigator wishes to evaluate teaching quality of 10 clinical clerkships. She developed a questionnaire with 30 items on various quality aspects. The questionnaire was administered in all clerkships by 50 students.

VarianceComponents

ci

s:cci

cs:i

G =c

c + s:c /50

+ ci /30

+ cs:i /50 x 30

PS: It is doubtful that i is a random facet and i could be treated as fixed or ignored!

Page 58: Generalizability Theory Nothing more practical than a good theory!

Further reading & software

Literature Cronbach LJ, Gleser GC, Nanda H, Rajaratnam N. The dependability of behavioral

measurements: Theory of generalizability for scores and profiles. New York: Wiley, 1972. Original monograph on generalizability theory. Complete, but hardly accessible for any reader.

Brennan RL. Elements of Generalizability Theory. Iowa: ACT Publications, 1983.This is the resource book for most specialists. Not easy for non-statistically trained readers

Shavelson RJ, Webb NM. Generalizability theory: A primer. Newbury Park, CA: Sage Publications, 1991 . Good and accessible introduction to generalizability theory for any reader

Software GENOVA

Conducts G and D studies and provides ample statistical information. Operates on any PC. Program is relatively old and not user friendly. Program available from Dr. J. Crick, National Board of Medical Examiners, National Board of Medical Examiners, 3750 Market Street,Philadelphia, PA 19104-3190, USA.

SPSSSPSS General Linear Models, Subprogram Variance Components, estimates variance components (also for unbalanced designs). D-studies need to be done manually.