generalizability theory
DESCRIPTION
Generalizability Theory. Nothing more practical than a good theory!. This presentation is made by Prof. Zhao. Overview of Presentation. Classes of reliability theories Generalizability Theory G-study D-study Illustrations. Three Reliability Theories. Classical Test Theory - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/1.jpg)
Generalizability Theory
Nothing more practical than a good theory!
![Page 2: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/2.jpg)
This presentation is made by Prof. ZhaoThis presentation is made by Prof. Zhao
![Page 3: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/3.jpg)
Overview of Presentation
Classes of reliability theories Generalizability Theory
G-study D-study
Illustrations
![Page 4: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/4.jpg)
Three Reliability Theories
Classical Test Theory Generalizability Theory Item Response Theory
![Page 5: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/5.jpg)
Overview of Presentation
Classes of reliability theories Generalizability Theory
G-study D-study
Illustrations
![Page 6: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/6.jpg)
Generalizability Theory
Fundamental is the concept of parallel measures (like classical test theory), but the theory allows a multitude of error sources
Generalizability concept:Reliability is dependent on the inferences (generalizations) that the investigator wishes to make with the data from the measurement
![Page 7: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/7.jpg)
Illustration
Essay test 7 vignette based essay questions 2 markers independently marking all
questions for all examinees Reliability in a classical framework:
Cronbach’s alpha: 0.66 Inter rater reliability (i.e. kappa) 0.71
![Page 8: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/8.jpg)
Fundamental Equation
X =X = Observed score
T + E T = True score
E = Error score
Reliability = Variance of TVariance of X
The larger the variance of T in relation to X, the higher the
reliability
![Page 9: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/9.jpg)
Fundamental Equation
X =X = Observed score
T + E T = True score
E = Error score
Reliability = Variance of TVariance of X
= = =
![Page 10: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/10.jpg)
Fundamental Equation
X =X = Observed score
T + E T = True score
E = Error score
Reliability = Variance of TVariance of X
Reliability = Variance of TVar T + Var E
![Page 11: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/11.jpg)
Multiple sources of error variance
Reliability = Variance of TVar T + Var E
Markers Essays Unexplained
![Page 12: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/12.jpg)
Two steps in G analysis
1) G(eneralizability)-study:Estimation of sources of variance that influence the measurement (e.g., variance between examinees, essays and markers)
2) D(ecision)-study:Estimation of reliability indices as a function of concrete sample size(s) (e.g., number of essays, number of markers)
![Page 13: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/13.jpg)
G-study steps
Determine facets (factors of variance)
Determine design Random vs fixed Crossed vs nested
![Page 14: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/14.jpg)
Crossed vs nested designs
A B
1
2
3
4
5
6
A B C D E F G H I J K L
Crosseddesign
Nesteddesign
![Page 15: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/15.jpg)
G-study
Determine facets (factors of variance)
Determine design Random vs Fixed Crossed vs nested Collect data
Analysis of Variance (ANOVA) Estimation of variance components
![Page 16: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/16.jpg)
Illustration 1
Essay Test 7 vignette based open ended questions 100 students One marker marked all essays for all
students G-study questions?
N of factors/facets? Random/fixed facets? Nested or crossed?
One facet designRandomCrossed
![Page 17: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/17.jpg)
Sources of Variance
Person x Items
p ipi,e
![Page 18: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/18.jpg)
Sources of Variance
Person x Items
ip pi,e
![Page 19: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/19.jpg)
Sources of Variance
Person x Items
p ipi,e
![Page 20: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/20.jpg)
Sources of Variance
Person x Items
p pi,e
![Page 21: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/21.jpg)
Variance component estimation (one facet design)
An observed score for a person on an item (Xpi):
Xpi = [Overall mean]
+ p - [Person effect]
+ i - [Item effect]
+ pi - p - i - [Residual]
Each of these effects have an average (always 0) anda variance (2). The latter ones are the variance components.
The variance of all observes scores Xpi across all persons and items:
^
^2 (Xpi) = ^2p
^2i + ^2
pi,e +
![Page 22: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/22.jpg)
Variance components
P x I design
Source
pi
pi,e
EstimatedVariance
Component
97.57261.24371.97
StandardError
19.02112.9817.60
Percentageof TotalVariance
13.3535.7550.90
![Page 23: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/23.jpg)
Crossed vs nested designs
A B
1
2
3
4
5
6
A B C D E F G H I J K L
Crosseddesign
Nesteddesign
![Page 24: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/24.jpg)
Sources of Variance
Items : Persons
p i,pi,e
![Page 25: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/25.jpg)
Variance components
I : P design
p
i,pi,e
97.57
663.21
35.7550.90
13.35
86.65
ipi,e
261.24371.97
Source
EstimatedVariance
Component
Percentageof TotalVariance
![Page 26: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/26.jpg)
Variance components
I : P design
p
i,pi,e
97.57
663.21
35.7550.90
13.35
86.65
ipi,e
261.24371.97
Source
EstimatedVariance
Component
Percentageof TotalVariance
pi,pi,e
97.57663.21
13.3586.65
![Page 27: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/27.jpg)
Sources of Variance
Person x Items x Judges
p i
pij,e
pi
pj ij
j
![Page 28: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/28.jpg)
Variance component estimation (two facet design)
An observed score for a person on an item (Xpi):
Xpi = [Overall mean]
+ p - [Person effect]
+ j - [Item effect]
+ i - [Judge effect]
+ pj - p - j + [Person by judge effect] + pi - p - i + [Person by item effect]
+ ij - j - i + [Judge x item effect]
+ pij - pj - pi - ij + p + j + i - [Residual]
The variance of observes scores Xpi across all persons and items:
^2 (Xpij) = ^2p
^2j + + ^2
i + ^2pj +
^2pi +
^2ij +
^2pij,e
![Page 29: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/29.jpg)
Variance componentsP x I x J design
Source
pij
pipjij
pij,e
EstimatedVariance
Component
48.7125.1215.00
185.8733.1880.0072.94
Percentageof TotalVariance
10.575.453.26
40.337.20
17.3615.83
![Page 30: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/30.jpg)
Overview of Presentation
Classes of reliability theories Generalizability Theory
G-study D-study
Illustrations
![Page 31: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/31.jpg)
Two steps in G analysis
1) G(eneralizability)-study:Estimation of sources of variance that influence the measurement (e.g., variance between examinees, essays and markers)
2) D(ecision)-study:Estimation of reliability indices as a function of concrete sample size(s) (e.g., number of essays, number of markers)
![Page 32: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/32.jpg)
Interpretation of scores
Norm-oriented perspectiveScores have relative meaning; scores have meaning in relation to each other
Domain-oriented perspectiveScores have absolute meaning to the domain of measurement
Mastery-oriented perspectiveScores have meaning in relation to a cut-off score (reliability of decisions, not of scores)
![Page 33: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/33.jpg)
Fundamental Equation
X =X = Observed score
T + E T = True score
E = Error score
Reliability = Variance of TVariance of X
Reliability = Variance of TVar T + Var E
![Page 34: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/34.jpg)
Illustration 1
Essay test 7 vignette based essay questions 1 markers marked all questions for all
examinees Norm-referenced perspective
Calculate generalizability coefficient!
![Page 35: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/35.jpg)
D-study (ni = 7; norm-referenced)
Source
pi
pi,e
EstimatedVariance
Component
97.57261.24371.97
StandardError
19.02112.9817.60
Percentageof TotalVariance
13.3535.7550.90
G =T
T + E=
97.57
97.57 + 371.97/7= 0.65
![Page 36: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/36.jpg)
Illustration 2
Essay test 7 vignette based essay questions 1 markers marked all questions for all
examinees Domain-referenced perspective
Calculate dependability coefficient!
![Page 37: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/37.jpg)
D-study (ni = 7; domain referenced)
Source
pi
pi,e
EstimatedVariance
Component
97.57261.24371.97
StandardError
19.02112.9817.60
Percentageof TotalVariance
13.3535.7550.90
D =97.57
97.57+= 0.52
261.24/ 7
+371.97/ 7
![Page 38: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/38.jpg)
Illustration 3
Essay test 7 vignette based essay questions 1 markers marked all questions for all
examinees Domain-referenced perspective
Calculate dependability coefficient fora sample of 10 essays!
![Page 39: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/39.jpg)
D-study (ni = 10; domain referenced)
Source
pi
pi,e
EstimatedVariance
Component
97.57261.24371.97
StandardError
19.02112.9817.60
Percentageof TotalVariance
13.3535.7550.90
D =97.57
97.57+= 0.61
261.24/10
+371.97/ 10
![Page 40: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/40.jpg)
D-studies for several item samples
N Essays
1571015
GeneralizabilityCoefficient (G)
0.210.570.650.720.80
DependabilityCoefficient (D)
0.130.440.520.610.70
![Page 41: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/41.jpg)
Illustration 4
Essay test 7 vignette based essay questions 2 markers independently marked all
questions for all examinees Norm-referenced perspective
Calculate generalizability coefficient!
![Page 42: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/42.jpg)
D-study (ni=7; nj=2; norm referenced)
Source
pij
pipjij
pij,e
VarianceComponent
48.7125.1215.00
185.8733.1880.0072.94
% of TotalVariance
10.575.453.2640.337.2017.3615.83
G =48.71
48.71+= 0.50
185.87/ 7
+33.18/2
+72.94/2 x 7
![Page 43: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/43.jpg)
Illustration 5
Essay test 7 vignette based essay questions 2 markers independently marked all
questions for all examinees Domain-referenced perspective
Calculate dependability coefficient!
![Page 44: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/44.jpg)
D-study (ni=7; nj=2; domain referenced)
Source
pij
pipjij
pij,e
VarianceComponent
48.7125.1215.00
185.8733.1880.0072.94
% of TotalVariance
10.575.453.2640.337.2017.3615.83
D =48.71
48.71+= 0.43
25.12/ 7
+15.00/2+185.87/
14+33.18/
2+80.00/
14+72.94/
14
![Page 45: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/45.jpg)
Illustration 6
Essay test 7 vignette based essay questions 2 different markers
independently marked each question for all examinees
Norm-referenced perspective
Calculate generalizability coefficient!
![Page 46: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/46.jpg)
D-study (ni=7; nj=2; norm referenced)
SourceEstimated Var
ComponentPerc of Total
Variance
(Judges : Items) x Persons
pi
j,ijpi
pj,pij,e
48.7125.1895.00
185.87106.12
10.575.45
20.6240.3323.03
G =48.71
48.71+= 0.52
185.87/ 7
+ 106.12/2 x 7
![Page 47: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/47.jpg)
D-study summary table
TwoMarkers
0.440.500.560.61
OneMarker
0.390.470.560.65
TwoMarkers
0.460.540.630.72
Same Markerfor all essays
Different Markerfor each essayNumber
ofEssays
571015
OneMarker
0.360.410.450.49
Norm-referenced score interpretation
![Page 48: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/48.jpg)
Another reliability index
Reliability coefficient (G & D coefficients) Scale independent (0-1) Non-intuitive interpretation
Standard Error of Measurement (SEM) Intuitive interpretation Scale dependent
![Page 49: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/49.jpg)
Standard Error of Measurement
X =X = Observed score
T + E T = True score
E = Error score
Reliability index = Variance of TVariance T + Variance E
EStandard Error of Measurement (SEM) =
![Page 50: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/50.jpg)
Interpretation of SEM
Suppose an examinee has a score of 60% and the SEM is 5:
60555045 65 70 7565% CI
1.96 x 5 10
60555045 65 70 7595% CI
2.14 x 5 11
60555045 65 70 7595% CI
![Page 51: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/51.jpg)
D-study (ni = 7; norm referenced)
Source
pi
pi,e
EstimatedVariance
Component
97.57261.24371.97
StandardError
19.02112.9817.60
Percentageof TotalVariance
13.3535.7550.90
G =97.57
97.57 + 371.97/7= 0.65
SEM = = 7.29 371.97 /7
![Page 52: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/52.jpg)
D-study (ni=7; nj=2; domain referenced)
Source
pij
pipjij
pij,e
VarianceComponent
48.7125.1215.00
185.8733.1880.0072.94
% of TotalVariance
10.575.453.2640.337.2017.3615.83
D =48.71
48.71+= 0.43
25.12/ 2
+15.00/2+185.87/
14+33.18/
2+80.00/
14+72.94/
14SEM = = 8.57
![Page 53: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/53.jpg)
Overview of Presentation
Classes of reliability theories Generalizability Theory
G-study D-study
Illustrations
![Page 54: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/54.jpg)
Scenario CEX
A clinical mini exercise (CEX) was developed in which examinees are periodically observed and rated on a rating form. An investigator analyzed a data set from 88 residents who were each observed on 4 occasions by a single different examiner (cf. 1. Norcini JJ, Blank LL, Arnold GK, Kimbal HR. The mini-CEX (Clinical Evaluation Exercise): A
preliminary investigation. Annals of Internal Medicine 1995;123:795-799.). Variance
Componentsp
o,op,eG =
p
p + o:p /4
= Do:p
![Page 55: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/55.jpg)
Scenario OSCE I
An OSCE was administered to 100 final year students consisting of 15 stations. Each station was scored by two independent examiners on a case specific checklist. Different examiners were used in each station.
VarianceComponents
ps
G =p
p +j:spspj:s
ps /15
+ pj:s /2 x15
![Page 56: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/56.jpg)
Scenario OSCE II
An experimental OSCE was administered to 20 residents. Each resident was tested on a different day. For each resident 3 stations were organized consisting of real patients that were available that day. Two examiners observed all residents in all stations and completed a generic rating scale.
VarianceComponents
ps:p D =
p
p +s:p /3j
ps:spj
+ j /2+ ps:s /
3+ pj /
6
![Page 57: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/57.jpg)
Scenario Clerkship Evaluation
An investigator wishes to evaluate teaching quality of 10 clinical clerkships. She developed a questionnaire with 30 items on various quality aspects. The questionnaire was administered in all clerkships by 50 students.
VarianceComponents
ci
s:cci
cs:i
G =c
c + s:c /50
+ ci /30
+ cs:i /50 x 30
PS: It is doubtful that i is a random facet and i could be treated as fixed or ignored!
![Page 58: Generalizability Theory](https://reader030.vdocuments.net/reader030/viewer/2022012303/568130e3550346895d96f5c4/html5/thumbnails/58.jpg)
Further reading & software
Literature Cronbach LJ, Gleser GC, Nanda H, Rajaratnam N. The dependability of behavioral
measurements: Theory of generalizability for scores and profiles. New York: Wiley, 1972. Original monograph on generalizability theory. Complete, but hardly accessible for any reader.
Brennan RL. Elements of Generalizability Theory. Iowa: ACT Publications, 1983.This is the resource book for most specialists. Not easy for non-statistically trained readers
Shavelson RJ, Webb NM. Generalizability theory: A primer. Newbury Park, CA: Sage Publications, 1991 . Good and accessible introduction to generalizability theory for any reader
Software GENOVA
Conducts G and D studies and provides ample statistical information. Operates on any PC. Program is relatively old and not user friendly. Program available from Dr. J. Crick, National Board of Medical Examiners, National Board of Medical Examiners, 3750 Market Street,Philadelphia, PA 19104-3190, USA.
SPSSSPSS General Linear Models, Subprogram Variance Components, estimates variance components (also for unbalanced designs). D-studies need to be done manually.