![Page 1: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/1.jpg)
ReliabilityChapter 3
![Page 2: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/2.jpg)
Every observed score is a combination of true score and error
Obs. = T + E
Reliability =
Classical Test Theory
ss
ss
O
T
O
E2
2
2
2
1
![Page 3: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/3.jpg)
Systematic versus unsystematic error
Reliability only takes unsystematic error into account
Reliability
![Page 4: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/4.jpg)
Reliability & Correlation
Reliability often based on consistency between two sets of scores
Correlation: Statistical technique used to examine consistency
![Page 5: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/5.jpg)
Positive Correlation
![Page 6: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/6.jpg)
Negative Correlation
![Page 7: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/7.jpg)
Correlation coefficient: a numerical indicator of the relationship between two sets of data
Pearson-Product Moment correlation coefficient is most common
Pearson-Product MomentCorrelation Coefficient
r
1z 2zN
![Page 8: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/8.jpg)
The percentage of shared variance between two sets of data
Coefficient of Determination
![Page 9: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/9.jpg)
Test-Retest
Alternate/Parallel Forms
Internal Consistency Measures
Types of Reliability
![Page 10: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/10.jpg)
Correlating performance on first administration with performance on the second
Co-efficient of stability
Test-Retest
![Page 11: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/11.jpg)
Two forms of instrument, administered to same individuals
Alternate/Parallel Forms
![Page 12: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/12.jpg)
Split-half reliability Spearman-Brown formula
Kuder-Richardson formulas KR 20 KR 21
Coefficient Alpha
Internal Consistency Measures
![Page 13: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/13.jpg)
Typical methods for determining reliability may not be suitable for:
Speed tests
Criterion-referenced tests
Subjectively-scored instruments Interrater reliability
Nontypical Situations
![Page 14: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/14.jpg)
Examine purpose for using instrument
Be knowledgeable about reliability coefficients of other instruments in that area
Examine characteristics of particular clients against reliability coefficients
Coefficients may vary based on SES, age, culture/ethnicity, etc.
Evaluating Reliability Coefficients
![Page 15: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/15.jpg)
rsSEM 1
Standard Error of Measurement
Provides estimate of range of scores if someone were to take instrument repeatedly
Based on premise that when individuals take a test multiple times, scores fall into normal distribution
![Page 16: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/16.jpg)
Sam’s SAT Verbal = 550 r = .91; s = 100
SEM
68% of the time, Sam’s true score would fall between 520 and 580
95% of the time, Sam’s true score would fall between 490 and 610 99.5% of the time, Sam’s true score would fall between 460 and
640
SEM: Example
30
3.100
09.100
91.1100
![Page 17: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/17.jpg)
Determining Range of Scores Using SEM
![Page 18: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/18.jpg)
Method to determine if difference between two scores is significant
Takes into account SEM of both scores
Standard Error of Difference
![Page 19: Reliability Chapter 3. Every observed score is a combination of true score and error Obs. = T + E Reliability = Classical Test Theory](https://reader034.vdocuments.net/reader034/viewer/2022051620/56649eb65503460f94bbee23/html5/thumbnails/19.jpg)
Generalizability or Domain Sampling Theory
Focus is on estimating the extent to which specific sources of variation under defined conditions are contributing to the score on the instrument
Alternative Theoretical Model