definitions correlation, reliability, validity, measurement error theories of reliability
DESCRIPTION
Quality of Measures. Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability Types of Reliability Standard Error of Measurement Types of Validity Article Exercise. Definitions. Correlation - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/1.jpg)
• Definitions– Correlation, Reliability, Validity, Measurement
error• Theories of Reliability• Types of Reliability
– Standard Error of Measurement• Types of Validity
• Article• Exercise
Quality of Measures
![Page 2: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/2.jpg)
• Correlation– reflect direction (+/-) & strength (0 to 1) of the
relation between two variables
• Variance explained– Reflects the strength of relation of two variables
• Square of correlation
• Varies from 0 to 1
Definitions
![Page 3: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/3.jpg)
90
130
170
210
250
150 160 170 180 190 200
Height (cm)
Wei
ght (
poun
ds)
Tom Cruise
Vince Carter
Calista Flockhart
Julia Roberts
![Page 4: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/4.jpg)
90
130
170
210
250
150 160 170 180 190 200
Height (cm)
Wei
ght (
poun
ds)
Tom Cruise
Vince Carter
Calista Flockhart
Julia Roberts
r = .76
r2 = 58%
![Page 5: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/5.jpg)
Effect of Measurement Error on Correlations
![Page 6: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/6.jpg)
150
160
170
180
190
200
150 160 170 180 190 200
Height (cm)
Hei
ght (
cm)
r = 1.00
r2 = 100%
![Page 7: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/7.jpg)
150
160
170
180
190
200
150 160 170 180 190 200
Objective Height (cm)
Self-
Rep
orte
d H
eigh
t (cm
) r = .98
r2 = 96%
![Page 8: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/8.jpg)
100
125
150
175
200
225
250
100 125 150 175 200 225 250
Objective Weight (cm)
Self-
Rep
orte
d W
eigh
t (cm
)
r = .92; r2 = 85%
![Page 9: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/9.jpg)
• Reliability• Consistency & stability of measurement• Reliability is necessary but not sufficient for
validity• E.g. A measuring tape to is not a valid way to measure
weight although the tape reliably measures height and height correlates w/weight
• Validity• Accuracy/meaning of measurement
• Example: unstructured vs. structured job interviews
Definitions
![Page 10: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/10.jpg)
• Classical Test Theory explains random variation in a person’s scores on a measure• Effects of learning, mood, changes in
understanding etc.• Test score=true score + error
• Errors have zero mean• Errors are uncorrelated with each other• Errors are uncorrelated with true score• Constant error is part of true score
Theories of Reliability
![Page 11: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/11.jpg)
• Test-retest • Consistency across time
• Parallel forms• Consistency across versions
• Internal • Consistency across items
• Scorer (inter-rater)• Consistency across raters/judges
Types of Reliability
![Page 12: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/12.jpg)
Example: The Satisfaction with Life Scale (SWLS)
1. In most ways my life is close to ideal.2. The conditions of my life are excellent.3. I am satisfied with my life.4. So far I have gotten the important things I want in my life.5. If I could live my life over, I would change almost nothing.
1 2 3 4 5 6 7Strongly StronglyDisagree Agree
![Page 13: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/13.jpg)
• Test-retest reliability• Correlation of scores on the same measure taken at
two different times• Time interval assumes no memory/learning effects
• Parallel-forms• Correlation of scores on similar versions of the
measure• Forms equivalent on mean, stan dev, inter-correlations• Can have time interval b/w admin of two forms
Types of Reliability
![Page 14: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/14.jpg)
Test-retest Reliability Time 1 Time 2 I1 I2 I3 AvgT1 I1 I2 I3 AvgT2 P1 P2 P3 Correlate AvgT1 to AvgT2 to get reliability Parallel Forms Version 1 Version 2 I1 I2 I3 AvgV1 I1 I2 I3 AvgV2 P1 P2 P3 Correlate AvgV1 to AvgV2
Types of Reliability
I=item
P=participant
![Page 15: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/15.jpg)
1
2
3
4
5
6
7
1 2 3 4 5 6 7
SWLS Time 1 (Beginning of Semester)
SWL
S T
ime
2 (E
nd o
f Sem
este
r)
r = .73; r2 = 50%
![Page 16: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/16.jpg)
Test-retest reliability of SWLS
• Good test-retest reliability
•Participants have similar scores at Time 1 (beginning of semester) and at Time 2 (end of semester).
•Retest reliability is useful for constructs assumed to be stable
•Current mood (e.g., how you feel right now) shows low-retest correlations, but that does not mean that the mood measure is not reliable
![Page 17: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/17.jpg)
• Internal Consistency• Correlation of scores on two halves of the measure• Length of measure increases reliability
• Inter-rater• Correlation of raters’ scores
• E.g., Scores on structured job interview• Can also include time interval
– e.g., ratings of the worth of jobs across time & across judges
Types of Reliability
![Page 18: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/18.jpg)
Internal Reliability Half 1 Half 2 I1 I2 I3 AvgH1 I4 I5 I6 AvgH2 P1 P2 P3 Correlate AvgH1 to AvgH2 Inter-rater Reliability Rater 1 Rater 2 I1 I2 I3 AvgR1 I1 I2 I3 AvgR2 P1 P2 P3 Correlate AvgR1 to AvgR2
Types of Reliability
![Page 19: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/19.jpg)
1
2
3
4
5
6
7
1 2 3 4 5 6 7
SWLS Items 1 & 2
SWL
S It
ems 3
, 4, &
5
r = .70; r2 = 49%
![Page 20: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/20.jpg)
Internal consistency of SWLS
• Satisfactory internal consistency.
•Participants respond similarly to items that are supposed to measure the same variable.
•Should be .70 or higher
•Measurement error accounts for half of the variance in SWLS scores.
![Page 21: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/21.jpg)
• Test-retest • Parallel forms• Internal • Scorer (inter-rater)
Types of Reliability
![Page 22: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/22.jpg)
• SD of scores when a measure is completed several times by the same individual• Mostly used in selection contexts
• Decide which of two individuals are hired• Decide whether a test score is significantly higher/lower
than a cutoff score
Standard Error of Measurement
![Page 23: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/23.jpg)
• Real correlation between two variables after removing unreliability of each measure• Divide observed correlation by product of the
square roots of individual reliabilities• Note: Selection research only controls for unreliability
in criterion bec. we are more interested in the value of the predictor given a perfectly reliable criterion
Correction for Attenuation
![Page 24: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/24.jpg)
• Definitions– Correlation, Reliability, Validity, Measurement
error• Theories of Reliability• Types of Reliability• Standard Error of Measurement• Types of Validity
Quality of Measures
![Page 25: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/25.jpg)
Validity
Evidence that a measure assesses the construct
Reasons for Invalid Measures
• Different understanding of items
• Different use of the scale (Response Styles)
• Intentionally presenting false information (socially desirable responding, other-deception)
• Unintentionally presenting false information (self-deception)
![Page 26: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/26.jpg)
Types of Validity
Content Validity
Criterion Validity
Construct Validity
Predictive Validity
Concurrent Validity
Convergent Validity
Discriminant Validity
Adapted from Sekaran, 2004
![Page 27: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/27.jpg)
• Content Validity• Extent to which items on the measure are a good
representation of the construct• e.g., Is your job interview based on what is required for
the job? • Content validity ratio based on judges’
assessments of a measure’s content• e.g., Expert (supervisors, incumbents) rating of job
relevance of interview questions
Types of Validity
![Page 28: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/28.jpg)
• Criterion-related Validity• Extent to which a new measure relates to another
known measure• Validity coefficient= Size of relation between the new
measure (predictor) and the known measure (criterion) (a.k.a correlation)
• e.g., do scores on your job interview predict performance evaluation scores?
Types of Validity
![Page 29: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/29.jpg)
• Concurrent• Scores on predictor and criterion are collected
simultaneously (e.g., police officer study)
• Distinguishes between participants in sample who are already known to be different from each other
• Weaknesses• Range restriction
– Does not include those who were not hired, fired & promoted
• Differences in test-taking motivation (employees vs. applicants)
• Experience with job can affect scores on criterion
Types of Criterion Validity
![Page 30: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/30.jpg)
• Predictive• Scores on predictor (e.g., selection test) collected
some time before scores on criterion (e.g., job performance)
• Able to differentiate individuals on a criterion assessed in the future
• Weaknesses• Due to management pressures, applicants can be chosen
based on scores on predictor (can have range restriction, but this can be corrected)
• Often, special measures of job performance are developed for validation study
Types of Criterion Validity
![Page 31: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/31.jpg)
• When full range of scores on predictor variable is available– Use unrestricted and restricted standard
deviations of predictor variable & the observed correlations b/w predictor & criterion
Correction for range restriction
![Page 32: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/32.jpg)
• Construct Validity• Extent to which hypotheses about construct are
supported by data1. Define construct, generate hypotheses about
construct’s relation to other constructs2. Develop comprehensive measure of construct & assess
its reliability3. Examine relationship of measure of construct to other,
similar and dissimilar constructs
• Examples: height & weight; Learning Style Orientation measure; networking; career outcomes
Types of Validity (cont’d)
![Page 33: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/33.jpg)
• Multi-trait multi-method matrix• Convergent validity coefficient
• Absolute size of correlation between different measures of the same construct
• should be large, significantly diff from zero,
• Discriminant validity coefficient• Relative size of correlations between the same construct
measured by different methods compared to• Different constructs measured by different methods• Different constructs measured by same method (method bias)
Establishing Construct Validity
![Page 34: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/34.jpg)
O-H SR-H O-W SR-W
O-H 1.00
SR-H .98 1.00
O-W .55 .56 1.00
SR-W .68 .69 .92 1.00
Corr b/w Objective (O) & Self-Reports (SR) of Height & Weight
![Page 35: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/35.jpg)
• Multi-trait multi-method matrix– Different measures of the same construct should be
more highly correlated than different measures of different constructs• e.g., Perceived career success & promotion vs.
networking vs. promotion/salary– Different measures of different constructs should
have lowest correlations• e.g., Networking vs. promotion/salary
Establishing Construct Validity
![Page 36: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/36.jpg)
• Item Development Study (generate critical incidents)– N=67– Yes/no responses to statements– Recall of learning events
• Two types of learning: theoretical, practical• Two types of outcomes=success, failure• 2 x 2 events per participant• 112 items constructed in total
Learning Style Orientation Measure
![Page 37: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/37.jpg)
• Item Development Study (questionnaire)– N=154– 112 items, 5 point likert scale (agree/disagree)
• 5 factor solution w/factor analyses• 54 items• Content validity sorting by 8 grad students
– Goldberg personality scale
Learning Style Orientation Measure
![Page 38: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/38.jpg)
• Item Development Study • Correlations b/w LSO & personality
• Only 1 sig correlation b/w 5 factors of LSOM!• High reliabilities of subscales of LSOM (.81-.91)• Construct (not really convergent) validity
– r b/w LSOM & personality subscales• .42 to -.26.
Learning Style Orientation Measure
![Page 39: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/39.jpg)
• Validation Study– N=350 -193– LSOM, Personality, old LSI, preferences for
instructional & assessment methods• Construct validity
– r b/w LSOM subscales & old LSI= .01 to .31– r b/w LSOM & personality subscales= .01 to .55– Confirmatory factor analysis
• 5-dimensions confirmed• High reliability
Learning Style Orientation Measure
![Page 40: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/40.jpg)
• Validation Study– Incremental validity
• Additional variance explained (LSOM vs LSI)
Learning Style Orientation Measure
DV LSOM LSISubjective assessment .15 .01
Interactional instruction .21 .04
Informational instruction .06 .00
![Page 41: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/41.jpg)
• Brainstorm constructs to develop measures• E.g. Dimensions of CIR professor effectiveness, CIR
student effectiveness
• Choose two constructs that can be measured similarly and be defined clearly
• Example measures– Self-report (rating scales)– Peer/informant reports – Observation– Archival measures– Trace measures etc etc.
In-class Exercise
![Page 42: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/42.jpg)
• Form two-person groups to• Generate items of the 2 different measures for each of
the two constructs
• Appointed person collects all items for both measures for both constructs
• Compiles & distributes measures to class
• Class gathers data on both measures & both constructs
• Class enters data into SPSS format• Compute reliabilities,means, correlations
In-class Exercise
![Page 43: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/43.jpg)
C1C2
M2M2
M1M1
Fill in the correlations
![Page 44: Definitions Correlation, Reliability, Validity, Measurement error Theories of Reliability](https://reader036.vdocuments.net/reader036/viewer/2022062316/56814c4f550346895db95e17/html5/thumbnails/44.jpg)
Types of Validity
Content Validity
Criterion Validity
Construct Validity
Predictive Validity
Concurrent Validity
Convergent Validity
Discriminant Validity
Adapted from Sekaran, 2004