1 class 4 psychometric characteristics part i: sources of error, variability, reliability,...

80
1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute for Health & Aging University of California, San Francisco

Upload: pierce-skinner

Post on 05-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

1

Class 4

Psychometric Characteristics Part I: Sources of Error, Variability, Reliability,

Interpretability October 12, 2006

Anita L. StewartInstitute for Health & Aging

University of California, San Francisco

Page 2: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

2

Overview of Class 4

Concepts of error Basic psychometric characteristics

– Variability

– Reliability

– Interpretability

Page 3: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

3

Components of an Individual’s Observed Item Score

(NOTE: Simplistic view)

Observed true item score score

= + error

Page 4: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

4

Components of Variability in Item Scores of a Group of Individuals

Observed true score score variance variance

Total variance (Variation is the sum of all observed item scores)

= + errorvariance

Page 5: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

5

Combining Items into Multi-Item Scales

When items are combined into a scale score, error cancels out to some extent– Error variance is reduced as more items

are combined

– As you reduce random error, amount of “true score” variance increases

Page 6: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

6

Sources of Error

Subjects Observers or interviewers Measure or instrument

Page 7: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

7

Measuring Weight in Pounds of Children: Weight without shoes

Observed scores is a linear combination of many sources of variation for an individual

Page 8: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

8

Measuring Weight in Pounds of Children: Weight without shoes

Scale ismiscalibrated

True weight

Amount of water

past 30 min

Weightof clothes

Observed weight

Person weighing children

is not very precise

= + +

+ +

Page 9: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

9

Measuring Weight in Pounds of Children: Weight without shoes

Scale ismiscalibrated

+1 lb

True weight80 lbs

Amount of water

past 30 min+.25 lb

Weightof clothes

+.75 lb

Observed weight83 lbs

Person weighing children

is not very precise+1 lb

= + +

+ +

83 = 80 +.25 +.75 +1 +1

Page 10: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

10

Sources of Error

Weight of clothes– Subject source of error

Person weighing child is not precise– Observer source of error

Scale is miscalibrated– Instrument source of error

Page 11: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

11

Measuring Depressive Symptoms in Asian and Latino Men

Unwillingnessto tell

interviewer

“True” depression

Hard to choose onenumber on the 1-6

response choice scale

Observed depression

score

Measurenot culturally

sensitive

= +

+ +

Page 12: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

12

Measuring Depressive Symptoms in Asian and Latino Men

Unwillingnessto tell

interviewer-3

“True” depression

16

Hard to choose onenumber on the 1-6

response choice scale+2

Observed depression

score13

Measurenot culturally

Sensitive-2

= +

+ +

13 = 16 +2 -3 -2

Page 13: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

13

Return to Components of an Individual’s Observed Item Score

Observed true item score score

= + error

Page 14: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

14

Components of an Individual’s Observed Item Score

Observed true item score score

= + error random

systematic

Page 15: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

15

Sources of Error in Measuring Weight

Weight of clothes– Subject source of random error

Scale is miscalibrated– Instrument source of systematic error

Person weighing child is not precise– Observer source of random error

Page 16: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

16

Sources of Error in Measuring Depression

Hard to choose one number on 1-6 response scale– Subject source of random error

Unwillingness to tell interviewer– Subject source of systematic error (underreporting

true depression) Instrument is not culturally sensitive (missing

some components)– Instrument source of systematic error

Page 17: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

17

Memory Errors – From Cognitive Psychology

Error remembering “when” and “how often” something occurred within some time frame

Memory and emotion – tend to remember– positive more than negative experiences– more emotionally intense than neutral experiences

Memory for threatening, sensitive events is more error prone than non-threatening events

AA Stone et al. (eds), The Science of Self-Report,London: Lawrence Erlbaum, 2000.

Page 18: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

18

Overview

Concepts of error Basic psychometric characteristics

– Variability

– Reliability

– Interpretability

Page 19: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

19

Variability

Good variability– All (or nearly all) scale levels are represented– Distribution approximates bell-shaped normal

Variability is a function of the sample– Need to understand variability of measure of

interest in sample similar to one you are studying Review criteria

– Adequate variability in a range that is relevant to your study

Page 20: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

20

Common Indicators of Variability

Range of scores (possible, observed) Mean, median, mode Standard deviation (standard error) Skewness % at floor (lowest score) % at ceiling (highest score)

Page 21: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

21

Range of Scores

Especially important for multi-item measures Possible and observed Example of difference:

– CES-D possible range is 0-30– Wong et al. study of mothers of young children:

observed range was 0-23» missing entire high end of the distribution (none had high

levels of depression)

Page 22: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

22

Mean, Median, Mode

Mean - average Median - midpoint Mode - most frequent score In normally distributed measures, these are

all the same In non-normal distributions, they will vary

Page 23: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

23

Mean and Standard Deviation

Most information on variability is from mean and standard deviation– Can envision how it is distributed on the

possible range

Page 24: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

24

Normal Distributions(Or Approximately Normal)

Mean, SD tell the entire story of the distribution + 1 SD on each side of the mean = 64%

of the scores

Page 25: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

25

Skewness

Positive skew - scores bunched at low end, long tail to the right

Negative skew - opposite pattern Coefficient ranges from - infinity to + infinity

– the closer to zero, the more normal Test whether skewness coefficient is significantly

different from zero– thus depends on sample size

Scores +2.0 are cause for concern

Page 26: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

26

Skewed Distributions

Mean and SD are not as useful – SD often goes out beyond the maximum or

minimum possible

Page 27: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

27

Ceiling and Floor Effects: Similar to Skewness Information

Ceiling effects: substantial number of people get highest possible score

Floor effects: opposite Not very meaningful for continuous scales

– there will usually be very few at either end More helpful for single-item measures or

coarse scales with only a few levels

Page 28: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

28

… to what extent did health problems limit you in everyday physical activities (such as walking and climbing stairs)?

0

10

20

30

40

50

Not at all Slightly Moderately Quite a bit Extremely

%

49% not limited at all (can’t improve)

Page 29: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

29

… to what extent did health problems limit you in everyday physical activities (such as walking and climbing stairs)?

0

10

20

30

40

50

Not at all Slightly Moderately Quite a bit Extremely

%

49% not limited at all (can’t improve)

Page 30: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

30

SF-36 Variability Information in Patients with Chronic Conditions (N=3,445)

Physicalfunction

Role-physical

Mental health

Vitality (energy)

0-100 0-100 0-100 0-100

Mean 80 75 71 54

SD 27 41 21 22

Skewness - .99 - .26 - .83 - .24

% floor < 1 24 <1 <1

% ceiling 19 37 4 <1

McHorney C et al. Med Care. 1994;32:40-66.

Page 31: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

31

SF-36 Variability Information in Patients with Chronic Conditions (N=3,445)

Physicalfunction

Role-physical

Mental health

Vitality (energy)

0-100 0-100 0-100 0-100

Mean 80 75 71 54

SD 27 41 21 22

Skewness - .99 - .26 - .83 - .24

% floor < 1 <1 <1

% ceiling 19 4 <1

McHorney C et al. Med Care. 1994;32:40-66.

24

37

Page 32: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

32

Reasons for Poor Variability

Low variability in construct being measured in that “sample” (true low variation)

Items not adequately tapping construct– If only one item, especially hard

Items not detecting important differences in construct at one or the other end of the continuum

Solutions if one is in the process of developing measures: add items

Page 33: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

33

Advantages of multi-item scales revisited

Using multi-item scales minimizes likelihood of ceiling/floor effects

When items are skewed, multi-item scale “normalizes” the skew

Page 34: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

34

Percent with Highest (Best) Score:MOS 5-Item Mental Health Index

Items (6 pt scale - all of the time to none of the time): – Very nervous person - 34% none of the time– Felt calm and peaceful - 4% all of the time– Felt downhearted and blue - 33% none of the time– Happy person - 10% all of the time– So down in the dumps nothing could cheer you up – 63%

none of the time Summated 5-item scale (0-100 scale)

– Only 5% had highest scoreStewart A. et al., MOS book, 1992

Page 35: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

35

Overview

Concepts of error Basic psychometric characteristics

– Variability

– Reliability

– Interpretability

Page 36: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

36

Reliability

Extent to which an observed score is free of random error– Produces the same score each time it is administered (all else

being equal) Population-specific; reliability increases with:

– sample size– variability in scores (dispersion)– a person’s level on the scale

Page 37: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

37

Components of Variability in Item Scores of a Group of Individuals

Observed true score score variance variance

Total variance (Variation is the sum of all observed item scores)

= + errorvariance

Page 38: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

38

Reliability Depends on True Score Variance

Reliability is a group-level statistic Reliability:

– Reliability = 1 – (error variance)– Reliability is:

Proportion of variance due to true score Total variance

Page 39: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

39

Reliability Depends on True Score Variance

Proportion of variance due to true score Total variance

Reliability = Total variance – error variance .70 = 100% - 30%

Page 40: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

40

Reliability Depends on True Score Variance

Reliability of .70 means 30% of the variancein the observed score is explainedby error

Reliability = total variance – error variance

Proportion of variance due to true score Total variance

Page 41: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

41

Importance of Reliability

Necessary for validity (but not sufficient)– Low reliability attenuates correlations with other

variables (harder to detect true correlations among variables)

– May conclude that two variables are not related when they are

Greater reliability, greater power – Thus the more reliable your scales, the smaller

sample size you need to detect an association

Page 42: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

42

Reliability Coefficient

Typically ranges from .00 - 1.00 Higher scores indicate better reliability

Page 43: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

43

How Do You Know if a Scale or Measure Has Adequate Reliability?

Adequacy of reliability judged according to standard criteria– Criteria depend on type of coefficient

Page 44: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

44

Types of Reliability Tests

Internal-consistency Test-retest Inter-rater Intra-rater

Page 45: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

45

Internal Consistency Reliability: Cronbach’s Alpha

Requires multiple items supposedly measuring same construct to calculate

Extent to which all items measure the same construct (same latent variable)

Page 46: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

46

Internal-Consistency Reliability

For multi-item scales Cronbach’s alpha

– ordinal scales Kuder Richardson 20 (KR-20)

– for dichotomous items

Page 47: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

47

Minimum Standardsfor Internal Consistency Reliability

For group comparisons (e.g., regression, correlational analyses)– .70 or above is minimum (Nunnally, 1978)– .80 is optimal– above .90 is unnecessary

For individual assessment (e.g., treatment decisions)– .90 or above (.95) is preferred (Nunnally, 1978)

Page 48: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

48

Internal-Consistency Reliability Can be Spurious

Based on only those who answered all questions in the measure– If a lot of people are having trouble with the

items and skip some, they are not included in test of reliability

Page 49: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

49

Internal-Consistency Reliability is a Function of Number of Items in Scale

Increases with the number of items Very large scales (20 or more items) can

have high reliability without other good scaling properties

Page 50: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

50

Example: 20 item Beck Depression Inventory (BDI)

BDI 1978 version (past week)– reliability .86

– 3 items correlated < .30 with other items in the scale

Beck AT et al. J Clin Psychol. 1984;40:1365-1367

Page 51: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

51

Test-Retest Reliability

Repeat assessment on individuals who are not expected to change

Time between assessments should be:– Short enough so no change occurs– Long enough so subjects don’t recall first response

Coefficient is a correlation between two measurements For single item measures, the only way to test

reliability

Page 52: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

52

Appropriate Test-Retest Coefficients by Type of Measure

Continuous scales (ratio or interval scales, multi-item Likert scales):– Pearson

Ordinal or non-normally distributed scales:– Spearman– Kendall’s tau

Dichotomous (categorical) measures:– Phi– Kappa

Page 53: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

53

Minimum Standards for Test-Retest Reliability

Significance of a test-retest correlation has NOTHING to do with the adequacy of the reliability

Criteria: similar to those for internal consistency

– >.70 is desirable

– >.80 is optimal

Page 54: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

54

Observer or Rater Reliability

Inter-rater reliability (across two or more raters)– Consistency (correlation) between two or more

observers on the same subjects (one point in time)

Intra-rater reliability (within one rater)– A test-retest within one observer– Correlation among repeated values obtained by the

same observer (over time)

Page 55: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

55

Observer or Rater Reliability

Sometimes Pearson correlations are used - correlate one observer with another– Assesses association only

.65 to .95 are typical correlations >.85 is considered acceptable

McDowell and Newell

Page 56: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

56

Association vs. Agreement When Correlating Two Times or Ratings

Association is degree to which one score linearly predicts other score

Agreement is extent to which same score is obtained on second measurement (retest, second observer)

Can have high correlation and poor agreement– If second score is consistently higher for all

subjects, can obtain high correlation– Need second test of mean differences

Page 57: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

57

Hypothetical Scores on 4 Subjects by 2 Observers

1

2

3

4

5

6

7

S1 S2 S3 S4

Subjects

Page 58: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

58

Example of Association and Agreement

Scores by observer 1 are exactly 2 points above scores by observer 2– Correlation (association) would be perfect

(r=1.0)

– Agreement is poor (no agreement on score in all cases - a difference of 2 between scores on each subject

Page 59: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

59

Intraclass Correlation Coefficient for Testing Inter-rater Reliability (Kappa) Coefficient indicates level of agreement of two

or more judges, exceeding that which would be expected by chance

Appropriate for dichotomous (categorical) scales and ordinal scales

Several forms of kappa:– e.g., Cohen’s kappa is for 2 judges, dichotomous

scale Sensitive to number of observations,

distribution of data

Page 60: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

60

Interpreting Kappa: Level of Reliability

<0.00

.00 - .20

.21 - .40

.41 - .60

.61 - .80

.81 - 1.00

Poor

Slight

Fair

Moderate

Substantial

Almost perfect

.60 or higher is acceptable (Landis, 1977)

Page 61: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

61

Reliable Scale?

NO! There is no such thing as a “reliable” scale We accumulate “evidence” of reliability in a

variety of populations in which it has been tested

Page 62: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

62

Reliability Often Poorer in Lower SES Groups

More random error due to Reading problems Difficulty understanding complex

questions Unfamiliarity with questionnaires and

surveys

Page 63: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

63

Advantages of multi-item scales revisited

Using multi-item scales improves reliability

Random error is “canceled out” across multiple items

Page 64: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

64

Overview

Concepts of error Basic psychometric characteristics

– Variability

– Reliability

– Interpretability

Page 65: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

65

Interpretability of Scale Scores: What does a Score Mean?

Meaning of scores What are the endpoints? Direction of scoring - what does a high score

mean? Compared to norms - is score average, low, or

high compared to norms?

Single items, more easily interpretableMulti-item scales, no inherent meaning to scores

Page 66: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

66

Endpoints

What is minimum and maximum possible?– To enable interpretation of mean score

Endpoints of summated scales depend on number of items & number of response choices– 5 items, 4 response choices = 5 - 20

– 3 items, 5 response choices = 3 - 15

Page 67: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

67

Direction of Scoring

What does a high score mean? Where in the range does this mean score

lie?– Toward top, bottom?

– In the middle?

Page 68: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

68

Descriptive Statistics for 3193 Women

M (SD) Min Max

Age 46.2 (2.7) 44.0 52.9

Activity 7.7 (1.8) 3.0 14.0

Stress 8.6 (2.9) 4.0 19.0

Avis NE et al. Med Care, 2003;41:1262-1276

Page 69: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

69

Sample Results: Mean Scores in a Sample of Older Adults

Physical functioning 45.0Sleep 28.1Disability 35.7

Mean

Page 70: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

70

Example of Table Labeling Scores: Making it Easier to Interpret

Physical functioning 45.0Sleep 28.1Disability 35.7

* All scores 0-100

Mean*

Page 71: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

71

Example of Table Labeling Scores: Making it Easier to Interpret

Physical functioning (+) 45.0Sleep (-) 28.1Disability (-) 35.7

* All scores 0-100 (+) indicates higher score is better health(-) indicates lower score is better health

Mean*

Page 72: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

72

Solutions

Can include in label (+) or (-)– Can label scale so that higher score is more

of “label” Can easily put score range next to label if

they differ in one table

Page 73: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

73

Mean Has to be Interpreted Within the Possible Range

M SD

Parents’ harsh discipline practices* Interviewers’ ratings of mother 2.55 .74 Husbands’ reports of wife 5.32 3.30

*Note: high score indicates more harsh practices

Page 74: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

74

Mean Has to be Interpreted Within the Possible Range

M SD

Parents’ harsh discipline practices* Interviewers’ ratings of mother (1-5) 2.55 .74 Husbands’ reports of wife (1-7) 5.32 3.30

*Note: high score indicates more harsh practices

Page 75: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

75

Mean Has to be Interpreted Within the Possible Range

M SD

Parents’ harsh discipline practices* Interviewers’ ratings of mother (1-5) 2.55 .74 Husbands’ reports of wife (1-7) 5.32 3.30

Interviewer: 1 2 3 4 5

Husband: 1 2 3 4 5 6 7

*Note: high score indicates more harsh practices

2.55

5.32

Page 76: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

76

Mean Has to be Interpreted Within the Possible Range: Adding SD Information

M SD

Parents’ harsh discipline practices* Interviewers’ ratings of mother (1-5) 2.55 .74 Husbands’ reports of wife (1-7) 5.32 3.30

Interviewer: 1 2 3 4 5

Husband: 1 2 3 4 5 6 7

*Note: high score indicates more harsh practices

2.55

5.32

Page 77: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

77

Transforming a Summated Scale to 0-100 Scale

Works with any ordinal or summated scale Transforms it so 0 is the lowest possible and

100 is the highest possible Eases interpretation across numerous scales

100 x (observed score - minimum possible score)

(maximum possible score - minimum possible score)

Page 78: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

78

Homework for Next Class

Complete rows in matrix for your two measures– Rows 13-18: Nature of samples on which it

has been tested, data quality

– Rows 19-26: Variability, reliability, interpretability

Page 79: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

79

Next Class (Class 5)

Guest lecture: Steve Gregorich Factor analysis

Page 80: 1 Class 4 Psychometric Characteristics Part I: Sources of Error, Variability, Reliability, Interpretability October 12, 2006 Anita L. Stewart Institute

80

Two Readings for Next Week

Selected by Steve Gregorich– Kline

– Mulaik

Suggest reading them ahead to be able to ask questions