epidemiologic methods- fall 2002. course administration format –lectures: tuesdays 8:15 am, except...

61
Epidemiologic Methods- Fall 2002 Lecture 1 Title U nderstanding M easurem ent: R eproducibility & Validity 2 Study D esign 3 M easures ofD isease O ccurrence I 4 M easures ofD isease O ccurrence II 5 M easures ofD isease A ssociation I 6 M easures ofD isease A ssociation II 7 B ias in C linicalR esearch: Selection and M easurem entBias 8 C onfounding and Interaction I: G eneralPrinciples 9 C onfounding and Interaction II: A ssessing Interaction 10 C onfounding and Interaction II: Stratified A nalysis 11 C onceptualA pproach to M ultivariable A nalysis I 12 C onceptualA pproach to M ultivariable A nalysis II

Upload: zain-tufts

Post on 16-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Epidemiologic Methods- Fall 2002Lecture

1

Title

Understanding Measurement: Reproducibility & Validity

2 Study Design

3 Measures of Disease Occurrence I

4 Measures of Disease Occurrence II

5 Measures of Disease Association I

6 Measures of Disease Association II

7 Bias in Clinical Research: Selection and Measurement Bias

8 Confounding and Interaction I: General Principles

9 Confounding and Interaction II: Assessing Interaction

10 Confounding and Interaction II: Stratified Analysis

11 Conceptual Approach to Multivariable Analysis I

12 Conceptual Approach to Multivariable Analysis II

Page 2: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Course Administration• Format

– Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm– Small Group Sections: Tuesdays 1:00 pm except for last

Section, Dec. 3, from 10:30 to 11:30. Begin next week• Content: Overview and discussion of lectures, and review of assignments.

• Textbooks– Epidemiology: Beyond the Basics by Szklo and Nieto (S & N). – Multivariable Analysis: A Practical Guide for Clinicians by M. Katz

• Grading– Based on points achieved on homework (~80%) & final (~20%). – Late assignments are not accepted.

• Missed sessions– All material distributed in class is posted on website.

Page 3: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Definitions of Epidemiology

• The study of the distribution and determinants (causes) of disease– e.g. cardiovascular epidemiology

• The method used to conduct human subject research– the methodologic foundation of any research

where individual humans or groups of humans are the unit of observation

Page 4: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Understanding Measurement: Aspects of Reproducibility and Validity

• Review Measurement Scales

• Reproducibility vs Validity

• Reproducibility– importance– sources of measurement variability– methods of assessment

• by variable type: interval vs categorical

Page 5: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Clinical ResearchSample

Measure(Intervene)

Analyze

Infer

Page 6: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

A study can only be as good as the data . . .

-Martin Bland

Page 7: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Measurement Scales

Scale Example

Interval continuousdiscrete

weightWBC count

Categoricalordinalnominaldichotomous

tumor stageracedeath

Page 8: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Reproducibility vs Validity

• Reproducibility– the degree to which a measurement provides the

same result each time it is performed on a given subject or specimen

• Validity– from the Latin validus - strong– the degree to which a measurement truly

measures (represents) what it purports to measure (represent)

Page 9: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Reproducibility vs Validity

• Reproducibility– aka: reliability, repeatability, precision, variability,

dependability, consistency, stability

• Validity– aka: accuracy

Page 10: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Relationship Between Reproducibility and Validity

Good Reproducibility

Poor Validity

Poor Reproducibility

Good Validity

Page 11: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Relationship Between Reproducibility and Validity

Good Reproducibility

Good Validity

Poor Reproducibility

Poor Validity

Page 12: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Why Care About Reproducibility?Impact on Validity

• Mathematically, the upper limit of a measurement’s validity is a function of its reproducibility

• Consider a study to measure height in the community:

– Assume the measurement has imperfect reproducibility: if we measure height twice on a given person, we get two different values; 1 of the 2 values must be wrong (imperfect validity)

– If study measures everyone only once, errors, despite being random, will lead to biased inferences when using these measurements (i.e. lack validity)

Page 13: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

GoodB-Ball

PoorB-Ball

>6 ft 10 30 40 +1 10 +3 30<6 ft 10 50 60 10 +1 50 +5

20 80 100 20 80

P

GoodB-Ball

PoorB-Ball

>6 ft 10 32 42<6 ft 10 48 58

20 80 100

Truth = Prevalence Ratio= (10/40) / (10/60) = 1.5

Observed = Prevalence Ratio = (10/42) / (10/58) = 1.38

10% Misclassification

Page 14: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Impact of Reproducibility on Statistical Precision

• Classical Measurement Theory:

–observed value (O) = true value (T) + measurement error (E)

–If we assume E is random and normally distributed:

E ~ N (0, 2E)

Fra

ctio

n

error-3 -2 -1 0 1 2 3

0

.02

.04

.06

Error

Page 15: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Impact of Reproducibility on Statistical Precision

–observed value (O) = true value (T) + measurement error (E)

–E is random and ~ N (0, 2E)

When measuring a group of subjects, the variability of observed values is a combination of:

the variability in their true values and the variability in the measurement error

2O = 2

T + 2E

Page 16: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Why Care About Reproducibility?

2O = 2

T + 2E

• More measurement error means more variability in observed measurements–e.g. measure height in a group of subjects. –If no measurement error–If measurement error

Height

Page 17: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Why Care About Reproducibility?

2O = 2

T + 2E

• More variability of observed measurements has profound influences on statistical precision/power:– Descriptive studies: wider confidence intervals– RCT’s: power to detect a treatment difference is reduced– Observational studies: power to detect an influence of a

particular risk factor upon a given disease is reduced.

Page 18: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Mathematical Definition of Reproducibility

• Reproducibility

• Varies from 0 (poor) to 1 (optimal)

• As 2E approaches 0 (no error), reproducibility

approaches 1

2

E

2

T

2

T

2

O

2

T

Page 19: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Phillips and Smith, J Clin Epi 1993

Power

Page 20: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Sources of Measurement Error

• Observer

• within-observer (intrarater)

• between-observer (interrater)

• Instrument

• within-instrument

• between-instrument

Page 21: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Sources of Measurement Error

• e.g. plasma HIV viral load– observer: measurement to measurement

differences in tube filling, time before processing

– instrument: run to run differences in reagent concentration, PCR cycle times, enzymatic efficiency

Page 22: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Within-Subject Variability

• Although not the fault of the measurement process, moment-to-moment biological variability can have the same effect as errors in the measurement process

• Recall that:– observed value (O) = true value (T) + measurement error (E)

– T = the average of measurements taken over time

– E is always in reference to T

– Therefore, lots of moment-to-moment within-subject biologic variability will serve to increase the variability in the error term and thus increase overall variability because

2O = 2

T + 2E

Page 23: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays
Page 24: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Assessing Reproducibility

Depends on measurement scale

• Interval Scale– within-subject standard deviation– coefficient of variation

• Categorical Scale

– Cohen’s Kappa

Page 25: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Reproducibility of an Interval Scale Measurement: Peak Flow

• Assessment requires

>1 measurement per subject

• Peak Flow Rate in 17 adults

(Bland & Altman)

Subject Meas. 1 Meas. 21 494 4902 395 3973 516 5124 434 4015 476 4706 557 6117 413 4158 442 4319 650 638

10 433 42911 417 42012 656 63313 267 27514 478 49215 178 16516 423 37217 427 421

Page 26: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Assessment by Simple CorrelationM

eas.

2

Meas. 1200 400 600 800

200

400

600

800

Page 27: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Pearson Product-Moment Correlation Coefficient

• r (rho) ranges from -1 to +1

• r

• r describes the strength of linear association

• r2 = proportion of variance (variability) of one variable accounted for by the other variable

22 )()())((YYXXYYXX

Page 28: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

r = -1.0

r = 0.8 r = 0.0

r = 1.0

r = 1.0 r = -1.0

r = 0.8 r = 0.0

Page 29: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Correlation Coefficient for Peak Flow Data

r ( meas.1, meas. 2) = 0.98

Page 30: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Limitations of Simple Correlation for Assessment of Reproducibility

• Depends upon range of data– e.g. Peak Flow

• r (full range of data) = 0.98• r (peak flow <450) = 0.97• r (peak flow >450) = 0.94

Page 31: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays
Page 32: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Limitations of Simple Correlation for Assessment of Reproducibility

• Depends upon ordering of data

• Measures linear association only

Page 33: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Meas.

2

Meas 1100 300 500 700 900 1100 1300 1500 1700

100

300

500

700

900

1100

1300

1500

1700

Page 34: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Limitations of Simple Correlation for Assessment of Reproducibility

• Gives no meaningful parameter using the

same scale as the original measurement

Page 35: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Within-Subject Standard Deviation

• Mean within-subject standard deviation (sw)

= 15.3 l/min

subject meas1 meas2 mean s1 494 490 492 2.832 395 397 396 1.413 516 512 514 2.83. . . . .. . . . .. . . . .

15 178 165 172 9.1916 423 372 398 36.0617 427 421 424 4.24

17)24.4...83.2( 222

n

si i

Page 36: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Computationally easier with ANOVA table:

• Mean within-subject standard deviation (sw) :

Analysis of Variance Source SS df MS F Prob > F-----------------------------------------------------------------------Between groups 441598.529 16 27599.9081 117.80 0.0000 Within groups 3983.00 17 234.294118----------------------------------------------------------------------- Total 445581.529 33 13502.4706

squares of sum group- withins2

i

234 squaremean group- within17s2

i

l/min 15.3 squaremean group-within

Page 37: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

sw: Further Interpretation

• If assume that replicate results:– are normally distributed

– mean of replicates estimates true value

• 95% of replicates are within (1.96)(sw) of true valueMeasured Value

x true value

sw

(1.96) (sw)

Page 38: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

sw: Peak Flow Data

• If assume that replicate results:– are normally distributed

– mean of replicates estimates true value

• 95% of replicates within (1.96)(15.3) = 30 l/min of true valueMeasured Value

x true value

sw = 15.3 l/min

(1.96) (sw) =

(1.96) (15.3) = 30

Page 39: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

sw: Further Interpretation

• Difference between any 2 replicates for same person = diff = meas1 - meas2

• Because var(diff) = var(meas1) + var(meas2), therefore,

s2diff = sw

2 + sw2 = 2sw

2

sdiff

1.41s s2s2s ww2w

2diff

Page 40: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

sw: Difference Between Two Replicates

• If assume that differences:– are normally distributed and mean of differences is 0

– sdiff estimates standard deviation

• The difference between 2 measurements for the same subject is expected to be less than (1.96)(sdiff) = (1.96)(1.41)sw = 2.77sw for 95% of all pairs of measurements

Measured Value

xdiff 0

sdiff

(1.96) (sdiff)

Page 41: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

sw: Further Interpretation

• For Peak Flow data:

• The difference between 2 measurements for the same subject is expected to be less than 2.77sw

=(2.77)(15.3) = 42.4 l/min for 95% of all pairs

• Bland-Altman refer to this as the “repeatability” of the measurement

Page 42: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

One Common Underlying sw

• Appropriate only if there is one sw

• i.e, sw does not vary with true underlying value

Wit

hin

-Su

bje

ct

Std

Devia

tion

Subject Mean Peak Flow

100 300 500 7000

10

20

30

40 Kendall’s correlation coefficient = 0.17, p = 0.36

Page 43: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Another Interval Scale Example

• Salivary cotinine in children (Bland-Altman)

• n = 20 participants measured twicesubject trial 1 trial 2

1 0.1 0.12 0.2 0.13 0.2 0.3. . .. . .. . .

18 4.9 1.419 4.9 3.920 7.0 4.0

Page 44: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Cotinine: Absolute Difference vs. MeanS

ub

ject

Ab

solu

te D

iffe

ren

ce

Subject Mean Cotinine0 2 4 6

0

1

2

3

4 Kendall’s tau = 0.62, p = 0.001

Page 45: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Logarithmic Transformation

subject trial1 trial2 log trial 1 log trial 21 0.1 0.1 -1 -12 0.2 0.1 -0.69897 -13 0.2 0.3 -0.69897 -0.52288. . . . .. . . . .. . . . .

18 4.9 1.4 0.690196 0.14612819 4.9 3.9 0.690196 0.59106520 7 4 0.845098 0.60206

Page 46: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Log Transformed: Absolute Difference vs. MeanS

ub

ject

ab

s lo

g d

iff

Subject mean log cotinine-1 -.5 0 .5 1

0

.2

.4

.6 Kendall’s tau=0.07, p=0.7

Page 47: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

sw for log-transformed cotinine data

• sw

• back-transforming to native scale:

• antilog(sw) = antilog(0.175) = 10 0.175 = 1.49

175.00305.0

Page 48: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Coefficient of Variation• On the natural scale, there is not one common within-subject

standard deviation for the cotinine data

• Therefore, there is not one absolute number that can represent the difference any replicate is expected to be from the true value or from another replicate

• Instead, within-subject standard deviation varies with the level of the measurement and it is reasonable to depict the within-subject standard deviation as a % of the level

= coefficient of variationmeansubject -within

deviation standardsubject -within 1- )antilog(sw

Page 49: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Cotinine Data

• Coefficient of variation = 1.49 -1 = 0.49

• At any level of cotinine, the within-subject standard deviation of repeated measures is 49% of the level

Page 50: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Coefficient of Variation for Peak Flow Data

• By definition, when the within-subject standard deviation is not proportional to the mean value, as in the Peak Flow data, then there is not a constant ratio between the within-subject standard deviation and the mean.

• Therefore, there is not one common coefficient of variation

• Estimating the the “average” coefficient of variation is not very meaningful

Page 51: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Peak Flow Data: Use of Coefficient of Variation when sw is Constant

Mean of replicates sw C.V.100 15.3 0.153200 15.3 0.077300 15.3 0.051400 15.3 0.038500 15.3 0.031600 15.3 0.026700 15.3 0.022

Page 52: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Pattern of within-subjectstandard deviation overrange of measurement

Which Index to Use?

Constant“Common” within-subjectstandard deviation (and itsderivatives)

Proportional to themagnitude of themeasurement

Coefficient of variation

Neither constant norporportional

Family of coefficients ofvariation over range ofmeasurement

Page 53: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Reproducibility of a Categorical Measurements: Kappa Statistic

• Agreement above that expected by chance

• (observed agreement - chance agreement) is the amount of agreement above chance

• If maximum amount of agreement is 1.0, then (1 - chance

agreement) is the maximum amount of agreement above

chance that is possible

• Therefore, kappa is the ratio of “agreement beyond chance” to

“maximal possible agreement beyond chance”

agreement chance -1agreement chance -agreement observed

kappa

Page 54: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Sources of Measurement Variability: Which to Assess?

• Observer • within-observer (intrarater)• between-observer (interrater)

• Instrument • within-instrument• between-instrument

• Subject• within-subject

• Which to assess depends upon the use of the measurement and how/when the measurement will be made: – For clinical use: all of the above are needed– For research: depends upon logistics of study (e.g.,

within-observer and within-instrument only are needed if just one person/instrument used throughout study)

Page 55: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Assessing Validity

• Measures can be assessed for validity in 3 ways:

– Content validity• Face• Sampling

– Construct validity– Empirical validity (aka criterion)

• Concurrent (i.e. when gold standards are present)– Interval scale measurement: 95% limits of agreement– Categorical scale measurement: sensitivity & specificity

• Predictive

Page 56: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Conclusions

• Measurement reproducibility plays a key role in determining validity

and statistical precision in all different study designs

• When assessing reproducibility, for interval scale measurements:

• avoid correlation coefficients

• use within-subject standard deviation if constant

• or coefficient of variation if within-subject sd is proportional to the

magnitude of measurement

• For categorical scale measurements, use Kappa

• What is acceptable reproducibility depends upon desired use

• Assessment of validity depends upon whether or not gold standards

are present, and can be a challenge when they are absent

Page 57: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Assessing Validity - With Gold Standards

• A new and simpler device to measure peak flow becomes available (Bland-Altman)

subject gold std new1 494 5122 395 4303 516 520. . .. . .. . .

15 178 25916 423 35017 427 451

Page 58: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Plot of Difference vs. Gold Standard

100 300 500 700-200

-100

0

100

200

Dif

fere

nc

e

Gold standard

Page 59: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Examine the Differences

100 300 500 700-200

-100

0

100

200

Dif

fere

nc

e

Gold standard

d1= -81

d2= 7

d3= -35

Page 60: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

Are the Differences Normally Distributed?F

req

ue

ncy

diff-100 -50 0 50 100

0

2

4

6

8

Page 61: Epidemiologic Methods- Fall 2002. Course Administration Format –Lectures: Tuesdays 8:15 am, except for Dec. 10 at 1:30 pm –Small Group Sections: Tuesdays

• The mean difference describes any systematic difference between the gold standard and the new device:

• The standard deviation of the differences:

• 95% of differences will lie between -2.3 + (1.96)(38.8), or from -78 to 74 l/min.

• These are the 95% limits of agreement

i

i nd

nd 3.2)]427451(..)494512[(

11

8.381

)( 2

n

dds i i

d