bias can get by us november 2 2004 epidemiology 511 w. a. kukull

Bias can get by usNovember 2 2004Epidemiology 511

W. A. Kukull

Bias

• Systematic error that leads to incorrect estimate of an association– anticipate and eliminate or minimize in the

study design phase– may be impossible to account for in analysis– usually introduced by the investigator (or

subjects) • Main categories: Selection bias and

Information bias

Bias is a systematic error(diagram after Rothman, 2002)

• Random error decreases with study size; systematic error remains

Random error

Systematic error

Study size

Error

Direction of Bias

True oddsratio

Observedodds ratio

Direction ofBias

2.0 8.0 away fromnull (1.0)

0.9 0.5 away fromnull

5.0 1.3 toward null

Control of Bias

• Careful study design is primary– Selection bias: permanent flaw

• Choice of study groups• Data Collection; Data sources

– objective, closed ended questions– trained interviewers: reliability assessment– wide variety of factors to “blind” interviewer

and subject to hypothesis

Selection Bias

• Selection of “cases” or “controls” leads to apparent disease- exposure association

• Selection or f/u and dx of “exposed” or “unexposed” leads to apparent d - e association

• “Apparent” association is due to a systematic error in design or conduct of the study

Selection bias

• Common element:– The association between exposure and

disease is different for those who are studied than it is for those who would be eligible but are not studied

– Case - control: subject selection is influenced by probability of exposure history

– Cohort: non-random loss to follow-up influences association measure (RR)

“Population” base

Framinghammer City

Studyenrollees

Time

Loss, death, refusalsbefore disease develops

Disease cases

Non- diseased

Selection BiasReferencePopulation

Study Sample

Non-Reference probabilitiesof being included in the studywithin exposure (or disease)

Dis No Dis

Exp

NotExp

Example: selection bias (after Szklo & Neito, 2000)

True reference populationdisease No disease

Exp

Not Exp

500

500

1800

7200

OR = 4.0

Unbiased Sample re: exposure status 50% of Diseased; 10% of Not Diseased-- but true Reference proportions of “exposed” in each

250

250

180

720

D Not D

Exp

Not Exp

OR= 4.0

Biased exposure probability sampling among “diseased” ONLY ( 60% exposed, not true 50% ) due to a flawed design or strategy

300 180

200 720

Exp

Dis Not Dis

Not Exp

OR = 6.0

Basic example: Case-control study(after Hernan et al, 2004)

• Is prior HRT use associated with MI?• Select women with incident MI—cases• Select controls from women with high

frequency of hip fracture (unintentionally) • HRT is known to decrease osteoporosis• Is the HRT – MI association likely to be

biased ? Why/how?

Hospital-base case-control study:Berkson’s bias (after Schwartzbaum et al,2003)

• Premise: diseases have different probabilities hospital admission– Pr(brain injury) > Pr(allergic rhinitis)– Pr( >2 diseases) > Pr( 1 disease)– Diseases unassociated in the population could

be associated in hospitalized patients• Then, a risk factor for one disease could

appear to be a risk factor for the other

Berkson’s bias/Admission bias(after Sackett, 1979)

17 207

184 2376

5 15

18 219

Resp.Disease

Bone disease

Yes

No

Yes No Yes No

Gen. Pop.

OR=1.06

Hospitalized inLast 6 monthsOR=4.06

Loss to follow-up: Selection bias in a Cohort study

• Effects of anti-retroviral therapy hx on AIDS risk in HIV+ patients.

• Pts. with more symptoms may drop early– Pts. with more therapy side effects may drop

• Restricting analysis to non-drop outs can produce biased result

• Subject drop out is rarely “at random”– Statistical missing data strategies

Selection Biases

• Non-response/Missing data bias: characteristics may differ between early, late and nonresponders– Missing data proportions differ– Analyses restricted to complete data will be

biased – Non-responders in case-control studies may

have different exposure histories

Healthy Worker selection bias

• Do rubber industry workers have excess mortality compared with U.S. population of the same age and sex?– SMR = 82 for rubber workers

• General population includes people who are unable to work because of illness– All cause death rates are usually higher in the general

pop. than among workers – Use unexposed workers as a comparison group

Contributors to selection bias

• Choice of comparison group or sampling frame

• Self-selection, volunteers• Loss to follow-up (cohort)• Initial non-response

– primarily case-control studies • Selective survival• Differences in disease detection

(surveillance or detection bias)

Examples

• Unmasking bias: – physicians followed OC users more closely

because of use-related cautions and thus detected more thrombophlebitis

– Frequent visits =>more comorbidity• Prevalent case and Survival bias

– Smoking and Alzheimer’s disease– Among AD cases smokers may have shorter

survival than non-smokers

Prevalent case biasLonger disease duration increases chance of selection

Time

Cross-sectional Sample

Example: volunteer/self-selection

• Leukemia in troops present at atomic test site– 76% of all troops were traced– of the 76%, 82% were tracked down by

investigators– of the 76%, 18% contacted investigators on

their own initiative– 4 leukemia cases were among the 18% and 4

among the 82%--Self referral bias?

Information Bias

• Inadequacies and inaccuracies in data collection or measurement

• Common to all subjects?– Will reduce observed association

• Different in each comparison group?– may exaggerate association

Information Bias

• Systematic errors in obtaining needed exposure (or diagnosis) information– non-differential misclassification, “random”

error• usually biases toward the “null”

– differential misclassification: different between the study groups

• may cause estimated effect error in either direction

Example:True classification of family history for a hypothetical disease ‘X’

240

160

80

320

No Disease

Positive Family Hx

No Family Hx

OR= 6.0 400 400

Disease X

Example: Non-Differential misclassification Fam Hx accuracy cases 65%; controls 65%

156

244 348

Disease X No X

Family Hx

No Fam Hx

OR = 4.3 400 400

52

Example: Differential misclassification accuracy cases 85%; controls 25%

204

196

20

380

Disease X No X

Family Hx

No Family Hx

OR = 19.8 400 400

Cohort study: true classification of persons who hypothetically develop ER

(after Koepsell & Weiss, Chapt 10)

Esoph. Reflux

No esoph.Reflux

Chew tobacco 10 990 1000

Do not chew 10 9990

10,000

RR= 10.0

What if only 90% of the true cases were identified due to diagnostic inaccuracy?

Esoph. Reflux

No esoph.Reflux

Chew Tobacco 10(0.9)=9 990+1=991 1,000

Do not chew

10(0.9)=9 9990+1=9991

10,000

RR=10.0

What if 1.0% of the well persons were misdiagnosed as having ER, but didn’t

Esoph. Reflux

No esoph.Reflux

Chew tobacco 10+10=20

990(.99)=980

1000

Do not chew

10+100=110

9990(.99)=9890

10,000

RR= 1.82

Information Bias

• Example: MI and smoking– smokers with new MI may be less likely to

respond to a mailed questionnaire than non-smokers with new MI

– if the non response is related to exposure and disease the potential for bias exists

• Proxy reports of exposure– Relationship, proximity influence agreement

Information Biases(after Sackett)

• Diagnostic suspicion bias: knowledge of subjects prior history influences intensity of diagnostic effort

• Exposure suspicion bias: disease with “known” cause may increase search for that cause

Information Biases(after Sackett)

• Recall bias: cases more (or less) likely to report than controls

• Family information bias: Information from a family is stimulated by a new case in in the family--and their need to explain why

Exposure Diseaseviewed through (after Maclure & Schneeweiss, 2001)

• Background random factors (chance)

• Correlated causes, confounding

• Diagnostic inaccuracy• Exposure accuracy• Missing data, database

errors

• Group/hypothesis formation

• Case-control selection• Cohort loss to f/u• Analysis, modeling,

interpretation• Publication bias

– Editors and experts

Evaluation of Bias:What would the RR look like if ???

• What is the direction and likely effect if bias is active?– IS A TRUE ASSOCIATION MASKED?– IS A SPURIOUS ASSOCIATION

REPORTED?• Can the potential for recall bias be estimated

– second control group with another illness?

Is Selection Bias Present(after Grimes and Shultz, Lancet;2002;359:248-52)

• In a cohort study, are participants in the exposed and unexposed groups similar in all respects except for exposure?

• In a case control study, are cases and controls similar in important respects except for the disease in question?

Is Information Bias Present(after Grimes and Shultz, Lancet;2002;359:248-52)

• In a cohort study, is information about outcome obtained in the same way for those exposed and unexposed?

• In a case control study, information about exposure gathered in the same way for cases and controls?

Is Confounding Present(after Grimes and Shultz, Lancet;2002;359:248-52)

• Could the results be accounted for by the presence of another factor– e.g., age, smoking, sexual behavior, diet—associated with the exposure and outcome but not directly in the causal pathway?

• Confounding is the subject of another lecture…

If Not bias or confounding are results due to “chance”

(after Grimes and Shultz, Lancet;2002,359:248-52)

• What is the RR or OR and the 95% confidence intervals…Does the CI include 1.0?

• Is the difference (association) statistically significant and if not did the study have adequate power to find a clinically important difference (association)?– What is the p-value?– Is the p-value inflated by multiple comparisons ?

Bias and study designs:Important sources

• Case-control– Knowledge of disease status may influence

determination of exposure status– Knowledge of exposure status influenced the subjects

selected– Recall bias

• Cohort – loss to follow-up; differential misdiagnosis– Information bias

Epidemiologic Reasoning

• Use the tools, statistics and calculations• Use knowledge of biology, behavior and

disease pathogenesis• Make educated guesses about effect of bias

and confounding to guide study design and analysis and eliminate untoward effects

• Try to make causal inferences

Conclusion

• What sources of Bias are common to which study designs?

• How can we evaluate bias?• “Sensitivity analysis”: “What if….”• Confounding may still impact results even

if bias is eliminated—but it can be dealt with in analysis.

bias can get by us november 2 2004 epidemiology 511 w. a. kukull

Documents