effect modification & confounding kostas danis epiet introductory course, menorca 2012
TRANSCRIPT
Effect Modification & Confounding
Kostas Danis
EPIET Introductory course,
Menorca 2012
Analytical epidemiology
Study design: cohorts & case control & cross-sectional studies
Choice of a reference group Biases Impact Causal inference
Stratification- Effect modification - Confounding
Matching Multivariable analysis
Cohort studies marching towards outcomes
Exposed
Not exposed
CasesNoncases Risk %
Cohort study
50 50 50 %
10 90 10 %
Risk ratio 50% / 10% = 5
Total
100
100
CasesExposed
Unexposed
Source population
Controls:Sample of the denominatorRepresentative with regard to exposure
Controls
Sample
Controls are non cases
Low attack rate: non-cases likely to represent exposure in source pop
Non- casesSourcepopn
High attack rate: non-cases unlikely to represent
exposure in source population
Cases
Cases
Non- cases
endstart
endstart
Exposed
Not exposed
Cases Controls Odds ratio
Case control study
a b
c d
Total a+c
OR= (a/c) / (b/d) = ad / bc
a/c b/dOdds ofexposure
b+d
Who are the right controls?
Controls may not be easy to find
Cross-sectional study: Sampling
Sample
Target Population
SamplingPopulation
Exposed
Not exposed
CasesNoncases Prevalence %
Cross-sectional study
500 500 50 %
100 900 10 %
Prevalence ratio (PR) 50% / 10% = 5
Total
1,000
1,000
Should I believe my measurement?
Exposure Outcome
RR = 4
Chance?Bias? Confounding?
True associationcausal
non-causal
Exposure Outcome
Third variable
Two main complications
(1) Effect modifier
(2) Confounding factor
- useful information
- bias
To analyse effect modification
To eliminate confounding
Solution = stratification stratified analysis
Create strata according to categories inside the range of values taken by third variable
Effect modification
Variation in the magnitude of measure of effect across levels of a third variable.
Effect modifier
Happens when RR or OR is different between strata (subgroups of population)
Effect modifier
To identify a subgroup with a lower or higher risk ratio
To target public health action
To study interaction between risk factors
Effect modification
Factor A(asbestos)
Disease(lung cancer)
Factor B(smoking)
Effect modifier = Interaction
19
Asbestos (As) and lung cancer (Ca)
Case-control study, unstratified data
As Ca Controls OR
Yes 693 320 4.8No 307 680 Ref.
Total 1000 1000
Asbestos Lung cancer
Smoking
As Smoking Cases Controls OR
Yes Yes 517 160 8.9
Yes No 176 160 3.0
No Yes 183 340 1.5
No No 124 340 Ref.
Asbestos (As), smoking and lung cancer (Ca)
1.5 * 3.0 < 8.9 1.5 * 3.0 * interaction=8.9
Physical activity and MI
Physical Infarction activity
Gender
Vaccine efficacy
ARU – ARVVE = ----------------
ARU
VE = 1 – RR
Vaccine efficacy
Status Pop. Cases Cases
per 1000 RR
V 301 545 150 0.49 0.28
NV 298 655 515 1.72 Ref.
Total 600 200 665 1.11
VE = 1 - RR = 1 - 0.28
VE = 72%
Vaccine Disease
Age
Vaccine efficacy by age group
Effect modification
Different effects (RR) in different strata (age groups)
VE is modified by age
Test for homogeneity among strata (Woolf test)
Any statistical test to help us?
• Breslow-Day
• Woolf test
• Test for trends: Chi square
Homogeneity
How to conduct a stratified analysis?
Crude analysis
Stratified analysis1.Do stratum-specific estimates look different? 2.95% CI of OR/RR do NOT overlap? 3.Is the Test of Homogeneity significant?
33
YESEFFECT MODIFICATION
(Report estimates by stratum)
NOCheck for confounding(compare crude RR/OR
with MH RR/OR)
Stratified analysis: Effect Modification
E ffect m od ifica tion
O R s / R R s 95% C .I.d o no t o verlap
E ffect m od ifica tion
W oo lf's tes t sig nificant
D iscuss lack o f po w ero f W o llf 's test
E ffect m od ifica tionu n like ly
W o olf's tes t no t sig nificant
U se W o olf's test
O R s / R R s C .I.d o overlap
O R s / R Rsd iffe ren t acro ss s tra ta
Diarrhea Controls OR (95% CI)
No breast feeding 120 136 3.6 (2.4-5.5)
Breast feeding 50 204 Ref
Death from diarrhea according to breast feeding, Brazil, 1980s
(Crude analysis)
No breast Diarhoea feeding
Age
Infants < 1 month of age
Cases Controls OR (95% CI)
No breast feeding 10 3 32 (6-203)
Breast feeding 7 68 Ref
Infants ≥ 1 month of age
Cases Controls OR (95% CI)
No breast feeding 110 133 2.6 (1.7-4.1)
Breast feeding 43 136 Ref
Death from diarrhea according to breast feeding, Brazil, 1980s
Woolf test (test of homogeneity):p=0.03
Exposed
ExposureYes No
RR† (95% CI‡)
n AR (%)* n AR(%)*
pasta 94 77 7 4.2 18.0
(8.8-38)
tuna 49 68 49 24 2.9 (2.1-3.8)
† RR = Risk Ratio* AR = Attack Rate
‡ 95% CI = 95% confidence interval of the RR
Risk of gastroenteritis by exposure, Outbreak X, Place, time X (crude analysis)
Tuna gastroenteritis
Pasta
Pasta Yes
Cases Total AR (%) RR (95% CI)
Tuna 43 52 83 1.1 (0.9-1.3)
No tuna 46 60 77 Ref
Pasta No
Cases Total AR (%) RR (95% CI) Tuna 4 17 24 11 (2.6-46)
No tuna 3 144 2 RefWoolf test (test of homogeneity): p=0.0007
Risk of gastroenteritis by exposure, Outbreak X, Place, time X (stratified analysis)
Tuna, pasta and gastroenteritis
Tuna Pasta Cases AR(%) RR
Yes Yes 43 83 42
Yes No 4 23 12
No Yes 46 76 38
No No 3 2 Ref.
38 * 12 > 42 38 * 12 * interaction= 42
Risk of HIV by injecting drug use (idu), surveillance data, Spain, 1988-2004
Cases Total AR (%) RR (95% CI)
Idu 268 2,732 9.8 3.9 (3.3-4.4)
No idu 484 18,822 2.5 Ref
idu hiv
gender
Males
Cases Total AR (%) RR (95% CI)
idu 86 693 12 20 (14-28)
No idu 52 8,306 0.6 Ref
Females
Cases Total AR (%) RR (95% CI) idu 182 2,039 8.9 2.3 (1.9-2.6)
No idu 432 10,576 4.1 RefWoolf test (test of homogeneity): p=0.00000
Risk of HIV by injecting drug use (idu), Spain, 1988-2004 (stratified analysis)
Idu, gender and hiv
Idu Male Cases AR(%) RR
Yes Yes 86 12.4 3.0
Yes No 182 8.9 2.2
No Yes 52 0.6 0.14
No No 432 4.1 Ref.
0.14 * 2.2 > 3.0 0.14 * 2.2 * interaction= 3.0
Confounding
Confounding
Distortion of measure of effect because of a third factor
Should be prevented
Needs to be controlled for
Confounding
Age
ChlamydiaSkate-boarding
Age not evenly distributed between the 2 exposure groups - skate-boarders, 90% young - Non skate-boarders, 20% young
50
Exposure Outcome (coffee) (Lung cancer)
Third variable (smoking)
51
Grey hair stroke
Age
Cases of Down syndroms by birth order
0
20
40
60
80
100
120
140
160
180
1 2 3 4 5
Birth order
Cases per 100 000 live births
Cases of Down Syndrom by age groups
0100200300400500600700800900
1000
< 20 20-24 25-29 30-34 35-39 40+
Age groups
Cases per 100000 live
births
Birthorder
Age ormother
Downsyndrom
0100200300400500600700800900
1000
Cases per 100000
1 2 3 4 5
Birth order
Cases of Down syndrom by birth order and mother's age
Confounding
Exposure Outcome
Third variable
To be a confounding factor, 2 conditions must be met:
Be associated with exposure - without being the consequence of exposure
Be associated with outcome - independently of exposure
Exposure OutcomeHypercholesterolaemia Myocardial infarction
Third factorAtheroma
Any factor which is a necessary step in the causal chain is not a confounder
Salt Myocardial infarction
Hypertension
The nuisance introduced by confounding factors
• May simulate an association
• May hide an association that does exist
• May alter the strength of the association– Increased– Decreased
Confounding factor
Ethnicity Pneumonia
Crowding
Apparent association
Crowding Pneumonia
Malnutrition
Altered strength of association
How to prevent/control confounding?
Prevention– Randomization (experiment) – Restriction to one stratum– Matching
Control– Stratified analysis– Multivariable analysis
Are Mercedes more dangerous than Porsches?
Type Total Accidents AR % RR
Porsche 1 000 300 30 1.5
Mercedes 1 000 200 20 Ref.
Total 2 000 500 25
95% CI = 1.3 - 1.8
Car type Accidents
Confounding factor:Age of driver
Crude RR = 1.5Adjusted RR = 1.1 (0.94 - 1.27)
Incidence of malaria according to the presence of a radio set,
Kahinbhi Pradesh
Crude data Malaria Total AR% RR
Radio set 80 520 15 0.7
No radio 220 1080 20 Ref
RR: 0.7; 95% CI: 0.6- 0.9; p < 0.0295% CI = 0.6 - 0.9
Radio Malaria
Confounding factor:Mosquito net
Crude RR = 0.7Adjusted RR = 1.01
To identify confounding
Compare crude measure of effect (RR or OR)
to
adjusted (weighted) measure of effect (Mantel Haenszel RR or OR)
10 - 20 %
Any statistical test to help us?
When is ORMH different from crude OR ?
Mantel-Haenszel summary measure
Adjusted or weighted RR or OR
Advantages of MH
• Zeroes allowed
(ai di) / ni
OR MH = ---------------------------
(bi ci) / ni
Mantel-Haenszel summary measure
• Mantel-Haenszel (adjusted or weighted) OR
OR MH = ------------------- SUM (ai di / ni)
SUM (bi ci / ni) n1
a1 b1
c1d1
Cases Controls
Exp+
Exp-
b2
c2d2
Cases Controls
Exp+
Exp-
n2
a2 (a1 x d1) / n1 +
ORMH = ----------------------------------------
(a2 x d2) / n2
(b2 x c2) / n2 (b1 x c1) / n1 +
How to conduct a stratified analysis?
Crude analysis
Stratified analysis1.Do stratum-specific estimates look different? 2.95% CI of OR/RR do NOT overlap? 3.Is the Test of Homogeneity significant?
73
YESEFFECT MODIFICATION
(Report estimates by stratum)
NOCheck for confounding(compare crude RR/OR
with MH RR/OR)
74
pesto 79 45 56.96 212 58 27.36 2.08 [1.56-2.79] 0.000 pasta 121 94 77.69 165 7 4.24 18.31 [8.81-38.04] 0.000 Exposure Total Cases AR% Total Cases AR% Risk Ratio P Exposed Unexposed
. cstable case pesto pasta
Risk of gastroenteritis by exposure, Outbreak X, Place, time X (crude analysis)
Adjusted/crude relative change : -52.67 % MH RR for pesto adjusted for pasta : 0.99 [0.81-1.20] Crude RR for pesto : 2.08 [1.56-2.79]
Test of Homogeneity (M-H) : pvalue : 0.8366301
UnExposed 145 6 4.14 Attrib.risk.pop 0.02 [.-.] Exposed 20 1 5.00 Attrib.risk.exp 0.17 [-5.52-0.90] Risk Ratio 1.21 [0.15-9.53] pesto Total Cases Risk % Risk difference 0.01 [-0.09-0.11] pasta = Unexposed
UnExposed 65 51 78.46 Attrib.risk.pop 0.01 [.-.] Exposed 56 43 76.79 Attrib.risk.exp 0.02 [-0.19-0.19] Risk Ratio 0.98 [0.81-1.19] pesto Total Cases Risk % Risk difference -0.02 [-0.17-0.13] pasta = Exposed
. csinter case pesto, by(pasta)
75
Stratified Analysis
> 10-20%
Examples of stratified analysis
Effect modifierBelongs to natureDifferent effects in different strataSimpleUsefulIncreases knowledge of biological mechanismAllows targeting of PH action
Confounding factorBelongs to study
Weighted RR different from crude RRDistortion of effectCreates confusion in dataPrevent (protocol)
Control (analysis)
Analyzing a third factor
Report ONE crude OR/ RR
Third factor does not play a role
Strata ORs / RRs similar to crude(Crude value fal ls between strata)
El iminate the confoudingReport ONE adj usted OR / RR
Adj ust using theM-H technique
Confounding factor
Strata ORs / RRs diff erent f rom crude(Crude value does not fal l between strata)
Ident ical ORs / RRs across strata
Report MULT IPLE ORs / RRs for each stratum
Stop the analysis.DO NOT adj ust!
Eff ect modifi cat ion
Diff erent ORs / RRs across strata
Examine ORs / RRs in each st ratum
Examine crude OR / RR
How to conduct a stratified analysis
Perform crude analysisMeasure the strength of association
List potential effect modifiers and confounders
Stratify data according topotential modifiers or confounders
Check for effect modification
If effect modification present, show the data by stratum
If no effect modification present, check for confoundingIf confounding, show adjusted dataIf no confounding, show crude data
80
How to define the strata?• Strata defined according to third variable:
– ‘Usual’ confounders (e.g. age, sex, socio-economic status)
– Any other suspected confounder, effect modifier or additional risk factor
– Stratum of public health interest
• For two risk factors:– stratify on one to study the effect of the second
on outcome
• Two or more exposure categories:– each is a stratum
• Residual confounding ?
Logical order of data analysis
How to deal with multiple risk factors:
Crude analysis
Multivariable analysis
1. stratified analysis
2. modelling
linear regression
logistic regression
Multivariate analysis
• Mathematical model
• Simultaneous adjustment of all confounding and risk factors
• Can address effect modification
A train can mask a second train
A variable can mask another variable
Back-up slides
86
Risk factors for Salmonella enteritidis infections, France, 1995
Delarocque-Astagneau et al Epidemiol. Infect 1998:121:561-7
87
Summer Cases Controls OR
(95%CI)
Duration of storage
>= 2 weeks 12 2 7.4
(1.5-69.9)< 2 weeks 52 64
Other seasons
Duration of storage
>= 2 weeks 7 3 2.6
(0.5-16.8)< 2 weeks 32 36
All seasons
>= 2 weeks 19 5 4.5
(1.5 – 16.1)< 2 weeks 84 100
Cases of Salmonella enteritidis gastroenteritis according to egg storage and season
88
Duration Salmonellosisof storage
Season
89
Summer
(A)
“Long” storage
(B)
Cases Control OR
Yes Yes 12 2 ORAB 6.8
Yes No 52 64 ORA 0.9
No Yes 7 3 ORB 2.6
No No 32 36 Ref Ref
Cases of Salmonella enteritidis gastroenteritis according to egg storage and season
90
Advantages & Disadvantages of Stratified Analysis
• Advantages– straightforward to implement and comprehend– easy way to evaluate interaction
• Disadvantages– only one exposure-disease association at a time– requires continuous variables to be grouped
• Loss of information; possible “residual confounding”
– deteriorates with multiple confounders• e.g. suppose 4 confounders with 3 levels
– 3x3x3x3=81 strata needed – unless huge sample, many cells have “0”’ and strata
have undefined effect measures