Measuring covariate data_Presentation (November 14, 2007) 1
Measuring covariate data in Measuring covariate data in subsets of study populations: subsets of study populations:
Design optionsDesign options
Jean-François Boivin, MD, ScD
McGill University
19 August 2007
2
3
16th International Conference on Pharmacoepidemiology
Barcelona 2000
4
What about missing covariate data?
5
Do not research that topic
Option #1
6
• Conduct study without covariates
• Scientifically reasonable for certain questions
• Example: Sharpe et al. 2000
Option #2
7
British Journal of Cancer 2002The effects of tricyclic antidepressants
on breast cancer risk
• Genotoxicity in Drosophila
• Comparison of antidepressants:– 6 genotoxic vs 4 nongenotoxic
• Confounding unlikely
8
Option #3
“Confounding by other determinants was studied in analyses with data obtained by interviewing samples of subjects…”
9
List 4 - 6 different sampling strategies:
“Confounding by other determinants was studied in analyses with data obtained by interviewing samples of subjects…”
a) ?
b) ?
c) ?
d) ?
10
11
12
Two-stage sampling
13
Entire population (=truth)
OR=0.5
OR=0.5
OR=2.5
Obese
Not obese
All
E+ E-
D+
D+
D+
D-
D-
D-
12,000 140
10,200 10,400
22,200 10,540 32,740
2,000 4010,000 100
200 40010,000 10,000
2,200 44020,000 10,100
14
Obese
Not obese
All
E+ E-
D+
D+
D-
D-
22,200 10,540
not available
computerized databases
2,200 44020,000 10,100
D+D-
15
Two-stage sampling
16
Obese
Not obese
All
E+ E-
250/ 250/250/ 250/ 2,200 440
20,000 10,100
32,740
227 23125 2
23 227125 248
D+D-
D+D-
D+D-
Two-stage sampling
OR1 biased
OR2 biased
250 x 250 250 x 250 = 1
17
White. AJE 1982
Walker. Biometrics 1982
Cain, Breslow. AJE 1988
Weinberg, Wacholder. Biometrics 1990
Weinberg, Sandler. AJE 1991
Statistical analysis; further design issues
18
19
Option 1:
Option 2:
Option 3:
Option 4:
No study
No covariate measurement
2-stage sampling
Case only measurement
20
Ray et al.Archives of Internal Medicine 1991
21
Cyclic antidepressants and the risk of hip fracture
22
E+ E-
All
RR=0.5
RR=0.5
RR=
D+D-
D+D-
D+D-
All
Not obese
Obese
RR=0.5
N1=? N2=?
RR=0.5
N3=? N4=?
RR=
RR=0.5
N1=1,000
N2=1,000
RR=0.5
N3=1,000 N4=1,000
RR=0.5
RR=0.5
N1=1,000
N2=1,000 cross-product ratio =1
RR=0.5
N3=1,000 N4=1,000
RR=
RR=0.5
N1=1,000
N2=1,000
RR=0.5
N3=1,000 N4=1,000
RR=
Confounding: Quick review
23
Obese
Not obese
All
D+
D+
D+
D-
D-
D-
OR=0.5
OR=0.5
OR=
E+ E-
OR=0.5500 1,500
OR=0.51,000 3,000
OR=
OR=0.5
OR=0.5
OR=0.5
OR=0.5
cross-product ratio =1
OR=0.5
OR=
Case-control study
24
Cyclic antidepressants and the risk of hip fracture
25
E+ E-
D+Obese
Not obese
All
D-
D+D-
D+D-
2,200 440 computerized database20,000 10,100
22,200 10,540
medical record review
2,200 440 computerized database20,000 10,100
22,200 10,540
2,000 400
? ?
200 40? ?
2,200 44020,000 10,10022,200 10,540
Covariate data on cases only
26
E+ E-
D+Obese
Not obese
All
D-
D+D-
D+D-
2,000 400
? ?
200 40? ?
2,200 44020,000 10,10022,200 10,540
OR1
OR2
•assume OR1 = OR2
•then: cross-product ratio =1 implies no confounding
Covariate data on cases only
27
What if confounding seems to be present?
Extensions
28
29
Option 1: No study
Option 2: No covariate measurement
Option 3: 2-stage sampling
Option 4: Case only measurements
Suissa, Edwardes. 1997
30
Confounder data on cases only
Obese
Not obese
E+ E-
D+D-
2,000 220? ?
200 220? ?
Cross-product ratio =10
Confounding plausible
D+D-
31
Epidemiology 1997
• Extensions of Ray’s method to presence of confounding
• Requires additional data from external sources
32
Smoker
Nonsmoker
All
E+ E-
D+
D+
D+
D-
D-
D-
Theophylline
17 13 30956 3,154 4,080
14 5 19
3 8 11
14 5 19
24% of 4,080
3 8 11
76% of 4,080
14 5 19
24% of 4,080
obtained from population survey
3 8 11
76% of 4,080
Confounding; no interaction
33
• Extensions of Ray’s method to presence of interaction
• Requires further additional data from external sources
Suissa, Edwardes. 1997
34
No interaction
OR=0.5
OR=0.5
Obese
Not obese
E+ E-
D+
D+
D-
D-
12,000 140
10,200 10,400
2,000 4010,000 100
200 40010,000 10,000
35
Option 1: No study
Option 2: No covariate measurement
Option 3: 2-stage sampling
Option 4: Case only measurements
Suissa, Edwardes. 1997
Multi-stage sampling
Partial questionnaires
Propensity score adjustments
Others:
36
37
38
Monotone missingness
39
Wacholder S, et al.
40
Cov 1 2 3 4 5 6 7 8
Subject 1
2
3
4
5
6
7
8
9
10
…
n
Cov 1 2 3 4 5 6 7 8
Subject 1
2
3 4
5
6
7
8
9
10
…
n
Cov 1 2 3 4 5 6 7 8
Subject 1
2
3
4 5
6
7
8
9
10
…
n
Cov 1 2 3 4 5 6 7 8
Subject 1 2 3
4
5
6
7
8
9
10
…
n
Cov 1 2 3 4 5 6 7 8
Subject 1
2
3
4
5 6
7
8
9
10
…
n
Cov 1 2 3 4 5 6 7 8
Subject 1
2
3
4
5
6 7
8
9
10
…
n
Cov 1 2 3 4 5 6 7 8
Subject 1
2
3
4
5
6
7 8
9
10
…
n
Cov 1 2 3 4 5 6 7 8
Subject 1
2
3
4
5
6
7
8 9
10
…
n
Cov 1 2 3 4 5 6 7 8
Subject 1
2
3
4
5
6
7
8
9 10 …
n
41
Wacholder S, et al.
Restricted to a small number of discrete covariates
42
Methodologic research
Stürmer et al. AJE 2005, 2007
Propensity score calibration
43
• Summarizes information about several covariates into a single number
• Used for matching, stratification, regression
Propensity score
44
• Main cohort: selected covariates-“error-prone” scores estimated - regression coefficients estimated
• Sample: additional covariates-gold standard scores-regression calibration
• Advantage: multivariable technique
Stürmer et al. 2005
45
“Until the validity and limitation of… [propensity score calibration] have been assessed in different settings, the method should be seen as a sensitivity analysis.”
Stürmer et al. 2005
46
47
48
Stage 1: 278 cases in 4561 pregnancies
Stage 2: 244 cases + 728 non cases
49
50
“Relatively few examples of two-and three-phase sampling designs for case-control studies have appeared to date in the epidemiologic literature.This is unfortunate, because the stratified designs are easy to implement and can result in substantial savings.”
NE Breslow (2000)
51
Consent for second-stage interviews:• Cases: 49%• Controls: 39%