design and analysis of clinical study 11. analysis of cohort study dr. tuan v. nguyen garvan...
TRANSCRIPT
![Page 1: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/1.jpg)
Design and Analysis of Clinical Study 11. Analysis of Cohort Study
Dr. Tuan V. Nguyen
Garvan Institute of Medical Research
Sydney, Australia
![Page 2: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/2.jpg)
Overview
• Incidence, person-years, hazard • Relative risk• Logistic regression analysis• Lifetable• Cox’s regression analysis• Diagnosis and prognosis
![Page 3: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/3.jpg)
Person-time
• Person-time = # persons x duration
1
2
3
4
5
Time
Incidence rate (IR). During (2+4+4+8+2)=20 person-years,there were 2 incident cases: IR = 2/20 = 0.1
0 2 4 6 8
2
4
4
8
2
![Page 4: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/4.jpg)
Incidence
1
2
3
4
5
Time
Incidence proportion (IP). During a 2-year period, 3 out of 5 subjects developed the disease; IP = 3/5 = 0.6
1 0 2
![Page 5: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/5.jpg)
Estimation of Incidence Rates
• Consider a study where P patient-years have been followed and N cases (eg deaths, survivors, diseased, etc.) were recorded.
• Assumption: Poisson distribution.
• The estimate of incidence rate is: I = N / P
• Standard error of I is:
• 95% confidence interval of “true” incidence rate: I + 1.96 x SD(I)
NSE
P
![Page 6: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/6.jpg)
Relative Risk
(exp ) 15.12.48
( exp ) 6.1
Risk osedRR
Risk un oused
Incidence rate of ischemic heart disease (IHD)
<2750 kcal >2750 kcal______________________________________________________________
Person-years 1858 2769
New cases 28 17______________________________________________________________
Estimate rate 15.1 6.1
SD of est. rate 2.8 1.5
1 2
1 1 1 10.3075
28 17SE
N N
• Relative risk (RR):
• L = log(RR) = 0.908• Standard error of log(RR)
• 95% of L: L ± 1.96xSE
= 0.908 ± 1.96x0.3075
= 0.3055, 1.51
• 95% of RR:
= exp(0.3055), exp(1.51)
= 1.36, 4.53
![Page 7: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/7.jpg)
Analysis of Difference in Incidence Rates
Incidence rate of ischemic heart disease (IHD)
<2750 kcal >2750 kcal______________________________________________________________
Person-years 1858 2769
New cases 28 17______________________________________________________________
Estimate rate 15.1 6.1
SD of est. rate 2.8 1.5
• Difference:
D = 15.1 – 6.1 = 8.93
• Standard error (SE) of D
2 2
28 170.032
1858 2769SE
• 95% of D
= D ± 1.96xSE
= 8.93 ± 1.96x0.032
= 3.65, 14.2
![Page 8: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/8.jpg)
Logistic Regression Analysis
• Example: A prospective of the association between BMI, BMD and bone turnover markers and fracture in 139 men. The risk factors were measured at baseline, and fracture was recorded during the 10-year follow-up:
id fx age bmi bmd ictp pinp
1 1 79 24.7252 0.818 9.170 37.383
2 1 89 25.9909 0.871 7.561 24.685
3 1 70 25.3934 1.358 5.347 40.620
4 1 88 23.2254 0.714 7.354 56.782
5 1 85 24.6097 0.748 6.760 58.358
6 0 68 25.0762 0.935 4.939 67.123
7 0 70 19.8839 1.040 4.321 26.399
8 0 69 25.0593 1.002 4.212 47.515
9 0 74 25.6544 0.987 5.605 26.132
10 0 79 19.9594 0.863 5.204 60.267
...
137 0 64 38.0762 1.086 5.043 32.835
138 1 80 23.3887 0.875 4.086 23.837
139 0 67 25.9455 0.983 4.328 71.334
![Page 9: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/9.jpg)
Logistic Regression: Model
• p = probability of fracture
• odds:
• Logit of p:
• X is a risk factor. Linear logistic model:
L = + X +
• Expected value of = 0. Expected value of L is: L = + X
1
pOdds
p
log1
pL
p
• Odds = e+X
• Odds ratio (OR)
0
0
10
0
| 1
|
x
x
odds p x x ee
odds p x x e
0
0
10
0
1 x
x
odds x x eOR e
odds x x e
![Page 10: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/10.jpg)
Logistic Regression Analysis using R
fracture <- read.table(“fracture.txt”, header=TRUE, na.string=”.”)
attach(fulldata)results <- glm(fx ~ bmd, family=”binomial”) summary(results)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.0287 -0.8242 -0.7020 1.3780 2.0709
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.063 1.342 0.792 0.428
bmd -2.270 1.455 -1.560 0.119
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 157.81 on 136 degrees of freedom
Residual deviance: 155.27 on 135 degrees of freedom
AIC: 159.27
![Page 11: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/11.jpg)
Model of Prediction
> sd(bmd)
[1] 0.1406543
• OR per SD increase in BMD: e-2.27*0.1406 = 0.7267
• Predictive model:
1.063 2.27
1.063 2.27ˆ
1
bmd
bmd
ep
e
![Page 12: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/12.jpg)
Model of Prediction
plot(bmd, fitted(glm(fx ~ bmd, family=”binomial”)))
0.6 0.8 1.0 1.2
0.1
50
.20
0.2
50
.30
0.3
50
.40
bmd
fitte
d(g
lm(f
x ~
bm
d, f
am
ily =
"b
ino
mia
l"))
![Page 13: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/13.jpg)
Problem of Time-to-event Data
• Non-normally distribution• Lost to follow-up• Censored observations (eg patients are still alive at the last follow-up)
• A class of statistical methods to study the occurrence and timing of events.
• Its applications are found in medicine and engineering science.
– Lifetime of machine components
– Time from diagnosis to death
– Time from infection to disease onset (latency time)
![Page 14: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/14.jpg)
Definition of “Failure Time”
• Time origin– Time origin = starting time of the experiment/study
• Scale of measurement– Chronological time, but not necessary– Must be non-negative
• Precise definition – Death– Death with a specified reason
123456
c
Censored obs
Observed failure
![Page 15: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/15.jpg)
Construction of Lifetable
• Survival time (in years) of 18 patients after diagnosis of parathyroid cancer:
10 13* 18* 19 23* 30 36 38* 54* 56* 59 75 93 97 104* 107 107* 107* *: censored (= survived)
• Arrange the observed failure times in an increasing order (tj)
• Calculate the number of failures (dj) during [tj-1 to tj]
• Calculate the number of censored observations (cj) during [tj-1 to tj]
• Calculate the number of subjects at risk up to time tj-1
• Compute the proportion of deaths for each interval• Compute the estimate of survivor function
![Page 16: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/16.jpg)
Lifetable of Example Data
Time (t)
Duration (in weeks)
Number at risk at the
start of duration (nt)
Number of failures
during the duration (dt)
Probability of Failure - h(t)
Probability of survival
pt
Cumulative probability of survival
S(t)
1 0 – 9 18 0 0.0000 1.0000 1.0000
2 10 – 18 18 1 0.0555 0.9445 0.9445
3 19 – 29 15 1 0.0667 0.9333 0.8815
4 30 – 35 13 1 0.0769 0.9231 0.8137
5 36 – 58 12 1 0.0833 0.9167 0.7459
6 59 – 74 8 1 0.1250 0.8750 0.6526
7 75 – 92 7 1 0.1428 0.8572 0.5594
8 93 – 96 6 1 0.1667 0.8333 0.4662
9 97 – 106 5 1 0.2000 0.8000 0.3729
10 107 – 3 1 0.3333 0.6667 0.2486
![Page 17: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/17.jpg)
Lifetable analysis using R
library(survival)
weeks <- c(10, 13, 18, 19, 23, 30, 36, 38, 54,
56, 59, 75, 93, 97, 104, 107, 107, 107)
status <- c(1, 0, 0, 1, 0, 1, 1,0, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0)
data <- data.frame(duration, status)
survtime <- Surv(weeks, status==1)
kp <- survfit(survtime)
summary(kp)
time n.risk n.event survival std.err lower 95% CI upper 95% CI
10 18 1 0.944 0.0540 0.844 1.000
19 15 1 0.881 0.0790 0.739 1.000
30 13 1 0.814 0.0978 0.643 1.000
36 12 1 0.746 0.1107 0.558 0.998
59 8 1 0.653 0.1303 0.441 0.965
75 7 1 0.559 0.1412 0.341 0.917
93 6 1 0.466 0.1452 0.253 0.858
97 5 1 0.373 0.1430 0.176 0.791
107 3 1 0.249 0.1392 0.083 0.745
![Page 18: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/18.jpg)
Lifetable analysis using R
plot(kp, xlab="Time (weeks)", ylab="Cumulative survival probability")
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Time (weeks)
Cu
mu
lativ
e s
urv
iva
l pro
ba
bili
ty
![Page 19: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/19.jpg)
Example of Cox’s Survival Data
Treatment groupid episodes time infected 1 12 8 1 3 10 12 0 6 7 52 0 7 10 28 1 8 6 44 1 10 8 14 1 12 8 3 1 14 9 52 1 15 11 35 1 18 13 6 1 20 7 12 1 23 13 7 0 24 9 52 0 26 12 52 0 28 13 36 1 31 8 52 0 33 10 9 1 34 16 11 0 36 6 52 0 39 14 15 1 40 13 13 1 42 13 21 1 44 16 24 0 46 13 52 0 48 9 28 1
Control groupid episodes time infected 2 9 15 1 4 10 44 0 5 12 2 0 9 7 8 111 7 12 113 7 52 016 7 21 117 11 19 119 16 6 121 16 10 122 6 15 025 15 4 127 9 9 029 10 27 130 17 1 132 8 12 135 8 20 137 8 32 038 8 15 141 14 5 143 13 35 145 9 28 147 15 6 1
Time to infection among patients with herpes. 25 patients were treated with gd2 and 23 patients were not treated.
Risk factor is the number of infectious episodes in previous year.
![Page 20: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/20.jpg)
Cox’s Regression Model: Theory
• Setting: a prospective study (or randomized clinical trial)– Risk factors were measured at baseline– Patients were follow-up for T time– Event occurred during that time– Risk of having the event was related to baseline risk ?
• Let x1, x2, x3, … xp be risk factors. X could be continuous or discrete variables.
• Model:
Risk = (base risk) x (risk factor)
1 1 2 2 3 3 ... p px x x xh t t e h(t) : hazard / risk of having the event
(t) : base risk
x1 + x2 + … : coefficient associated with each risk factor
![Page 21: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/21.jpg)
Cox’s Regression Model: Data
• Relative risk (relative hazards - RH)
1 2| , group episodeh t group episode t e
1 12 1| 2
| 1
h t groupRH e e
h t group
1 represents the relative hazards or treatment effect
![Page 22: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/22.jpg)
Cox’s Regression Model Using R
group <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)episode <- c(12, 10, 7, 10, 6, 8, 8, 9, 11, 13, 7, 13, 9, 12, 13, 8, 10, 16, 6, 14, 13, 13, 16, 13, 9, 9, 10, 12, 7, 7, 7, 7, 11, 16, 16, 6, 15, 9, 10, 17, 8, 8, 8, 8, 14, 13, 9, 15)time <- c(8, 12, 52, 28, 44, 14, 3, 52, 35, 6, 12, 7, 52, 52, 36, 52, 9, 11, 52,15, 13, 21,24, 52,28, 15,44, 2, 8,12,52,21,19, 6,10,15, 4, 9,27, 1, 12,20,32,15, 5,35,28, 6)infected <- c(1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1)data <- data.frame(group, episode, time, infected)
![Page 23: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/23.jpg)
Cox’s Regression Model Using R
library(survival)
kp.by.group <- survfit(Surv(time, infected==1) ~ group)
# Kaplan Meier curve
summary(kp.by.group)
plot(kp.by.group,
xlab="Time",
ylab="Cum. survival probability",
col=c(“black”, “red”))
# Cox’s regression model 1
analysis <- coxph(Surv(time, infected==1) ~ group)
summary(analysis)
# Cox’s regression model 2
analysis <- coxph(Surv(time, infected==1) ~ group + episode)
summary(analysis)
Cox.model <- survfit(coxph(Surv(time, infected==1)~episode+strata(group)))
![Page 24: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/24.jpg)
Survival Curves
0 10 20 30 40 50
0.0
0.2
0.4
0.6
0.8
1.0
Time
Cu
mu
lativ
e s
urv
iva
l pro
ba
bili
ty
![Page 25: Design and Analysis of Clinical Study 11. Analysis of Cohort Study Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia](https://reader035.vdocuments.net/reader035/viewer/2022070409/56649e9f5503460f94ba0cd2/html5/thumbnails/25.jpg)
Cox’s Regression Model Using R
analysis <- coxph(Surv(time, infected==1) ~ group + episode)
summary(analysis)
coef exp(coef) se(coef) z pgroup 0.874 2.40 0.3712 2.35 0.0190episode 0.172 1.19 0.0648 2.66 0.0079
exp(coef) exp(-coef) lower .95 upper .95group 2.40 0.417 1.16 4.96episode 1.19 0.842 1.05 1.35
Rsquare= 0.196 (max possible= 0.986 )Likelihood ratio test= 10.5 on 2 df, p=0.00537Wald test = 10.4 on 2 df, p=0.00555Score (logrank) test = 10.6 on 2 df, p=0.00489