survival analysis - university of manchester · survival analysis is concerned with the length of...
TRANSCRIPT
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Survival Analysis
Mark Lunt
Centre for Epidemiology Versus ArthritisUniversity of Manchester
03/12/2019
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Introduction
Survival Analysis is concerned with the length of timebefore an event occurs.Initially, developed for events that can only occur once (e.g.death)Using time to event is more efficient that just whether ornot the event has occured.It may be inconvenient to wait until the event occurs in allsubjects.Need to include subjects whose time to event is not known(censored).
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Plan of Talk
CensoringDescribing SurvivalComparing SurvivalModelling Survival
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Censoring
Exact time that event occured (or will occur) is unknown.Most commonly right-censored: we know the event has notoccured yet.Maybe because the subject is lost to follow-up, or study isover.Makes no difference provided loss to follow-up is unrelatedto outcome.
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Censoring Examples: Chronological Time
AccrualPatient
Observation
10987654321 q a aa q aq aaq
0 6 12 18Time(months)
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Censoring Examples: Followup Time
10
9
8
7
6
5
4
3
2
1 q a aa q aq aaq
0 6 12 18
Time(months)
Followup Time
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Other types of censoring
Left Censoring:Event had already occured before the study started.Subject cannot be included in study.May lead to bias.
Interval Censoring:We know event occured between two fixed times, but notexactly when.E.g. Radiological damage: only picked up when film istaken.
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Survivor functionStata Commands
Describing Survival: Survival Curves
Survivor function: S(t) probability of surviving to time t .If there are rk subjects at risk during the k th time-period, ofwhom fk fail, probability of surviving this time-period forthose who reach it is
rk − fkrk
Probability of surviving the end of the k th time-period is theprobability of surviving to the end of the (k − 1)th
time-period, times the probability of surviving the k th
time-period. i.e
S(k) = S(k − 1)× rk − fkrk
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Survivor functionStata Commands
Motion Sickness Study
21 subjects put in a cabin on a hydraulic piston,Bounced up and down for 2 hours, or until they vomited,whichever occured first.Time to vomiting is our survival time.Two subjects insisted on ending the experiment early,although they had not vomited (censored).
Is censoring independent of expected event time ?
14 subjects completed the 2 hours without vomiting.5 subjects failed
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Survivor functionStata Commands
Motion Sickness Study Life-Table
ID Time Censored rk fk S(t)1 30 No 21 1 20/21 = 0.9522 50 No 20 1 19/20× S(30) = 0.9053 50 Yes 19 0 19/19× S(50) = 0.9054 51 No 18 1 17/18× S(50) = 0.8555 66 Yes 17 0 17/17× S(51) = 0.8556 82 No 16 1 15/16× S(66) = 0.8017 92 No 15 1 14/15× S(82) = 0.7488 120 Yes 14 0 14/14× S(92) = 0.748...21 120 Yes 14 0 14/14× S(92) = 0.748
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Survivor functionStata Commands
Kaplan Meier Survival Curves
Plot of S(t) against (t).Always start at (0, 1).Can only decrease.Drawn as a step function, with a downwards step at eachfailure time.
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Survivor functionStata Commands
Stata commands for Survival Analysis
stset: sets data as survivalTakes one variable: followup timeOption failure = 1 if event occurred, 0 if censored
sts list: produces life tablests graph: produces Kaplan Meier plot
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Survivor functionStata Commands
Stata Output
sts list if group == 1
failure _d: failanalysis time _t: time
Beg. Net Survivor Std.Time Total Fail Lost Function Error [95% Conf. Int.]
-------------------------------------------------------------------------------30 21 1 0 0.9524 0.0465 0.7072 0.993250 20 1 1 0.9048 0.0641 0.6700 0.975351 18 1 0 0.8545 0.0778 0.6133 0.950766 17 0 1 0.8545 0.0778 0.6133 0.950782 16 1 0 0.8011 0.0894 0.5519 0.920692 15 1 0 0.7477 0.0981 0.4946 0.8868
120 14 0 14 0.7477 0.0981 0.4946 0.8868-------------------------------------------------------------------------------
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Survivor functionStata Commands
Kaplan Meier Curve: example0
.00
0.2
50
.50
0.7
51
.00
0 50 100 150analysis time
Kaplan−Meier survival estimate
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Comparing Survivor Functions
Null Hypothesis Survival in both groups is the sameAlternative Hypothesis
1 Groups are different2 One group is consistently better3 One group is better at fixed time t4 Groups are the same until time t , one group is better after5 One group is worse up to time t , better afterwards.
No test is equally powerful against all alternatives.
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Comparing Survivor Functions
Can useLogrank test
Most powerful against consistent differenceModified Wilcoxon Test
Most powerful against early differences
Regression
Should decide which one to use beforehand.
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Motion Sickness Revisited
Less than 1/3 of subjects experienced an endpoint in firststudy.Further 28 subjects recruitedFreqency and amplitude of vibration both doubledIntention was to induce vomiting soonerWere they successful ?
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Comparing Survival Curves0
.00
0.2
50
.50
0.7
51
.00
0 50 100 150analysis time
group = 1 group = 2
Kaplan−Meier survival estimates, by group
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
Comparison of Survivor Functions
sts test group gives logrank test for differencesbetween groupssts test group, wilcoxon gives Wilcoxon test
Test χ2 pLogrank 3.21 0.073Wilcoxon 3.18 0.075
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
What to avoid
Compare mean survival in each group.Censoring makes this meaningless
Overinterpret the tail of a survival curve.There are generally few subjects in tails
Compare proportion surviving in each group at a fixedtime.
Depends on arbitrary choice of timeLacks power compared to survival analysisFine for description, not for hypothesis testing
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Modelling Survival
Cannot often simply compare groups, must adjust for otherprognostic factors.Predicting survival function S is tricky.Easier to predict the hazard function.
Hazard function h(t) is the risk of dying at time t , given thatyou’ve survived until then.Can be calculated from the survival function.Survival function can be calculated from the hazardfunction.Hazard function easier to model
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
The Hazard Function
Hazard for all cause mortality for time since birth
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Options for Modelling Hazard Function
Parametric ModelSemi-parametric models
Cox Regression (unrestricted baseline hazard)Smoothed baseline hazard
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Comparing Hazard Functions
01
23
45
0 .5 1 1.5 2time
Untreated Treated
Exponential Distribution
01
23
45
0 .5 1 1.5 2time
Untreated Treated
Log−Logistic Distribution
01
23
45
0 .5 1 1.5 2
time
Untreated Treated
Unknown Baseline Hazard
01
23
45
0 .5 1 1.5 2
time
Untreated Treated
Non−constant Hazard Ratio
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Parametric Regression
Assumes that the shape of the hazard function is known.Estimates parameters that define the hazard function.Need to test that the hazard function is the correct shape.Was only option at one time.Now that semi-parametric regression is available, not usedunless there are strong a priori grounds to assume aparticular distribution.More powerful than semi-parametric if distribution is known
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Cox (Proportional Hazards) Regression
Assumes shape of hazard function is unknownGiven covariates x, assumes that the hazard at time t ,
h(t , x) = h0(t)×Ψ(x)
where Ψ = exp(β1x1 + β2x2 + . . .).Semi-parametric: h0 is non-parametric, Ψ is parametric.t affects h0, not Ψ
x affects Ψ, not h0
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Cox Regression: Interpretation
Suppose x1 increases from x0 to x0 + 1,
h(t , x0) = h0(t)× e(β1x0)
h(t , x0 + 1) = h0(t)× e(β1(x0+1))
= h0(t)× e(β1x0) × eβ1
= h(t , x0)× eβ1
⇒ h(t ,x0+1)h(t ,x0)
= eβ1
i.e. the Hazard Ratio is eβ1
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Results may be presented as β or eβ
β > 0⇒ eβ > 1⇒ risk increasedβ < 0⇒ eβ < 1⇒ risk decreasedShould include a confidence interval.
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Cox Regression: Testing Assumptions
We assume hazard ratio is constant over time: should test.Possible tests:
Plot observed and predicted survival curves: should besimilar.Plot − log(− log (S(t))) against log(t) for each group: shouldgive parallel lines.Formal statistical test:
OverallEach variable
May need to fit interaction between time period andpredictor: assume constant hazard ratio on short intervals,not over entire period.
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Cox Regression in Stata
stcox varlist performs regression using varlist aspredictorsOption nohr gives coefficients in place of hazard ratios
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Testing Proportional Hazards
stcoxkm produced plots of observed and predictedsurvival curvesstphplot produces − log(− log (S(t))) against log(t)(log-log plot)estat phtest gives overall test of proportional hazardsestat phtest, detail gives test of proportionalhazards for each variable.
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Cox Regression: Example
. stcox i.group
Cox regression -- Breslow method for ties
No. of subjects = 49 Number of obs = 49No. of failures = 19Time at risk = 4457
LR chi2(1) = 3.32Log likelihood = -67.296458 Prob > chi2 = 0.0685
------------------------------------------------------------------------------_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------2.group | 2.45073 1.277744 1.72 0.086 .8820678 6.809087
------------------------------------------------------------------------------
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Testing Assumptions: Kaplan-Meier Plot
0.5
00.6
00.7
00.8
00.9
01.0
0S
urv
ival P
robabili
ty
0 50 100 150analysis time
Observed: group = 1 Observed: group = 2
Predicted: group = 1 Predicted: group = 2
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Testing Assumptions: log-log plot
01
23
4−
ln[−
ln(S
urv
ival P
robabili
ty)]
1 2 3 4 5ln(analysis time)
group = 1 group = 2
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Testing Assumptions: Formal Test
. estat phtest
Test of proportional hazards assumption----------------------------------------------
| chi2 df Prob>chi2------------+---------------------------------global test | 0.03 1 0.8585----------------------------------------------
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Allowing for Non-Proportional Hazards
Effect of covariate varies with timeNeed to produce different estimates of effects at differenttimesUse stsplit to split one record per person into severalFit covariate of interest in each time period separately
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Non-Proportional Hazards Example
0.0
00.2
00.4
00.6
00.8
01.0
0S
urv
ival P
robabili
ty
0 10 20 30 40analysis time
Observed: treatment2 = Standard Observed: treatment2 = Drug B
Predicted: treatment2 = Standard Predicted: treatment2 = Drug B
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Non-Proportional Hazards Example
. stcox i.treatment2
------------------------------------------------------------------------------_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------treatment2 | .7462828 .3001652 -0.73 0.467 .3392646 1.641604
------------------------------------------------------------------------------
. estat phtest
Test of proportional hazards assumption
Time: Time----------------------------------------------------------------
| chi2 df Prob>chi2------------+---------------------------------------------------global test | 10.28 1 0.0013----------------------------------------------------------------
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Non-Proportional Hazards Example: Fittingtime-varying effect
stsplit period, at(10)gen t1 = treatment2*(period == 0)gen t2 = treatment2*(period == 10)
. stcox t1 t2
------------------------------------------------------------------------------_t | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------t1 | 1.836938 .8737408 1.28 0.201 .7231357 4.666262t2 | .1020612 .0853529 -2.73 0.006 .0198156 .5256703
------------------------------------------------------------------------------
. estat phtest
Test of proportional hazards assumption
Time: Time----------------------------------------------------------------
| chi2 df Prob>chi2------------+---------------------------------------------------global test | 1.34 2 0.5114----------------------------------------------------------------
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Non-Proportional Hazards Example0
.20
0.4
00
.60
0.8
01
.00
Su
rviv
al P
rob
ab
ility
0 10 20 30 40analysis time
Observed: t1 = 0 Observed: t1 = 1
Predicted: t1 = 0 Predicted: t1 = 1
0.0
00
.20
0.4
00
.60
0.8
01
.00
Su
rviv
al P
rob
ab
ility
0 10 20 30 40analysis time
Observed: t2 = 0 Observed: t2 = 1
Predicted: t2 = 0 Predicted: t2 = 1
IntroductionCensoring
Describing SurvivalComparing Survival
Modelling Survival
The hazard functionCox RegressionProportional Hazards Assumption
Time varying covariates
Normally, survival predicted by baseline covariatesCovariates may change over timeCan have several records for each subject, with differentcovariatesEach record ends with a censoring event, unless the eventof interest occurred at that timeNeed to have unique identifier for each individual so thatstata knows which observations belong togetherOption tvc() is for variables that increase linearly withtime