bayesian structured hazard regression · { only a small number of covariates is available...

33
Bayesian Structured Hazard Regression Andrea Hennerfeind, Ludwig Fahrmeir & Thomas Kneib Department of Statistics Ludwig-Maximilians-University Munich 29.05.2006

Upload: others

Post on 01-Nov-2019

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Bayesian Structured Hazard Regression

Andrea Hennerfeind, Ludwig Fahrmeir & Thomas KneibDepartment of Statistics

Ludwig-Maximilians-University Munich

29.05.2006

Page 2: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Outline

Outline

• Leukemia survival data.

• Structured hazard regression for survival times.

• Bayesian inference in structured hazard regression.

– Full Bayesian inference based on MCMC.

– Empirical Bayes inference using mixed model methodology.

• Multi-state models for the analysis of human sleep.

Bayesian Structured Hazard Regression 1

Page 3: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Leukemia Survival Data

Leukemia Survival Data

• Survival time of adults after diagnosis of acute myeloid leukemia.

• 1,043 cases diagnosed between 1982 and 1998 in Northwest England.

• 16 % (right) censored.

• Continuous and categorical covariates:

age age at diagnosis,wbc white blood cell count at diagnosis,sex sex of the patient,tpi Townsend deprivation index.

• Spatial information in different resolution.

Bayesian Structured Hazard Regression 2

Page 4: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Leukemia Survival Data

• Classical Cox proportional hazards model:

λ(t; x) = λ0(t) exp(x′γ).

• Baseline hazard λ0(t) is a nuisance parameter and remains unspecified.

• Estimate γ based on the partial likelihood.

• Questions / Limitations:

– Simultaneous estimation of baseline hazard rate and covariate effects.

– Flexible modelling of covariate effects (e.g. nonlinear effects, interactions).

– Spatially correlated survival times.

– Non-proportional hazards models / time-varying effects.

⇒ Structured hazard regression models.

Bayesian Structured Hazard Regression 3

Page 5: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Leukemia Survival Data

• Replace usual parametric predictor with a flexible semiparametric predictor

λ(t; ·) = λ0(t) exp[f1(age) + f2(wbc) + f3(tpi) + fspat(si) + β1sex]

and absorb the baseline

λ(t; ·) = exp[g0(t) + f1(age) + f2(wbc) + f3(tpi) + fspat(si) + β1sex]

where

– g0(t) = log(λ0(t)) is the log-baseline hazard,

– f1, f2, f3 are nonparametric functions of age, white blood cell count anddeprivation, and

– fspat is a spatial function.

Bayesian Structured Hazard Regression 4

Page 6: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Leukemia Survival Data

−5

−2

14

0 7 14time in years

(a) log(baseline) Log-baseline hazard.

Effect of age at diagnosis.

−1.5

−.7

50

.75

1.5

14 27 40 53 66 79 92age in years

(b) effect of age

Bayesian Structured Hazard Regression 5

Page 7: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Leukemia Survival Data

−.5

0.5

11.5

2

0 125 250 375 500white blood cell count

(c) effect of white blood cell count Effect of white blood cell count.

Effect of deprivation.

−.5

−.2

50

.25

.5

−6 −2 2 6 10townsend deprivation index

(d) effect of townsend deprivation index

Bayesian Structured Hazard Regression 6

Page 8: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Leukemia Survival Data

-0.44 0 0.3

District-level analysis−0.44 0.3

Individual-level analysis

Bayesian Structured Hazard Regression 7

Page 9: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Structured Hazard Regression

Structured Hazard Regression

• A general structured hazard regression model consists of an arbitrary combination ofthe following model terms:

– Log baseline hazard g0(t) = log(λ0(t)).

– Time-varying effects gl(t)ul of covariates ul.

– Nonparametric effects fj(xj) of continuous covariates xj.

– Spatial effects fspat(s) of a spatial location variable s.

– Interaction surfaces fj,k(xj, xk) of two continuous covariates.

– Varying coefficient interactions ujfk(xk) or ujfspat(s).

– Frailty terms bg (random intercept) or xjbg (random slopes).

Bayesian Structured Hazard Regression 8

Page 10: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Structured Hazard Regression

• Penalised splines for the baseline effect, time-varying effects, and nonparametriceffects:

– Approximate f(x) (or g(t)) by a weighted sum of B-spline basis functions

f(x) =∑

ξjBj(x).

– Employ a large number of basis functions to enable flexibility.

– Penalise differences between parameters of adjacent basis functions to ensuresmoothness:

Pen(ξ|τ2) =1

2τ2

∑(∆kξj)2.

– Bayesian interpretation: Assume a k-th order random walk prior for ξj, e.g.

ξj = ξj−1 + uj, uj ∼ N(0, τ2) (RW1).

ξj = 2ξj−1 − ξj−2 + uj, uj ∼ N(0, τ2) (RW2).

Bayesian Structured Hazard Regression 9

Page 11: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Structured Hazard Regression

−2

−1

01

2

−3 −1.5 0 1.5 3

−2

−1

01

2

−3 −1.5 0 1.5 3

−2

−1

01

2

−3 −1.5 0 1.5 3

Bayesian Structured Hazard Regression 10

Page 12: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Structured Hazard Regression

• Bivariate Tensor product P-splines for interaction surfaces:

– Define bivariate basis functions (Tensor products of univariate basis functions).

– Extend random walks on the line to random walks on a regular grid.

f f f f f

f f f f f

f f f f f

f f f f f

f f f f f

v v

v

v

Bayesian Structured Hazard Regression 11

Page 13: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Structured Hazard Regression

• Spatial effects for regional data s ∈ {1, . . . , S}: Markov random fields.

– Bivariate extension of a first order random walk on the real line.

– Define appropriate neighbourhoods for the regions.

– Assume that the expected value of fspat(s) = ξs is the average of the functionevaluations of adjacent sites:

ξs|ξs′, s′ 6= s, τ2 ∼ N

1

Ns

s′∈∂s

ξs′,τ2

Ns

.

τ2

2

t−1 t t+1

f(t−1)

E[f(t)|f(t−1),f(t+1)]

f(t+1)

Bayesian Structured Hazard Regression 12

Page 14: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Structured Hazard Regression

• Spatial effects for point-referenced data: Stationary Gaussian random fields.

– Well-known as Kriging in the geostatistics literature.

– Spatial effect follows a zero mean stationary Gaussian stochastic process.

– Correlation of two arbitrary sites is defined by an intrinsic correlation function.

– Can be interpreted as a basis function approach with radial basis functions.

Bayesian Structured Hazard Regression 13

Page 15: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Structured Hazard Regression

• Cluster-specific frailty terms:

– Account for unobserved heterogeneity.

– Easiest case: i.i.d Gaussian frailty.

• All covariates in the discussed model terms are allowed to be piecewise constanttime-varying.

Bayesian Structured Hazard Regression 14

Page 16: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Bayesian Inference

Bayesian Inference

• Generic representation of structured hazard regression models:

λ(t) = exp [x(t)′γ + f1(z1(t)) + . . . + fp(zp(t))]

• For example:

f(z(t)) = g(t) z(t) = t log-baseline effect,

f(z(t)) = u(t)g(t) z(t) = (u, t) time-varying effect of u(t),f(z(t)) = f(x(t)) z(t) = x(t) smooth function of a continuous

covariate x(t),f(z(t)) = fspat(s) z(t) = s spatial effect,

f(z(t)) = f(x1(t), x2(t)) z(t) = (x1(t), x2(t)) interaction surface,

f(z(t)) = bg z(t) = g i.i.d. frailty bg, g is a groupingindex.

• The generic representation facilitates description of inferential details.

Bayesian Structured Hazard Regression 15

Page 17: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Bayesian Inference

• All vectors of function evaluations fj can be expressed as

fj = Zjξj

with design matrix Zj, constructed from zj(t), and regression coefficients ξj.

• Generic form of the prior for ξj:

p(ξj|τ2j ) ∝ (τ2

j )−kj2 exp

(− 1

2τ2j

ξ′jKjξj

)

• Kj ≥ 0 acts as a penalty matrix, rank(Kj) = kj ≤ dj = dim(ξj).

• τ2j ≥ 0 can be interpreted as a variance or (inverse) smoothness parameter.

• Relation to penalized likelihood: Penalty terms

Pλj(ξj) = log[p(ξj|τ2

j )] = −12λjξ

′jKjξj, λj =

1τ2j

.

Bayesian Structured Hazard Regression 16

Page 18: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Bayesian Inference

• Likelihood for right censored survival times under the assumption of noninformativecensoring:

n∏

i=1

λi(Ti)δi exp

(−

∫ Ti

0

λi(t)dt

),

where δi is the censoring indicator.

• In general, numerical integration has to be used to evaluate the cumulative hazardrate (e.g. the trapezoidal rule).

Bayesian Structured Hazard Regression 17

Page 19: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Fully Bayesian inference based on MCMC

Fully Bayesian inference based on MCMC

• Assign inverse gamma prior to τ2j :

p(τ2j ) ∝ 1

(τ2j )aj+1 exp

(− bj

τ2j

).

Proper for aj > 0, bj > 0 Common choice: aj = bj = ε small.

Improper for bj = 0, aj = −1 Flat prior for variance τ2j ,

bj = 0, aj = −12 Flat prior for standard deviation τj.

• Conditions for proper posteriors in structured hazard regression: Enough uncensoredobservations and either

– proper priors for the variances or

– aj < bj = 0 and rank deficiency in the prior for ξj not too large.

Bayesian Structured Hazard Regression 18

Page 20: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Fully Bayesian inference based on MCMC

• MCMC sampling scheme:

– Metropolis-Hastings update for ξj|·:Propose new state from a multivariate Gaussian distribution with precision matrixand mean

Pj = Z ′jWZj +1τ2j

Kj and mj = P−1j Z ′jW (y − η−j).

IWLS-Proposal with appropriately defined working weights W and workingobservations y.

– Gibbs sampler for τ2j |·:

Sample from an inverse Gamma distribution with parameters

a′j = aj +12rank(Kj) and b′j = bj +

12ξ′jKjξj.

• Efficient algorithms make use of the sparse matrix structure of Pj and Kj.

Bayesian Structured Hazard Regression 19

Page 21: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Empirical Bayes inference based on mixed model methodology

Empirical Bayes inference based on mixed model methodology

• Consider the variances τ2j as unknown constants to be estimated.

• Idea: Consider ξj a correlated random effect with multivariate Gaussian distributionand use mixed model methodology.

• Problem: In most cases partially improper random effects distribution.

• Mixed model representation: Decompose

ξj = Xjβj + Zjbj,

wherep(βj) ∝ const and bj ∼ N(0, τ2

j Ikj).

⇒ βj is a fixed effect and bj is an i.i.d. random effect.

Bayesian Structured Hazard Regression 20

Page 22: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Empirical Bayes inference based on mixed model methodology

• This yields the variance components model

λ(t; ·) = exp [x′β + z′b] ,

where in turnp(β) ∝ const and b ∼ N(0, Q).

• Obtain empirical Bayes estimates / penalized likelihood estimates via iterating

– Penalized maximum likelihood for the regression coefficients β and b.

– Restricted Maximum / Marginal likelihood for the variance parameters in Q:

L(Q) =∫

L(β, b,Q)p(b)dβdb → maxQ

.

• Involves Laplace approximation to the marginal likelihood (similar as in the previoustalk by Havard Rue).

• Corresponds to REML estimation of variances in Gaussian mixed models.

Bayesian Structured Hazard Regression 21

Page 23: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Human Sleep Data

Human Sleep Data

• Consider individual human sleep data as independent realisations of time-continuousstochastic processes with discrete state space {awake, REM, non-REM}.

• Compact description of such a process in terms of transition intensities between thesestates.

• Simple approaches: Markov or Semi-Markov processes.

• Limitations / Questions:

– Changing dynamics of human sleep over night.

– Individual sleeping habits to be described by covariates.

– Only a small number of covariates is available (unobserved heterogeneity).

⇒ Model the transition intensities in analogy to survival models.

Bayesian Structured Hazard Regression 22

Page 24: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Human Sleep Data

stat

e

awak

eN

on−

RE

MR

EM

time since sleep onset (in hours)

cort

isol

0 2 4 6 8

050

100

150

Bayesian Structured Hazard Regression 23

Page 25: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Human Sleep Data

• Simplified structure for the transitions:

Awake

Non-REM REM

Sleep

?

6

-

¾

λRN(t)

λNR(t)

λAS(t) λSA(t)

Bayesian Structured Hazard Regression 24

Page 26: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Human Sleep Data

• Model for the transitions:

λAS,i(t) = exp[γ

(AS)0 (t) + siβ

(AS) + b(AS)i

]

λSA,i(t) = exp[γ

(SA)0 (t) + siβ

(SA) + b(SA)i

]

λNR,i(t) = exp[γ

(NR)0 (t) + ci(t)γ

(NR)1 (t) + siβ

(NR) + b(NR)i

]

λRN,i(t) = exp[γ

(RN)0 (t) + ci(t)γ

(RN)1 (t) + siβ

(RN) + b(RN)i

]

where

ci(t) =

{1 cortisol > 60 n mol/l at time t

0 cortisol ≤ 60 n mol/l at time t,

si =

{1 male

0 female,

b(j)i = transition- and individual-specific frailty.

Bayesian Structured Hazard Regression 25

Page 27: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Human Sleep Data

• Use penalized splines for the baseline and time-varying effects.

• I.i.d. Gaussian priors for the frailty terms (with transition-specific variances).

• The likelihood contribution for individual i can be constructed based on a countingprocess formulation of the model:

li =k∑

h=1

[∫ Ti

0

log(λhi(t))dNhi(t)−∫ Ti

0

λhi(t)Yhi(t)dt

]

=ni∑

j=1

k∑

h=1

[δhi(tij) log(λhi(tij))− Yhi(tij)

∫ tij

ti,j−1

λhi(t)dt

].

Nhi(t) counting process for type h event.

Yhi(t) at risk indicator for type h event.

tij event times of individual i.

ni number of events for individal i.

δhi(tij) transition indicator for type h transition.

Bayesian Structured Hazard Regression 26

Page 28: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Human Sleep Data

⇒ Concepts from survival analysis can be adapted.

• In particular:

– Fully Bayesian inference based on MCMC and

– Mixed model based empirical Bayes inference.

Bayesian Structured Hazard Regression 27

Page 29: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Human Sleep Data

• Baseline effects:

0 2 4 6 8

−2.

4−

2.0

−1.

6−

1.2

awake −> sleep

0 2 4 6 8

−4.

5−

4.0

−3.

5−

3.0

−2.

5−

2.0

−1.

5

sleep −> awake

0 2 4 6 8

−15

−10

−5

0

Non−REM −> REM

0 2 4 6 8

−5.

5−

5.0

−4.

5−

4.0

−3.

5−

3.0

−2.

5

REM −> Non−REM

Bayesian Structured Hazard Regression 28

Page 30: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Human Sleep Data

• Time-varying effects for a high level of cortisol:

0 2 4 6 8

−10

−5

05

Non−REM −> REM

0 2 4 6 8

−3

−2

−1

01

2

REM −> Non−REM

Bayesian Structured Hazard Regression 29

Page 31: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Software

Software

• Estimation was carried out using BayesX.

• Public domain software package for Bayesian inference in geoadditive and relatedmodels.

• Available from

http://www.stat.uni-muenchen.de/~bayesx

Bayesian Structured Hazard Regression 30

Page 32: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib Conclusions

Conclusions

• Unified framework for general regression models describing the hazard rate of survivalmodels.

• Bayesian inference based on MCMC or mixed model methodology.

• Extendable to models for transition intensities in multi state models.

• Future work:

– More general censoring mechanisms.

– Conditions for propriety of posteriors.

– Joint modelling of covariates and duration times.

Bayesian Structured Hazard Regression 31

Page 33: Bayesian Structured Hazard Regression · { Only a small number of covariates is available (unobserved heterogeneity).) Model the transition intensities in analogy to survival models

Thomas Kneib References

References

• Brezger, Kneib & Lang (2005). BayesX: Analyzing Bayesian structured additiveregression models. Journal of Statistical Software, 14 (11).

• Fahrmeir, Kneib & Lang (2004): Penalized structured additive regression forspace-time data: a Bayesian perspective. Statistica Sinica 14, 731–761.

• Hennerfeind, Brezger, and Fahrmeir (2006): Geoadditive Survival Models.Journal of the American Statistical Association, to appear.

• Kneib & Fahrmeir (2006): A mixed model approach for geoadditive hazardregression. Scandinavian Journal of Statistics, to appear.

• Kneib (2006): Geoadditive hazard regression for interval censored survival times.Computational Statistics and Data Analysis, conditionally accepted.

Bayesian Structured Hazard Regression 32