joint modeling li l

34
Liang Li Department of Quantitative Health Sciences Cleveland Clinic Joint Modeling of Longitudinal and Survival Data Presented at ASA North Illinois Chapter Spring Meeting, March 5 2009

Upload: hubik38

Post on 20-Nov-2015

217 views

Category:

Documents


0 download

DESCRIPTION

Joint Modeling Li L

TRANSCRIPT

  • Liang Li

    Department of Quantitative Health SciencesCleveland Clinic

    Joint Modeling of Longitudinal and Survival Data

    Presented at ASA North Illinois Chapter Spring Meeting, March 5 2009

  • Outline of the Talk

    What is joint modeling of longitudinal & survival data?

    The shared parameter model

    The measurement error perspective

    Our proposal

    Why it works (theoretical properties)

    How it works (empirical performance)

    Extension and on-going work

    2

  • Longitudinal Data

    Each subject is followed over a period of time; a series of measurements made.

    e.g., After lung transplant, FEV1 measured every week for a month, and every months afterwards till the end of the study

    3

    Months

    FEV1

    0 6 12 18

  • Survival Data

    Time to event such as death, machine failure, disease relapse, PFS, etc.

    Could be censored (partially observed)

    4

    0 1 2 3 4 5 6

    02

    04

    06

    08

    01

    00

    Years

    Su

    rviv

    al (%

    )

    2 4 6 8 10

    Time (Months)

    death

    censored

    Kaplan-Meier Curve

  • Joint Modeling of Longitudinal and Survival Data

    Question: how does the change in the (earlier) longitudinal profile of a subject relate to the risk of the (later) survival event?

    Example 1: Rate of change of glomerular filtration rate (GFR) & time to end stage renal disease (ESRD) or death

    Example 2: FEV1 & survival among cystic fibrosis patients

    Wide-spread use & active research field, e.g, surrogate endpoint

    5

  • Longitudinal Profile

    6

    Months

    FEV1

    0 6 12 18

    longitudinal profile =

    signal + noise

    Linear profile: subject-specific (random) intercept & slope

    Relates intercept & slope to survival

    Can we use raw data profile and avoid joint modeling?

    Nonlinear profile: time-dependent covariate curve

  • Data Structure

    7

    subject-specificintercept & slope

    longitudinal data survival data

    Stage 1

    Stage 2

    e.g., Cox Modele.g., Linear Mixed Model

    Two-stage hierarchical model

    Longitudinal part and survival part are conditionally independent given the subject-specific intercept and slope

  • Shared Parameter Model

    Two-stage hierarchical structure suggests the shared parameter model

    Review by Tsiatis & Davidian (2004), and Tseng, Hsieh, Wang (2005), Liu & Ying (2007), among others

    Almost all based on the following Fisher-likelihood

    8

    n

    i=1

    log{

    f(longint, slope)f(surv

    int, slope)f(int, slope)d[int, slope]}

    Pros: maximum likelihood estimator

    Cons: computational intensive, distributional assumptions needed

  • A New Perspective

    9

    Can we use a two-step approach for the two-stage problem?

    step 1: estimate the intercept and slope for each subject

    step 2: relate them to survival

    0 2 4 6 8 10

    810

    12

    14

    16

    Time

    true line

    fitted line

    fitted line

  • The Measurement Error Perspective

    10

    Do a regression of survival using true subject-specific intercepts and slopes

    true intercept & slope unknown

    estimated intercept & slope act as surrogates

    measurement error may cause bias in regression

    -1 0 1 2

    -1.5

    -1.0

    -0.5

    0.0

    0.5

    1.0

    1.5

    Z (red) or X (blue)

    Y

    Y = b0 + b1Z + e

    X = Z + U

    Measurement error cause attenuation in regression

    Y ~ X

  • The Example

    11

    HEMO study: a clinical trial coordinated at Cleveland Clinic

    2 by 2 design: standard or high dose of dialysis, low or high flux dialyzer

    Neither treatment was found to significantly affect time to all-cause mortality (Rocco et al 2004)

    We want to study a secondary question: whether the decline of albumin levels is a strong predictor of mortality

    Challenge: albumin measurements need to be calibrated to remove artificial differences due to variations in total body water.

    Monday-Wednesday-Friday

    Tuesday-Thursday-Saturday

  • Model & Notation - Longitudinal Part

    12

    Longitudinal sub-model: linear mixed model

    Wij =VTij + DTij

    i + #ij

    i =Xi + i

    In the context of the application:

    ALBij =Mon/Tues + i0 + Timeiji1 + Noiseij

    i0 =Intercept0 + DoseiA0 + FluxiB0 + i0i1 =Intercept1 + DoseiA1 + FluxiB1 + i1

    We start with i N(0,)

    and shows later that conclusion holds even when this assumption is dropped.

    Stage 1

    Stage 2

  • Model & Notation - Survival Part

    Survival sub-model: Cox proportional hazard model

    T: time to event C: time to censoring

    Y = min(T, C) = 1{ T < C}

    13

    Log hazard function:

    h(t;Zi, i) = h0(t) + ZTi a1 + Ti a2

    In the context of the example:

    h(t;Zi, i) =h0(t) + Doseia11 + Fluxia12+i0a20 + i1a21

    The proposed model includes as special cases the models considered by Wang (2006), Ratcliffe et al (2004), Hsieh, Tseng, Wang (2006), Tsiatis & Davidian (2004), among others

    Stage 2

  • Poissonization of Cox Model

    Step 3: use Trapezoidal rule for numerical integration

    14

    Step 1: use B-spline to approximate log baseline hazard

    h0(t) K

    k=1

    a0kk(t)

    Step 2: use full likelihood of Cox model, not partial likelihood

    iUi(Yi)T Yi

    0exp{Ui(t)T}dt

    Finally: we can fit a Cox model using Poisson regression

  • 15

    The joint log likelihood (for one subject)

    Key observation: appears in linear, quadratic or exponential terms

    Survival/Poisson

    Longitudinal

    Stage 1

    LLi() =ni2

    log(22! ) Wi Xi Dii 2

    22!

    LSi() =Mi

    g=0

    {Y ig

    {UT1ig1 +

    Ti 2 T2 Xi + log(cig)

    }

    exp{UT1ig1 +

    Ti 2 T2 Xi + log(cig)

    }}

    LMi() =q

    2log(2) 1

    2log | |

    12(i Xi)T1 (i Xi)

    i

    True Likelihood corrected version

  • 16

    From linear model theory, is a measurement of

    i = (DTi Di)

    1DTi (Wi Vi)

    i |i N(i ,

    2! (D

    Ti Di)

    1)

    i i

    W N(X,2u)If then,

    i

    Xi

    i

    Wi

    i

    X2i

    i

    (W 2i 2u)

    i

    exp(Xi)

    i

    exp(Wi 122u)

    n

    i=1

    LL() +n

    i=1

    LS() +n

    i=1

    LM()

    Do correction to the joint log likelihood (formula omitted)

    Corrected Likelihood

    Linear

    Quadratic

    Exponential

    0 2 4 6 8 10

    810

    12

    14

    16

    Time

  • The proposed estimators are maximizers of the corrected joint log likelihood function

    Variance components estimated separately in a side step.

    Mis-specification allowed, like GEE

    Result not sensitive to the B-spline approximation

    Statistical inference based on sandwich variance estimator

    17

    A Few Remarks

  • Summary on Proposed Method

    Key idea: find a corrected joint log likelihood that looks like the true joint log likelihood with the unknowns eliminated

    This is possible because the unknowns reside in linear, quadratic or exponential terms (Li and Greene, Biometrics 2008)

    Combine three pieces of log likelihood together, similar in spirit to the h-likelihood (1996), but different from the classical Fisher likelihood (1922)

    Compared with Wang (2006, Stat Sinica), our method

    more general (unknown parameters in both sub-models), including most published models as special case

    exact correction with full likelihood instead of approximate correction with partial likelihood

    concave likelihood (next page)

    18

  • Theoretical Properties

    The estimators of the unknown parameters are maximizers of the corrected joint log likelihood

    As sample size becomes large:

    the estimator is consistent

    the estimator is asymptotically normal

    the corrected joint log likelihood is concave

    These properties remain valid even when the random effects do not have normal distribution or their variance matrix is misspecified (robust)

    19

  • Simulation Results

    We conducted extensive computer simulations to investigate the empirical performance of the proposed method

    Bias, variance, coverage of confidence interval: Good

    Result not sensitive to number of knots of B-spline

    The computation is much faster than competing methods based on maximum likelihood

    The algorithm is stable, always converge (concavity)

    Estimator expected to be less efficient than maximum likelihood based methods, a trade-off for robustness

    20

  • Parameter

    Bias CI coverageof

    proposeduncorrected(two-step)

    proposed

    L 1 = 1 0.00197 0.00299 94.5

    L 2 = 2 -0.00370 -0.00571 94.0

    L 3 = 1 0.00591 0.00659 94.0

    L 4 = 0.5 -0.0104 -0.0118 97.0

    intercept = 0.5 -0.347 0.0196 96.0

    slope = 1 -0.471 0.0552 95.5

    21

    n=250

  • Application to HEMO Study Data

    1628 patients with between 3 and 15 repeated measurements

    22

    Parameter Estimator p-value

    intercept 3.7 < 0.001

    high dose 0.0012 0.94

    high flux -0.007 0.67

    time (years) -0.058 < 0.001

    high dose by time -0.014 0.311

    high flux by time -0.01 0.468

    Monday / Tuesday -0.026 0.017

    high dose -0.061 0.5

    high flux -0.069 0.44

    random intercept -1.5 < 0.001

    random slope -3.7 < 0.001

    0 2 4 6 8 10

    810

    12

    14

    16

    Time

    smaller slope (-0.4)

    larger slope (-0.2)

  • Estimated baseline survival function and its 95% point-wise confidence interval

    23

    0 1 2 3 4 5 6

    02

    04

    06

    08

    01

    00

    Years

    Su

    rviv

    al (%

    )

    smooth curve

    step function frompartial likelihood

  • Summary

    A new method for joint modeling

    A general model that includes most published models as special case

    Theoretically appealing properties and reliable and easy computation

    Robust against certain model mis-specification

    May use other methods than Trapezoidal rule (Poissonization is not inevitable)

    Limitation:

    Need at least three repeated measurements per subject

    Trade efficiency for robustness, best for large sample size

    24

  • Nonlinear Longitudinal Data

    In a lung transplant study at Cleveland Clinic, investigators want to use FEV1 profile after lung transplant to predict mortality

    The profile is clearly nonlinear

    25

  • 0 20 40 60 80 100

    30

    35

    40

    45

    50

    55

    60

    65

    mean FEV1 trajectory, subject!clustering ignored

    months after transplant

    FE

    V1

    26

  • 0 20 40 60 80 100

    0.0

    0.5

    1.0

    1.5

    Subject!Specific Fitted Curves

    months after transplant

    fitted c

    urv

    es

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    3334

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    5253

    54

    55

    56

    57

    58

    59

    60

    61

    62

    63

    64

    65

    66

    67

    68

    69

    7071

    72 73

    74

    75

    76

    77

    78

    79

    80

    81

    82

    83

    84

    85

    86

    87

    88

    89

    90

    91

    92

    9394

    95

    96

    97

    98

    99

    100

    101

    102

    103

    104

    105

    106

    107

    108

    109

    110

    111

    112

    113

    114

    115

    116

    117

    118

    119

    120

    121

    122

    123

    124

    125

    126

    127

    128

    129

    130

    131

    132

    133

    134

    135

    136

    137

    138

    139

    140

    141

    142143

    144

    145

    146

    147

    148

    149

    150

    151

    152

    153

    154

    155

    156

    157

    158159

    160

    161

    162163164

    165

    166

    167168

    169

    170

    171

    172

    173

    174

    175

    176177

    178

    179

    180

    181

    182

    183

    184

    185

    186

    187

    188189

    190

    191

    192

    193

    194

    195

    196

    197

    198

    199

    200

    201202

    203

    204205

    206

    207208

    209

    210

    211

    212

    213

    214

    215

    216

    217218

    219

    220

    221

    222

    223

    224

    225

    226

    227

    228

    229230

    231

    232

    233

    234

    235

    236

    237

    238

    239

    240

    241

    242

    243

    244

    245

    246

    247

    248

    249250251

    252

    253

    254255

    256

    257

    258

    259

    260

    261

    262

    263

    264

    265

    266

    267

    268

    269

    270

    271

    272

    273

    274

    275

    276

    277

    278

    279

    280

    281

    282

    283

    284

    285

    286

    287

    288

    289

    290

    291

    292

    293

    294

    295

    296

    297

    298

    299300

    301

    302

    303

    304

    305

    306

    307

    308

    309310

    311

    27

  • 0 1 2 3 4 5 6

    -0.5

    0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    Time

    Replace subject-specific intercept or slope with time-dependent covariate

    Want Error correction?true curve

    estimated curve

  • Proposed Model & Method

    Cox model with time-dependent covariate and time-dependent hazard ratios (varying coefficients)

    29

    Why varying coefficients: constant hazard ratios unlikely for surgical data

    hi(t;X(t)) = exp{0(t) + 1(t)T Xi(t)}

    Wi(t) = Xi(t) + !i(t)hi(t;X(t)) = exp{0(t) + 1T Xi(t)}

    Use what ever method to fit each subjects longitudinal profile separately

    Get estimated curve & its variation; do the measurement error correction

    Deal with varying coefficients

  • 0 2 4 6 8 10

    -2-1

    01

    2

    Time

    Y

    0 2 4 6 8 10-2

    -10

    12

    Time

    Y

    Local linear method: estimate the curves piece by piece at local neighborhoods.

    Proposed Method

  • Proposed Method

    Local linear method for the full likelihood of Cox model

    31

    Our proposal different from all previous methods in that we did not use partial likelihood (for exact correction)?

    2 4 6 8 10

    Time

    2 4 6 8 10

    Time

    artificiallycensored

    removed

  • n

    i=1

    [i{Xi(Yi)T(Yi)

    } Yi

    0exp

    {Xi(t)T(t)

    }dt

    ]

    n

    i=1

    [Kh(Yi t0)i

    {Xi(Yi)T(Yi)

    } Yi

    0Kh(t t0) exp

    {Xi(t)T(t)

    }dt

    ]

    n

    i=1

    [Kh(Yi t0)i

    {Wi(Yi)T(Yi)

    }

    Yi

    0Kh(t t0) exp

    {Wi(t)T(t)

    12(t)T(t)(t)

    }dt

    ]

    Cox log likelihood

    Cox local likelihood

    The Evolution of Likelihoods

    Replace (t) by intercept + slope t

    under construction ... ...

    32

    with correction

    with local linear approx.

  • References

    Liang Li, Bo Hu, Tom Greene (2009) A semiparametric joint model for longitudinal and survival data with application to hemodialysis study. Biometrics, in press.

    Liang Li. Semiparametric joint modeling of nonlinear time-dependent covariate process and time to event outcome with varying coefficients. Working paper.

    33