doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with...

43
Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates Doubly robust estimates for longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and Senior Biostatistician, NACC Professor, Department of Biostatistics University of Washington October, 2009 Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariates ADC, 2009 1 / 43

Upload: others

Post on 11-Aug-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Doubly robust estimates for

longitudinal data analysis with missingresponse and missing covariates

Xiao-Hua Andrew Zhou, Ph.D

Co-Investigator and Senior Biostatistician, NACCProfessor, Department of Biostatistics

University of Washington

October, 2009

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 1 / 43

Page 2: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

1 NACC UDS

2 Analysis of Complete Longitudinal Data

3 Estimating Equations for Missing Outcome

4 Methods for Handling Missing Covariates

5 New MethodModel Formulation For Missing Response and CovariatesEstimation and Inference

6 Simulations and ApplicationsSimulationsApplications

7 Summary

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 2 / 43

Page 3: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

A NACC example

Using the National Alzheimer’s Coordinating Center (NACC)Uniform Data Set (UDS), we are interested in assessing heassociation between patient’s characteristics and the onset ofdementia.

The response is the diagnosis of dementia (Yes/No).

The covariates that may be related to the status of dementiainclude sex, congestive heart failure (CVCHF, yes/no), familyhistory of dementia (FHDEM, yes/no), diabetes (yes/no),behavioral assessment (depression or dysphoria, yes/no),hypertension (yes/no), education (years), Mini-Mental StateExam (MMSE) score, and age.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 3 / 43

Page 4: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

A NACC example, continued

There are 16223 subjects from 29 Alzheimer’s Disease Centersincluded at the entry of this study.

Follow-up visits for subjects are scheduled at approximatelyone-year intervals, with up to three follow-ups at present.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 4 / 43

Page 5: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

An example, continued

Due to some reasons, there are some missing data for theresponse and the behavioral assessment covariate.

There are 8724 subjects with complete data on scheduledvisits.

About 11.9% subjects miss both the response and behavioralassessment; about 31.2% subjects miss the response butobserve behavioral assessment; about 3.2% subjects miss thebehavioral assessment but observe the response; and about53.7% subjects observe both the response and the behavioralassessment covariate.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 5 / 43

Page 6: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

GEE Approach with Complete LongitudinalData

The method of generalized estimating equations (GEE) is apopular method for analyzing longitudinal data.

It requires only the specification of a model for the marginalmean and variance of each measurement and of a ”working”matrix for the correlation between measurements in a cluster.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 6 / 43

Page 7: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Notations

Let Yij denote the response of individual i at time j

(i = 1, . . . ,N; j = 1, . . . ,Mi). Let Yi = (Yi1, . . . ,YiMi)T .

Let xij denote a vector of covariates for individual i at time j ,and xi = (xT

i1 , . . . , xTiMi

)T . xi = (xTi1, . . . , x

TiMi

)T .

Let µij = E (Yij | xij), g(µij) = βT xij ; letµi = (µi1 . . . , µiMi

)T .

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 7 / 43

Page 8: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

GEE for Complete Data Analysis

The GEE for complete data are

N∑

i=1

Ui(β, ρ;Yi , xi ) = 0,

where

Ui(β, ρ;Yi , xi ) =∂µT

i

∂βVi (ρ)−1(Yi − µi ),

and Vi(ρ) is the working covariance matrix of Yi .

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 8 / 43

Page 9: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Asymptotic results

When xi contains only time-independent covariates, undersome regularity conditions, the GEE yields estimators that areconsistent.

If xi includes some time-dependent covariates, the GEE stillyields consistent estimators under one additional assumptionthat E (Yij | xi ) = E (Yij | xij). If this is not the case, then forconsistency the independent working correlation should beused.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 9 / 43

Page 10: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Time-dependent Covariates

Let Lij denote all the data that should be collected onindividual i at time j .

Let Lij denote the data available on individual i by time j .

Let Lij denote the data not yet available by time j .

Note that Lij includes both Yij and xij .

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 10 / 43

Page 11: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Drop-out

Let Rij = 1 if measurement j on individual i is observed andRij = 0 otherwise.

Assume monotone drop-out: Rij = 0 implies Rik = 0 for alltimes k > j .

Let Cij = 1 if subject is last observed measurement is at timej and 0 otherwise.

We assume that the covariates included in Lij are chosen so thatthe data can assumed to be Missing at Random (MAR):

P(Rij = 1|LiMi,Ri ,j−1 = 1) = P(Rij = 1|Li ,j−1,Ri ,j−1 = 1).

i.e., the probability of missingness only depends on the observeddata.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 11 / 43

Page 12: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

GEE for Complete-Data

N∑

i=1

Ui(β, ρ;Yi , xi ) = 0,

where

Ui(β, ρ;Yi , xi ) =∂µT

i

∂βVi(ρ)−1(Yi − µi ),

and Vi(ρ) is the working covariance matrix of Yi .These equations yield estimates that are consistent if the data areMissing Completely at Random (MCAR), but not necessarily ifthey are MAR.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 12 / 43

Page 13: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Re-weighting

With missing data, we can base our estimates on thecomplete cases, but re-weight them according to theprobability of being observed.The estimating equations are then

N∑

i=1

∂µTi

∂βVi(ρ)−1∆i(α)(Yi − µi),

where ∆i (α) = diag(Ri1/πi1, . . . ,RiMi/πiMi

) andπij = πij(α) is the probability, according to a specifieddropout model, that measurement j on subject i is observed.Under the drop-out missing data,

πij(α) = (1 − λi1(α)) . . . (1 − λij(α)),

where λij(α) = P(Rij = 0 | Lij ,Rij = 1).The resulting estimates are consistent if the data are MAR, aslong as the probability model for the missingness is correctlyspecified.Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 13 / 43

Page 14: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Imputation

Alternatively, we can impute, or “guess”, what the missingvalues are based on some probability model.

Then the estimates are based on both the observed data andthe imputed data.

The complete case estimating equations are used, but afterimputing missing responses with their expected values:

E (Yij |Lik ,Rik = 1), for j > k.

The imputations are based on specified regression models.

The resulting estimates are consistent if the data are MAR, aslong as the probability model for the imputations is correct.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 14 / 43

Page 15: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Doubly-robust Estimating Equations

The inverse probability weighting estimates make no use ofthe available data on subjects with missing measurements.

Let d(LM ,β) = U(β,ρ;Y, x) be the contribution of a fullyobserved subject to the estimating equations.

For drop-out missing data, the IPW estimating equations canbe augmented by a term F (C ,LC ,β) satisfyingEC{F (C ,LC ,β)|LM} = 0.

The resulting augmented estimating equations are

N∑

i=1

{

RiMi

πiMi

d(LMi ,β) + F (C ,LC ,β)

}

= 0.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 15 / 43

Page 16: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Doubly-robust Estimating Equations (2)

The optimal choice of augmentation term is

Fopt(C ,LC ,β) =

M−1∑

j=1

(

Cj − λj+1Rj

πj+1

)

Hj(β),

where Hj (β) = ELj{d(LM ,β)|Lj ,Rj = 1}.

We specify models for Hj (β), j = 1, . . . ,M − 1 which involveparameters γ.

Let α̂ and γ̂ denote consistent estimators of α and γ.

Then, in the estimating equations, replace λj , πj , and Hj withλj(α), πj(α), and Hj(β, γ̂).

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 16 / 43

Page 17: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Properties of DR Estimating Equations

If:

The data are MAR,

the marginal model is correct, g(µij) = βTxij , and

either the dropout model πj , or the model for Hj (or both) iscorrectly specified,

then the solution to the estimating equations β̂ is consistent for β.

Furthermore, if both the dropout model and the model for Hj

are correct, then this solution β̂ is optimal in the sense that ithas the smallest asymptotic variance among estimates fromaugmented estimating equations. A consistent estimate ofthis variance exists in closed form.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 17 / 43

Page 18: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Methods for Handling Missing Covariates

Lipsitz et al. (1999) considered the doubly robust estimate in thecross-sectional study with a missing covariate

Notations:

yi : response, xi : covariate vector that is always observedzi : covariate that is subject to missingri : missing indicator for zi

Joint density of (ri , yi , zi |xi)

p(ri , yi , zi |xi ) = p(ri |yi , zi , xi , ω)p(yi |zi , xi , β)p(zi |xi , α)

= p(ri |yi , xi , ω)p(yi |zi , xi , β)p(zi |xi , α) (MAR)

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 18 / 43

Page 19: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Score Equation for Complete Data

The likelihood-based score question:

n∑

i=1

u1i (β)u2i(α)u3i (ω)

= 0,

where

u1i (β; yi , xi , zi ) = ∂ log p(yi |xi ,zi ,β)∂β

u2i (β; xi , zi ) = ∂ log p(zi |xi ,α)∂α

u3i (β; ri , xi , yi ) = ∂ log p(ri |xi ,yi ,zi ,ω)∂ω

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 19 / 43

Page 20: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Methods for Handling Missing Covariates

With missing data, the maximum likelihood estimating equationsfor γ̂ = (β̂′, α̂′, ω̂′)′ solves

u∗(γ̂) =

n∑

i=1

u∗i (γ̂) =

n∑

i=1

E

u1i(β̂)u2i (α̂) observed datau3i (ω̂)

= 0

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 20 / 43

Page 21: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Methods for Handling Missing Covariates

We can further show that

u∗(γ) =

n∑

i=1

riu1i(β; yi , xi , zi) + (1 − ri)Ezi |yi ,xi[u1i (β; yi , xi , zi )]

riu2i (α; zi , xi ) + (1 − ri)Ezi |yi ,xi[u2i (α; zi , xi )]

u3i (ω; yi , xi , ri )

Solving u∗(γ̂) = 0 we get the MLE

The asymptotic properties of (β̂, α̂)′ don’t depend on themissing data model

If p(yi |xi , zi ) and p(zi |xi ) are correctly specified, we can getconsistent estimate of (β̂, α̂)′ by solving u∗(γ̂) = 0

If p(yi |xi , zi ) or/and p(zi |xi ) are misspecified, then β̂ will notbe consistent

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 21 / 43

Page 22: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Methods for Handling Missing Covariates

Weighted GEE

S(γ) =

n∑

i=1

riπi

u1i (β; yi , xi , zi) +(

1 − riπi

)

Ezi |yi ,xi[u1i (β; yi , xi , zi)]

riπi

u2i (α; zi , xi) +(

1 − riπi

)

Ezi |yi ,xi[u2i (α; zi , xi )]

u3i (ω; yi , xi , ri )

where πi = P(ri = 1|yi , xi )

Doubly robust estimate, i.e., solving S(γ̂) = 0 can getasymptotic unbiased estimate for β when either πi or p(zi |xi )is correctly specified

EM algorithm for the estimate

Asymptotic variance

Var(γ̂) ={

n∑

i=1

E

[

∂Si(γ)

∂γ′

]}

−1n

i=1

E [Si(γ)S ′

i (γ)]{

n∑

i=1

E

[

∂Si(γ)

∂γ

]}

−1

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 22 / 43

Page 23: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Model Formulation

Notations

Response: Yi = (Yi1, Yi2, . . . , YiJi)′

Covariate: Xi = (Xi1, Xi2, . . . , XiJi)′

Rij =

0 Yij and Xij are missing

1 Yij is missing and Xij is observed

2 Yij is observed and Xij is missing

3 Yij and Xij are observed

Covariate: Zi [always observed]

Response model: µij = E (Yij |Xi ,Zi)var(Yij |Xi ,Zi ) = κf (µij)

g(µij) = Xijβx + Z ′ijβz

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 23 / 43

Page 24: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Model Formulation (Continued)

Missing data models: λijk = P(Rij = k |R̄ij , Yi , Xi , Zi ), k = 0, 1, 2, 3

log(λijk

λij0

)

= uijk′αk k = 1, 2, 3

R̄ij : missing response indicator history

Covariate model: ωij = E (Xij |X̄ij ,Zi )

h(ωij) = v ′ijγ

X̄ij : covariate history

θ = (β′, α′, γ′)′, where β is of interest

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 24 / 43

Page 25: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Model Formulation (Continued)

MAR assumption:

P(Rij = k|R̄ij ,Yi ,Xi ,Zi )

= P(Rij = k|R̄ij ,Y(o)i ,X

(o)i ,Zi)

Yi = (Y(o)i , Y

(m)i )

Xi = (X(o)i , X

(m)i )

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 25 / 43

Page 26: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Model Formulation (Continued)

Weighted GEE (WGEE) for β:

S1(θ) =

n∑

i=1

[

DiMi(Yi−µi )+EY

(m)i

,X(m)i

|Y(o)i

,X(o)i

,Zi[DiNi (Yi−µi)]

]

= 0

Mi = κ−1F−1/2i [C−1

i • ∆i ]F−1/2i

Ni = κ−1F−1/2i [C−1

i • (11′ − ∆i)]F−1/2i

Fi = diag(var(Yij |Xij ,Zij), j = 1, . . . , Ji )

Ci : working correlation matrix

∆i = [δijk ] with

δijk = [I (Rij = 1,Rik = 3) + I (Rij = 3,Rik = 3)]/πijk for j 6= k

andδijj = I (Rij = 3)/πij

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 26 / 43

Page 27: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Model Formulation (Continued)

Weighted GEE (WGEE) for γ:

S2(θ) =

n∑

i=1

[

vi∆∗i (Xi −ωi)+E

X(m)i

|X(o)i

,Zi[vi(I −∆∗

i )(Xi −ωi)]]

= 0

∆∗i = diag(I (Rij = 1 or 3)/πx

ij , j = 1, . . . , Ji)

πxij = P(Rij = 1 or 3|Yi ,Zi ,Xi )

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 27 / 43

Page 28: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Model Formulation (Continued)

Estimation function for missing data parameter α:

S3(α) =

n∑

i=1

Ji∑

j=1

3∑

k=0

I (Rij = k)

λijk

∂λijk

∂α= 0

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 28 / 43

Page 29: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Estimation and Inference

Solve estimating equations

S(θ̂) =

S1(θ̂)

S2(θ̂)S3(α̂)

=n

i=1

Si(θ) = 0

EM algorithm for the estimation

Variance estimate

Var(θ̂) ={

n∑

i=1

E

[

∂Si(θ)

∂θ

]}

−1n

i=1

E [Si(θ)S′

i (θ)]{

n∑

i=1

E

[

∂Si(θ)

∂θ

]

′}

−1

.

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 29 / 43

Page 30: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Estimation and Inference (Continued)

Doubly robust estimate

If missing data model is correctly specified, we get asymptoticunbiased estimate for β no matter the model for the covariateis correctly specified or not

If covariate model is correctly specified, we get asymptoticunbiased estimate for β no matter the model for the missingdata is correctly specified or not

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 30 / 43

Page 31: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Simulations

Response model is logit(µij) = β0 + β1xij + β2Zij , j = 1, 2, 3,with exchangeable correlation ρ.

Covariate model

logitωij = γ0 + γ1Xi ,j−1 + γ2Zij

Missing data model

log(λijk

λij0

)

= α0k + α1k1I (Ri ,j−1 = 1) + α1k2I (Ri ,j−1 = 2)

+α1k3I (Ri ,j−1 = 3) + α2ky(o)i ,j−1 + α3kx

(o)i ,j−1

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 31 / 43

Page 32: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Simulations (Continued)

Methods considered

1 EM(x+): EM with correct covariate model

2 WGEE(x+, r+): WGEE with correct covariate and missingdata models

3 WGEE(x−, r+): WGEE with incorrect covariate and correctmissing data models

4 WGEE(x+, r−): WGEE with correct covariate and incorrectmissing data models

5 WGEE(x−, r−): WGEE with incorrect covariate and incorrectmissing data models

6 cc: complete case MLE

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 32 / 43

Page 33: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Simulations (Continued)

Table: Empirical bias, standard deviation and coverageprobabilities for six approaches to estimation and inference withincomplete covariate and response data (ρ = 0.6, α2 = γ2 = −2)

β0 β1 β2

Method Bias% SD CP% Bias% SD CP% Bias% SD CP%

EM(x+) -0.3 0.102 94.9 -1.1 0.077 94.3 0.5 0.091 94.8(x+, r+) 0.7 0.104 95.1 0.8 0.080 94.5 -0.9 0.093 94.9(x+, r−) -1.0 0.110 95.2 -1.6 0.088 94.9 1.6 0.102 95.0(x−, r+) 0.4 0.105 94.4 1.0 0.084 94.8 -0.3 0.096 94.5(x−, r−) -20.1 0.094 91.4 12.0 0.081 92.9 3.0 0.096 93.9cc -302.0 0.876 53.8 49.9 1.077 96.8 0.4 1.218 94.6

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 33 / 43

Page 34: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Table: Empirical bias, standard deviation and coverageprobabilities for six approaches to estimation and inference withincomplete covariate and response data (ρ = 0.3, α2 = γ2 = −2)

β0 β1 β2

Method Bias% SD CP% Bias% SD CP% Bias% SD CP%

EM(x+) -1.6 0.058 94.4 -0.2 0.067 95.3 1.1 0.084 94.4(x+, r+) 0.1 0.060 95.4 0.1 0.072 95.1 0.3 0.086 94.6(x+, r−) 0.0 0.066 94.3 0.8 0.071 94.9 0.2 0.091 94.7(x−, r+) 1.2 0.062 94.7 0.6 0.079 94.8 -0.9 0.087 94.5(x−, r−) -12.4 0.076 93.4 8.4 0.077 94.1 2.0 0.087 94.2cc -219.6 0.784 78.6 -27.0 1.065 97.2 0.0 0.930 94.9

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 34 / 43

Page 35: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Simulations (Continued)

Summary of the Simulations:

EM algorithm gives consistent and most efficient estimatewhen the covariate model is correctly specified

The proposed method yields negligible biases when either thecovariate model or the missing data model is correctlyspecified

If both the covariate and missing data model are misspecified,the proposed method yield biased result

The complete case analysis gives biased estimate

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 35 / 43

Page 36: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Impact of model misspecification

−4 −2 0 2 4

−50

510

γ1

% RELA

TIVE B

IAS FO

R β2

α2=−2α2=−1α2= 0α2= 1α2= 2

Figure: Asymptotic percent relative bias of β2 with misspecifiedcovariate model and missing data model

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 36 / 43

Page 37: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Application to the NACCUDS

Table: Frequency table for the responses and covariate for the

missingness (X ,Y )

Time (m, m) (o, m) (m, o) (o, o)

1 6.0 28.8 8.9 56.32 10.3 31.7 3.9 54.13 12.8 31.1 2.7 53.44 14.1 31.3 1.6 52.9

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 37 / 43

Page 38: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Application to the NACCUDS

Table: Parameter estimate for the NACCUDS: proposed method,

n = 16223

Parameter Est. SE p

(Intercept) -0.136 0.106 0.198SEX(F) -0.203 0.025 <0.001CVCHF -0.031 0.063 0.618DEPRESSION 0.679 0.029 <0.001MMSE -0.002 0.001 <0.001FHDEM 0.181 0.028 <0.001DIABETE -0.124 0.038 0.001HYPERT -0.195 0.026 <0.001EDUC -0.002 0.001 0.040AGE 0.006 0.001 <0.001

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 38 / 43

Page 39: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Application to the NACCUDS

Table: Parameter estimate for the NACCUDS: missing response

only, n = 15416

Parameter Est. SE p

(Intercept) -0.272 0.110 0.013SEX(F) -0.113 0.026 <0.001CVCHF 0.123 0.066 0.063DEPRESSION 0.505 0.030 <0.001MMSE -0.007 0.001 <0.001FHDEM -0.004 0.029 0.897DIABETE -0.176 0.038 <0.001HYPERT -0.220 0.027 <0.001EDUC 0.000 0.001 0.670AGE 0.013 0.001 <0.001

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 39 / 43

Page 40: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Application to the NACCUDS

Table: Parameter estimate for the NACCUDS: missing covariate

only, n = 10755

Parameter Est. SE p

(Intercept) 0.198 0.142 0.163SEX(F) -0.040 0.032 0.215CVCHF 0.044 0.080 0.579DEPRESSION 0.451 0.034 <0.001MMSE -0.019 0.001 <0.001FHDEM -0.048 0.036 0.177DIABETE -0.177 0.047 <0.001HYPERT -0.212 0.034 <0.001EDUC -0.000 0.002 0.904AGE 0.011 0.002 <0.001

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 40 / 43

Page 41: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Application to the NACCUDS

Table: Parameter estimate for the NACCUDS: complete case

analysis, n = 8724

Parameter Est. SE p

(Intercept) 0.283 0.162 0.081SEX(F) -0.022 0.037 0.551CVCHF -0.019 0.092 0.834DEPRESSION 0.416 0.039 <0.001MMSE -0.021 0.001 <0.001FHDEM -0.067 0.040 0.099DIABETE -0.168 0.054 0.002HYPERT -0.212 0.039 <0.001EDUC 0.002 0.002 0.252AGE 0.013 0.002 <0.001

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 41 / 43

Page 42: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Summary

Likelihood-based method is robust to the misspecification ofthe missing data process model

Weighted GEE method is robust to the misspecification of thecovariate model

Our proposed method is robust to the misspecification ofeither the missing data process model or the covariate model

Our proposed method can deal with intermittent missingnesspattern for longitudinal data with both missing response andmissing covariate

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 42 / 43

Page 43: Doubly robust estimates for longitudinal data analysis ... · longitudinal data analysis with missing response and missing covariates Xiao-Hua Andrew Zhou, Ph.D Co-Investigator and

Motivation Example Complete Longitudinal Data Missing outcome Missing Covariates Missing both Response and Covariates

Questions?

Thank You!!!

Xiao-Hua Zhou Doubly robust estimates for longitudinal data analysis with missing response and missing covariatesADC, 2009 43 / 43