empirical likelihood with arbitrary censored/truncated data by constrained em algorithm

2004 ENAR Spring Meeting, Pittsburgh, PA2004 ENAR Spring Meeting, Pittsburgh, PA

Empirical Likelihood with Arbitrary Censored/Truncated Data by

Constrained EM Algorithm

Min Chen, Jingyu Luan, Mai ZhouDepartment of StatisticsDepartment of StatisticsUniversity of KentuckyUniversity of Kentucky

817 Patterson Office Tower817 Patterson Office TowerLexington, KY 40506Lexington, KY [email protected]@ms.uky.edu


Outline

• Introduction• Maximize Empirical Likelihood under parameters

of weighted hazard• Maximize Empirical Likelihood under parameters

of weighted hazards for data with covariates• Examples and Numerical results • Conclusion

we can compute empirical likelihood ratio for arbitrary censored/truncated data with/without covariates under weighted hazard parameter.


Introduction (1)

• Empirical Likelihood Ratio Method– Empirical Distribution for n iid observations

– Empirical Likelihood function

– Empirical Likelihood Ratio

[ ]1

1( )

i

n

n X xi

F x In

1

( ) ( ) ( )n

i ii

L F F X F X

( )( )

( )n

L FR F

L F


Introduction (1)

– Empirical Likelihood ratio Function by Owen(1988)

( ) sup ( ) | ( ) ( ) , nF

R R F g x dF x F F


Introduction (2)

• Arbitrary Censored/Truncated Data– According to Turnbull (1976) and Frydman (1994),

any observation in an arbitrary censored/Truncated Data can be described by , with ,

where

And we associate each observation with , the covariate, if any.

( , )i iA B i iA B

1

[ , ]ik

i ij ijj

A L R

1

( , )il

i ij ijj

B V U

iZ


Introduction (2)

For example• Any exact observation without covariate could be

described as

• Any right censored observation with one covariate could be described as

[ , ], ( , )i i i iA x x B

[ , ), ( , ),i i i i iA x B Z z


Introduction (2)

The likelihood for the arbitrary censored and/or

truncated data is proportion to

1

1

1

{ ( ) ( )}

( ) ( )

{ ( ) ( )}

i

i

k

ij ijNj

li

ij ijj

F R F L

L F

F U F V


Introduction (3)

Based on Turnbull (1976) and Alioum (1996), for

arbitrary censored/truncated data, we could

construct a set such that

• Any c.d.f. jumps outside of C could not be MLE of the unknown distribution function

• The likelihood is independent of the behavior of the distribution inside each interval .

1

[ , ]m

j jj

C q p

[ , ]j jq p


Introduction (3)

Now the problem of maximizing emprical likelihood

reduces to maximize

The MLE of can be obtained by self-

Consistency/EM algorithm.

*1

1 11

( ,..., ) ( / )N m m

m ij j ij jj ji

L s s s s

1,..., ms s


Maximum of Empirical Likelihood under parameters of weighted hazard

• The empirical log likelihood for , i=1,…,N is proportional to

• and could be written in terms of hazards as

*

1 1 1

log ( ) [log log ]N m m

ij j ij ji j i

L s s s

( , )i iA B

: :*1

1 1 1

log ( ,..., ) [log log ]k kk t t k t tk j k j

N m mp p

m ij j ij ji j j

L p p p e p e


Maximum Empirical Likelihood under parameters of weighted hazard

• Under hazard constraint , we may think of using the Lagrange Multiplier method to find the maximum log likelihood, but it turns out to be intractable.

( ) ( )g t dH t



• Modified self-consistency/EM algorithm

E-Step: Given current Estimate of H(.), we can compute a weight on each pseudo jump point of H(.).

M-Step: For the pseudo jump points associated with weights in E-Step, we could compute a new hazard jump.



Theorem 1

Under hazard type constraint ,

the NPMLE of hazard jumps obtained by the modified EM algorithm for the arbitrary censored and/or truncated data is equivalent to the solution by Lagrange Multiplier Method.

( ) ( )g t dH t


Empirical ML under parameters of weighted hazards for data with covariates

• For arbitrary censored/truncated data with covariates, by Cox proportional hazards regression model, the log likelihood could be written as

: :*1

1 1 1

log ( ,..., , ) (log log )z zi i

k kk t t k t tk j k j

N m me p e p

m ij j ij ji j j

L p p p e p e



• Under hazard constraint , and hypothesis about , obtain the maximum log likelihood by using Lagrange Multiplier method is even more complicate.

( ) ( )g t dH t



Theorem 2

Under hazard type constraint, the NPMLE of hazard jumps obtained by the modified EM algorithm for the arbitrary censored and/or truncated data with covariates are equivalent to the solution by Lagrange Multiplier Method.


Examples and Numerical results (1)

Left Truncated Right

Censored Data without

covariates under hazard

constraint

The maximum log

likelihood is achieved at

=0.33

[ 960] ( )tI dH t



Left Truncated Right

Censored Data without

covariates under hazard

constraint

The maximum log


=0.71

[ 1020] ( )tI dH t



Right Censored Data

with one covariate

under hazard constraint

The maximum log


=-0.01

[ 95] ( ) 2.05tI dH t



Interval Censored Data

with one covariate and

no hazard constraint

The maximum log


=0.1


Conclusion

we can compute empirical likelihood ratio for arbitrary censored/truncated data with/without covariates under weighted hazard parameter.


References

• Alioum A. and Commenges D. (1996) A proportional Hazards Model for Arbitrarily Censored and Truncated Data. Biometrics, 52, 512-524.

• Cox, D.R. (1972) Regression models and life tables (with discussion). J. of the Royal Statistical Society, Series B, 34, 187-220.

• Frydman, H. (1994) A note on nonparametric estimation of the distribution function from interval-censored and truncated observations. Journals of the Royal Statistical Society, Series B, 56, 71-74.

• Gentleman, R. and Ihaka, R. (1996) R: A Language for data analysis and graphics. J. of Computational and Graphical Statistics, 5, 299-314.

• Luan J.Y., Chen M. and Zhou M. (2003) Empirical Likelihood Ratio with Right Censoring and Left Truncation Data. Technical Report.


References

• Klein and Moeschberger (1997) Survival Analysis: Techniques for Censored and Truncated Data. Springer, New York.

• Owen, A. (2001) Empirical Likelihood. Chapman \& Hall. London.• Pan, X.R. and Zhou, M. (1999). Using one parameter sub-family of

distributions in empirical likelihood with censored data. J.Statist. Planning and Infer. 75, 379-392.

• Thomas, D. R. and Grunkemeier, G.L. (1975). Confidence Interval estimation of survival probabilities for censored data. Amer. Statist. Assoc. 70, 865-871.

• Turnbull B, The empirical distribution function with arbitrary grouped, censored and truncated data. JRSS B, 290-295.

• Zhou M. (2003). Empirical likelihood ratio with arbitrary censored/truncated data by EM algorithm. Technical Report.

empirical likelihood with arbitrary censored/truncated data by constrained em algorithm

Documents