controlling false positive rate due to multiple analyses unstratified vs. stratified logrank test

1

Controlling False Positive Rate Controlling False Positive Rate Due to Multiple AnalysesDue to Multiple Analyses

Unstratified vs. Stratified Logrank Test

Peiling Yang, Gang Chen, George Y.H. Chi

DBI/OB/OPaSS/CDER/FDA

The view expressed in this talk are those of the authors and may not necessarily represent those of the Food and Drug Administration.

2

Motivation: Example of Drug X

Primary endpoint: Survival

Hypothesis: Overall constant H.R. 1 vs. >1Primary Analysis: Unstratified logrank

Results Observedstatistic

P-value(1-sided)

Unstratified 1.762 0.039Stratified 2.228 0.013

Q: Is this finding statistically significant?

3

Issues to Explore

• Implication of these tests/analyses.

• Eligibility of efficacy claim based on these tests/analyses.

• Practicability of multiple testing/analyses.

4

Outline

• Notations / Settings• Introduction to logrank test

– Unstratified, stratified

• Comparisons– Hypotheses, test statistic, test procedure, inference

• Practicability of hypotheses Testing• Multiple testing/analyses• Example of Drug X• Summary

5

Settings / Notations

• 2 arms (control j=1; experimental: j=2).

• K strata: k=1, .., K

• Patients randomized within strata

• t1 < t2 < …< tD: distinct death times

• dijk: # of deaths & Yijk: # of patients at risk at death time ti, in jth arm & kth stratum.

6


# o f d e a t h s a tt i m e t i

# o f p a t i e n t s a tr i s k a t t i m e t i

I n S t r a t u m k : 2i . k i j kj = 1d = d 2

. 1i k i j kjY Y

I n A r m j : Ki j . i j kk = 1d = d . 1

Ki j i j kkY Y

T o t a l : 2i . . i j .j = 1d = d 2

. . .1i i jjY Y

7


• Hazard ratio (ctrl./exper.): constant– Across strata: c

– Within stratum: ck

• Non-informative censoring

8

Introduction: Unstratified Logrank

1c u0H : v s . > 1cu

1H :

T e s t s t a t i s t i c : . 1 . . 1 .

. 1 .

[ ]

[ ]

uu

u

d E dW

V A R d

, w h e r e

. 1 .[ ]uE d = 1 .1 .

. .

ii

ii

dY

Y

. 1 .[ ]uV A R d = 1 . 2 . . . . .. .

. . . . . . 1i i i i

ii i ii

Y Y Y dd

Y Y Y

9

Introduction: Unstratified Logrank

• Wu ~ N(0,1) under least favorable parameter configuration (c=1) in .

• Reject if Wu > z.

• Type I error rate is controlled at level .

0uH

0uH

10

Introduction: Stratified Logrank

1kc s0H : f o r a l l k v s .

1kc s1H : f o r a t l e a s t o n e k .

T e s t s t a t i s t i c : . 1 . . 1 .

. 1 .

[ ]

[ ]

ss

s

d E dW

V A R d

, w h e r e

. 1 .[ ]sE d = 11

.

i ki k

i kk i

dY

Y

. 1 .[ ]sV A R d = 1 2 . ..

. . . 1i k i k i k i k

i ki k i k i kk i

Y Y Y dd

Y Y Y

11

Introduction: Stratified Logrank

• Ws ~ N(0,1) under least favorable parameter configuration (ck = 1 for all k) in .

• Reject if Ws > z.

• Type I error rate is controlled at level .

0sH

0sH

12

Comparison of Hypotheses

• Different hypotheses formulations:

– U nstratified :

0 : 1uH c vs . 1 : 1uH c

– S tratified :

: 1s0 kH c for a ll k vs.s1H : 1kc for a t least one k .

13

Comparison of Test Statistics

• Corr(Wu, Ws) = 1 because of same r.v. d.1.

• Ws = a Wu + b, wherewhere

• Wu ~ N(0, 1) Ws ~ N(b, a2)

a .1.

.1.

[ ]

[ ]

u

sVar d

Var d & b .1. .1.

.1.

[ ] [ ]

[ ]

u s

s

Ed Ed

Var d

.

14

Comparison of Test Procedure

To test 1c u0H : vs. > 1cu

1H :

– Use uW and reject u0H if uW > z.

– If use sW , adjusted critical value (az b )required for a valid level- test.

15

Comparison of Test Procedure

T o t e s t 1kc s0H : f o r a l l k v s .

1kc s1H : f o r a t l e a s t o n e k .

– U s e sW a n d r e j e c t s0H i f sW > z .

– I f u s e uW , a d ju s t e d c r i t i c a l v a lu e ( ) /z b a r e q u i r e d f o r a v a l id l e v e l - t e s t .

16

Comparison of Inference

• Rejection of : – Infer overall positive treatment effect in entire

population.

• Rejection of : – Can only infer positive treatment effect in "at least one

stratum".

– Further testing to identify those strata required to make claim & error rate for identifying wrong strata also needs to be controlled.

u0H

s0H

17

Practicability of Hypotheses Testing

• Unstratified hypotheses are tested when desired to infer overall positive treatment effect in entire population.

• Stratified hypotheses are tested when desired to infer positive treatment effect in certain strata.

• Multiple testing of both unstratified & stratified hypotheses ok when not sure whether treatment is effective in entire population or certain strata (but both nulls need to be prespecified in protocol).

18

Multiple Testing/Analyses

• Multiple testing unstratified (use Wu) & stratified (use Ws) hypotheses.

• Error to control: strong familywise error (SFE), including the following:– When c1 & all ck1: falsely infer c or some ck’s>1.

– When c1 & some ck’s>1: falsely infer c>1 or wrong ck’s>1

Note: parameter space of “all ck1 but c>1” impossible.

19

Multiple Testing/Analyses

c1 & all ck 1

c>1 & at least one ck>1

impossible space

c1 & at least one ck>1

Property of SFE: FE nested in another FE.

FE

Which ck>1?

Nested FE

20

Example -- Drug X

• Ws = aWu+b, where a = 1.039, b=0.409

• Critical value using Ws should be adjusted to az+b.

• False positive error rate using Ws w/o adjustment = 0.066; – Inflation = 0.066 - 0.025 = 0.041.

• Ans.: This finding is not statistically significant.

Logrank Test Observedstatistic

P-value(1-sided)

Unstratified Wu 1.762 0.039Stratified Ws 2.228 0.013

1cu0H: vs. > 1cu

1H:

for s0H

21

Figure 1: False positive rate vs. desired level (w/o adjustment)

22

Summary

• Hypotheses (unstratified or stratified or both) – should reflect what is desired to claim.– need to be prespecified in protocol.

• If stratified null is rejected, further testing required to identify in which strata treatment effect is positive.

• Strong family error rate needs to be controlled regardless of single or multiple testing.

controlling false positive rate due to multiple analyses unstratified vs. stratified logrank test

Documents

test statistic

unstratified logrank

logrank testunstratified

test procedure

death time ti

time tiin stratum

cwithin stratum

drug administration