estimation and confidence intervals after adjusting the maximum information

10
Estimation and Confidence Intervals after Adjusting the Maximum Information ** John Lawrence * and H. M. James Hung Division of Biometrics I, OB/CDER, Food and Drug Administration, HFD-710, 15B-45, 5600 Fishers Lane, Rockville, MD 20857 Abstract In a comparative clinical trial, if the maximum information is adjusted on the basis of unblinded data, the usual test statistic should be avoided due to possible type I error inflation. An adaptive test can be used as an alternative. The usual point estimate of the treatment effect and the usual confidence inter- val should also be avoided. In this article, we construct a point estimate and a confidence interval that are motivated by an adaptive test statistic. The estimator is consistent for the treatment effect and the confidence interval asymptotically has correct coverage probability. Key words: Adaptive test; Interim analysis; Consistent; Coverage probability; Sequential monitoring. 1. Introduction A key design element to the success of a comparative clinical trial is the sample size or more generally the total amount of statistical information required for de- tecting a treatment effect with sufficient power. This amount of information is a function of the expected treatment effect and the variance of the outcome variable of interest, both of which are unknown and difficult to guess in practice. Thus, in the past decade, much research has been devoted to the sample size re-estimation problem. Literature on this topic is abundant, such as, Bauer (1989), Bauer and Ko ¨ hne (1994), Gould and Shih (1992), Proschan and Hunsberger (1995), Lan and Trost (1997), among others. Cui, Hung and Wang (1999) proposed an adap- tive test procedure for testing for a treatment effect lower than expected that re- sults in a sample size adjustment based on the observed sample path at an interim time of the trial. The adaptive test procedure has the type I error probability pre- served at the target level and can provide a substantial gain in power with the increase in sample size. Biometrical Journal 45 (2003) 2, 143–152 # 2003 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim 0323-3847/03/0203-0143 $ 17.50þ.50/0 * Corresponding author: [email protected] ** The views expressed are those of the authors and not necessarily those of the U.S. Food and Drug Administration.

Upload: john-lawrence

Post on 06-Jun-2016

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Estimation and Confidence Intervals after Adjusting the Maximum Information

Estimation and Confidence Intervals after Adjustingthe Maximum Information**

John Lawrence* and H. M. James Hung

Division of Biometrics I, OB/CDER, Food and Drug Administration, HFD-710, 15B-45,5600 Fishers Lane, Rockville, MD 20857

Abstract

In a comparative clinical trial, if the maximum information is adjusted on the basis of unblinded data,the usual test statistic should be avoided due to possible type I error inflation. An adaptive test can beused as an alternative. The usual point estimate of the treatment effect and the usual confidence inter-val should also be avoided. In this article, we construct a point estimate and a confidence interval thatare motivated by an adaptive test statistic. The estimator is consistent for the treatment effect and theconfidence interval asymptotically has correct coverage probability.

Key words: Adaptive test; Interim analysis; Consistent; Coverage probability;Sequential monitoring.

1. Introduction

A key design element to the success of a comparative clinical trial is the samplesize or more generally the total amount of statistical information required for de-tecting a treatment effect with sufficient power. This amount of information is afunction of the expected treatment effect and the variance of the outcome variableof interest, both of which are unknown and difficult to guess in practice. Thus, inthe past decade, much research has been devoted to the sample size re-estimationproblem. Literature on this topic is abundant, such as, Bauer (1989), Bauer andKohne (1994), Gould and Shih (1992), Proschan and Hunsberger (1995), Lan

and Trost (1997), among others. Cui, Hung and Wang (1999) proposed an adap-tive test procedure for testing for a treatment effect lower than expected that re-sults in a sample size adjustment based on the observed sample path at an interimtime of the trial. The adaptive test procedure has the type I error probability pre-served at the target level and can provide a substantial gain in power with theincrease in sample size.

Biometrical Journal 45 (2003) 2, 143–152

# 2003 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim 0323-3847/03/0203-0143 $ 17.50þ.50/0

* Corresponding author: [email protected]** The views expressed are those of the authors and not necessarily those of the U.S. Food and Drug

Administration.

Page 2: Estimation and Confidence Intervals after Adjusting the Maximum Information

This is the focus of this article is on the problem of estimation of the treatmenteffect following such adaptive testing. A difficulty arises because the final samplesize is a random variable depending on the interim estimate of the treatment ef-fect. The reader is referred to the articles by Lehmacher and Wassmer (1999)

and Brannath, Posch, and Bauer (2002) for further discussion of the issues andrelated approaches to adaptive trial designs and estimation of the treatment effectfollowing the adaptation.We will consider the scenario where a clinical trial is designed to compare a

new treatment to a control. Initially, the total amount (denoted by M) of statisticalinformation is planned to detect an expected treatment difference at a specifiedsignificance level with desired statistical power. Time is measured on the informationscale and is rescaled so that the study as originally planned will have maximuminformation equal to one. One interim look is planned at time t1 (0 < t1 < 1). Forsimplicity, we consider the scenario where the study will never be terminated atthis interim look and thus no alpha is spent at that time. The investigator maydecide to adjust the maximum information based on the data observed at thisinterim look. As a result of possible adjustment to the maximum information, thenew maximum information time will be denoted by t* (t* > t1).Of interest is the test statistic at the end of the study. We will assume that a

Brownian motion process can represent the sequence of test statistics calculatedover the duration of the study, at least asymptotically. An introduction to this gen-eral framework can be found in Lan and Zucker (1993). The mean of the statisticat time t ¼ 1 is the drift parameter Q ¼ kd, where k is a constant depending on Mand d is the treatment effect. The constant k may depend on nuisance param-eters. Examples include the logrank test for a clinical trial with a time to eventendpoint and the t-test for comparing means. In the case of the logrank test, as-suming proportional hazards, the treatment effect is the log-hazard ratio and the

drift parameter is dffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffidp ð1� pÞ

pwhere d is the expected number of events at time

t ¼ 1 and p is the proportion of patients in the new treatment group at the start ofthe study (Schoenfeld, 1981). When using the t-test, the treatment effect is the

difference in population means and the drift parameter isdffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

s2

n1þ s2

n2

s where n1 and

n2 denote the number of patients in each group and s2 is the common withinsubject variance.Cui, Hung, and Wang (1999) proposed an adaptive test statistic that is asymp-

totically standard normal under the hypothesis d ¼ 0. In this article, this adaptivetest statistic is generalized to the testing situation H0 : d ¼ d0 versus the generalalternative. This generalization makes it possible to derive a confidence intervaland a point estimate for the treatment effect. The point estimate is consistent for dunder reasonable conditions and agrees with the usual estimator of d when themaximum information is not adjusted. The hypothesis test will asymptotically

144 J. Lawrence and H. M. J. Hung: Estimation after adjusting information

Page 3: Estimation and Confidence Intervals after Adjusting the Maximum Information

have the correct significance level. Consequently, the associated confidence inter-val will asymptotically have the correct coverage probability.

2. Generalization of the adaptive test statistic

For testing H0 : d ¼ 0, let Zð1Þ denote the value of the normalized test statistic atthe interim look and Z* denote the final value of the test statistic (i.e., at the timewhen maximum information is attained). From Appendix A, the statistic

Zð2Þ ¼ 1ffiffiffiffiffiffiffiffiffiffiffiffi1� t1

t*

r Z*�

ffiffiffiffiffiffit1t*

rffiffiffiffiffiffiffiffiffiffiffiffi1� t1

t*

r Zð1Þ

is independent of Zð1Þ, and intuitively it captures all the information about thetreatment effect in Z* that is not in Zð1Þ. The class of adaptive test statistics for

this null hypothesis is fZl j Zl ¼ffiffiffil

pZð1Þ þ

ffiffiffiffiffiffiffiffiffiffiffi1� l

pZð2Þg where 0 � l < 1 is a

constant weight that is chosen before any data are observed. It is helpful to thinkof Zð1Þ and Zð2Þ as the first and second stage Z-statistics and the final statistic, Zl,as a pooled Z-statistic. The weights defined by l ¼ t1 are optimal when there isno change in the maximum amount of information. Hence, we will use this valueof l and denote the resulting test statistic by Z. The null hypothesis of no treat-ment effect is rejected in favor of the general alternative at level a if jZj exceedsza=2, the upper a=2 percentile of the standard normal distribution. This adaptivetest is essentially the basis of the group sequential adaptive test procedure of Cui,Hung, and Wang.

For the general treatment effect d, the expected value of Zð1Þ isffiffiffiffit1

pkd. For any

fixed value of t*, the expected value of Zð2Þ isffiffiffiffiffiffiffiffiffiffiffiffiffit*� t1

pkd. As shown in Appen-

dix A, the statistic Zð2Þ �ffiffiffiffiffiffiffiffiffiffiffiffiffit*� t1

pkd asymptotically has a standard normal distri-

bution and is statistically independent of Zð1Þ � ffiffiffiffit1

pkd. Hence, the statistic

Z ¼ ffiffiffiffit1

p fZð1Þ � ffiffiffiffit1

pkd0g þ

ffiffiffiffiffiffiffiffiffiffiffiffi1� t1

pfZð2Þ �

ffiffiffiffiffiffiffiffiffiffiffiffiffit*� t1

pkd0g

is asymptotically standard normal under the null hypothesis H0 : d ¼ d0. The nullhypothesis is rejected if jZj exceeds za=2. This extends the testing procedure ofCui, Hung, and Wang to the more general testing scenario.By equating the observed values and their means, the natural method of mo-

ments estimators ddð1Þ ¼ Zð1Þffiffiffiffit1

pk

and ddð2Þ ¼ Zð2Þffiffiffiffiffiffiffiffiffiffiffiffiffit*� t1

pk

are found. For instance,

when using the t-test, the estimator ddð1Þ is Zð1Þ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffis2

t1n1þ s2

t1n2

s, assuming s is

known. This is the difference in sample means using the data on the patients ob-

Biometrical Journal 45 (2003) 2 145

Page 4: Estimation and Confidence Intervals after Adjusting the Maximum Information

served at the time of the interim analysis. Similarly, ddð2Þ is the difference in samplemeans using the data on the patients observed after the time of the interim analysis.There is an alternative way to write the adaptive test statistic as a function of

these estimators, Z ¼ t1kfddð1Þ � d0g þffiffiffiffiffiffiffiffiffiffiffiffi1� t1

p ffiffiffiffiffiffiffiffiffiffiffit*�t1

pkfddð2Þ � d0g. It is interest-

ing to compare this to the usual (unadapted) test statistic that would be used to

test these hypotheses, namely, Z* ¼ t1kfddð1Þ � d0g þ ðt*�t1Þ kfddð2Þ � d0g. Theonly difference is that the weight given to the information after the interim look is

decreased from t*� t1 when calculating Z* toffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið1� t1Þ ðt*� t1Þ

qwhen calculat-

ing Z. The weight in the adaptive test statistic is the geometric mean of theamount of information initially expected after the interim analysis and the amountof information actually obtained after the interim analysis. In particular, if themaximum information is not changed, then the statistics are identical.

3. Confidence interval and point estimate

A 100ð1� aÞ% confidence interval consists of the set of all d0 such that the nullhypothesis H0 : d ¼ d0 would not be rejected. Therefore, the upper and lowerlimits of this confidence interval are found by equating Z to � za=2 and solving ford0. Hence, the upper and lower limits of this confidence interval are

t1ddð1Þ þ

ffiffiffiffiffiffiffiffiffiffiffiffi1� t1

p ffiffiffiffiffiffiffiffiffiffiffit*�t1

pddð2Þ � k�1za=2

t1 þffiffiffiffiffiffiffiffiffiffiffiffi1� t1

p ffiffiffiffiffiffiffiffiffiffiffiffiffit*� t1

p : ð1Þ

The natural point estimate, dd, is found by equating Z to its expected value, 0,under the null hypothesis and solving for d0. This estimator is the midpoint of theconfidence interval, that is,

dd ¼ t1ddð1Þ þ

ffiffiffiffiffiffiffiffiffiffiffiffi1� t1

p ffiffiffiffiffiffiffiffiffiffiffiffiffit*� t1

pddð2Þ

t1 þffiffiffiffiffiffiffiffiffiffiffiffi1� t1

p ffiffiffiffiffiffiffiffiffiffiffiffiffit*� t1

p : ð2Þ

Again, it is interesting to observe that this is equal to the traditional estimator

~dd ¼ t1ddð1Þ þ ðt*� t1Þ ddð2Þ

t*;

when the maximum information is not changed and in general differs only in theweights. Moreover,

dd� d ¼ t1fddð1Þ � dg þffiffiffiffiffiffiffiffiffiffiffiffi1� t1

p ffiffiffiffiffiffiffiffiffiffiffit*�t1

pfddð2Þ � dg

t1 þffiffiffiffiffiffiffiffiffiffiffiffi1� t1

p ffiffiffiffiffiffiffiffiffiffiffiffiffit*� t1

p ;

where the numerator multiplied by k is a standard normal for sufficiently large M.Therefore, as M ! 1, under the conditions that t1 converges to a fixed positiveconstant less than one and that t* is bounded in probability and t* > t1, the esti-

146 J. Lawrence and H. M. J. Hung: Estimation after adjusting information

Page 5: Estimation and Confidence Intervals after Adjusting the Maximum Information

mator dd is consistent for d and asymptotically normal with variance that can be

consistently estimated by 1=fk2ðt1 þffiffiffiffiffiffiffiffiffiffiffiffi1� t1

p ffiffiffiffiffiffiffiffiffiffiffiffiffiffit*� t1

pÞ2g. In fact, the adaptive

test Z is the normalized test based on dd, that is,

Z ¼ k t1 þffiffiffiffiffiffiffiffiffiffiffiffi1� t1

p ffiffiffiffiffiffiffiffiffiffiffiffiffiffit*� t1

q� �� �ðdd� dÞ :

It is worth noting that in contrast, the estimator ~dd is not normal though it is alsoconsistent for d.

4. Example

We take the case study described in Cui, Hung and Wang (1999). The clinicaltrial was conducted to evaluate the effect of a new drug for prevention of myocar-dial infarction in the patients undergoing coronary artery bypass graft surgery. Atotal of 300 subjects per treatment group was originally planned to detect a 50%reduction of incidence from 22% for placebo to 11% for the drug with 95%power, based on two-sample proportion test. Half-way through the trial, the inter-im analysis based on the data from approximately 150 patients per group revealedthat the incidence rate was about 21.5% in the placebo group, in line with expecta-tions, but was about 16.5% (about 25% less than anticipated) for the drug group.The trial was not up-sized to ensure sufficient power and eventually failed to showa statistically significant effect for the drug.As Cui, Hung and Wang (1999) articulated, if the per-group sample size were

increased to about 1400 treating the observed incidence rates as if they were thetrue rates, the proposed adaptive test would have approximately 93% power todetect the difference. For illustration, assume that the total sample size per armwas increased to 1400 and the observed incidence rates were 23% and 17% forthe placebo and the drug, respectively, at the end of the study. Details of thecalculations that follow are provided in Appendix B. By applying formula (2), anestimate of the difference in incidence rates is dd ¼ 5.8%. Based on formula (1),the 95% confidence interval for the difference in incidence rates is (2.6%, 9.1%).Cui, Hung, and Wang’s adaptive test statistic for testing the null hypothesis of notreatment effect is Z ¼ 3:49. If we interpret the confidence interval as a pointestimate plus or minus za=2 standard errors, then the point estimate divided by the“standard error” is equal to this value of Z. This further illustrates how the confi-dence interval is compatible with the test.

5. Simulation Study

A simulation study was done to compare the actual coverage probability of theconfidence interval proposed in Section 3 with the usual confidence interval. The

Biometrical Journal 45 (2003) 2 147

Page 6: Estimation and Confidence Intervals after Adjusting the Maximum Information

simulated data is similar to the example in Section 4. One run consists of thesimulation of an experiment where initially, 300 patients per group are planned forthe entire experiment to detect a postulated treatment effect of 11%. The rate inthe placebo group is always 22%, but we varied the true rate in the treatmentgroup to see the impact of various differences in rates. After 150 patients areobserved in each group, the total sample size is re-estimated using the observedtreatment effect. After calculating the observed treatment difference ddð1Þ amongthe first 150 patients in each group, the total sample is changed toð11%=ddð1ÞÞ2 � 300. No sample size adjustment is made in the simulation unlessthe observed treatment effect is less than the 90% of the postulated treatmenteffect. This has the effect of putting a lower bound on the total sample size. More-over, the total sample size is truncated at an upper limit of 4800 for practicality.The results of the simulation study appear in Table 1.One million runs were used for each row in Table 1. Consequently, the Monte

Carlo standard error is approximately 0.0002. The coverage probability of theadaptive confidence interval is very close to 0.95 in each of the scenarios studied.On the other hand, the coverage probability of the naive confidence interval fluc-tuates depending on the true effect size and differs by up to 1% from the targetcoverage probability. The purpose of this simulation study was not to show thatthe naive confidence interval behaves very badly. One could argue that 94% to96% coverage probability is adequate. In other scenarios, one does not know howbad the coverage probability of the traditional confidence interval may be. How-ever, what the simulation study does show is that the confidence interval describedin Section 3 always maintains the correct coverage probability.In order to investigate the properties of the point estimate, a similar simulation

study was done. The relative bias and relative root-mean-squared error were esti-mated for three different estimators. The relative bias is defined as the averagedifference between the estimator and the true parameter divided by the trueparameter. The root-mean-squared error was defined as the square root of themean-squared error divided by the true parameter. The first estimator in the tableis dd defined by equation (2) and the naive estimator (unadapted estimator) is thedifference in the sample event rates. The middle column is the estimator definedby the recursive combination test by Brannath, Posch, and Bauer corresponding

148 J. Lawrence and H. M. J. Hung: Estimation after adjusting information

Table 1

Estimated coverage probability of 95% confidence interval

True Difference in Rates Adaptive CI Unadaptive CI

0 0.950 0.9410.02 0.950 0.9390.05 0.950 0.9440.11 0.950 0.958

Page 7: Estimation and Confidence Intervals after Adjusting the Maximum Information

to Fisher’s combination test. This estimator is defined as the value of d0 such that

p1ðd0Þ* p2ðd0Þ* ½1� ln fp1ðd0Þ* p2ðd0Þg� ¼ 0:5 :

Here, p1ðd0Þ is the p-value for testing the null hypothesis that the difference is d0using only the data before the interim analysis and p2ðd0Þ is the correspondingp-value using only the data after the interim analysis. This estimator, like dd, is bydefinition a median-unbiased estimator. Brannath, Posch, and Bauer (2002)point out that the adaptive test of Cui, Hung, and Wang (1999) is a special caseof their recursive combination test using the inverse Gaussian transformation. Con-sequently, dd is the estimator defined by these authors’ procedure with a differentcombination function.Under these scenarios, the first estimator has uniformly smaller bias than the

naive estimator. The RCT estimate generally has smaller bias than the naive esti-mate when the interim look is at t1 ¼ 0.5 or t1 ¼ 0.75, but larger bias whent1 ¼ 0.25. When the true difference is much smaller than the initial guess, i.e. thetrue difference is only 0.02, then the naive estimator is a more efficient estimatoras measured by the mean squared error. This is because both of the first twoestimators put less weight on the larger amount of data observed after the interimanalysis than the naive estimator. In the other cases, when the initial guess at thetrue treatment effect is not too far from the truth, the mean-squared errors of dd

Biometrical Journal 45 (2003) 2 149

Table 2

Relative bias and root mean squared error of dd, recursive combination test estimate de-scribed by Brannath et. al. and naive estimate

True Differencein Rates

dd RCT Estimate Naive Estimate

Bias RMSE Bias RMSE Bias RMSE

Interim look at t1 ¼ 0.25

0.02 0.181 0.937 0.366 1.277 0.213 0.9330.05 0.084 0.423 0.151 0.524 0.105 0.4250.07 0.062 0.323 0.104 0.382 0.080 0.3250.11 0.030 0.226 0.048 0.256 0.042 0.226

Interim look at t1 ¼ 0.5

0.02 0.196 0.934 0.246 0.998 0.222 0.8970.05 0.107 0.431 0.114 0.437 0.136 0.4340.07 0.078 0.328 0.079 0.329 0.107 0.3330.11 0.003 0.230 0.003 0.237 0.050 0.228

Interim look at t1 ¼ 0.75

0.02 0.183 0.981 0.188 0.873 0.206 0.8590.05 0.109 0.447 0.098 0.405 0.148 0.4410.07 0.080 0.338 0.073 0.324 0.123 0.3430.11 0.029 0.233 0.035 0.251 0.055 0.225

Page 8: Estimation and Confidence Intervals after Adjusting the Maximum Information

and the naive estimate are roughly equivalent. In these same scenarios, the mean-squared error of the RCT estimate is slightly higher compared to the naive esti-mate in some cases and slightly lower in other cases.

6. Discussion

When the maximum information is changed, the test based on the adaptive teststatistic is guaranteed to have the correct significance level, while the usual testdoes not have this property. Following the adaptive testing, the traditional pointestimate and the confidence interval should be avoided. One of the problems withthe naive estimator in the scenario discussed in this article is that it is no longernormally distributed. Therefore, hypothesis tests or confidence intervals that usethe normal distribution are no longer valid. We constructed a consistent estimatorfor the treatment effect and a test-based confidence interval that has correct cover-age probability regardless of what the true value of the treatment effect. The pointestimate and confidence interval are compatible with the adaptive test. A simula-tion study showed that the point estimate has smaller bias than the naive estima-tor, but the mean-squared errors of both estimators are comparable.There are two important features of the methods presented in this article. First,

no knowledge is required regarding the reasons behind the adjustment in the max-imum information or the specification of a function relating the amount of adjust-ment to the data observed at the interim analysis. Second, if an adjustment in themaximum information was considered, but not done, then the methods describedhere are identical to the methods without adjustment. Although this article onlydiscussed the two-sided testing scenario, with obvious changes one can easilyobtain an adaptive test statistic for one-sided (non-inferiority) testing and an asso-ciated one-sided confidence bound.The focus of this article was on scenarios where no hypothesis testing was done

at the interim look. The point estimate and confidence interval defined here can beextended to the two-stage testing scenario. One possible way to accomplish this isdescribed in Brannath, Prosch, and Bauer (2002) using the inverse normaltransformation.

Appendix A

Consider a sequence of the normalized test statistics ZðtÞ, indexed by t > 0, whichcan be expressed asymptotically as a Brownian motion process BðtÞ. That is,BðtÞ ¼ ZðtÞ

ffiffit

p. For any t > s > 0, the random vector ðBðtÞ; BðsÞÞ is jointly bivari-

ate normal under the null hypothesis with mean (0, 0) and variance (t, s) andcovariance s. At some fixed s > 0, the realization of BðsÞ is obtained. Suppose thatt* is a positive function of BðsÞ and increases in t. Then, conditional on BðsÞ, the

150 J. Lawrence and H. M. J. Hung: Estimation after adjusting information

Page 9: Estimation and Confidence Intervals after Adjusting the Maximum Information

difference process Bðt*Þ � BðsÞ indexed by t* � s is also a Brownian motion pro-cess. It follows that conditional on ZðsÞ, the random variable

Zðt*� sÞ � Bðt*Þ � BðsÞt*� s

¼Zðt*Þ �

ffiffiffiffis

t*

rZðsÞffiffiffiffiffiffiffiffiffiffiffiffiffi

1� s

t*

r

is a standard normal. Hence, under the null hypothesis, Zðt*� sÞ is statisticallyindependent of the standard normal ZðsÞ and unconditionally a standard normal.

Appendix B

t1 ¼150

300¼ 0:5 ; t* ¼ 1400

300; ddð1Þ ¼ 21:5%� 16:5% ¼ 5% ; and

ddð2Þ ¼ f23%� 1400� 21:5%� 150g � f17%� 1400� 16:5%� 150g1250

� 6:12% :

From equation (2), we have dd ¼5%

ffiffiffiffiffiffiffiffiffiffi1250

600

r� 6:12%

1

ffiffiffiffiffiffiffiffiffiffi1250

600

r � 5:83% .

The inverse of the constant k is by definition equal to the variance of the differ-ence in sample proportions with 300 patients per group. Estimates of individualgroup rates are needed to estimate k�1. Equation (2) can be used with individualgroup rates in place of differences to obtain estimates that are in agreement withdd. For example, using the data before the interim analysis an estimate of the ratein the placebo group is 21.5% and an estimate using the data after the interim

analysis is23%� 1400� 21:5%� 150

1250� 23:2%. These are combined to obtain

the estimate

21:5%

ffiffiffiffiffiffiffiffiffiffi1250

600

r� 23:2%

1

ffiffiffiffiffiffiffiffiffiffi1250

600

r � 22:8%. The consistent estimator of k�1

is

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi22:8%� 77:2%

300þ 16:9%� 83:1%

300

r� 3:24%. The lower and upper limits of

the 95% confidence interval are then found from equation (1). Also,Zð1Þ ¼ ffiffiffiffi

t1p

kddð1Þ � 1:09, Z* ¼ffiffiffiffit*

pk � 6% � 4:00, and Zð2Þ � 3:85. So, the adap-

tive test statistic for testing the null hypothesis of no difference in rates isZ ¼ ffiffiffiffi

t1p

Zð1Þ þffiffiffiffiffiffiffiffiffiffiffiffi1� t1

pZð2Þ � 3:49. Finally, the estimate of the standard error

Biometrical Journal 45 (2003) 2 151

Page 10: Estimation and Confidence Intervals after Adjusting the Maximum Information

based on the confidence interval isk�1

t1 þffiffiffiffiffiffiffiffiffiffiffiffi1� t1

p ffiffiffiffiffiffiffiffiffiffiffiffiffit*� t1

p � 1:67% anddd

1:67%� 3:49.

References

Bauer, P., 1989: Multistage testing with adaptive designs (with discussion). Biometrie und Informatikin Medizin und Biologie 20, 130–148.

Bauer, P. and Kohne, K., 1994: Evaluation of experiments with adaptive interim analyses. Biometrics50, 1029–1041.

Brannath, W., Posch, M., and Bauer, P., 2002: Recursive combination tests. Journal of the Ameri-can Statistical Association 97, 236–244.

Cui, L., Hung, H. M. J. and Wang, S. J., 1999: Modifications of sample size in group sequentialtrials. Biometrics 55, 853–857.

Gould, A. L. and Shih, W. J., 1992: Sample size re-estimation without unblinding for normally dis-tributed outcomes with unknown variance. Communications in Statistics – Theory and Methods21 (10), 2833–2853.

Jennison, C. and Turnbull, B. W., 1983: Repeated confidence intervals for group sequential trials.Controlled Clinical Trials 5, 33–45.

Lan, K. K. G. and Trost, D. C., 1997: Estimation of parameters and sample size reestimation. Pro-ceedings of Biopharmaceutical Section, American Statistical Association, 48–51.

Lan, K. K. G. and Zucker, M., 1993: Sequential monitoring of clinical trials: the role of informationand Brownian motion. Statistics in Medicine 12, 753–765.

Lehmacher, W. and Wassmer, G., 1999: Adaptive sample size calculations in group sequential trials.Biometrics 55, 1286–1290.

Proschan, M. A. and Hunsberger, S. A., 1995: Designed extension of studies based on conditionalpower. Biometrics 51, 1315–1324.

Schoenfeld, D., 1981: The asymptotic properties of nonparametric tests for comparing survival distri-butions. Biometrika 68, 316–319.

Received, November 2001Revised, May 2002Revised, August 2002Accepted, September 2002

152 J. Lawrence and H. M. J. Hung: Estimation after adjusting information