pubh 8482: sequential analysis course · pdf filepubh 8482: sequential analysis course...

Pubh 8482: Sequential AnalysisCourse Introduction

Joseph S. Koopmeiners

Division of BiostatisticsUniversity of Minnesota

Week 1

Who am I?

• Joe Koopmeiners• Assistant Professor in Division of Biostatistics• PhD - University of Washington, 2009• Research Interests:

• Sequential and adaptive methods for translational cancer research• statistical evaluation of diagnostic tests

• Collaborative Projects:• Tobacco cessation• MRI as a diagnostic tool for prostate cancer• Statistical support for early phase clinical trials

Who are you?

• Name• Year in program (program, if not in biostatistics)• Advisor, if known

Pre-requisites

• Stat 8101-8102• Students must be comfortable with the multivariate normal

distribution

Text Books

There are no required texts for this course but lecture notes will drawheavily from:

• Jennison, C. and Turnbull, B. (1999) Group Sequential Methodswith Applications to Clinical Trials, Boca Raton: CRC Press.ISBN 0849303168

• Berry, S.M., Carlin, B.P., Lee, J.J. and Muller, P. (2010) BayesianAdaptive Methods for Clinical trials, Boca Raton: CRC Press.ISBN 1439825483

Text Books

Other useful textbooks on reserve in the biostat reading room include:

• Whitehead, J. (1997) The Design and Analysis of SequentialClinical Trials, 2nd Ed., New York: John Wiley & Sons. ISBN0471975508

• Proschan, M. A., Lan, K.K.G. and Wittes, J.T. (2006) StatisticalMonitoring of Clinical Trials: A Unified Approach, New York:Springer. ISBN 0387300597

• Yin, G. (2012) Clinical Trial Design: Bayesian and FrequentistAdaptive Methods, Hoboken: John Wiley & Sons. ISBN0470581719

Evaluation and Grades

• Homework - 40%• Mid-term exam - 30%• Final project - 30%

Homework

• About 5 homework assignments• You will have at least 2 weeks to work on the assignments• Homeworks will be due on Tuesdays

Mid-term Exam

• Take-home mid-term about halfway through the course• Will cover what I consider “classic” group sequential methodology• You will have one week to complete the exam

Class Project

• Pick a specific topic related to sequential and adaptive clinicaltrials

• Writen report and 20 - 30 minute presentation• Discuss statistical challenges specific to your topic• Literature review of current research• Identify open research areas

Office Hours

• Monday 3:00 p.m. - 4:00 p.m.• Thursday 1:00 - 2:00 p.m.

Course Website

• Course Website

http://www.biostat.umn.edu/ josephk/courses/pubh8482 fall2012/

• Linked from my faculty webpage

Course Title

• Current Title:• Sequential Analysis

• More Accurate Title:• Sequential and Adaptive Methods for Clinical Trials

What is a clinical trial?

• Clinical Trial: A controlled experiment to test the safety orefficacy of a treatment or intervention

• Usually randomized. Although, this is not always the case.Especially in phases 1 and 2.

Fixed-sample Design

• Most studies utilize fixed-sample designs• Fixed-sample design: collect a pre-specified number of

subjects and test our hypthesis• Randomize 200 patients to either a novel treatment or placebo and

compare survival using a log-rank test

Sequential Design

• An alternate approach is a sequential design• Sequential design: sequentially monitor the primary endpoint

and continue to enroll subjects based on the interim results• Randomize an initial cohort of patients to a novel treatment or

placebo and compare survival using a log-rank test• Determine if more patients should be enrolled based on

pre-specified stopping rule

Continuous Monitoring

• Early sequential methods focused on continuous monitoring• Evaluating the endpoint after each new experimental unit• Quality control for bombs during WWII

• These methods are not practical in the setting of clinical trials

Group Sequential Methods

• Sequential clinical trials generally rely on group sequentialmethodology

• Group sequential methods: interim analyses are completed atpre-specified intervals throughout the study

• Randomized the first 50 patients to either a novel treatment orplacebo and compare survival using a log-rank test

• Determine if more patients should be enrolled based onpre-specified stopping rule

• Re-evaluate endpoint after every 50 patients until a total of 200have been enrolled

Adaptive Design

• An adaptive design is a design that uses accumulating datafrom the ongoing trial to modify certain aspects of the study

• Sample size• Treatment dose• Randomization ratio• Study arms

Sequential vs. Adaptive Designs

• There is no clear distinction between what constitutes asequential and what constitutes an adaptive design

• Both rely on interim analyses to modify the design• Sequential designs generally only modify the sample size (by

stopping early), while adaptive designs are used to describedesigns with more broad modifications

• Both face similar statistical challenges

Sequential and Adaptive Designs: Challenges

• Clinical trials are designed to achieve desired operatingcharacteristics

• Type-I error• Power

• Sequential and adaptive methods alter the operatingcharacteristics of the study

• Challenge: Incorporate sequential and adaptive methods whilemaintaining the desired operating characteristics

• Goal: Show that sequential and adaptive methods have thesame type-I error rate and power as a fixed-sample design butsmaller sample size or other desirable property

Phases of Drug Development

• Phase 1: Safety trials• Phase 2: Efficacy trials• Phase 3: Confirmatory trials• Phase 4: Post-marketing surveillance

Phase 1 Clinical Trials

• First-in-human trials• Primary objective is to evaluate safety• Efficacy is a secondary concern• Small trials: 10 - 50 subjects

Phase 1 Trials Example

Phase 1 clinical trial in oncology

• Estimate the probability of dose limiting toxicity (DLT)• We assume that as dose increases, the probability of DLT and

the probability of efficacy will also increase• Maximum tolerated dose (MTD): Highest dose with probability of

DLT less than some pre-specified cut-off (usually 0.2 or 0.33)

Dose Escalation

• We would like to do a randomized study• Too dangerous for first-in-human studies• Instead, we complete non-randomized dose escalation studies• Patients receive progressively higher doses until MTD has been

identified• Adaptive designs are used to guide dose escalation

Patients vs. Healthy Volunteers

• Healthy volunteers are used in other settings where toxicities areless severe

• Phase 1 oncology trials include patients for whom standardtreatments have failed

• Added goal treating patients with efficacious dose• Adaptive designs are used to treat patients at dose levels that

are more likely to be efficacous


• Goal of Phase 2: evaluate the efficacy of a novel therapeuticagent

• Surrogate endpoints are often used in place of hard endpoints• Phase 2 oncology trial: tumor response instead of overall survival

• Continue to evaluate safety of new drug

Stopping for futility

• The majority of novel therapeutic agents will not be adequatelyefficacious

• Clinical trials for ineffective treatments are expensive and fail toprovide adequate care for study subjects

• Early termination for futility allows ineffective treatments to beabonded if initial estimates of treatment efficacy are notpromising

Dropping Study Arms

• The optimal dose/treatment schedule is unlikely to be knownafter Phase 1

• We might run a multi-arm study to investigate multiple doselevels/treatment schedules and adaptively drop arms to savetime/money/etc.

Safety Monitoring

• The safety profile of a new drug continues to be evaluated inPhase 2

• Dual goals of answering scientific question/protecting studysubjects

• Sequential stopping rules are used to monitor the rate of adverseevents

Personalized Medicine

• Personalized medicine refers to customized treatment decisionsbased on patient characteristics such as genetic or otherinformation

• A phase 2 trial could be designed to investigate the effectivenessof a new treatment in different subpopulations

• We might design a study to adaptively assign subjects to one ofseveral treatment or drop subgroups for which the drug is noteffective


• Final, confirmatory trial for new therapeutic agent• Much larger than phase 2• Hard endpoints

• Overall survival instead of clinical response

Sequential Monitoring in Phase 3

• Most common setting for sequential monitoring• Stopping rules set in advance to allow early termination for

efficacy or futility• Get new treatments onto the market faster• Save time and money when treatments are not promising

Adaptive Randomization

• Typically, subjects are randomized to treatment or control using afixed randomization ratio

• Alternately, we could change the randomization ratio over time sothat more subjects are assigned to “better” treatment

• Better outcomes for study subjects

Sample Size Re-estimation

• Sample size is often calculated based on nuisance parametersfor which there is little information

• Incorrectly specified nuisance parameters can lead tounder-powered studies

• Re-estimate sample size using updated estimates of nuisanceparameters at interim analyses

Sample Size Re-estimation

• What if the true effect size is smaller than we anticipated?• We can re-estimate the sample size using an updated estimate

of the effect size at interim analyses• This is controversial

In Summary

Motivations for using sequential/adaptive designs can be group intothe following:

• Ethical• Economic• Administrative

Ethical

• Minimize the number of subjects treated with ineffectivetreatments

• Make new treatments available to the public more quickly• Protect study subjects

Economic

• Save money on the trial by terminating early• Early termination allows company to profit from new drug sooner

Administrative

• Evaluate composition of study population• Determine if study procedures are being followed correctly• Check model assumptions

Course Objectives

• Students will be familiar with standard group sequentialmethodology

• Students will be exposed to adaptive methods in clinical trials• Students will understand the challenges of apply sequential and

adaptive methods to clinical trials• Students will understand the advantages and disadvantages to

sequential and adaptive designs

Course Outline

• Week 1: Course Introduction• Weeks 2 and 3: Sequential testing of Normal Random Variables• Week 4: Brownian Motion and Asymptotically Normal test

statistics• Week 5: Estimation after a sequential trial• Week 6: Confidence intervals and p-values• Week 7: Bayesian Sequential methods• Weeks 8 and 9: Adaptive methods for Phase 1 clinical trials• Weeks 10 and 11: Adaptive methods for Phase 2 clinical trials• Weeks 12 and 13: Adaptive methods for Phase 3 clinical trials• Week 14: Student Presentations

Frequentist vs. Bayesian

It is worth pointing out...

• I don’t consider my self a Frequentist or a Bayesian• I am comfortable with both paradigm and do what I think is best

for the specific problem• Both approaches will be discussed in this course• In general,

• The first half of the class will focus on sequential designs and bemore Frequentist

• The second half of the class will focus on adaptive designs and bemore Bayesian

Comparing Normal Means with Known Variance

• Let X1,X2, . . . ,Xn be i .i .d . N(µx , σ

2)

• Let Y1,Y2, . . . ,Yn be i .i .d . N(µy , σ

2)

• σ2 known

Null hypothesis

Consider a two-sided test of

H0 : µx = µy vs. Ha : µx 6= µy

Fixed-sample test

• Collect n subjects in each group• Test null hypothesis using the following test statistic

Zn =Xn − Yn√

2∗σ2

n

• Under the null, Zn ∼ N (0,1)

• Reject if |Zn| > Zα/2

• Results in type-1 error rate of α

Power

Power (1− β) for the fixed sample test can be calculated from thefollowing formula:

1− β = 1− Φ

(Z1−α/2 −

√n

δ√2 ∗ σ2

)where δ is the alternative hypothesis. With δ = .5 ∗ σ and α = 0.05,• n = 50 results in 1− β = 0.70• n = 100 results in 1− β = 0.94

Operating Characteristics

The fixed sample design has:

• Total sample size of 2n• Type-1 error equal to α• Power for δ = .5 ∗ σ and α = 0.05:

• 0.70 for n = 50• 0.94 for n = 100

Group Sequential Design

• We could also consider a group sequential design• Ideally, we would find a sequential design with

• The same type-1 error rate and power as the fixed-sample design• Smaller expected sample size

One Interim Analysis

• Consider the simplest case of one interim analysis:• Collect an initial sample of n1 subjects in each group• Test the null hypothesis using Zn1 defined analagousley to Zn

• Reject if |Zn1 | > Zα/2• Otherwise, collect n − n1 additional subjects in each group and test

the null hypothesis using Zn

Operating Characteristics

• Evaluate the operating characteristics of the sequential designdescribed in the previous slide:

• Type-1 error rate• Power• Expected sample size

• How?• Consider the joint distribution of (Zn1 ,Zn)

Joint Distribution of(Xn1 − Yn1, Xn−n1 − Yn−n1

)

The joint distribution of(Xn1 − Yn1 , Xn−n1 − Yn−n1

)is(

Xn1 − Yn1

Xn−n1 − Yn−n1

)∼

((µx − µyµx − µy

),

(2∗σ2

n10

0 2∗σ2

n−n1

))

Joint Distribution of(Xn1 − Yn1, Xn − Yn

)

It is immediate from the last slide that the joint distribution of(Xn1 − Yn1 , Xn − Yn

)is(

Xn1 − Yn1

Xn − Yn

)∼

((µx − µyµx − µy

),

(2∗σ2

n1

2∗σ2

n2∗σ2

n2∗σ2

n

))

Definitions

Define• δn1 = Xn1 − Yn1

• δn = Xn − Yn

• δ = µx − µy

• In1 = n12∗σ2 be the information for δn1

• In = n2∗σ2 be the information for δn

Joint Distribution of(Xn1 − Yn1, Xn − Yn

)

The joint distribution of(Xn1 − Yn1 , Xn − Yn

)can be re-written as:(

δn1

δn

)∼((

δδ

),

(I−1n1 I−1

n

I−1n I−1

n

))

Joint Distribution of(δn1, δn

)

That is,(δn1 , δn

)follows a bivariate normal distribution with:

• δn1 ∼ N(δ, I−1

n1

)• δn ∼ N

(δ, I−1

n

)• Cov

(δn1 , δn

)= I−1

n

A note on the joint distribution of δn1 and δn

• Many commonly used estimators follow a similar joint distributionas δn1 and δn asymptotically

• This allows us to develop sequential methodology in a commonframework that can be broadly applied

• We will discuss this further in the future

Joint Distribution of (Zn1,Zn)

Under our new notation:

• Zn1 = δn1

√In1

• Zn = δn√

In

which results in the following joint distribution for (Zn1 ,Zn)(Zn1

Zn

)∼ N

((δ√

In1

δ√

In

),

(1

√In1/In√

In1/In 1

))

Finding the type-I error rate

We can find the type-I error rate by integrating over the jointdistribution of (Zn1 ,Zn)

type-I error rate = 1− P(|Zn1 | < Zα/2& |Zn| < Zα/2|δ = 0

)= 1−

∫ Z1−α/2

Zα/2

∫ Z1−α/2

Zα/2

f (Zn1 ,Zn|δ = 0)

Two chances to make a type-1 error

Alternately (and perhaps more instructive), we can consider the twoways to make a type-1 error

• Incorrectly reject null hypothesis at interim analysis• Incorrectly reject null hypothesis at study completion given that

you did not stop at the interim analysis• The type-1 error rate is the probability of incorrectly rejecting at

the interim analysis plus the probability of incorrectly rejecting atstudy completion given that the trial was not stopped at theinterim analysis

Probability of making a type-1 error at the interimanalysis

It is straight-forward to calculate the probability of making a type-1error at the interim analysis

P(|Zn1 | > Zα/2

)=1−

∫ Z1−α/2

Zα/2

∫ ∞−∞

f (zn1 , zn|δ = 0) dzndzn1

=1−∫ Z1−α/2

Zα/2

f (zn1 |δ = 0) dzn1

=α

Probability of making a type-1 error at studycompletion

We can calculate the probability of making a type-1 error at studycompletion by multiplying the probability of a type-I error at studycompletion by the probability of reaching full enrollment

• Let C be an indicator function taking the value 1 if|Zn1 | < Z1−alpha/2

• The probability of making a type-1 error at study completion is

(1− α)

∫ Z1−α/2

Zα/2

f (zn|C = 1, δ = 0)

What is f (zn|C = 1, δ = 0)?

• Marginally, Zn follows a normal distribution: Zn ∼ N(δ√

In,1)

• Allowing for early termination alters the distribution of Znconditional on C = 1

• What is f (zn|C = 1, δ = 0)?

Density of Zn conditional on C = 1

f (zn|C = 1, δ = 0) =

Φ

(√InZ1−α/2−

√In1 zn√

In−In1

)− Φ

(√InZα/2−

√In1 zn√

In−In1

)Φ(

Z1−α/2 − δ√

In1

)− Φ

(Z1−α/2 − δ

√In1) 1√

2πe− 1

2

(Zn−δ√

In)

• f (zn|C = 1, δ = 0) is equal to f (zn) multiplied by a factor to account for the possibility ofearly termination

• f (zn|C = 1, δ = 0) depends on δ, α, In and In1

• The most important factor is In1/In, the ratio of information at the interim analysis to theinformation at study completion

Density of Zn conditional on C = 1

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

I_n1/I_n = 0.25

z_n

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

I_n1/I_n = 0.50

z_n

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

I_n1/I_n = 0.75

z_n

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

I_n1/I_n = 0.95

z_n

f (zn|C = 1, δ = 0) as a function In1/In

• f (zn|C = 1, δ = 0) has lighter tails than f (zn)

• The mass in the tails decreases as In1/In increases• The probability of making a type-1 error at study completion

decreases as In1/In increases

What is the type-1 error rate?

• Type-1 error depends on In1/In• In1/In = 0.25: type-1 error equals 0.091• In1/In = 0.50: type-1 error equals 0.083• In1/In = 0.75: type-1 error equals 0.073

Correcting the type-1 error rate

• The type-1 error rate is inflated regardless of In1/In• How do we correct the type-1 error rate?• The simplest approach is to find α∗ such that the overall type-1

error rate equals α• These are known as Pocock stopping boundaries

Corrected stopping boundaries

• α∗ for various values of In1/In• In1/In = 0.25: α∗ = 0.027• In1/In = 0.50: α∗ = 0.029• In1/In = 0.75: α∗ = 0.033

Type-1 error summary

• Interim looks inflate the type-1 error rate• The amount of inflation depends on the proportion of information

available at the interim analysis• We can correct the type-1 error rate by altering the stopping

boundaries

Power

• Recall that for for δ = .5 ∗ σ and α = 0.05:• 0.70 power with n = 50• 0.94 power with n = 100

• How does altering the stopping boundaries impact power?

Power

• For In1/In = .50, α∗ = 0.029 results in a type-1 error of 0.05• Power to detect δ = .5 ∗ σ for n = 50 is 0.66• Power to detect δ = .5 ∗ σ for n = 100 is 0.92

Power

• Power has decreased. Why?• More stringent criteria for rejecting the null hypothesis

• We have to increase the maximum sample size to assureadequate power

• Keeping In1/In = .50• n = 56 results in power of 0.70• n = 110 results in power of 0.94

Power as function of In1/In

• Power increases as In1/In increases• For δ = .5 and n = 100

• In1/In = 0.25 results in a power of 0.91• In1/In = 0.75 results in a power of 0.93

Maximum sample size as function of In1/In

• The sample size inflation factor is also related to In1/In• For δ = .5 and overall type-1 error of 0.05

• In1/In = 0.25 requires a maximum sample size of 113 to maintainpower = 0.94

• In1/In = 0.75 requires a maximum sample size of 106 to maintainpower = 0.94

Summary

• We have developed a two-stage design with the same type-Ierror and power as a fixed-sample design

• α to α∗ to maintain type-I error rate• increased maximum sample size to maintain power

• What is the benefit?

Sample Size

• Sample size in a sequential study is a random variable• The sample size is either n1 or n depending on Zn1

• We evaluate the benefit of a sequential design by considering theexpected sample size

Expected Sample Size

The expected sample size is calculated as

E (SS) = n1 (1− P (C = 1)) + n ∗ P (C = 1)

where C is an indicator that the trial reached full enrollment

P (C = 1) =

∫ Z1−α/2

Zα/2

1√2π

e−12 (Zn1−δ

√In1 )

Note that the expected sample size depends on δ and will be differentfor the null and alternative hypotheses

Expected Sample Size

• For our two stage design, we have• Overall type-1 error rate equal to 0.05• Power of 0.94• Maximum sample size of 110• In1/In = 0.5

• What is the expected sample size?• Under the null, E(SS) = 108• Under the alternative of δ = 0.5, E(SS) = 73

Expected Sample Size as function of In1/In

• Expected sample size is also related to In1/In• Assuming overall α equal to 0.05 and power equal to 0.94• For In1/In = 0.25

• Under the null, E(SS) = 110• Under the alternative of δ = 0.5, E(SS) = 82

• For In1/In = 0.75• Under the null, E(SS) = 105• Under the alternative of δ = 0.5, E(SS) = 83

Comparison

• Fixed-sample design• α = 0.05• Power = 0.94• N = 100

• Fixed-sample design• α = 0.05• Power = 0.94• E(N) = 108 under null• E(N) = 73 under alternative

• Which is the better design?• Is the slight increase in sample size under the null (i.e. the worst

case scenario) worth a substantial reduction under the alternative?

Summary

• Adding interim analyses increases the type-I error rate• This can be fixed by changing the stopping boundaries

• Correcting the stopping boundaries results in a decrease inpower

• Increase the maximum sample size to achieve the desired power• Sample size is now stochastic

• Sequential designs result in dramatic reductions in the expectedsample size in some cases

pubh 8482: sequential analysis course · pdf filepubh 8482: sequential analysis course...

Documents