pubh 8482: sequential analysis course · pdf filepubh 8482: sequential analysis course...
TRANSCRIPT
Pubh 8482: Sequential AnalysisCourse Introduction
Joseph S. Koopmeiners
Division of BiostatisticsUniversity of Minnesota
Week 1
Who am I?
• Joe Koopmeiners• Assistant Professor in Division of Biostatistics• PhD - University of Washington, 2009• Research Interests:
• Sequential and adaptive methods for translational cancer research• statistical evaluation of diagnostic tests
• Collaborative Projects:• Tobacco cessation• MRI as a diagnostic tool for prostate cancer• Statistical support for early phase clinical trials
Who are you?
• Name• Year in program (program, if not in biostatistics)• Advisor, if known
Pre-requisites
• Stat 8101-8102• Students must be comfortable with the multivariate normal
distribution
Text Books
There are no required texts for this course but lecture notes will drawheavily from:
• Jennison, C. and Turnbull, B. (1999) Group Sequential Methodswith Applications to Clinical Trials, Boca Raton: CRC Press.ISBN 0849303168
• Berry, S.M., Carlin, B.P., Lee, J.J. and Muller, P. (2010) BayesianAdaptive Methods for Clinical trials, Boca Raton: CRC Press.ISBN 1439825483
Text Books
Other useful textbooks on reserve in the biostat reading room include:
• Whitehead, J. (1997) The Design and Analysis of SequentialClinical Trials, 2nd Ed., New York: John Wiley & Sons. ISBN0471975508
• Proschan, M. A., Lan, K.K.G. and Wittes, J.T. (2006) StatisticalMonitoring of Clinical Trials: A Unified Approach, New York:Springer. ISBN 0387300597
• Yin, G. (2012) Clinical Trial Design: Bayesian and FrequentistAdaptive Methods, Hoboken: John Wiley & Sons. ISBN0470581719
Evaluation and Grades
• Homework - 40%• Mid-term exam - 30%• Final project - 30%
Homework
• About 5 homework assignments• You will have at least 2 weeks to work on the assignments• Homeworks will be due on Tuesdays
Mid-term Exam
• Take-home mid-term about halfway through the course• Will cover what I consider “classic” group sequential methodology• You will have one week to complete the exam
Class Project
• Pick a specific topic related to sequential and adaptive clinicaltrials
• Writen report and 20 - 30 minute presentation• Discuss statistical challenges specific to your topic• Literature review of current research• Identify open research areas
Office Hours
• Monday 3:00 p.m. - 4:00 p.m.• Thursday 1:00 - 2:00 p.m.
Course Website
• Course Website
http://www.biostat.umn.edu/ josephk/courses/pubh8482 fall2012/
• Linked from my faculty webpage
Course Title
• Current Title:• Sequential Analysis
• More Accurate Title:• Sequential and Adaptive Methods for Clinical Trials
What is a clinical trial?
• Clinical Trial: A controlled experiment to test the safety orefficacy of a treatment or intervention
• Usually randomized. Although, this is not always the case.Especially in phases 1 and 2.
Fixed-sample Design
• Most studies utilize fixed-sample designs• Fixed-sample design: collect a pre-specified number of
subjects and test our hypthesis• Randomize 200 patients to either a novel treatment or placebo and
compare survival using a log-rank test
Sequential Design
• An alternate approach is a sequential design• Sequential design: sequentially monitor the primary endpoint
and continue to enroll subjects based on the interim results• Randomize an initial cohort of patients to a novel treatment or
placebo and compare survival using a log-rank test• Determine if more patients should be enrolled based on
pre-specified stopping rule
Continuous Monitoring
• Early sequential methods focused on continuous monitoring• Evaluating the endpoint after each new experimental unit• Quality control for bombs during WWII
• These methods are not practical in the setting of clinical trials
Group Sequential Methods
• Sequential clinical trials generally rely on group sequentialmethodology
• Group sequential methods: interim analyses are completed atpre-specified intervals throughout the study
• Randomized the first 50 patients to either a novel treatment orplacebo and compare survival using a log-rank test
• Determine if more patients should be enrolled based onpre-specified stopping rule
• Re-evaluate endpoint after every 50 patients until a total of 200have been enrolled
Adaptive Design
• An adaptive design is a design that uses accumulating datafrom the ongoing trial to modify certain aspects of the study
• Sample size• Treatment dose• Randomization ratio• Study arms
Sequential vs. Adaptive Designs
• There is no clear distinction between what constitutes asequential and what constitutes an adaptive design
• Both rely on interim analyses to modify the design• Sequential designs generally only modify the sample size (by
stopping early), while adaptive designs are used to describedesigns with more broad modifications
• Both face similar statistical challenges
Sequential and Adaptive Designs: Challenges
• Clinical trials are designed to achieve desired operatingcharacteristics
• Type-I error• Power
• Sequential and adaptive methods alter the operatingcharacteristics of the study
• Challenge: Incorporate sequential and adaptive methods whilemaintaining the desired operating characteristics
• Goal: Show that sequential and adaptive methods have thesame type-I error rate and power as a fixed-sample design butsmaller sample size or other desirable property
Phases of Drug Development
• Phase 1: Safety trials• Phase 2: Efficacy trials• Phase 3: Confirmatory trials• Phase 4: Post-marketing surveillance
Phase 1 Clinical Trials
• First-in-human trials• Primary objective is to evaluate safety• Efficacy is a secondary concern• Small trials: 10 - 50 subjects
Phase 1 Trials Example
Phase 1 clinical trial in oncology
• Estimate the probability of dose limiting toxicity (DLT)• We assume that as dose increases, the probability of DLT and
the probability of efficacy will also increase• Maximum tolerated dose (MTD): Highest dose with probability of
DLT less than some pre-specified cut-off (usually 0.2 or 0.33)
Dose Escalation
• We would like to do a randomized study• Too dangerous for first-in-human studies• Instead, we complete non-randomized dose escalation studies• Patients receive progressively higher doses until MTD has been
identified• Adaptive designs are used to guide dose escalation
Patients vs. Healthy Volunteers
• Healthy volunteers are used in other settings where toxicities areless severe
• Phase 1 oncology trials include patients for whom standardtreatments have failed
• Added goal treating patients with efficacious dose• Adaptive designs are used to treat patients at dose levels that
are more likely to be efficacous
Phase 2 Clinical Trials
• Goal of Phase 2: evaluate the efficacy of a novel therapeuticagent
• Surrogate endpoints are often used in place of hard endpoints• Phase 2 oncology trial: tumor response instead of overall survival
• Continue to evaluate safety of new drug
Stopping for futility
• The majority of novel therapeutic agents will not be adequatelyefficacious
• Clinical trials for ineffective treatments are expensive and fail toprovide adequate care for study subjects
• Early termination for futility allows ineffective treatments to beabonded if initial estimates of treatment efficacy are notpromising
Dropping Study Arms
• The optimal dose/treatment schedule is unlikely to be knownafter Phase 1
• We might run a multi-arm study to investigate multiple doselevels/treatment schedules and adaptively drop arms to savetime/money/etc.
Safety Monitoring
• The safety profile of a new drug continues to be evaluated inPhase 2
• Dual goals of answering scientific question/protecting studysubjects
• Sequential stopping rules are used to monitor the rate of adverseevents
Personalized Medicine
• Personalized medicine refers to customized treatment decisionsbased on patient characteristics such as genetic or otherinformation
• A phase 2 trial could be designed to investigate the effectivenessof a new treatment in different subpopulations
• We might design a study to adaptively assign subjects to one ofseveral treatment or drop subgroups for which the drug is noteffective
Phase 3 Clinical Trials
• Final, confirmatory trial for new therapeutic agent• Much larger than phase 2• Hard endpoints
• Overall survival instead of clinical response
Sequential Monitoring in Phase 3
• Most common setting for sequential monitoring• Stopping rules set in advance to allow early termination for
efficacy or futility• Get new treatments onto the market faster• Save time and money when treatments are not promising
Adaptive Randomization
• Typically, subjects are randomized to treatment or control using afixed randomization ratio
• Alternately, we could change the randomization ratio over time sothat more subjects are assigned to “better” treatment
• Better outcomes for study subjects
Sample Size Re-estimation
• Sample size is often calculated based on nuisance parametersfor which there is little information
• Incorrectly specified nuisance parameters can lead tounder-powered studies
• Re-estimate sample size using updated estimates of nuisanceparameters at interim analyses
Sample Size Re-estimation
• What if the true effect size is smaller than we anticipated?• We can re-estimate the sample size using an updated estimate
of the effect size at interim analyses• This is controversial
In Summary
Motivations for using sequential/adaptive designs can be group intothe following:
• Ethical• Economic• Administrative
Ethical
• Minimize the number of subjects treated with ineffectivetreatments
• Make new treatments available to the public more quickly• Protect study subjects
Economic
• Save money on the trial by terminating early• Early termination allows company to profit from new drug sooner
Administrative
• Evaluate composition of study population• Determine if study procedures are being followed correctly• Check model assumptions
Course Objectives
• Students will be familiar with standard group sequentialmethodology
• Students will be exposed to adaptive methods in clinical trials• Students will understand the challenges of apply sequential and
adaptive methods to clinical trials• Students will understand the advantages and disadvantages to
sequential and adaptive designs
Course Outline
• Week 1: Course Introduction• Weeks 2 and 3: Sequential testing of Normal Random Variables• Week 4: Brownian Motion and Asymptotically Normal test
statistics• Week 5: Estimation after a sequential trial• Week 6: Confidence intervals and p-values• Week 7: Bayesian Sequential methods• Weeks 8 and 9: Adaptive methods for Phase 1 clinical trials• Weeks 10 and 11: Adaptive methods for Phase 2 clinical trials• Weeks 12 and 13: Adaptive methods for Phase 3 clinical trials• Week 14: Student Presentations
Frequentist vs. Bayesian
It is worth pointing out...
• I don’t consider my self a Frequentist or a Bayesian• I am comfortable with both paradigm and do what I think is best
for the specific problem• Both approaches will be discussed in this course• In general,
• The first half of the class will focus on sequential designs and bemore Frequentist
• The second half of the class will focus on adaptive designs and bemore Bayesian
Comparing Normal Means with Known Variance
• Let X1,X2, . . . ,Xn be i .i .d . N(µx , σ
2)
• Let Y1,Y2, . . . ,Yn be i .i .d . N(µy , σ
2)
• σ2 known
Null hypothesis
Consider a two-sided test of
H0 : µx = µy vs. Ha : µx 6= µy
Fixed-sample test
• Collect n subjects in each group• Test null hypothesis using the following test statistic
Zn =Xn − Yn√
2∗σ2
n
• Under the null, Zn ∼ N (0,1)
• Reject if |Zn| > Zα/2
• Results in type-1 error rate of α
Power
Power (1− β) for the fixed sample test can be calculated from thefollowing formula:
1− β = 1− Φ
(Z1−α/2 −
√n
δ√2 ∗ σ2
)where δ is the alternative hypothesis. With δ = .5 ∗ σ and α = 0.05,• n = 50 results in 1− β = 0.70• n = 100 results in 1− β = 0.94
Operating Characteristics
The fixed sample design has:
• Total sample size of 2n• Type-1 error equal to α• Power for δ = .5 ∗ σ and α = 0.05:
• 0.70 for n = 50• 0.94 for n = 100
Group Sequential Design
• We could also consider a group sequential design• Ideally, we would find a sequential design with
• The same type-1 error rate and power as the fixed-sample design• Smaller expected sample size
One Interim Analysis
• Consider the simplest case of one interim analysis:• Collect an initial sample of n1 subjects in each group• Test the null hypothesis using Zn1 defined analagousley to Zn
• Reject if |Zn1 | > Zα/2• Otherwise, collect n − n1 additional subjects in each group and test
the null hypothesis using Zn
Operating Characteristics
• Evaluate the operating characteristics of the sequential designdescribed in the previous slide:
• Type-1 error rate• Power• Expected sample size
• How?• Consider the joint distribution of (Zn1 ,Zn)
Joint Distribution of(Xn1 − Yn1, Xn−n1 − Yn−n1
)
The joint distribution of(Xn1 − Yn1 , Xn−n1 − Yn−n1
)is(
Xn1 − Yn1
Xn−n1 − Yn−n1
)∼
((µx − µyµx − µy
),
(2∗σ2
n10
0 2∗σ2
n−n1
))
Joint Distribution of(Xn1 − Yn1, Xn − Yn
)
It is immediate from the last slide that the joint distribution of(Xn1 − Yn1 , Xn − Yn
)is(
Xn1 − Yn1
Xn − Yn
)∼
((µx − µyµx − µy
),
(2∗σ2
n1
2∗σ2
n2∗σ2
n2∗σ2
n
))
Definitions
Define• δn1 = Xn1 − Yn1
• δn = Xn − Yn
• δ = µx − µy
• In1 = n12∗σ2 be the information for δn1
• In = n2∗σ2 be the information for δn
Joint Distribution of(Xn1 − Yn1, Xn − Yn
)
The joint distribution of(Xn1 − Yn1 , Xn − Yn
)can be re-written as:(
δn1
δn
)∼((
δδ
),
(I−1n1 I−1
n
I−1n I−1
n
))
Joint Distribution of(δn1, δn
)
That is,(δn1 , δn
)follows a bivariate normal distribution with:
• δn1 ∼ N(δ, I−1
n1
)• δn ∼ N
(δ, I−1
n
)• Cov
(δn1 , δn
)= I−1
n
A note on the joint distribution of δn1 and δn
• Many commonly used estimators follow a similar joint distributionas δn1 and δn asymptotically
• This allows us to develop sequential methodology in a commonframework that can be broadly applied
• We will discuss this further in the future
Joint Distribution of (Zn1,Zn)
Under our new notation:
• Zn1 = δn1
√In1
• Zn = δn√
In
which results in the following joint distribution for (Zn1 ,Zn)(Zn1
Zn
)∼ N
((δ√
In1
δ√
In
),
(1
√In1/In√
In1/In 1
))
Finding the type-I error rate
We can find the type-I error rate by integrating over the jointdistribution of (Zn1 ,Zn)
type-I error rate = 1− P(|Zn1 | < Zα/2& |Zn| < Zα/2|δ = 0
)= 1−
∫ Z1−α/2
Zα/2
∫ Z1−α/2
Zα/2
f (Zn1 ,Zn|δ = 0)
Two chances to make a type-1 error
Alternately (and perhaps more instructive), we can consider the twoways to make a type-1 error
• Incorrectly reject null hypothesis at interim analysis• Incorrectly reject null hypothesis at study completion given that
you did not stop at the interim analysis• The type-1 error rate is the probability of incorrectly rejecting at
the interim analysis plus the probability of incorrectly rejecting atstudy completion given that the trial was not stopped at theinterim analysis
Probability of making a type-1 error at the interimanalysis
It is straight-forward to calculate the probability of making a type-1error at the interim analysis
P(|Zn1 | > Zα/2
)=1−
∫ Z1−α/2
Zα/2
∫ ∞−∞
f (zn1 , zn|δ = 0) dzndzn1
=1−∫ Z1−α/2
Zα/2
f (zn1 |δ = 0) dzn1
=α
Probability of making a type-1 error at studycompletion
We can calculate the probability of making a type-1 error at studycompletion by multiplying the probability of a type-I error at studycompletion by the probability of reaching full enrollment
• Let C be an indicator function taking the value 1 if|Zn1 | < Z1−alpha/2
• The probability of making a type-1 error at study completion is
(1− α)
∫ Z1−α/2
Zα/2
f (zn|C = 1, δ = 0)
What is f (zn|C = 1, δ = 0)?
• Marginally, Zn follows a normal distribution: Zn ∼ N(δ√
In,1)
• Allowing for early termination alters the distribution of Znconditional on C = 1
• What is f (zn|C = 1, δ = 0)?
Density of Zn conditional on C = 1
f (zn|C = 1, δ = 0) =
Φ
(√InZ1−α/2−
√In1 zn√
In−In1
)− Φ
(√InZα/2−
√In1 zn√
In−In1
)Φ(
Z1−α/2 − δ√
In1
)− Φ
(Z1−α/2 − δ
√In1) 1√
2πe− 1
2
(Zn−δ√
In)
• f (zn|C = 1, δ = 0) is equal to f (zn) multiplied by a factor to account for the possibility ofearly termination
• f (zn|C = 1, δ = 0) depends on δ, α, In and In1
• The most important factor is In1/In, the ratio of information at the interim analysis to theinformation at study completion
Density of Zn conditional on C = 1
−3 −2 −1 0 1 2 3
0.0
0.1
0.2
0.3
0.4
I_n1/I_n = 0.25
z_n
−3 −2 −1 0 1 2 3
0.0
0.1
0.2
0.3
0.4
I_n1/I_n = 0.50
z_n
−3 −2 −1 0 1 2 3
0.0
0.1
0.2
0.3
0.4
I_n1/I_n = 0.75
z_n
−3 −2 −1 0 1 2 3
0.0
0.1
0.2
0.3
0.4
I_n1/I_n = 0.95
z_n
f (zn|C = 1, δ = 0) as a function In1/In
• f (zn|C = 1, δ = 0) has lighter tails than f (zn)
• The mass in the tails decreases as In1/In increases• The probability of making a type-1 error at study completion
decreases as In1/In increases
What is the type-1 error rate?
• Type-1 error depends on In1/In• In1/In = 0.25: type-1 error equals 0.091• In1/In = 0.50: type-1 error equals 0.083• In1/In = 0.75: type-1 error equals 0.073
Correcting the type-1 error rate
• The type-1 error rate is inflated regardless of In1/In• How do we correct the type-1 error rate?• The simplest approach is to find α∗ such that the overall type-1
error rate equals α• These are known as Pocock stopping boundaries
Corrected stopping boundaries
• α∗ for various values of In1/In• In1/In = 0.25: α∗ = 0.027• In1/In = 0.50: α∗ = 0.029• In1/In = 0.75: α∗ = 0.033
Type-1 error summary
• Interim looks inflate the type-1 error rate• The amount of inflation depends on the proportion of information
available at the interim analysis• We can correct the type-1 error rate by altering the stopping
boundaries
Power
• Recall that for for δ = .5 ∗ σ and α = 0.05:• 0.70 power with n = 50• 0.94 power with n = 100
• How does altering the stopping boundaries impact power?
Power
• For In1/In = .50, α∗ = 0.029 results in a type-1 error of 0.05• Power to detect δ = .5 ∗ σ for n = 50 is 0.66• Power to detect δ = .5 ∗ σ for n = 100 is 0.92
Power
• Power has decreased. Why?• More stringent criteria for rejecting the null hypothesis
• We have to increase the maximum sample size to assureadequate power
• Keeping In1/In = .50• n = 56 results in power of 0.70• n = 110 results in power of 0.94
Power as function of In1/In
• Power increases as In1/In increases• For δ = .5 and n = 100
• In1/In = 0.25 results in a power of 0.91• In1/In = 0.75 results in a power of 0.93
Maximum sample size as function of In1/In
• The sample size inflation factor is also related to In1/In• For δ = .5 and overall type-1 error of 0.05
• In1/In = 0.25 requires a maximum sample size of 113 to maintainpower = 0.94
• In1/In = 0.75 requires a maximum sample size of 106 to maintainpower = 0.94
Summary
• We have developed a two-stage design with the same type-Ierror and power as a fixed-sample design
• α to α∗ to maintain type-I error rate• increased maximum sample size to maintain power
• What is the benefit?
Sample Size
• Sample size in a sequential study is a random variable• The sample size is either n1 or n depending on Zn1
• We evaluate the benefit of a sequential design by considering theexpected sample size
Expected Sample Size
The expected sample size is calculated as
E (SS) = n1 (1− P (C = 1)) + n ∗ P (C = 1)
where C is an indicator that the trial reached full enrollment
P (C = 1) =
∫ Z1−α/2
Zα/2
1√2π
e−12 (Zn1−δ
√In1 )
Note that the expected sample size depends on δ and will be differentfor the null and alternative hypotheses
Expected Sample Size
• For our two stage design, we have• Overall type-1 error rate equal to 0.05• Power of 0.94• Maximum sample size of 110• In1/In = 0.5
• What is the expected sample size?• Under the null, E(SS) = 108• Under the alternative of δ = 0.5, E(SS) = 73
Expected Sample Size as function of In1/In
• Expected sample size is also related to In1/In• Assuming overall α equal to 0.05 and power equal to 0.94• For In1/In = 0.25
• Under the null, E(SS) = 110• Under the alternative of δ = 0.5, E(SS) = 82
• For In1/In = 0.75• Under the null, E(SS) = 105• Under the alternative of δ = 0.5, E(SS) = 83
Comparison
• Fixed-sample design• α = 0.05• Power = 0.94• N = 100
• Fixed-sample design• α = 0.05• Power = 0.94• E(N) = 108 under null• E(N) = 73 under alternative
• Which is the better design?• Is the slight increase in sample size under the null (i.e. the worst
case scenario) worth a substantial reduction under the alternative?
Summary
• Adding interim analyses increases the type-I error rate• This can be fixed by changing the stopping boundaries
• Correcting the stopping boundaries results in a decrease inpower
• Increase the maximum sample size to achieve the desired power• Sample size is now stochastic
• Sequential designs result in dramatic reductions in the expectedsample size in some cases