ce course on adaptive dose-response studies … · ce course on adaptive dose-response studies 2007...

CE Course onAdaptive Dose-Response Studies

2007 Joint Statistical MeetingsSalt Lake City, UT – July 29, 2007

PRESENTERS

Christopher S. Coffey, PhDUniversity of Alabama at Birmingham

Email: [email protected]

Brenda Gaydos, PhDEli Lilly and CompanyEmail: [email protected]

José Pinheiro, PhDNovartis Pharmaceuticals

Email: [email protected]

LEARNING OBJECTIVES

At the end of the course, students should:

• Understand distinction between adaptive dose-response designs and other types of adaptive designs

• Understand use of adaptive designs in early and late phase drug development

• Understand advantages of adaptive dose-response designs over more traditional dose-response designs

• Understand how to implement adaptive dose-response trials

OUTLINE

I. What are Adaptive Designs? (~15 min.)

II. Summary of Development Stages (~15 min.)

III. Adaptive Dose-Response Methods for Early Exploratory Studies (~50 min.)

IV. Fixed Design Dose-Response Methods (~30 min.)

Break (~15 min.)

OUTLINE

V. Adaptive Dose-Response Methods for Late-Stage Exploratory Development (~50 min.)

VI. Simulations to Illustrate the Performance and Implementation of Adaptive Dose-Response Designs and Their Comparison to Traditional Methods(~ 50 min)

VII. Overall Conclusions and Recommendations (~15 min)

I. What are Adaptive Designs?

Outline:

1) Definition of an ‘adaptive design’.

2) Types of adaptive designs

3) Adaptive dose-response designs

WHAT ARE ADAPTIVE DESIGNS?

Recently, there has been considerable research on adaptive designs (also called flexible or innovative designs).

The rapid proliferation of interest in adaptive designs and inconsistent use of terminology has created confusion about similarities and differences among the various techniques.

For example, the definition of an “adaptive design” itself is a common source of confusion.


PhRMA Working Group on Adaptive Designs (2006):

“By adaptive design we refer to a clinical study design that uses accumulating data to modify aspects of the study as it continues, without undermining the validity and integrity of the trial.”

“…changes are made by design, and not on an ad hoc basis”

“…not a remedy for inadequate planning.”


Adaptive designs are NOT new.

The methodology has existed for decades.

However, because this is a rapidly expanding area of research, more practical experience is needed.

Although adaptive designs are scientifically and operationally more complex, the issues are resolvable.


Myth #1: A study with one or more protocol amendments is an adaptive design.

Protocol Amendment:

• Addresses the unanticipated

• Need may or may not be based on outcome data

• Cam compromise validity of study conclusions

Adaptive by DESIGN:

• Planned flexibility to address areas of design uncertainty

• Impact of planned changes on conclusions understood

• MAY require a protocol amendment (but less likely)


Myth #2: Adaptive design protocols should be vague to allow for flexibility.

In order to enable the process to be simulated, the extent to which adaptation is planned should be described a priori in detail, if possible.

Hung et al. (2006):“At the very least, the regulatory agencies need to know every detail of how the trial proceeded during its conduct and adaptations.”


For this course, we focus on adaptive dose-response methods.

Such adaptive designs:

• Offer more efficient ways to learn about dose response

• Provide more information on dose-response profile earlier in development.

• Guide decision making on whether to continue program and, if so, which dose to select for further development

• Aim to increase probability of technical success by taking correct choice of dose forward for further study.


Infinite number of adaptive design possibilities:

• Adapting dose is only one possibility.

• Many other aspects of the study can be changed:

- sample size - final test statistic

- primary endpoint - inclusion/exclusion criteria

- number of treatment arms - randomization procedure

- Number of interim looks - goal: superiority to non-inferiority

• Define objective of the adaptation and the design elements to adapt.


Design Changes?Design Changes?

Adaptive/Flexible DesignsAdaptive/Flexible Designs

ChangeOther

Aspects(Test Statistic,

PrimaryEndpoint,Inclusion/Exclusion

CriteriaDose, etc.)

AdaptiveAdaptive

DoseDose--

ResponseResponse

SeamlessPhase II/IIIDesigns

AdaptiveRandomization

Planned

Planned Unplanned

EstimatedTreatment

Effect(Known

Variance)

SampleSize

Re-Estimation

Internal Pilots(EstimatedNuisance

Parameters)

Estimated“EffectSize”

???


Historically, a great deal of controversy surrounding adaptive designs has been focused around a particular type of sample size re-estimation design:

SampleSize

Re-Estimation

EstimatedEstimated

Treatment EffectTreatment Effect

(Known Variance)(Known Variance)

Estimated “Effect Size”

Internal Pilots(EstimatedNuisance

Parameters)


When rule for increasing sample size can be pre-specified, sample size re-estimation based on a revised estimate of treatment effect is nearly always less efficient than a group sequential approach.

- Tsiatis & Mehta (2003);Jennison & Turnbull (2003, 2006); Mehta & Patel (2006)

However, little controversy surrounds the use of IP designs (re-estimating only nuisance parameters).

Since IP designs can be implemented in large clinical trials with little penalty, the use of internal pilot designs should beencouraged.


More recently, focus has shifted to the logistical barriers thatneed to be overcome before any adaptive design can be practically implemented.

These include:

Budget Administration

Increased communication with clinical sites

Information Technology

Protocol Issues

Shameless plug for upcoming 2007 JSM panel session:

Issues and Solutions to Planning and Implementing an Adaptive

Design in Practice

Organizer: Brenda Gaydos, Eli Lilly and Company

Time: Monday, 10:30-12:20

Panelists: Michael Krams, WyethPaul Gallo, Novartis PharmaceuticalsGernot Wassmer, The University of CologneJerald S. Schindler, CytelChristopher S. Coffey, UAB


SUMMARY

Adaptive dose-response studies are one of many types of possible adaptive designs.

Adaptive designs are NOT always “better”.

Simulations under realistic scenarios are needed to assess how the design will perform.

Suggest routinely assessing the appropriateness of novel designs and analyses when developing clinical plans.

Adaptive by DESIGN – thorough upfront planning is required

II. Summary of Drug Development Stages

• Overview of Drug Development

• Traditional Phases of Clinical Development (I-III)

• Some Statistics

• Shift to Learn & Confirm Paradigm

OVERVIEW OF DRUG DEVELOPMENT

Drug Discovery

Preclinical Testing

File Investigational New Drug Application (IND)

Clinical Trial Development Phases

Phase I

Phase II

Phase III

File New Drug Application (NDA)

Review and Approval Process

Phase IV (Post-Marketing Studies)

PRIOR TO CLINICAL DEVELOPMENT

Drug Discovery (hypothesis generation)

Target-disease link identification & validation using biological tools

Assay development to support screening & evaluate screening hits

Molecule identification for preclinical testing

1 in 10,000 molecules synthesized will become a new medicine (approved)

Preclinical Testing

In vitro (laboratory) & in-vitro (animal subjects) studies to determine preliminary efficacy and pharmacokinetic information

Determine dose range to explore in Phase I

File New Drug Application (NDA)

Effective if FDA does not disapprove within 30 days

Institutional Review Boards (IRB) where clinical studies will beconducted must reviewed and approve prior to clinical study start

PHASE I

Typically in healthy volunteers (~ 20 to 100)

In cancer research, patients are typical (toxicity expected at efficacious levels)

Time in Phase I ~ 1.5 years

Objectives

Determine how the drug is absorbed, distributed, metabolized, and excreted

Identify the safe dose range for first efficacy dose (in patient studies)

– Maximum Tolerated Dose (MTD)

– No Adverse Effect Level dose (NOEL)

Assess preliminary efficacy

– If applicable animal model to relate concentration in animals to humans

– If applicable biomarker to relate biological activity to clinical efficacy

– If models available to relate concentration to marketed compound

PHASE I (cont.)

Typical Study Types

Single Dose Safety Study (SDSS)

– Also referred to as first human dose (FHD)

Multiple Dose Safety Study (MDSS)

Proof of Concept Study (PoC)

– Assess biological activity to support investment in Phase II

– MDSS study may be referred to as MDSS/PoC study if assessing preliminary efficacy & safety

– May be referred to as a Phase Ib study if in patients

Additional studies that may be run in Phase I

Biomarker studies

Methods studies

PHASE II

Clinical trials in patients (~ 100 to 500)

Time in Phase II ~ 2years

Objectives

Determine if there is a therapeutic response

Determine the appropriate dose, dose regimen, and patient population (inclusion/exclusion criteria) for further study in Phase III

Clinical trial material is not yet commercial formulation

PHASE III

Clinical trials in patients (~ 1000 to 5000)

Time in Phase III ~ 3.5 years

Objective

Evaluate overall benefit-risk relationship of the drug

Provide adequate basis for physician labeling

Clinical trial material is the expected marketing formulation

OTHER CONCURRENT STUDIES

Toxicology studies in animals

Prior to Phase I

During Phase I and/or II to support longer term human exposure

Biopharmaceutical package (clinical pharmacology)

Typically run during Phase III

Characterize drug kinetics (e.g. drug-drug interaction studies, kinetics in special populations, bio-equivalence, QT-prolongation)

Material development

FOLLOWING CLINICAL DEVELOPMENT

NDA

Contains all scientific information gathered

Typically 100,000 pages or more

Once approved, company must continue to submit periodic reports including adverse reactions

Phase IV (Post-Marketing Studies)

Involves post-launch safety surveillance to monitor for any rare or long-term adverse effects and ongoing technical support of a drug

Studies may be mandated by regulatory authorities or undertaken

by the company to gain more information

FROM WWW.PHRMA.ORG 7/07

Clinical Trials

Discovery/

Preclinical

Testing

Phase I Phase II Phase III FDA Phase IV

Years 6.5 1.5 2 3.5 1.5 15

Total

Test

Pop

Laboratory

and animal

studies

20 to 100

healthy

volunteers

100 to 500

patient

volunteers

1000 to 5000

patient

volunteers

Purpose

Assess

safety,

biological

activity, and

formulations

Determine

safety and

dosage

Evaluate

effectiveness,

look for side

effects

Confirm

effectiveness,

monitor

adverse

reactions

from long-

term use

Review

process /

Approval

Success

Rate

5,000

compounds

evaluated

File

IND

at

FDA

5 enter trials

File

NDA

at

FDA

1

approved

Additional

Post

marketing

testing

required

by FDA

SOME STATISTICS

Development time from lab to patient: 10-15 years *

Costs of bringing a new medicine to market: 800 Million to 1.7 Billion **

Success rate for novel candidate at Phase 1 to market: 8% **

Failure rate in Phase II: 40% *

Failure rate in Phase III: 45% ***

Development costs escalating: risen 55% in last 5 years **

* PhRMA; ** FDA Critical Path, 2004; *** Kola & Landis, 2004

R & D PRODUCTIVITY DECREASING

$0

$5

$10

$15

$20

$25

$30

$35

$40

1980

1981

1982

1983

1984

1985

1986

1987

1988

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

0

20

40

60

80

100

120

140

160

180

200

Annual NMEApprovals

Industry R&D Expense

($ Billions)

R&D Investment

NME Approvals

Source: PhRMA, FDA, Lehman Brothers

CRITICAL PATH INITIATIVE: A CALL TO ACTION

"Critical Path" Paper Calls for

Academic Researchers, Product

Developers, and Patient Groups To

Work With FDA To Help Identify

Opportunities to Modernize Tools for

Speeding Approvable, Innovative

Products To Improve Public Health

www.fda.gov/oc/initiatives/criticalpath/

whitepaper.html

Source: Lawrence J. Lesko, Clinical Pharmacology Subcommittee of ACPS, Nov 4, 2004

… to Learn and Confirm

Phase 4Phase 4

Transition Zone

IND NDA Submission

ConfirmLearn

Phase 2

Phased approach …

IND NDA Submission

Phase 1 Phase 2 Phase 4

Transition time

Phase 3

SHIFT TO LEARN & CONFIRM

Source: Robert R. Ruffolo, Jr., Ph.D., Wyeth

Increased focus on the “Learn” phase More information on dose-response (safety/efficacy)

Reduction in transition time Seamless from Phase I-II

Seamless from Phase II-III

Expected OutcomeReduce clinical development time & costs

Increase information at time of NDA

WHY THE SHIFT ?

OBJECTIVES ADAPTIVE DOSE-RESPONSE

Getting information as fast as ethically possible about key aspects of the dose-response

Increased information with improved efficiencyExplore more doses with same sample size as fixed design

More observations at doses that better inform the dose-response

More observations on the doses that are most promising

Feasible to combine PoC and Phase II dose finding with early stopping for futility

Shorten development timelines

More informative Go / No Go decisions

Improve Pr (TS) in Phase III

Feasible to combine Phase II dose finding with Phase III

III. Adaptive Dose-Response Methods for Early Exploratory Studies

Outline:

• Summary of major philosophies regarding definition of maximum tolerated dose (MTD)

• Conventional 3+3 designs

• Model-based designs

• Case Studies

MTD - DEFINITION

Phase I clinical trials typically want to determine some maximum tolerated dose (MTD).

Accurate determination of the MTD is very important since the dose established as the MTD will be used for further testing in later phases.

Passing on too low of a dose may jeopardize a potentially useful drug

Passing on too high of a dose puts patients in later phase trials at risk

MTD - DEFINITION

Two major philosophies regarding MTD definition:

Dose that, if exceeded, would put patients at ‘unacceptable risk’ of toxicity.

• Treat the MTD as being observed from the data

• Vague from statisticians point of view since ‘unacceptable risk’may not be defined quantitatively

Specifying ‘unacceptable risk’ as a probability.

• Treat the MTD as an unknown parameter of a monotonic dose response curve.

• The MTD is estimated corresponding to a specified probability.

MTD - DEFINITION

1) Conventional up-and-down designs

• Such as 3+3 designs for cancer

2) Model-based designs where MTD is a quantile to be estimated

• Random walk rule

• Bayesian methods

These two definitions lead to two different approaches for designing phase I clinical trials:

CONVENTIONAL 3+3 DESIGNS

Conventional 3+3 methods employ an ad-hoc approach to screen dose levels and identify the MTD.

Toxicity is defined as a binary event and patients are treated in groups of three, starting with the initial dose.

Algorithm iterates moving dose up or down depending on the number of toxicities observed.

No estimation in a traditional sense is involved.

The MTD is a statistic identified from the data - highest dose studied with less than, say 1/3 toxicities (i.e., 0 or 1 dose-limiting toxicities out of six patients).

0 2 or moreCountEvents

Treat 3 patients at dose

Start at the lowest reasonable dose

Increase dose to next level

Treat 3 additional patients at dose

CountEvents

Decrease dose or stop and select lower dose

1

0 1 or more

CONVENTIONAL 3+3 DESIGNS CONVENTIONAL 3+3 DESIGNS

0%

20%

40%

60%

80%

100%

0.0 0.2 0.4 0.6 0.8 1.0

True "p"

Ch

an

ce o

f "S

tep

pin

g U

p"

Even with a 30% chance of an “event”there is still a 50% chance of stepping up!


Strengths:

Simple to implement and understand

Requires no computer program

Familiar to many clinicians


Drawbacks:

Estimate of MTD has no clear relationship with any percentile of the dose toxic response distribution

Tend to treat many patients at low, ineffective doses

No satisfactory approach for obtaining CI for MTD.

Often provide poor estimates of MTD (i.e., large uncertainty) - probability of stopping at incorrect dose is generally higher than perceived.

Hence, unsafe or non-efficacious doses may be advanced to Phase III trials.

RWR DESIGNS

Random Walk Rule (RWR, biased coin) designs:

Non-parametric model-based approaches to MTD estimation.

• MTD is treated as a quantile of a dose-response distribution, but no underlying parametric distribution is assumed.

Sample dose levels in unimodal region around MTD.

Provides unified approach targeting any quantile of interest.

Generalization of conventional up-and-down methods.

RWR DESIGNS

As in conventional designs, patients are treated sequentially, and dose escalation occurs when no toxicities are observed.

However, instead of applying a deterministic rule, a “biased coin” is flipped after observing each response.

The algorithm escalates to the next dose with probability p,where p depends on the targeted level of the response.

RWR DESIGNS

Strengths:

Non-parametric

Having a workable finite distribution theory

Simple and intuitive to implement

Simple software in MATLAB has been developed which gives the finite properties of the design(Durham, Flournoy, & Rosenberger, 1997)

CRM

Continual Reassessment Method (CRM):

Originated as a Bayesian method for phase I cancer trials of cytotoxic agents.

For a pre-defined set of doses and a binary response, estimates MTD as the dose level that yields a particular target proportion of responses (e.g., TD20).

Assumes a particular model (such as logistic function)

Assignment of doses converges to the MTD.

See Garrett-Moyer (2006) for an excellent tutorial.

CRM

The method assumes that the probabilities of both efficacy and toxicity increase with increasing dose.

The method also assumes that toxicity can be defined as a binary outcome.

The “acceptable” toxicity rate is explicitly defined and the MTD is the highest (most efficacious) dose with acceptable toxicity.

Similar designs can be used to explore dose-efficacy relationships (for agents that are non-cytotoxic).

CRM

The method begins with an assumed a priori dose-toxicity curve and a chosen target toxicity rate.

The first patients are assigned the dose most likely to be associated with the target toxicity level.

The estimated dose-toxicity curve is refit (i.e., the posterior distribution of the model is updated) after each patient’s outcome has been observed.

Hence, the updated curve is shifted slightly up or down depending on whether the patient experienced a dose-limiting toxicity.

CRM

The next patient is assigned the dose closest to the MTD based on the updated dose-toxicity curve (posterior distribution).

Patients continue to be treated until some pre-defined level of certainty is achieved or pre-defined stopping criteria are met.

Once the stopping criteria is achieved, the final dose is selected as the MTD.

CRM

For example, consider the following curve:

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5 10 15 20

Dose

Even

t R

ate

If target level of toxicity is 10%, then dose level 5 would be the optimal starting dose.

CRM

An example of how the CRM might work:

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5 10 15 20

Dose

Even

t R

ate

CRM

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 5 10 15 20

Dose

Even

t R

ate

An example of how the CRM might work:

Final Dose

CRM

The implementation of a CRM requires a substantial collaboration between the investigator and statistician.

This collaboration is important in order to determine:

The dose-toxicity model to use

The target rate of toxicity (or response)

Stopping rules

CRM

Several types of mathematical models for the dose-toxicity curves may be chosen:

“One parameter” logistic models fix a midpoint and use the data to estimate the slope of the curve.

“Two parameter” logistic models estimate parameters that determine both midpoint and slope of the curve.

Hyperbolic tangent models

The choice of model is an aspect of CRM design that requires a statisticians assistance.

However, the estimation of the MTD has been shown to be fairly robust to model misspecification.

CRM

Choosing the target rate of toxicity is a key component of a CRM.

This requires defining the dose for which the probability of a dose-limiting toxicity is equal to some specified value :

Pr{ DLT | Dose = MTD } =

This determination should involve the opinions of several investigators and will depend on the nature of the DLT.

CRM

Stopping rules for CRM designs:

Continue until a fixed number of patients is treated.

Continue until a fixed number of patients have been treated at a dose (for discrete doses).

Continue until the target dose changes by less than 10% (for continuous doses).

CRM

When publishing results from a CRM trial, typical to display:

The recommended dose (MTD) for a future trial, along with some estimate of the variability surrounding the MTD estimate.

A table that shows how the CRM progressed, including:

• Number of dose-limiting toxicities for each cohort

• Estimated dose at end of each cohort

CRM

Strengths:

“Learns” from information gained at early time points in the study – all patients studied contribute to the estimated dose.

Less likely to treat patients at toxic doses – tends to incur fewer dose-limiting toxicities.

More likely to treat patients at efficacious doses

Can more accurately estimate the MTD as compared to standard 3+3 designs

CRM

Drawbacks:

Mathematical and statistical complexities make it difficult for many clinical investigators to understand.

Properties must be assessed via simulation.

Early on, large dose escalations can occur based on little information which may cause more patients to be treated at unsafe doses.

Dosing first patients at level deemed appropriate by the a priori curve may be worrisome due to uncertainty surrounding this curve.

MODIFIED CRM’s

To address some of the concerns with the original CRM, Several modified CRM approaches have been developed and implemented:

Always start at the lowest dose level under consideration

Enroll 2-3 patients in each cohort

Proceed as a standard 3+3 dose escalation design in the absence of dose-limiting toxicities.

Any given dose escalation cannot increase by more than one level.

MODIFIED CRM’s

Strengths:

Mathematical model is not solely responsible for determining dosage increases – restricted by design.

Starting dose can be chosen as with a traditional design – start dosing at the lowest level

MODIFIED CRM’s

Drawbacks:

Mathematical and statistical complexities make it difficult for many clinical investigators to understand.

Properties must be assessed via simulation.

OTHER BAYESIAN DESIGNS

1) Escalation with overdose control:

Similar to CRM, but addresses ethical need to control probability of overdosing.

Designed to approach MTD as rapidly as possible subject to constraint that the predicted proportion of patients given an overdose is less than or equal to .

The dose for each patient is chosen so that the predicted probability that it exceeds the MTD is .

Bayesian feasible – minimizes the predicted amount by which any given patient is overdosed.


2) Designs based on Bayesian decision theory:

Focused on efficient estimation and decision making by providing tools to achieve various goals

- Reducing sample size - Reducing cost

- Maximizing information

- Increasing likelihood of making a ‘correct’ decision

Use gain functions (based on desired goal) that are sequentially updated after each response.

Next dose assignment is determined by maximizing the gain function.


2) Designs based on Bayesian decision theory (cont.):

Whitehead & Brunier (1995) introduced a design that incorporates elements of Bayesian decision theory:

• Set of assigned dose levels

• Priors and loss functions

At each stage, a dose is selected by minimizing the asymptotic posterior variance of the MTD estimator with respect to the possible doses to be assigned.

Specifying the loss function as minimizing distance that next assigned dose is from the target quantile will yield a procedure similar to the CRM.


3) Bayesian D-optimal designs:

Similar to decision theoretic approaches

Concerned with both efficiency of estimation and protecting patients from being assigned to highly toxic doses.

Introduces formal optimality criterion (D-optimality) minimizing the determinant of the variance-covariance matrix of the model parameter estimates.

Constraint incorporates optimal design points and ensures that probability an administered dose exceeds the maximum acceptable dose is low.


3) Bayesian D-optimal designs (cont.):

The optimal allocation changes with each update of the posterior distribution.

Target the overall dose-response curve rather than the MTD only; therefore, any level of response can be estimated.

Concerned mainly with collective ethics (doing what is best for future patients) as opposed to individual ethics (doing what is best for current patients)

However, computational demands are high

PENALIZED D-OPTIMAL DESIGNS

Penalized D-Optimal designs:

Non-Bayesian designs which allow simultaneous assessment of efficacy and toxicity.

Attempt to find design that maximizes the information (collective ethics) but under control of total penalty for treating patients in the trial (individual ethics).

Similar to Bayesian D-optimal designs, the D-optimality criterion is applied at each step of the sequential trial to maximize expected increment of information about efficacy and toxicity dose response.

PENALIZED D-OPTIMAL DESIGNS

Penalized D-Optimal Designs (cont.) :

Flexibility in constraint and underlying bivariate model make approach particularly useful in early phase trials.

Scope extends beyond MTD estimation and allows a number of questions involving efficacy and safety dose-response relationships to be addressed simultaneously.

Has potential to accelerate the drug development process by combining traditional Phase I and Phase II into a single trial.

CASE STUDIES

Objective: Establish the MTD of Nalmefene that spares analgesia in an acceptable number of patients receiving epidural Fentanyl and dilute Bupivacaine for postoperative pain control.

CASE STUDIES

Doses Studied: 0.25, 0.50, 0.75 or 1.00 µg/kg Nalmefene

Toxicity: Reversal of analgesia was defined as increase in pain score of 2 or more above baseline on a visual analog scale from 0-10 after nalmefene administration.

MTD: That dose (among the four studied) with a final mean probability of reversal of anesthesia closest to 20%.

The investigators utilized the modified CRM.

Patients were treated in cohorts of one, starting with the lowest dose.

CASE STUDIES

The investigators assumed a one-parameter logistic function described the risk of ROA at the ith nalmefene dose:

In order to estimate an initial probability of ROA for each dose, a prior unit exponential distribution was chosen for .

The estimated curve was modified after the response for each subject was observed.

The MTD was determined after 25 patients were treated and evaluated for ROA.

exp 3Pr

1 exp 3

i

i

dROA

d

CASE STUDIES

Sequence of nalmefene doses over the course of the trial is shown to the right.

The modified CRM treated the last 7 and 15 of the last 17 patients at the estimated MTD.

CASE STUDIES

After 25 patients were treated, the final estimated median posterior probabilities of ROA were:

- 11% for 0.25 dose group




CASE STUDIES

Objective: Determine the minimum effective dose regimen (MEDR) of intravenous ibuprofen required to close ductus arteriosus in infants with a postmenstrual age of 27-29 weeks at birth.

CASE STUDIES

Doses Studied: Loading doses of 5, 10, 15, or 20 mg/kg, followed by two doses (half loading dose) at 24 hr. intervals

Efficacy: Target closure

MEDR: That dose (among the four studied) with a final mean probability of target closure closest to 80%.

The investigators utilized the CRM.

Cohorts of three consecutive patients received the same dose regimen.

CASE STUDIES

Each of the four dose levels was arbitrarily associated by the investigator with the following prior guesses of success probability: 60%, 80%, 90%, and 95%.

The one-parameter logistic model (with scale parameter fixed at 3) was chosen in order to fit the dose-response curve.

A prior exponential distribution with = 0.5 was initially chosen for the model parameter.

The dose allocated to each new cohort of patients was the dose level with updated response probability closest to the target rate of 80%, unless adverse events were observed.

CASE STUDIES

The CRM continued until one of the following were met:

A total of 20 subjects were studied

Estimated efficacy was too low for all levels

Suitable estimation of the MEDR was obtained – based on predictive gains of further patient inclusions on the response probability and width of the credibility interval

CASE STUDIES

Sequential posterior estimatedprobabilities of success of the four tested doses, updated after each new cohort is shown to the right.

Failures were recorded in 4 patients.

CASE STUDIES

After 20 patients were treated, the final estimated mean posterior probabilities of success were:

- 56% for 5 mg/kg group




SUMMARY

Standard 3+3 designs were not designed with the intention of producing accurate estimates of a target quantile.

Rather, they are designed to screen drugs quickly and identify a dose level that does not exhibit too much toxicity.

Bayesian model-based methods (CRM, EWOC, decision theoretic approaches, etc.) provide better estimates of the MTD and dose-response curve.

SUMMARY

However, such methods are complicated to explain to non-statisticians and computationally challenging to implement.

The key to their usefulness lies in the packaging of these methods in user-friendly software that runs quickly and is well-documented.

IV. Fixed Design Dose-Response Methods

Outline:

• Traditional designs

• Multiple comparison procedures approach

• Modeling approach

• Combination methods

DIFFERENT GOALS

Establish proof-of-concept (PoC): response (typically a biomarker) changes with dose

Obtain maximum tolerated dose (MTD), or maximum safe dose (MSD) – safety driven

Estimate minimum effective dose (MED), maximum useful dose (MUD) – efficacy driven

Model dose-response relationship for efficacy, safety, or both

Fixed designs: allocation ratios are determined prior to start of trial and remain unchanged during it

TRADITIONAL DESIGNS

Choice of design will depend on study goals

Parallel groups

Patients independently randomized to dose groups, each patient receives just one dose

Inter-patient variation influences precision larger N

Most commonly used design in dose finding (DF) studies

Cross-over designs

Each patient receives all available doses, in randomized sequence (typically chosen to minimize confounding with period, previous dose, etc; e.g., Williams design, Latin squares)

Within-patient variance determines precision smaller N

Typically only used when endpoint is persistent (e.g., Asthma)

TRADITIONAL DESIGNS (CONT.)

Dose escalation

Cohorts of patients allocated sequentially to increasing doses

Safety is evaluated for current cohort, before new one started

Main goal is to estimate maximum tolerated dose (MTD)

Placebo and/or active control patients included for blinding

Titration designs

Patients are titrated to desired dose level – can be optional (e.g., based on efficacy) or forced

Optional titration designs can be challenging for dose response estimation (e.g., non-responder receiving higher doses)

Factorial designs (drug combinations)

Randomized concentration designs

MINIMUM EFFECTIVE DOSE – MED

MED is one of the key concepts in dose finding, often assumed the target dose

ICH-E4 (1996): Dose-response information to support drug registration“… smallest dose with a discernible useful effect …”

Reuberg, 1995: “… smallest dose producing a clinically important response that can be declared statistically significantly different from placebo …”

General perception: too high doses are brought into Ph. III

FDA: 20% of drugs approved between 1980 and 1999 had dose changed by more than 33% after approval (80% reductions)

ANALYSIS APPROACHES IN DF

Main strategies: (i) multiple comparison procedures (MCP) based on contrast tests of doses and (ii) modeling of dose response relationship

MULTIPLE COMPARISONS PROCEDURES

Two main goals: identification of dose response signal (PoC) and selection of target dose – both implemented via hypothesis testing

Two levels of multiplicity involved:

PoC: multiple samples – adequate global test (e.g., trend test)

Dose selection: multiple testing, multiplicity adjustment (e.g., Dunnett, Hochberg)

MCP is the most common approach used in DF studies –sample size calculations are typically based on the power to establish PoC for an assumed treatment effect

Dose is treated as a categorical variable

MCP - ADVANTAGES

Easy to implement and interpret: series of individual hypothesis tests based on contrasts between doses

Does not require much prior knowledge of dose response relationship – less sensitive to assumptions

Useful with small number of doses (e.g., 2 or 3), when modeling is not feasible

Reliable, validated software available for analysis (e.g., PROC MULTTEST in SAS)

MCP - DISADVANTAGES

Not designed for estimation of target dose, such as MED: can only select one out of doses used in trial

Does not provide information about precision of selected dose – confidence intervals not available

Including clinical relevance criterion typically difficult (emphasis is on hypothesis testing)

Does not provide information on dose response (DR)profile

MODELING

Parametric model is used to represent DR profile

Requires sufficient number of doses (typically > 3) and previous knowledge of DR shape

Dose is treated as a continuous variable

Dose response models are typically non-linear andmonotone; typical examples:

Linear, non-monotonic, and non-parametric (e.g., splines) can also be used in practice, typically when less is known

MODELING (CONT.)

Target dose estimation is done via inverse regression

PoC can be tested based on fitted model (e.g., likelihood ratio test vs. flat DR model)

MODELING – ADVANTAGES

Straightforward to estimate target doses, such as MED and MUD, which do not need to be included in study

Precision of estimated target doses can be assessed, e.g., using confidence intervals – can also be used for evaluating sample size calculations

Easy to include requirements on clinical relevance

Allows better understanding of DR, providing useful information for planning future studies (e.g., simulations)

Does not involve multiple comparisons, so multiplicity adjustment is not needed

MODELING – DISADVANTAGES

Requires prior knowledge of DR shape, if parametric model is used – more sensitive to assumptions

Difficult to use with small number of doses

Estimation and analysis are less straightforward than with MCP, especially when nonlinear models are used

Sample size calculations are more complex, generally requiring simulations

TYPICAL DOSE RESPONSE MODELS

MODEL SELECTION PROBLEM

True dose response shape is typically unknown at the time study is being planned

Choice of working model may have substantial impact on dose estimation

Current model selection approaches do not take into account statistical uncertainty associated with choice of DR model

How to combine MCP and Modeling, benefiting from the advantages of each approach?

MCP-MOD – A UNIFIED DF APPROACH

Set of candidate models

Optimal contrast coefficients

Selection of significant models while controlling FWER

Selection of a single model using max t, AIC,possibly combined with external data

Dose estimation and selection (MED, MSD,…)

MCP-MOD – OVERVIEW

DR model does not need to be specified before hand; just set of possible candidate models

Candidate models are expressed in terms of optimalcontrasts (maximize power of test when model is correct)

MCP approach used to control FWER of multiple model contrasts test used to test PoC

When PoC established, select best DR model (e.g., AIC)

Selected model used to estimate target doses, taking into account clinical relevance – estimate may not exist

Precision of target dose estimates can be assessed and used for sample size calculations

MCP-MOD – EXAMPLE

Randomized, double-blind, parallel group DF study

Placebo and four active doses: 0.05, 0.2, 0.6, and 1

100 patients per arm

Normally distributed endpoint, constant variance

All doses well tolerated – MSD > 1

Planned PoC test: step-down hierarchical procedure; preserve 5% one-sided FWER

MCP-MOD – EXAMPLE (CONT.)

What should be the MED?

EXAMPLE – CANDIDATE MODELS

Five candidate models identified: linear, linear in log-dose, Emax, quadratic, and exponential

High correlation between some contrasts (e.g., linear and linear in log-dose) less impact on multiplicity adjust.

EXAMPLE – MODEL CONTRASTS EXAMPLE – RESULTS

All contrast tests highly significant – critical value, adjusting for multiplicity = 1.93, 5% one-sided FWER

Emax model selected as best, based on AIC

Clinically relevant effect: increase of = 0.4 over placebo

Different MED estimates

pd is predicted DR at dose d, Ld and Ud are CI limits

EXAMPLE – MED ESTIMATES CONCLUSIONS

Fixed dose allocation designs are still prevalent in clinical development

Multiple comparison procedures are most commonly used approach for establishing PoC and estimating dose

Model-based methods are generally advantageous (compared to MCP), but require more assumptions

Combination methods taking advantage of the better features of MCP and modeling are available – need more experience using it, including software availability

Adaptive dose allocation methods give greater flexibility and can lead to substantial gains in efficiency

1. Multiple Comparison Methods

Combination Tests

2. Model Based Methods

Bayesian Approach

Normal Dynamic Linear Model

D-Optimal Criterion

Clinical Utility Function

V. Adaptive Dose-Response Methods for Late-Stage Exploratory Development Focus for late stage exploratory designs:

Population average dose response studies in patients

Other important considerations:

Exposure-response models

– Increase understanding of population dose response

– Adjust individual dosing in practice

– Useful to develop adaptive dose-response designs

Final conclusions on dose-response from entire database

– Not restricted to studies designed to inform about dose response

LEARNING ABOUT THE DOSE-RESPONSE

CLASSIFICATION OF ADAPTIVE METHODS

Multiple Comparison Approaches

Frequentist based

Some approaches imbed Bayesian methods within study stages

Model Based Approaches

Frequentist & Bayesian

Normal Dynamic Linear Model (NDLM, non-parametric)

D-optimal Criterion (parametric)

AdvantagesVery few (or no) assumptions about dose response shape

E.g. monotonic u1 < u2 < u3 < u4

Strong control Type I error

Disadvantages

Doesn’t leverage information across doses

NO information about what is happening between doses

Some inefficiencies for fixed designs:Typically requires high sample sizes per dose group

Feasibility limits number of doses explored

May identify if dose response exists BUT

Provides limited information on dose-response

MULTIPLE COMPARISON (MC) PROCEDURES

Objectives

Establish a dose-response relationship (trend test)

Identify dose(s) effective relative to a control (pairwise comparisons)

Approaches

Extending the classical group sequential framework

Combination function approaches (foundation in meta-analysis)

Types of adaptations:

Early termination of inferior dose(s)

Add dose(s)

Sample size reassessment for future stages

Early stopping for futility or efficacy

Seamless shift across development phases

MC FRAMEWORK FOR ADAPTIVE METHODS

Stallard & Todd (2003)

Extended classical group sequential designs to multiple treatment arms

Identify best treatment based on a maximum standardized test statistic

GROUP SEQUENTIAL DESIGNS

Approach

Trial analyzed in a series of independent stages

Very flexible

– Do NOT have to define what you will adapt in advance

– Bayesian decision theoretical approach can be used (posterior probabilities of events)

– Do have to define a-priori how you will combine the test statistics from the stages to make inference

Controls family-wise type I error rate

Adjustments are needed for inference (Posch et al. 2005)

Multiplicity adjusted p-values for dose-control comparisons

Point estimates and CI adjusted for

– Early stopping

– Treatment selection

ADAPTIVE TREATMENT SELECTION BASED ON COMBINATION TESTS

Stage wise tests

Independent observations between stages

Let p1 , p2 be p-values from stages 1 and 2 respectively

A-priori define at minimum

Combination function

Stage wise and overall alpha levels

>18 different combination functions (Becker, 1994)

Commonly used functions

Fisher’s: C(p1, p2) = p1*p2

Inverse Normal: C(p1, p2) = -w1 N-1(1-p1) - w2 N-1(1-p2)

THEORETICAL BACKGROUND

2

2

1

n

i n df

i

and X

2

22 ln ;i i i dfLet X P then X iid

01 02 1 2: Pr( )Note H H PP

(0,1)iUnder the Null P U iid

0 01 02lo :To test the G bal Null Hypothesis H H H

1

2

42; 2(ln ln ) (1 )dfHence compare p p to critical value

COMBINING P-VALUES: FISHER’S METHOD

Note P may only be approximately uniform [0,1] under the Null:

IF individual hypotheses are composite, or if responses are discrete

Jennison & Turnbull (2005); Robins, et al. (2000)

Two-stage procedure Bauer & Kieser (1999):

Weaker condition:

Distribution of P1 and conditional distribution of P2|P1

stochastically larger than or equal to the uniform distribution on [0,1]

ON DISTRIBUTION OF P-VALUE

Jennison & Turnbull (2005)

Inverse Normal: C(p1, p2) = -w1 N-1(1-p1) - w2 N-1(1-p2)

Historically

Mosteller & Bush (1954): generalization based on fixed weights

– If

– Then

Interpretation concern

– Weighting patient information unequally based on stage

INVERSE NORMAL FUNCTION

1

1

1( ... ) (1 ) (0,1)k k kZ Z where Z N p N

k

2

1

1K

k

k

w

1

(0,1)K

k k

k

w Z N

Let

Then the z-statistic for pooled data equals:

Combination test statistic equals the z-statistic for pooled data

– Invariant to partitioning of the data

– A function of the sufficient statistic (efficient)

– Sample size of stages must be fixed

Note: In general, the number of stages & weights can be adapted for K > 2

Fisher (1998) “variance spending”

– Spend of variance of Z statistic: study ends when sum is 1

INVERSE NORMAL FUNCTION (cont.)

1

K

k k

k

Z w Z

kk

nw

N

2

kw

Bauer & Kohne (1994)

The pre-specified combination test needs to be followed

Properties only hold if followed

e.g. Cannot decide to treat Stage 1 as internal pilot (even if no adaptation is made for Stage 2)

“Protocol has to describe which types of adaptation are

intended.”

Conclusions depend on types of adaptations

Ad-hoc adaptations (even if family-wise Type I error preserved) can make interpretation difficult

Estimates may be biased or intractable

POTENTIAL ABUSES

Combines

Closed testing procedure

A multiplicity adjustment procedure

– e.g. Bonferroni-Holm min P, Simms

A combination test procedure

– Fishers, Inverse Normal

Same approach can be used for seamless designs

Stages can span Phase II/III

APPLICATION TO DOSE FINDING

Following example taken from PhRMA Adaptive Design Working Group training presentation

Titled: Adaptive Seamless Designs for Phase IIb/III Clinical

Trials

Author: Jeff Maca, Ph.D., Novartis

Full set of this and other training slides can be found at the following open access WEB site:

http://biopharmnet.com/doc/doc12004.html

AN EXAMPLE

Closed test procedure

• n null hypotheses H1, …, Hn

• Closed test procedure considers all intersection hypotheses.

• Hi is rejected at global level ifall hypotheses HI formed by intersection with Hi arerejected at local level

H1 can only be rejected

at =.05 if H12 is also

rejected at =.05

Source: Jeff Maca

CLOSED TESTING

• A typical study with 3 doses 3 pairwise hypotheses.

• Multiplicity can be handled by adjusting p-values from each stage using Simes procedure

iSi

S pi

Sq min S is number of elements in Hypothesis,

p(i) is the ordered P-values

Source: Jeff Maca

CLOSED TESTING (cont.)

Stage sample sizes: n1 = 75, n2 =75

Unadjusted pairwise p-values from the first stage:

p1,1= 0.23, p1,2 = 0.18, p1,3 = 0.08

Dose 3 selected at interim

Unadjusted p-value from second stage: p2,3 = .01

Source: Jeff Maca

SCENARIO: DOSE FINDING 3 DOSES & CONTROL

q1,123 = min( 3*.08, 1.5*.18, 1*. 23)= .23

q2,123 = p2,3 = .01

C(q1,123, q2,123) = 2.17 P value = .015

Source: Jeff Maca

THREE-WAY TEST

q1,13 = min( 2*.08, 1*. 23)= .16

q1,23 = min(2*.08,1*.18) = .16

q2,13 = q2,23 = p2,3 = .01

C(q1,13, q2,13) = C(q1,23, q2,23) = 2.35 P.value = .0094

Source: Jeff Maca

TWO-WAY TEST

q1,3 = p1,3 = .08

q2,3 = p2,3 = .01

C(q1,13, q2,13) = C(q1,23, q2,23) = 2.64 P.value = .0042

Conclusion: Dose 3 is effective

Source: Jeff Maca

FINAL TEST

Assess design options/power via simulations

Power is a function of unknown dose response

In two stage approach with fixed sample sizes, inverse normal combination function is efficient

Model based approaches may be more efficient (but also more complex)

Resulting estimates can be biased

Recommend assessing via simulation

Last resort, use the last stage for estimation purposes

DO NOT ABUSE

Follow required pre-specified rules

Describe possible adaptations in protocol

RECOMMENDATIONS

ASSUMES a functional relationship between the dose and response

Parametric & Non-parametric model-based approaches

Estimates, such as ED95, inferred from the model

Potential inefficiencies with fixed dose design

Provides limited information on dose-response

– Same number of patients assigned to each dose

Often high likelihood doses selected a-priori are not optimal

Unlikely to identify at predetermined levels of precision, e.g.,MED, ED95

MODEL BASED APPROACHES

Objectives

Estimate dose-response

Identify optimal (target) dose(s)

Modeling components

Parametric or non-parametric

Prior distributions on model parameters

Decision making components

Dose allocation

Stopping rules

Highly flexible

BAYESIAN APPROACH

Objectives:

Identify target dose (ED95)

Estimate dose response

Modeling Component:

NDLM

Decision Making Components:

Dose Allocation Rule

Model-based optimization criteria

Stopping Rules

Decision analytic

EXAMPLE: ASTIN (Krams et al. 2003)

Objective: Allocate patients to maximize information about ED95

Maximize Utility Function

Minus the variance of the predicted mean response at the ED95

Includes uncertainty in ED 95 dose & in the dose response

Function of future patient data

Determining next patient assignment

Calculate the expected utility for each possible dose assignment

– Expectation over the posterior predictive distribution for the data yet to be observed

– Ongoing patient data predicted from earlier data using a longitudinal model (that gets updated during the study)

– Assume next patient is last patient

Assign dose that is expected to result in the smallest variance

– Randomly across doses within 5% of optimal dose

DOSE ALLOCATION RULE

Function of the posterior mean and variance of ED95

Stop for efficacy:

Lower bound of 80% credibility interval that the change relative to placebo >2 points for the ED95

Minimum of 250 evaluable patients

Stop for futility:

Upper bound of 80% credibility interval that the change relative to placebo < 1 point for the ED95

Minimum of 500 evaluable patients

Maximum sample size = 1300

STOPPING RULES

West and Harrison (1997): Bayesian Forecasting and Dynamic Models

A piece-wise linear model

Smoothed transitions in the dose-response slope across the doses

Does not restrict the shape of the dose response curve

Developed for analysis and forecasting of time series data

Other non-parametric models

Splines, Kernel Methods

NORMAL DYNAMIC LINEAR MODEL

Assumptions

Response at each dose normally distributed about a mean

Change in mean between adjacent doses can be predicted by a simple linear model

Variability decomposed into two components

Observational variability for the patient response about the mean for the given dose

System variability around the linear model that relates the adjacent means

NDLM (cont.)

Let Rik be the ith patient response at dose k, and Dk represent the kth

dose with mean µk

Observation Equation:

NDLM (cont.)

2| ~ (0, )ik k k ik ikR D where N

System Equations:2

1 1 ~ (0, )k k k k kwhere N H

2

1 ~ (0, )k k k kwhere N H

Priors placed on:

µi (mean at dose i) H (smoothing parameter)

(slope parameters) 2 (observational variance)

Neuropathic Pain

Minimum Clinical Significance:

Average Daily Pain Score (ADPS)

Ranges (0 no pain, 10 worst pain)

1.5 difference from placebo change from baseline

Design PoC study to select future dose(s) Phase III

12 fold dose range

Dose-response unknown…may be inverted-U shaped

Positive control desirable for assay sensitivity

Too costly to explore dose-range?

CONSIDER

Re

sp

on

se

Dose

NotInformative

Informative

FIXED DOSE DESIGN

• Pfizer: Smith, Jones, Morris, Grieve, Tan (2006)

1 wk

Lead-In

4 wks

Double Blind Treatment

1 wk

Follow-up

7 Doses

Positive Control

Placebo

Max n=35 per arm

Type I error < 5%

Power ~ 80%

ADAPTIVE PoC CASE STUDY

Decision Making Components:Dose Allocation Rule

– Initiate all 9 arms (equal allocation)

Stopping Rules

– Actions

– 2 interim analyses

– Drop up to 2 non-efficacious arms at each look

– Stop the study early if all doses non-efficacious

– Don’t stop early for efficacy (gather more information)

– Decision rules

– Futility at dose Pr ( Effect at dose < 1.5 ) > 0.80

– Worth continuing if Pr (Effect at dose > 1.5) > 0.80

Modeling Component:

Normal Dynamic Linear Model

ADAPTIVE FEATURES

Trial stopped at first interim

Flat dose-response

Approximately $2M saved due to stopping early

Rough comparison to fixed design with pairwise comparison of each dose to placebo

Approximately 3-4 times larger

– No early stopping

– Controlling for multiple comparisons

– Type I error, 1-sided, 10%

RESULTS

Targets estimation of the overall dose-response

Formal optimality criterion (D-optimal)

Minimize determinant of the variance covariance matrix of the model parameter estimates (maximizes information)

Allocates patients (sequentially or group sequentially) to provide the most information

Typically keep allocation of placebo constant

Wide class of models are applicable

e.g. Four-parameter logistic model

D-OPTIMAL

One approach (ASRS WG white paper)

Allocate equally across doses for first cohort

Fit model

Based on this model, determine optimal allocation ratio for next cohort of patients to maximize information

Bayesian D-Optimal

Place a prior distribution on the model parameters

After each cohort, calculate the posterior distribution

Similarly, update allocation ratio to minimize determinate of the variance co-variance matrix of model parameters

D-OPTIMAL (cont.) 4-PARAMETER LOGISTIC MODEL

4

1 22

3

1 ( )i i

i

RD

Patient indicator

Patient response

Level of drug

Response at 0 drug

Max. attributable effect of drug + 1

Dose producing response half way between 1 and 2

Related to steepness of slope

Random error for patient I {often iid N(0,1) }

1

2

3

4

i

i

i

i

R

D

1-1 Comparison to Emax Model

4 < 0 4 > 0

2 = E0 2 = E0

1 - 2 = Emax 1 - 2 = Emax

3 = ED50 ( 3)-1 = ED50

- 4 = Hill Coef 4 = Hill Coef

(Di)-1= Di

50

max0

ii i

i

D ER E

D ED

Requires monotonicityIncreasing or decreasing

Minimum of 5 doses desirable4-parameter model

If highest dose < ED95

Estimates of Emax, ED50, and Hill Coefficient (gamma) impacted

– High coefficient of variation & bias

Fit in data range usually good

Bayesian approachStrong priors might be assumed for Emax if highest dose thought to be less than ED95

4-PARAMETER LOGISTIC MODEL (cont.)

How many patients to assign per cohort

Doses to include

Sample size

Fixed

Information driven (select criteria for determinate)

Include stop for futility

Likelihood the trend test will not be statistically significant

Likelihood that effect of each dose is less than some threshold

DESIGN QUESTIONS TO EXPLORE THROUGH SIMULATION

Phase II dose ranging study

Schizophrenia

Objective

Confirm positive POC study

– 3 arm: High dose, Active (assay sensitivity), Placebo

Explore lower doses

Determine dose(s) Phase III

Dose range 8 fold

4 doses

Primary Measure

PANSS total score at 6 weeks

EXAMPLE: ADAPTIVE DESIGN NOT RECOMMENDED

Subjective primary measure / 20 sites

Significant effect due to site

Desirable to stratify by site

Long term outcome relative to expected enrollment rate

No biomarker

Narrow dose range well covered by 4 doses

High dose effective, but may not be near Emax

STUDY CHARACTERISTICS

Fixed design with equal allocation

Adaptive allocation

Bayesian D-Optimal Criterion (4 parameter logistic model)

– Allocation adapts to increase efficacy of estimates of model parameters

– Target is the overall dose-response curve

Stopping Rules (4 interim analyses)

Stop for Futility

– If predicted mean difference high dose vs placebo > -5, with 95% confidence

Stop for Efficacy

– If predicted mean difference low dose vs placebo < 0, with 95% confidence

DESIGNS COMPARED

No compelling advantage to adaptive randomization over fixed allocation

Adaptive randomization favored ~ equal allocation

– Slightly more on placebo, slightly less on lower doses

Fixed design would be slightly more powerful for pairwise comparisons with unequal allocation & not effect dose-response estimation adversely (2:1:1:1:1)

Perfect information was assumed in simulations for adaptive allocation

– Lag between patient outcome data & enrollment worsens performance

Use of parametric dose-response model (unknown dose-response)

Additional resources/complexity not warranted

RECOMMENDATION

Decision theoretic approach to choice of dose

Doses comparable on the utility index scale

Maximize utility

Quantify benefit risk / tradeoffs

Incorporate both efficacy & safety measures

Subjective

Requires development of subjective value functions

Requires development of subjective weights to define the importance of each measure to the decision

Functional form can be Additive or Multiplicative

Can be complex to interpret

UTILITIES

Change in CGI-Severity

Value

0.0

1.0

0.8 Value Function

Efficacy: CGI-Severity

EXAMPLE: VALUE FUNCTION

INDEX: .5*.25 + .8*.35 + (.3*.6 + .9*.4)*.4 = .621

Weights: 0.25 0.35 0.4

Attributes: Health Outcome Efficacy Safety

Weights: 1 1 0.6 0.4

Sub-Attributes: Weight Loss CGI-S AE QT

Value: 0.5 0.8 0.3 0.9

Note on Multiplicative Utility:

Similar to above (can incorporate weights directly into the value function)

Define value function to go to 0 value quickly if undesirable trait (e.g. safety concern)

EXAMPLE: ADDITIVE UTILITY INDEX

Consider routinely assessing appropriateness of adaptive designsin exploratory development

Asses potential gains against those of standard fixed designs

Balance complexity with potential gains

Trial simulations typically needed

Fine tune design

Assess operating characteristics

Recommended even when ONLY considering a fixed trial design

Consider Seamless PoC/Phase 2 dose-response studies

Recommend model based approaches

More informative of dose response profile than MC

Critical to assess model assumptions

Non-parametric models less restrictive

CONCLUDING REMARKS

VI. Simulations Comparing Adaptive and Non-Adaptive DF Methods

Outline:

• Evaluating statistical operational characteristics of complex DF designs and methods

• Performance metrics and graphical displays

• Comparing DF designs and methods: PhRMA’s Adaptive Dose Ranging Studies working group simulation study

• Conclusions from ADRS WG simulations

MOTIVATION

Evaluation of operational characteristics (OCs) of proposed statistical methods is a critical step in designing a clinical trial – comparison of methods

The OCs include the power to detect signals of interest, the precision of estimates for quantities of interest, expected duration, etc in particular, used to determine sample size and number of arms

Complexity of adaptive dose finding designs and other non-traditional dose finding methods typically no closed form expressions for OCs metrics

Simulation-based evaluation needs to be employed

KEY GOALS OF DF TRIALS

Typical goals of Phase II trials:

Determine evidence of dose response (DR) signal, i.e., if average response changes with dose level – proof-of-concept (PoC)

Select target dose(s) for confirmatory phase – typically MED; other targets also used (e.g., maximum useful dose)

Estimate DR profile – usually for efficacy, but safety of increasing interest

These goals determine the design of the study and the operational characteristics that need to be evaluated

SIMULATING DF TRIALS

Trial simulation is the main tool for evaluating study OCs; itneeds to properly incorporate multiple factors in study:

Type: parallel groups, cross-over, titration, etc

Available doses, inclusion of active control(s)

Dose allocation scheme (fixed vs. adaptive)

If adaptive, frequency and timing of adaptations (and algorithm for recalculating allocation ratios)

Dose response profile(s):

more than one should be used to assess sensitivity

flat dose response should be included to assess Type I error andimpact on dose selection

SIMULATING DF TRIALS (CONT.)

Response variables: type (e.g., continuous, binary, count, ordinal); distribution (e.g., normal, Poisson)

Possible covariates and their role in DR model

Longitudinal measurements per patient (when, how many)

Variance and covariance parameters (e.g., within- and between patient variances, between-site variances)

Sample size (e.g., expected, maximum)

Drop-out and missing data models (e.g., time to drop-out)

Patient accrual process (e.g., rates, uniformity over time)

Stopping rules, if any (e.g., futility, efficacy)

SIMULATING DF TRIALS (CONT.)

Number of simulations (need to take into account desired precision for OCs estimates)

Statistical analysis methods:

testing for DR signal

selecting target dose(s) – may need target clinical effect

estimating DR profile

Sensitivity analysis: impact of changes in assumed parameters/models/design on OCs (highly recommended)

Choice of software: general purpose (e.g., Trial Simulator) vs. customized (e.g., R or S-PLUS suite of functions)

PERFORMANCE METRICS

Used to quantify performance of different designs and methods with regard to key study goals

1. Detecting DRPr(DR) = probability of identifying DR (usual power for sample size calculations in Phase II trials)

estimate = % simul. trials for which DR was detected

2. Dose selectionPr(dose) = probability of selecting a dose at end of trial- dose selection also based on clinical relevance of effect- Pr(dose) Pr(DR), typically different

estimate = % simul. trials for which dose was selected

DOSE SELECTION METRICS

Let and represent the target dose and its estimate

Bias =

pBias = % Bias =

pError = % Error =

Expected value E(.) estimated by the simulation averages

For methods based on hypothesis testing (e.g., Dunnett),is typically one of the doses in the study,

For model-based methods, it takes values on a continuous scale (typically within the dose range for the trial)

Can also define Bias, pBias, and pError for target effect andthe effect associated with

argtdarg

ˆtd

argarg )ˆ( tt ddE

argargarg /)ˆ(100 ttt dddE

argargarg /)ˆ(100 ttt dddE

argˆ

td

argˆ

td

TARGET DOSE INTERVAL

Doses with an effect within ± 100p% of target effect

Example for p = 0.1 (i.e., ± 10% of target effect)

target dose interval

DOSE RESPONSE METRICS

For pre-defined grid of doses d1, d2, …, dK in range of interest, let be the corresponding expected responses andthe estimated responses

APE: average prediction error =

pAPE: % APE wrt target effect = 100APE/

PEQ(q): prediction error quantile of order q =e.g., median prediction error

expected values estimated by simulation means, quantiles estimated by simulation quantiles

K,,, 21

Kˆ,,ˆ,ˆ

21

KEK

i ii /ˆ1

qiiˆ

GRAPHICAL DISPLAYS

importance of conveying relevant info via plots\

histograms

dotplots

barplots

Sample DR curves

Trellis plots to combine information

GRAPHICAL DISPLAYS

Conveying relevant information in concise and efficient wayis a critical step in simulation report

Graphical displays are well-suited for this purpose, but must be chosen appropriately

Because simulations usually include many different combinations of scenarios (e.g., sample size, number of doses) and methods, Trellis displays are particular useful in presenting information

Will describe and illustrate various efficient graphical displays in the context of PhRMA WG simulations

PhRMA ADRS WG

Adaptive Dose Ranging Studies (ADRS) working group (WG): one of 10 Pharmaceutical Innovation Steering Committee (PISC) WGs

Formed as result of BCG survey to identify key drivers of poor performance in pharmaceutical industry poorunderstanding of DR indicated as one of leading causes for high attrition in late development.

Close collaboration with Novel Adaptive Designs PISC WG

ADRS WG: TEAM MEMBERS

• Alex Dmitrienko, Eli Lilly

• Amit Roy, BMS

• Beat Neuenschwander, Novartis

• Björn Bornkamp, U. Dortmund

• Brenda Gaydos, Eli Lilly

• Chyi-Hung Hsu, Novartis

• Frank Bretz, Novartis

• Frank Shen, BMS

• Franz König, U. Vienna

• Greg Enas, Eli Lilly

• José Pinheiro, Novartis

• Michael Krams, Wyeth

• Qing Liu, J&J

• Rick Sax, AstraZeneca

• Tom Parke, Tessella

ADRS WG: GOALS AND SCOPE

Investigate and develop designs and methods for efficientlylearning about efficacy and safety DR profiles benefit/risk profile

Evaluate statistical OCs of alternative designs and methods (adaptive and fixed) to make recommendations on their use

Increase awareness about ADRS, promoting their use, when advantageous

Comprehensive simulation study comparing ADRS to other DF methods, quantifying potential gains

SUMMARY OF DESIGN AND ASSUMPTIONS

Proof-of-concept + dose-finding trial, motivated by neuropathic pain indication (conclusions and recommendations can be generalized)

Key questions: Whether these is evidence of dose response and, if so, which dose level to bring to confirmatory phase and how welldose response (DR) curve is estimated.

Primary endpoint: Change from baseline in VAS at Week 6 (continuous, normally distributed)

Dose design scenarios (parallel arms):

- 5 equally spaced dose levels: 0, 2, 4, 6, 8

- 7 unequally spaced dose levels: 0, 2, 3, 4, 5, 6, 8

- 9 equally spaced dose levels: 0, 1, …, 8

Significance level: one sided FWER = 0.05

Sample sizes: 150 and 250 patients (total)

DOSE RESPONSE PROFILES DF METHODS USED IN SIMULATONS

Traditional ANOVA based on pairwise comparisons and multiplicity adjustment (Dunnett)

MCP-Mod combination of multiple comparison procedure (MCP) and modeling (Bretz, Pinheiro, and Branson, 2005)

MTT: novel method based on Multiple Trend Tests

Bayesian Model Averaging: BMA

Nonparametric local regression fitting: LOCFIT

GADA: Dynamic dose allocation based on Bayesian normal dynamic linear model (Krams, Lee, and Berry, 2005)

D-opt: adaptive dose allocation based on D-optimality criterion

TARGET DOSE INTERVALS

Target clinical effect: = -1.3 units (reduction in VAS)SELECTED SIMULATION

RESULTS

More detailed results given in the ADRS WG’s White Paper, available at http://biopharmnet.com/doc/doc12005.html

POWER TO IDENTIFY DR DOSE SELECTION UNDER FLAT DR

DOSE SELECTION UNDER ACTIVE DR CORRECT TARGET DOSE INTERVAL

DOSES SELECTED – LOGISTIC, N=150 DOSES SELECTED – UMBRELLA, N=150

% AVG. PREDICTION ERROR, N = 150 SAMPLE PRED. DR – LOGISTIC, N=150

SAMPLE PRED. DR – UMBRELLA, N=150 ADRS WG CONCLUSIONS

Detecting DR is considerably easier than estimating it

Current sample sizes for DF studies, based on power to detect DR, are inappropriate for dose selection and DR estimation

None of methods had good performance in estimating dose in the correct target interval: maximum observed percentage of correct interval selection – 60% larger N

needed

Adaptive dose-ranging methods (i.e., ADRS) lead to gains in power to detect DR, precision to select target dose, and to estimate DR – greatest potential in the latter two

ADRS WG CONCLUSIONS

Model-based methods have superior performance compared to methods based on hypothesis testing

Number of doses larger than 5 does not seem to produce significant gains (provided overall N is fixed) trade-offbetween more detail about DR and less precision at each dose

In practice, need to balance gains associated with adaptive dose ranging designs against greater methodological and operational complexity

VII. Overall Conclusions and Recommendations

CONCLUSIONS & RECOMMENDATIONS

Adaptive, model-based dose finding designs should be routinely considered for use in drug development (Early Development, PoC, Dose Ranging) can lead to substantial gains in efficiency over traditional methods

Dose assignment algorithm should be prospectively andclearly specified in study protocol

Trial simulations should be used to fully evaluate operational characteristics of design prior to study start

Seamless approaches should be considered to improve efficiency, especially between PoC (Ph. I/IIa) and dose ranging (Ph. IIb)

CONCL. & RECOMMENDATIONS (CONT.)

Sample size calculations for adaptive DF designs should take into account the precision of target dose estimates and, more broadly, the accuracy of the decision(s) to be made from the study

Early stopping rules, for efficacy and safety, should be implemented, when feasible, to allow greater efficiency gains in adaptive design

Potential gains associated with adaptive approaches should always be contrasted to additional complexity and costs related to their implementation – not a panacea

CONCL. & RECOMMENDATIONS (CONT.)

Greater usage of these adaptive DF designs should be encouraged and will require:

• Good quality software packages with well documented code and examples for implementing approaches and conducting simulations needed to evaluate operating characteristics of these methods.

• A greater understanding of the strengths and weaknesses of these approaches (hopefully, this course has helped out along this regard)

• More published examples of studies that have utilized these methods.

Practical Considerations

PRACTICAL CONSIDERATIONS

Assessing projects for an adaptive design:

• Rapid acquisition of data relative to enrollment rate

- Outcome is more immediate (and accurately) observable

- Trials of longer duration, with relatively slow recruitment, canbe good candidates

• Existence of predictive biological models and/or prior information in patient population of interest

- Predictive models for longer term outcome


Assessing projects for an adaptive design (cont.):

• Ethical considerations as driver for adaptation

• A high exploratory aspect may indicate greater efficacy gains

- Uncertainty relative to e.g., dose, variability, effect size

- Wide dose range

• Caution: Assuming the patient population remains constant over time

- Trials of long duration

- Selection bias due to unblinding of information


Considerations for simulations:

• Leverage information from disease state and exposure-response models

- Selecting dose-response model

- Defining prior distributions for model parameters

- Development of adaptive algorithm (decision criteria)

- Trial simulations to assess design performance

• Optimize over PD response model (best guess of truth)

• Assess sensitivity to using other response models (include models different from the design dose-response model)


Considerations for simulations (cont.):

• Understand impact of enrollment rate

- Include different rates in simulations

- Consider controlling rate if simulations indicate gains

• Assess impact across different dropout models

• Include information lag (e.g. batches) in simulations

• Demonstrating control of Type I error rate

- Simulate over a grid of scenarios in the null space

- Simulate across various dropout & enrollment rate models


Practical Issues for implementation:

• Additional time to develop the design & protocol

- May need to run extensive simulations to understand operating characteristics

- Communicate with the primary investigator’s about the design, receive feedback, and address concerns

• Clinical trial material needs

- Dosage strengths, quantity, packaging

• Additional resources for modeling & data analysis

- Interim data preparations & analyses

- Final analysis more complex


Practical Issues for implementation (cont.):

• Increase in site communication

- Design changes / Patient treatment assignments

- Fax, Interactive Voice Response Systems, WEB interface

- Additional site training

• Determine type of committee needed to monitor trial

- Ensure protocol is followed (no programming errors)

- Unanticipated safety signals not accounted for in the adaptive algorithm

- Engage committee early in scenario simulations (prior to protocol approval)


Practical Issues for implementation (cont.):

• Determine what data will be needed

• How data will be collected?

- Electronic data capture, Expedited report forms (not monitored), Voice Response system, Excel Spreadsheet

- eDC systems not friendly for interim data extraction

• How clean data needs to be?

- Fully verified (locked) data is not typical

- Use latest data in modeling/analysis (continually clean data)

• Document, Document, Document!!!

ce course on adaptive dose-response studies … · ce course on adaptive dose-response studies 2007...

Documents