literature review june–september 2005
TRANSCRIPT
PHARMACEUTICAL STATISTICS
Pharmaceut. Statist. 2005; 4: 293–296
Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/pst.191
Literature Review June–September 2005
Simon Day1,*,y and Meinhard Kieser2
1Medicines and Healthcare Products Regulatory Agency, Room 13-205, Market Towers,
1 Nine Elms Lane, London SW8 5NQ, UK2Department of Biometry, Dr Willmar Schwabe Pharmaceuticals, Karlsruhe, Germany
INTRODUCTION
This review covers the following journals received during the
period from the middle of June 2005 to middle of September
2005:
* Applied Statistics, volume 54, part 4.* Biometrical Journal, volume 47, parts 3, 4.* Biometrics, volume 61, parts 2, 3.* Biometrika, volume 92, parts 2, 3.* Biostatistics, volume 6, part 3.* Clinical Trials, volume 2, parts 3, 4.* Communications in Statistics – Simulation and Computation,
volume 34, part 3.* Communications in Statistics – Theory and Methods, volume
34, parts 6–8.* Drug Information Journal, volume 39, part 3.* Journal of Biopharmaceutical Statistics, volume 15, parts
4, 5.* Journal of the American Statistical Association, volume 100,
part 3.* Journal of the Royal Statistical Society, Series A, volume
168, parts 2, 3.* Statistics in Medicine, volume 24, parts 14–19.* Statistical Methods in Medical Research, volume 14, parts 3, 4.
SELECTED HIGHLIGHTS FROM THE
LITERATURE
The themes of Statistical Methods in Medical Research were:
* Part 3: Multicentre trials (pp. 201–318).* Part 4: Accounting for non-compliance in clinical trials
(pp. 325–431).
Part 4 of the Journal of Biopharmaceutical Statistics is a
special issue on the topic of adaptive designs in clinical research.
Fifteen articles discuss planning and analysis issues that occur
in adaptive designs from a regulatory, academic and industry
viewpoint.
* Journal of Biopharmaceutical Statistics, Volume 15, part 4
(pp. 535–745).
Part 4 of Clinical Trials is devoted to the proceedings of a
conference in Bethesda entitled ‘Can Bayesian approaches to
studying new treatments improve regulatory decision-making?’
It contains papers from presentations, panel discussions and
case studies.
* Clinical Trials, Volume 2, part 4 (pp. 271–378).
Ethics
There is a series of articles in issue 3, volume 2, of
Clinical Trials, beginning with an editorial by Goodman. A
paper by Fergusson et al. (with accompanying commentaries
by Chalmers and by Augoustides and Fleisher) shows a
cumulative meta-analysis of trials of aprotinin – one
really does have to question why new studies were started and
patients continued to be randomized well after the efficacy
questions had been resolved. A similar example is in the next
paper by Mann et al. The next article in the ‘set’ is an excellent
history and review of the arguments about collective and
individual ethics (terms that the authors believe are really too
broad and too vague to help address the question of which
takes priority).
* Goodman SN. Ethics and evidence in clinical trials
(editorial). Clinical Trials 2005; 2:195–196.* Fergusson D, Cranley Glass K, Hutton B, Shapiro S.
Randomized controlled trials of aprotinin in cardiac
surgery: could clinical equipoise have stopped the bleeding?
Clinical Trials 2005; 2:218–232.* Mann H, London AJ, Mann J. Equipoise in the Enhanced
Supression of the Platelet IIb/IIIa Receptor with Integrilin
Copyright # 2005 John Wiley & Sons, Ltd.Received \60\re /teci
yE-mail: [email protected]
*Correspondence to: Simon Day, Medicines and HealthcareProducts Regulatory Agency, Room 13-205, Market Towers,1 Nine Elms Lane, London SW8 5NQ, UK.
Trial (ESPRIT): a critical appraisal. Clinical Trials 2005;
2:233–243.* Heilig CM, Weijer C. A critical history of individual and
collective ethics in the lineage of Lellouch and Schwartz.
Clinical Trials 2005; 2:244–253.
A further paper in this issue, whilst not in the section on
ethics, does seem to fit well in this section of this review. It is by
Cooper et al. on the use of systematic reviews when designing
studies. The authors cited above critisize the lack of use of such
reviews when planning new studies – the current authors stress
the importance and discuss how to make best use of such
reviews.
* Cooper NJ, Jones DR, Sutton AJ. The use of systematic
reviews when designing studies. Clinical Trials 2005; 2:
260–264.
Phase I
* Dose–response and the search for the maximum tolerated
dose is a well-researched problem. Much is known about
efficient designs – although, perhaps, there is also much to
discover. Certainly the problem of stopping before overdose
is important, particularly in therapies that are intended to
be used at the highest dose possible. Tighiouart et al. use a
joint prior for the maximum tolerated dose and the
probability of dose-limiting toxicity. It is important to use
a joint prior because these two features are correlated – in
fact, negatively correlated.* Tighiouart M, Rogatko A, Babb S. Flexible Bayesian
methods for cancer phase I clinical trials. Dose escala-
tion with overdose control. Statistics in Medicine 2005;
24:2183–2196.
Bretz et al. contrast the common approaches to analysis of
multiple dose studies which seem to be either fit a model (with
uncertainty about what form of model should be used), or make
pairwise comparisons (with associated control of error rates).
They combine both these aspects together to control error rates
whilst exploring a variety of functional forms for a model:
* Bretz F, Pinheiro JC, Bransom M. Combining multiple
comparisons and modeling techniques in dose–response
studies. Biometrics 2005; 61:738–748.
Phase II
Dosing finding is usually a Phase II domain – but the following
paper might come somewhere between II and III. Commonly in
cancer treatments (but in others areas too) it is not just the dose
which needs to be found, but the dosing schedule. The paper by
Braun et al. addresses this problem by looking at time to
toxicity following repeated administration of an agent. It does
not completely solve the problem of what dosing schedule is
‘optimal’, but does help to find the maximum tolerated
schedule. When very aggressive therapies are needed, then
what ‘most the patients can tolerate’ may be the desired dose.
* Braun TM, Yuan Z, Thall PF. Determining a maximum-
tolerated schedule of a cytotoxic agent. Biometrics 2005;
61:335–343.
London and Chang describe a one-sided test for response
rates in phase II oncology trials. The novel features are that
stratification is accounted for and sample size can be adjusted
to obtain desired power and significance levels (hence the
elements of ‘one stage’ and ‘two-stage’ in the title).
* London WB, Chang MN. One- and two-stage designs for
stratified phase II clinical trials. Statistics in Medicine 2005;
24:2597–2611.
Multiplicity
Multiple multiples. . . this paper looks at the problem of
multiple endpoints for each of several doses of an active
treatment. Multiplicity caused by multiple doses needs a
different solution to that caused by more than one endpoint.
These authors combine the two problems by thinking of them
as a simple two-dimensional problem.
* Quan H, Luo X, Capizzi T. Multiplicity adjustment for
multiple endpoints in clinical trials with multiple doses of an
active treatment. Statistics in Medicine 2005; 24:2151–2170.
Interim analyses and data monitoring committees
Even if interim analyses are not directly about making
decisions, data monitoring committees certainly do have to
make decisions (even if only benign ones such as ‘do nothing for
now’). Decision-making should consider the consequences of
those decisions and Ashby and Tan nicely argue for doing this
in a Bayesian way. The advantage may not be in the Bayesian
approach per se, (priors, likelihoods, posteriors and so on) but –
as they point out – ‘explicit consideration of utilities leads to
decision-making that is more transparent.’ Three accompany-
ing commentaries (by Carlin, Louis and Inoue), and an authors’
response, make for a more lively and thought-provoking article:
* Ashby D, Tan S. Where’s the utility in Bayesian data-
monitoring of clinical trials? Clinical Trials 2005; 2:197–208.
As another opportunity to include Bayesian ideas in trials
with interim analyses, Chen and Shen propose Bayesian
adaptive designs. Here the decision to terminate or continue
the trial uses a loss function that is based on the cost for each
patient and the costs of making incorrect decisions at the end of
the study. The loss function is closely related to frequentist
error rates and therefore the desired frequentist properties of
the design can be maintained.
Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 293–296
Literature review294
* Cheng Y, Shen Y. Bayesian adaptive designs for clinical
trials. Biometrika 2005; 92:633–646.
Still within the general topic of interim analyses, two
interesting papers have appeared relating to flexible designs
and mid-course changes of direction. One is a two-stage
procedure for testing non-inferiority and superiority. Note that
this is, indeed, a two-stage design, not a one-stage design. The
design of the second stage (to get more data and go for
superiority) is based on the results of a non-inferiority test at
the end of the first stage. This is not the same as the cases
described in the CHMP Points to Consider document on
Switching Between Superiority and Non-inferiority.
* Koyama T, Sampson AR, Gleser LJ. A framework for two-
stage adaptive procedures to simultaneously test non-
inferiority and superiority. Statistics in Medicine 2005;
24:2439–2456.
The second example is more of a continuous adaptation.
Changing the allocation ratio towards the more successful
treatment (response-adaptive designs) is well known but this
paper looks carefully at the optimal allocation ratio (usually the
change in allocation ratio is rather arbitrary). The practicalities
of conducting studies in this way are not small, particularly in
multi-centre (even multi-regional) trials.
* Atkinson AC, Biswas A. Adaptive biased-coin designs for
skewing the allocation proportion in clinical trials with
normal responses. Statistics in Medicine 2005; 24:
2477–2492.
One possible mid-course adaptation might be to abandon a
study altogether. Lachin looks at futility analyses based on
conditional power, looking at various approaches and con-
sidering their size and power:
* Lachin JM. A review of methods for futility stopping based
on conditional power. Statistics in Medicine 2005; 24:
2747–2764.
Perhaps one of the most hopeful fields of application for
adaptive designs is the seamless transition from phases II to III
within a single trial (several papers in the special issue of the
Journal of Biopharmaceutical Statistics mentioned above
address this topic). Starting with a dose–response trial, the
most promising dose is selected at the interim analysis, and
the trial is continued with this treatment group and placebo – to
prove efficacy. Sampson and Sill develop a procedure that does
not allow for early stopping but, rather, one that takes into
account the selection of the best treatment when calculating the
critical boundary for the final analysis. In this way the type I
error rate of the treatment group comparison is controlled. The
article is followed by a spirited discussion of this design from
various viewpoints and a rejoinder by the authors.
* Sampson AR, Sill MW. Drop-the-losers design: normal
case. Biometrical Journal 2005; 47:257–268 (Discussion and
rejoinder: 269–281).
A very different type of problem is addressed by van
Houwelingen et al. concerning interim analyses with survival
data. Instead of having partial recruitment but with complete
follow-up for all recruited patients, in survival analyses we often
have complete recruitment but only partial follow-up. The
purpose of the interim analysis is not to stop recruitment but to
stop follow-up (or perhaps apply for a marketing authorization,
or publish results, based on partial follow-up).
* Van Houwelingen HC, van de Velde CJH, Stijnen T.
Interim analysis on survival data: its potential bias and how
to repair it. Statistics in Medicine 2005; 24:2823–2835.
Numerous papers show the opportunities of adaptive
designs. The following article illustrates that the freedom they
offer may also bear considerable risks. When applying
inefficient adaptation rules or when being fooled by the data,
adaptation may lead to a switch from a good path to a worse
alternative.
* Kieser M. A note on adaptively changing the hierarchy of
hypotheses in clinical trials with flexible design. Drug
Information Journal 2005; 39:215–222.
Data analysis issues
Another paper comparing different methods for imputing
missing values: this one compares hot-deck multiple imputa-
tion, and a model based on a multivariate normal distribution.
As some sort of baseline, they are compared with a last
observation carried forward approach and the available cases
(or ‘completers only’) subset. The criterion for comparison is
principally the coverage probabilities of confidence intervals
rather than the point estimates.
* Tang L, Song J, Belin TR, Unutzer J. A comparison of
imputation methods in a longitudinal randomized clinical
trial. Statistics in Medicine 2005; 24:2111–2128.
Much has been debated on the whole purpose of subgroup
analyses. Grouin et al. give an overview on this subject by
addressing design and analysis issues and by discussing the
validity of the claims that can be inferred from the results of
subgroup analyses.
* Grouin JM, Coste M, Lewis J. Subgroup analyses in
randomized clinical trials: statistical and regulatory issues.
Journal of Biopharmaceutical Statistics 2005; 15:869–882.
The following paper is written in the context of an
observational/epidemiological study but probably has useful
implications in randomized studies too. The relationship
between baseline scores and endpoint score, and the
Literature review 295
Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 293–296
relationship between baselines and changes from baseline, are
well rehearsed. This paper looks at the relationship between
baseline value and slope (which surely must be very similar to
that of the relation with change from baseline score). Different
approaches to assessing this problem are considered – as well as
whether or not the baseline is a useful predictor of slope.
* Byth K, Cox DR. On the relation between initial value and
slope. Biostatistics 2005; 6:395–403.
Two-stage analyses have been criticized in many cases
(testing for carry-over prior to deciding on the main effects
analysis in crossover trials is, perhaps, the most notable
example). This paper looks more generally at the value of
diagnostic checking of model assumptions. The second part of
the title (‘do they really help?’) should be a sufficient clue to the
authors conclusion that they don’t!
* Shuster JJ. Diagnostics for assumptions in moderate to
large simple clinical trials: do they really help? Statistics in
Medicine 2005; 24:2431–2438.
Meta-analysis
The section in this review on ethics includes several papers
about meta-analysis and systematic reviews; here is a further
one looking at commonly used methods for individual patient
data. There seems wide variety (and perhaps arbitrary choices)
in the way such analyses are carried out – fixed and random
effects being an obvious one. The authors argue for enhanced
methods both of analysis and presentation of such meta-
analyses.
* Simmonds MC, Higgins JPT, Stewart LA, Tierney JF,
Clarke MJ, Thompson SG. Meta-analysis of individual
patient data from randomized trials: a review of methods
used in practice. Clinical Trials 2005; 2:209–217.
Pharmacovigilance
This paper gives a comprehensive tutorial on design and
analysis issues for the assessment of drug-induced QT and QTc
prolongation.
* Pharmaceutical Research and Manufacturers of America
QT Statistics Expert Working Team. Investigating drug-
induced QT and QTc prolongation in the clinic: a review of
statistical design and analysis considerations: report from
the pharmaceutical research and manufacturers of America
QT statistics expert team. Drug Information Journal 2005;
39:243–266.
This paper looks more widely at adverse events and tries to
solve some of the problem of multiplicity and coming up with a
single measure of relative safety of two agents. The methodol-
ogy is not that difficult (multivariate test of binomial
probabilities) and whilst the methods may have some value,
there might be a danger of missing a single signal in some
particular aspects of the adverse event profile – but contrary to
that, several small increments in adverse events may be picked
up by this type of approach but missed when different types of
events are looked at in isolation.
* Agresti A, Klingenberg B. Multivariate tests comparing
binomial probabilities, with application to safety studies of
drugs. Applied Statistics 2005; 54:691–706.
Miscellaneous
The area of non-clinical and preclinical drug development
usually attracts less attention than clinical biostatistics although
– or perhaps because – it includes an even wider spectrum of
methods. The following paper gives an overview of recent
developments in statistical methodology for this field of drug
research.
* Hothorn LA. Biostatistics in nonclinical and preclinical
drug development. Biometrical Journal 2005; 47:282–285.
Copyright # 2005 John Wiley & Sons, Ltd. Pharmaceut. Statist. 2005; 4: 293–296
Literature review296