for clinical trials 31st annual meeting p10 p10... · sunday, may 16, 2010 1 ... contention or...

102
Society for Clinical Trials 31 st Annual Meeting Workshop P10 Considerations in the Design of Clinical Trials for Comparative Effectiveness Research Sunday, May 16, 2010 1:00 PM 5:00 PM Harborview Ballroom D

Upload: trinhbao

Post on 17-Feb-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

 

 

 

 

 Society for Clinical Trials 31st Annual Meeting  

  

Workshop P10 Considerations in the Design of Clinical Trials 

for Comparative Effectiveness Research    

Sunday, May 16, 2010 1:00 PM ‐ 5:00 PM 

Harborview Ballroom D     

Page 2: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

 

Page 3: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Society for Clinical Trials Pre-Conference Workshop Evaluation Baltimore, Maryland

May 16, 2010 WORKSHOP 10 – Considerations in the Design of Clinical Trials for Comparative

Effectiveness Research 1. Overall, did the subject context of this workshop meet your expectations and needs?

Yes ( ) No ( ) If yes, in what way? If no, why not? ___________________________________________ ___________________________________________________________________________

2. Was the content of this workshop of value to you personally or on the Job? Yes ( ) No ( )

3. Was the content of the workshop: New ( ) New/Review ( ) Review ( ) 4. The level and complexity of this workshop was: Too elementary ( ) Correct ( ) Too advanced ( )

5. Rate the extent to which this workshop:

a. Presented content clearly 1 2 3 4 5

b. Allowed sufficient time for discussion and audience participation 1 2 3 4 5

c. Provided useful information 1 2 3 4 5

d. Utilized appropriate teaching methods,

i.e., audiovisual, handouts, lectures 1 2 3 4 5 6. Please rate each workshop faculty member:

Name Knowledge of Subject Organization/Delivery

Huang Grant 1 2 3 4 5

1 2 3 4 5

Walter Koroshetz 1 2 3 4 5

1 2 3 4 5

Michael Lauer 1 2 3 4 5

1 2 3 4 5

Dennis Wallace 1 2 3 4 5

1 2 3 4 5

Please complete the following questions by circling the appropriate description using the rating scale listed below.

1 = excellent 2 = very good 3 = good 4 = fair 5 = poor

Page 4: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

1. Are you currently working in a clinical trial? (Yes) (No) 2. What is your job title? __________________________________________________________ 3. Do you have any suggested topics for workshops at future meetings? If so, please list below:

_____________________________________________________________________________ _____________________________________________________________________________ 4. What aspect of the workshop did you like best?

_____________________________________________________________________________

_____________________________________________________________________________ 5. What aspect of the workshop would you change if this workshop were offered again?

_____________________________________________________________________________ _____________________________________________________________________________

6. Additional Comments: _________________________________________________________ _____________________________________________________________________________

Page 5: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

May 16, 2010, 1pm Title: Considerations in the design of clinical trials for comparative effectiveness research (CER) Chair: Carmen Rosa (Center for the Clinical Trials Network, National Institute on Drug Abuse)

Agenda

Introduction: Michael Lauer, NIH (NHLBI) Key factors impacting the design of comparative effectiveness trials Clinical and Practical Considerations to Enhance Structural and Operational Efficiency

Some of the areas that can be addressed are: (1) participant eligibility (target populations); (2) intervention and control groups flexibility and expertise (using local sites staff and usual treatment vs. highly trained research staff or placebo); (3) choice of primary outcome measure; (4) economic evaluation (not usually done in traditional RCTs).

Statistical Considerations: RCT designed for Decision Makers:

This segment will address key issues that the statistician must address in working with other researchers to design and implement CER designs. The session will specifically discuss subtle ways in which research questions and outcome measures may be distinct for CER clinical trials and discuss how extensions to traditional/classic subject-randomized designs may provide advantages for CER trials. Examples of such designs that will be discussed are cluster-randomized designs, Bayesian designs and adaptive designs.

Case Studies: Small group break outs will provide participants a hands-on exercise in design.

Discussion of case studies: designs and suggestions to improve the application of the key factors

Closing remarks: what is next? Michael Lauer, NIH (NHLBI) 

Page 6: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

1

Comparative Effectiveness Research and Clinical Trials

Michael S Lauer, MD, FACC, FAHADirector, Division of Cardiovascular Sciences

NHLBI/NIHMay 16, 2010

Disclosure: None

Disclosures

My immediate family and I have NOT received anything of value related to the technologies and topics being presented.

I will present my views which are notI will present my views, which are not necessarily those of NHLBI/NIH/DHHS.

I am neither an economist nor a politician.

2

Just Over One Year Ago…

3

Page 7: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

2

ARRA and CER

Allocation: $1.1 billionNIH: $400 millionAHRQ: $300 millionHHS: $400 millionHHS: $400 million

Federal coordinating counsel for CERAdvice on CER Federal infrastructure needsFederal officials including AHRQ, NIH, VA

IOM issued report on priorities June 2009

4 http://www.opencongress.org/bill/111-h1/text

So What is CER?

C = ComparativeReal contestExisting options for clinicians or systems

Effectiveness = OutcomesEffectiveness OutcomesClinical outcomes: mortality, morbidity, major clinical events, costsSystems outcomes: adherence to guidelines

Research = ScienceObservations, experiments, syntheses

5

What to Do?

6 http://www.sonofthesouth.net/revolutionary-war/general/death-bed.jpgThanks to David Sackett for inspiring this content

Page 8: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

3

Choices

7 http://upload.wikimedia.org/wikipedia/en/8/8c/Tracheotomy1.jpghttp://upload.wikimedia.org/wikipedia/commons/0/01/Blood_letting.jpg

Debates about Comparative Effectiveness

8 Chalmers I. Int J Epidemiol 2001;30:1156-64

Come down to the contest…

“Come down to the contest ye Humorists: Let us take out of the Hospitals or the camps or elsewhere, 200, or 500 poor People, that have Fevers etc. Let us divide them in Halfes, let us cast lots that one half of them may fall to mycast lots, that one half of them may fall to my share and the other to yours; I will cure them without bloodletting…; but do you do as ye know. We shall see how many Funerals both of us shall have: But let the reward of thecontention or wager, be 300 Florens, deposited on both sides: Here your business is decided.”

9 Van Helmont JA. Oriatrike. London: Lodowick-Loyd, 1662, p.526

Page 9: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4

The Trial is Reported…

“It had been so arranged, that this number was admitted, alternately, in such a manner thateach of us had one third of the whole. The sick were indiscriminately received, and were attended as nearly as possible with the same

10

attended as nearly as possible with the same care and accommodated with the same comforts.

Neither Mr Anderson nor I ever once employed the lancet. He lost two, I four cases; whilst out of the other third [treated with bloodletting by the third surgeon] thirty five patients died.”

Milne I and Chalmers I. J Epidemiol Community Health 2002;56:1a

Over 100 Years Later…

“During the last decades we have

t i l bl d t

11

certainly bled too little.”

William Osler, MD

David Sackett, Gairdner-Wightman Award, March 31, 2009 (www.cebm.net)

Modern Examples of Bloodletting

Thalidomide (birth defects)Hormone replacement (cancer, strokes)Oxygen for premature infants (blindness)Anti arrhythmic drugs (higher death rate)Anti-arrhythmic drugs (higher death rate)Bone marrow transplantation for breast

cancer (higher death rate)PSA for prostate cancer (over-diagnosis)

12 Thank you to Andrew Epstein

Page 10: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

5

Why Do We Need CER?

“Only a limited amount of evidence is available about which treatments work best for which patients and whether the added benefits of more-effective but more-expensive services are sufficient to warrant their added costsare sufficient to warrant their added costs—yet current practice tends to adopt more-expensive treatments even in the absence of rigorous assessments of their impacts….”

Peter Orszag

13 Congressional Budget Office 2007

Is This So?

14 Tricoci P et al. JAMA 2009;301:831-41

Nearly 50% of recommendations are based on expert opinion.Only 11% are based on multiple randomized trials.

Illustrative Story: AMI and the Occluded Artery

15http://www.circulation.or.kr/info/case/200904/fig1.gifhttp://www.indiastudychannel.com/attachments/Resources/82666-221135-Coronary%20Angiogram.jpg

Page 11: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

6

Observational Findings

16 Lamas GA et al. Circulation 1995;92:1101-9

The Real Story

When researchers tried to organize a randomized study of the benefits of angioplasty for patients who had suffered a heart attack three days or more before, they ran into a problem. Many doctors were so convinced of the value of this procedure…that they thought it would be unethical to assign any patients to the control g y pgroup, which would get all the best medicines for this condition but not the artery-reopening procedure.

But the researchers persisted, with heavy support from the National Heart, Lung, and Blood Institute. After four years of work examining 2,166 patients, they came to an unexpected conclusion….

17 Boston Globe, December 9, 2006

The Findings

18 Hochman JS. N Engl J Med 2006;355:2395-407

Page 12: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

7

CER at NIH…

19

What is Comparative Effectiveness Research?

CBO: “…a rigorous evaluation of the impact of different options that are available for treating a medical condition…

…may compare similar treatments, such as competing drugs-- or analyze different approaches…may focus on medical risks/benefits, or weigh costs.”

IOM: “The purpose is to assist consumers, clinicians, purchasers, and policy makers to make informed decisions that will improve health care at both the individual and population levels.”

20 Congressional Budget Office 2007; IOM Report 2009

Debate about Randomized Trials

“As currently designed and conducted, many RCTs are ill suited to meet the evidentiary needs implicit in the IOM definition of CER: comparison of interventions amongCER: comparison of interventions among patients in typical patient care settings with decisions tailored to individual patient needs. Without major changes…the nation risks spending large sums of money … to answer the wrong questions.”

21 Luce BR et al. Ann Intern Med 2009;151:206-209

Page 13: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

8

Pragmatic – Explanatory Continuum

Pragmatic• Broad eligibility• Flexible interventions• Typical practitioners

Explanatory• Narrow eligibility• Strict instructions• Expert practitionersyp p

• “Usual care” comparison• No follow-up visits• Objective clinical outcome• Usual compliance• Intent-to-treat

p p• Placebo/other comparator• Frequent follow-up visits• Surrogate outcomes• Close monitoring• ITT plus per protocol

22 Thorpe KE et al. J Clin Epidemiol 2009;62:464-75

PRECIS Tool

23 Thorpe KE et al. J Clin Epidemiol 2009;62:464-75

PRECIS Tool

24 Thorpe KE et al. J Clin Epidemiol 2009;62:464-75

Page 14: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

9

Final Thoughts to Initiate Our Dialogue

Why CER?True choicesVariable practices with similar outcomesAdoption without evidencepResearch to directly inform practice and policy

Practical trialsReal patientsReal settings Real discipline!Real outcomes

25

Page 15: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

1

“As applied in the health care sector, an analysis of Comparative Effectiveness is simply a rigorous evaluation of the impact of different options that are available for treating a given medical condition for a particular set of patients. Such a study may compare similar treatments, such as competing drugs, or it may analyze very different approaches such as surgery and drug therapy The

Congressional Budget Office definition for CER

ASSENT 2010

different approaches, such as surgery and drug therapy. The analysis may focus only on the relative medical benefits and risks of each option, or it may also weigh both the costs and the benefits of those options. In some cases, a given treatment may prove to be more effective clinically or more cost‐effective for a broad range of patients, but frequently a key issue is determining which specific types of patients would benefit most from it.”

Why are we here? Projected Federal Spending on Medicare, as share of GDP

Can Comparative Effectiveness Research live up to it’s promise?

22

ASSENT 2010

Regional variations in spendingWhat have we learned? 

33

ASSENT 2010Peter Orszag, N Engl J Med, 2007

Page 16: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

2

What do higher spending regions get?Lower Quality?

44

Resources – and Content of Care

30% more beds and MDs; 65% more 

specialists

Health Outcomes

Slightly higher mortality

Patient‐Perceived Quality

Worse access to primary care

Lower  overall rating

Physician’s Perceptions

Worse communication among physicians

Greater difficulty ensuring continuity

Trends Over Time

Regions with greater growth in resources 

have no better  gains in survival after AMI

ASSENT 2010

Worse technical quality

No more elective surgery

More hospital stays, visits, tests

No better function of medical care

Lower satisfaction with hospital care

Greater perception of scarcity

Lower satisfaction with career

(1) Fisher et al. Ann Intern Med: 2003; 138: 273-298 (2) Baicker et al. Health Affairs web exclusives, October 7, 2004(3) Fisher et al. Health Affairs, web exclusives, Nov 16, 2005(4) Skinner et al. Health Affairs web exclusives, Feb 7, 2006(5) Sirovich et al Ann Intern Med: 2006; 144: 641-649(6) Fowler et al. JAMA: 299: 2406-2412

Health Expenditures Unsustainable

Cost crisis has been inexorably linked to Comparative Effectiveness research (CER). – Assumption is that we are paying for items that are not as effective as less costly alternatives and 

f

ASSENT 2010

research will guide use of resources. – In fact the CER linked to cost is Cost‐ Effectiveness‐Research‐measuring the effect of an intervention in relation to the resources it consumes. Is it worth it? 

• Cochrane A. BMJ (1999) 319:652‐653

Efficacy and Effectiveness

• Efficacy research is generally thought of as looking at “what can work”; i.e., is a treatment safe and does it have a real effect?

• Effectiveness research, looks at what works “f h d d h t diti ”“for whom and under what conditions.”  Requires large numbers, “real world setting ”, clinically important outcomes, accounting for co‐morbidities .

• Efficacy research also generally involves comparison of an active agent to a placebo, whereas comparative effectiveness studies 

Page 17: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

3

The trillion dollar questions

• How to collect evidence on value and then incorporate this evidence into decisions on coverage, reimbursement, and payment for healthcare services? 

• How to develop value‐based, cost effective 

ASSENT 2010

healthcare that is trusted and not perceived as only cost cutting for profit/balancing federal budget. 

• Lack of evidence is a real impediment to value‐based healthcare. Collecting this data will take time.  Policy decisions based on incomplete data is subject to serious negative consequences.  Half truth sometimes more dangerous than nothing. 

Major Gaps in CER

ASSENT 2010

Data Wars

Outcome DATA/Quality Cost

ASSENT 2010

Page 18: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

4

Real Targets

• Overuse, Misuse and Underuse– Appropriate use requires evidence

• Danger is decision making in absence of outcome evidence.  Cost will drive decision making. Performance measure substitute for outcomes

ASSENT 2010

Performance measure substitute for outcomes.– Difficult to design a system without complete body of evidence and consensus on all of medicine.

• Difficult to accommodate a more expensive, more effective, new products in a system that is cost‐driven. True health outcome data is key (QALYs). 

National Institute for Health and Clinical Excellence (NICE)

• “An independent organization responsible for providing national guidance on promoting good health and preventing and treating ill health”

ASSENT 2010

health . 

• Secretary of State directs that NHS provide funds for medicines recommended by NICE appraisals usually within 3 months. 

• http://www.nice.org.uk/

National Institute for Health and Clinical Excellence

• Advisory to the British National Health System

• Produces “guidance” documents after consultation with practitioners and review of data primarily published data but also

ASSENT 2010

data, primarily published data but also pharma data.  Pharma submits economic evaluations.  Includes economic model‐cost/QALY

Page 19: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

5

NIH Transforming medicine and NIH health through Comparative Effectiveness Research

Long and Continuing Tradition at NIH

Sample NIH CER Projects

• Drug versus drug

• Surgery versus medical

• Lifestyle versus medicaly

• Surgery versus surgery

• Screening versus usual care

• Observational analyses based on EHR

15

Page 20: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

6

Types of CER investments and activities can be grouped into four major categories: 

Research (e.g., comparing medicines for a specific condition or discharge process A to discharge process B for readmissions) Human and Scientific Capital (e.g., training new researchers to conduct CER, developing CER methodology) CER Data Infrastructure (e g developing a distributedCER Data Infrastructure (e.g., developing a distributed practice‐based data network, longitudinal linked administrative or Electronic Health Record (EHR) databases, or patient registries) Dissemination and Translation of CER (e.g., building tools and methods to disseminate CER findings to clinicians and patients and translate CER into practice) 

Personalized medicine and patient subgroups

• Understanding the variability that underlies who benefits and who does not benefit f ifrom a given treatment. 

• Limit the number of persons treated with an intervention to which they don’t respond, but which best “on average”

Drug versus Drug: CATIE

Newer generation antipsychotics for schizophrenia (N=1493)

18

Lieberman JA, et al. N Engl J Med 2005;353:1209-23

Page 21: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

7

NIH CER Coordinating Committee Charge

Provide advice to NIH Director on:

– Priorities for ARRA CER funds

– Implementation of CER rules and definitionsImplementation of CER rules and definitions

– Optimal collaboration with AHRQ and other agencies

– NIH CER portfolio analysis

– CER communication and dissemination

– Long‐term CER efforts19

NIH‐AHRQ CER WorkgroupCharge:

– Coordinate dialogue with AHRQ

– Report to NIH CER Coordinating Committee

AHRQ Members– Jean Slutsky

– Yen‐pin Chiang

– Lia Hotchkiss

NIH Members– Michael Lauer (NHLBI): Chair

– Richard Suzman (NIA) 

– Philip Wang (NIMH)

– Nancy Miller (OSP) 

– Lynn Hudson (OSP)

20

The NIH is Fully Committed to CER

Our goals:

• Work closely with our DHHS colleagues

• Involve our scientific community, practitioners, i d t li k IOM dconsumers, industry, policymakers, IOM and 

other stakeholders

• Demonstrate our value to the public

• Work fast while ensuring accountability, transparency, and accessibility

• Develop infrastructure platforms to enhance CER capacity21

Page 22: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

8

Comparative Effectiveness Research at AHRQ

• Created in 2005, authorized by Section 1013 of the Medicare Prescription Drug, Improvement, and Modernization Act (MMA) of 2003

• AHRQ shall conduct and support research on:– “the outcomes, comparative clinical effectiveness, and appropriateness of health care items and services (including prescription drugs)”

• Goal: to provide patients, clinicians and policy makers with reliable, evidence‐based healthcare information

Center for Medicare and Medicaid Services

• Operates under regulations on coverage.– Require payment not be made for interventions that are not

“reasonable and necessary”

Default has been to provide coverage if there is no evidence of harm

ASSENT 2010

– Default has been to provide coverage if there is no evidence of harm. 

– Has  regulation to allow it in limited circumstances to cover only the “least costly alternative” for durable medical equipment and IV drugs. 

• Reimbursement based on RVU, not clinical benefit. 

• Would need new legal authority to apply CER principles to coverage decisions. 

• Fee for service model incentivizes “capacity” expansion. 

Coverage with Evidence Development

• Links payment to requirement for prospective data collection

• Intent is to guide clinical research to address

ASSENT 2010

Intent is to guide clinical research to address questions of interest to Medicare– Medicare must approve study design

• Goal to support evidence and innovation– Lower evidence threshold with commitment to generate better information later

24

Page 23: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

9

Need for pragmatic trials. The Pragmatic Randomized Control Trial 

Workshop was held on March31st through April 2nd, 2008 at the University of Toronto Conference Centre.

• There is an unmet need for pragmatic clinical trials because of misaligned incentives: – Patients are interested in access, physicians do not like to admit uncertaintylike to admit uncertainty, 

– product manufacturers are only interested in meeting regulatoryrequirements, 

– not all decision makers are concerned with the same gaps in evidence. 

Pragmatic Clinical Trial Characteristics

• the design of a pragmatic trial reflects variations between patients that occur in real clinical practice and aims to inform choices between treatments. To ensure generalisability pragmatic trials should, so far as possible, represent the patients to whom the treatment will be applied. The need for purchasers and providers of health care to use evidence from trials in policy decisions has increased the focus on pragmatic trialsincreased the focus on pragmatic trials.

• Outcome measures differ between explanatory and pragmatic approaches. In explanatory trials intermediate outcomes are often used,which may relate to understanding the biological basis of theresponse to the treatment—for example, a reduction in bloodpressure. In pragmatic trials they should represent the full range of health gains—for example, a reduction in stroke and improvement in quality of life.

Pragmatic vs. Explanatory TrialsThe Pragmatic Randomized Control Trial Workshop was held on March

31st through April 2nd, 2008 at the University of Toronto Conference Centre.

• Pragmatic clinical trials are those prospective trials designed to address the evidence needs of healthcare decision makers (payers, patients clinicians and policy‐makers) At thepatients, clinicians, and policy makers). At the opposite end of the spectrum are explanatory trials which are designed to address scientific questions.

Page 24: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

10

Assessing degree of “pragmaticism” Scott Tunis

Research Methods & ReportingImproving the reporting of pragmatic trials: an extension of 

the CONSORT statementBMJ 2008;337:a2390 Question Efficacy—can the intervention work? Effectiveness—does the intervention work

when used in normal practice?

Setting Well resourced, "ideal" setting Normal practice

Participants Highly selected. Poorly adherent

participants and those with conditions

which might dilute the effect are often

excluded

Little or no selection beyond the clinical

indication of interest

excluded

Intervention Strictly enforced and adherence is

monitored closely

Applied flexibly as it would be in normal

practice

Outcomes Often short term surrogates or process

measures

Directly relevant to participants, funders,

communities, and healthcare practitioners

Relevance to

practice

Indirect—little effort made to match design

of trial to decision making needs of those in

usual setting in which intervention will be

implemented

Direct—trial is designed to meet needs of

those making decisions about treatment

options in setting in which intervention will be

implemented

 

Efficacy vs. Effectiveness 

Participants Symptomatic patients stratified for carotid stenosis severity,. Exclusions included

mental incompetence and another illness likely to cause death within 5 years.

Patients also were temporarily ineligible if they had any of seven transient medical

conditions (eg, uncontrolled hypertension or diabetes)

Anyone aged 18-65 with non-specific low back pain of 4-52

weeks’ duration who were judged to be suitable by their

general practitioner. There were some exclusion criteria, eg

those with spinal disease

North American Symptomatic Carotid Endarterectomy Trial (NASCET)

Intervention Surgeons had to be approved by an expert panel, and were restricted to those who

had performed at least 50 carotid endarterectomies in the past 24 months with a

postoperative complication rate (stroke or death within 30 days) of less than 6%.

Acupuncturists determined the content and number of

treatments according to patients’ needs

Outcomes The primary outcome was time to ipsilateral stroke, the outcome most likely to be

affected by carotid endarterectomy. Secondary outcomes: all strokes, major strokes,

and mortality

Primary outcome was bodily pain as measured by SF-36.

Secondary outcomes included use of pain killers and patient

satisfaction

Relevance to

practice

Indirect—patients and clinicians are highly selected and it isn’t clear how widely

applicable the results are

Direct—general practitioners and patients can immediately

use the trial results in their decision making

 

Page 25: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

11

Design of Studies of diagnostic tests.Scott, Greenberg, Poole. Int Med J. (2008) 38:120

• Estimates of precision around test accuracy are often lacking.

• The performance of test in practice may be misjudged (overestimated) by studies with too many patients  who are disease free or too many with disease. 

• Selection bias‐ if reluctance to enroll in a study due to invasive nature of the confirmatory “gold standard” test then likely will end with patientsthe confirmatory  gold standard  test, then likely will end with patients with high pre‐test probability of disease.  Overestimate accuracy of the test.  

• Verification bias‐preferential referral for verificatio of dx. Using more invasive gold std such that verificaiton is incomplete‐ lead to overestimation if those with low pretest probability are not verified. 

• Unblinded reading.

First documented CT?Palmer CR, Shahumyan H. Stat Med. 2007 Nov 30;26(27):4939‐57

• The Old Testament book of Daniel records the world’s first‐ever controlled intervention study. It had a clear purpose, a concurrent comparison group, a pre‐defined clinical endpoint and a useful conclusion, despite lacking a written protocol, treatment group blinding and having inadequate duration (10 days) and group sample sizes (four versus an unstated number):

• Then Daniel said to the steward whom the chief of the eunuchs had appointed over Daniel,Hananiah, Mishael, and Azaariah; ‘Test your servants for ten days; let us be given vegetables to eat and water to drink. Then let our appearance and the appearance of the youths who eat the kings rich food be observed by you, and according to what you see deal with your servants’. . . .At the end of ten days it was seen that they were better in appearance . . .than all the youths who ate the king’s rich food. So the steward took away their rich food and the wine they were to drink, and gave them vegetables. Daniel 1:11‐16.

First RCT in Medicine

• Medical Research Council Streptomycin in Tuberculosis Trials Committee. Streptomycin treatment for pulmonary tuberculosis. British Medical Journal 1948; ii:769–782.

• Based upon Fisher RA. The arrangement of field experiments. Journal of the Ministry of Agriculture Great BritainJournal of the Ministry of Agriculture, Great Britain 1926;33:503–513.– sample size calculation relying on specifying a supposed clinically 

relevant difference in treatment effects, and type I and type II error rates (with 0.05 set almost universally as the level of significance and 80–90 per cent as the power

Page 26: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

12

The Bayesian Statistic: An approach fitted to the clinic.

Meyer, Vinzio Goichot. La Revue de Medicine Einterna; i Volume 30, Issue 3, March 2009, Pages 242‐249

• Bayes’ theorem on which this paradigm relies is frequently used by the clinicians. There is a direct link between the routine diagnostic test and the Bayesian statistic. This link is the Bayes’ theorem which allows one to compute positive and negative predictive values of a test. The principle of this theorem is extended to simple statistical situations as an introduction to Bayesian statistic. The conceptual simplicity of Bayesian statistic should make for a greater acceptance in the biomedical world.

Practical Bayesian adaptive randomisation in clinical trials.

Thall PF, Walthan JK, Eur J Cancer. 2007 Mar;43(5):859‐66. Epub 2007 Feb 16. 

• While randomisation is the established method for obtaining scientifically valid treatment comparisons in clinical trials, it sometimes is at odds with what physicians feel is good medical practice. If a physician favours one treatment over another based on personal experience or published data, it may be more appropriate ethically for that physician to use the favoured treatment, rather than enrolling patients on a randomised trial. Still, the randomised trial may later show the physician's favoured treatment to be inferior. This paper reviews a statistical method, Bayesian adaptive randomisation, that provides a practical compromise between the scientific ideal of conventional randomisation and choosing each patient's treatment based on a personal preference that may prove to be incorrect

Practical Bayesian Adaptive Randomization in Clinical Trials

Thall PF, Walthen JK. Eur J Cancer 2007 43:859

• An alternative statistical method for comparing treatments that provides a practical compromise between the scientific ideal of conventional randomization, which essentially bases treatment selection on a coin flip, and choosing the patient's treatment based on a personal preference that may turn out p p yto be wrong.  – The purpose of randomization is to ensure that whatever the unknown 

latent variables may be, on average their effects will be the same in the two treatment arms. 

– Comparing effects of two interventions tested in different trials is problematic because the inter trial differences are often larger than the treatment effect. 

Page 27: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

13

Practical Bayesian Adaptive Randomization in Clinical Trials

Hall , Kyle Eur J Cancer 2007 43:859

• Frequentist – the unknown parameters are fixed 

• Bayesian‐ parameters are random

– Includes a prior probability distribution (prior(O)

– Computes a posterior probability distribution 

• Liklihood (data/Ο) x  prior(O)/ prob (data)• Uses data to convert a prior to a posterior

• Used repeatedly so that posterior at one stage used as prior for next stage

Bayesian adaptive randomizationHall , Kyle Eur J Cancer 2007 43:859

Bayesian Adaptive RandomizationHall , Kyle Eur J Cancer 2007 43:859

• Uses current data to compute randomization probability for each patient at time of enrollment.

• Helpful to simulate the trials behavior under f b lvariety of possibilities prior to start. 

• Involves a trade‐off between the variability in the estimator of the comparative treatment effect  for the ethical desirability of treating as few patients as possible with the inferior treatment.

Page 28: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

14

Bayesian Adaptive Randomization

• Changing characteristics of the patients enrolled over time (drift) can be a major confounder.  Include known covariates in the modelmodel. 

Conventional vs. Bayesian Randomization.  A means to increase participation in clinical 

research?• “Ms. B, I have two possible treatments for your cancer, A and B, but I do not know 

which is better. So I would like to enroll you in a clinical trial aimed at comparing these treatments to each other. If you agree to enter the trial, your treatment will be chosen randomly by a computer, based on the data that we have so far on how well these two treatments have done with previous patients in the trial.

• Or• Or

• M s B , I have two possible treatments for your cancer, A and B, but I do not know which is better. So I would like to enroll you in a clinical trial aimed at comparing these treatments to each other. If you agree to enter the trial, your treatment will be chosen by flipping a coin.“ Hall , Kyle Eur J Cancer 2007 43:859

• The methodology is for applications having two or three treatments with binary responses, best suited to the case of outcomes occurring rapidly relative to the patient accrual rate and when the disease involved is serious or life threatening.Palmer CR, Shahumyan H. Stat Med. 2007 Nov 30;26(27):4939‐57

Implementing a decision‐theoretic design in clinical trials: why and how?

Palmer CR, Shahu myan H, Stat Med. 2007 Nov 30;26(27):4939‐57.

• Perhaps unsurprisingly, it takes fewer patients to come to a conclusion that treatment A is better than treatment B than it does to estimate the difference in their success rates with high precision. Pragmatists argue that it does not matter other things being equal if the success rate on Athat it does not matter, other things being equal, if the success rate on A outperforms B by 0.20 or 0.02, merely that B is not as effective as A.

• Objections to the subjective nature of prior distributions involved in a Bayesian framework can be alleviated, and in parallel a claim of robustness made,…by implementing a design that has desirable properties across a range of priors and that will not lead to unduly early stopping…. Appeal instead to a notion of ‘worstcase scenario’.

Page 29: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

15

Implementing a decision‐theoretic design in clinical trials: why and how?

Palmer CR, Shahumyan H. Stat Med. 2007 Nov 30;26(27):4939‐57.• Clinical trial design as put into practice has not changed fundamentally 

since Fisher introduced randomization in the 1920s into agricultural field studies, nor since Bradford Hill translated this defining feature into medical research some two decades thereafter. However, a wealth of theory, much ethically motivated, has been developed subsequently, yet is not routinely implemented in today’s clinical research.y p y

• Medical progress is based on research which ultimately must rest in part on experimentation involving human subjects. . . . In medical research on human subjects, considerations related to the well‐being of the human subject should take precedence over the interests of science and society. 

Research on humans, then, is necessary, but matters of individual ethics are to be given higher priority  than matters of collective (or research) ethics,

The value of information and optimal clinical trial design 

Willan AR, Pinto EM.  Stat Med. 2005 Jun 30;24(12):1791‐806.. 

• Type I error, the probability of rejecting the null hypothesis of no difference when it is true, is most often set to 0.05, regardless of the cost of such an error. In addition, the traditional use of 0.2 for the type II error means that the money and effort spent on the trial will be wasted 20 per cent of the time, even when the true treatment difference is equal to the smallest clinically important one and, again, will not reflect the cost of y p , g ,making such an error.

• An effectiveness trial (otherwise known as a pragmatic trial or management trial) is essentially an effort to inform decision‐making, i.e. should treatment be adopted over standard and  sample size should be determined to maximize the difference between the cost of doing the trial and the value of the information gained from the results.

– Ex‐which of two wavelengths of light to treat macular degeneration vs. C section vs. vaginal delivery in breech babies.   If really both treatments the same then not much hazard in Type I error in one but big cost for the other. 

Cluster Randomized Trials:

• Examination of interventions at the level of the health service organizational unit or geography.

• Randomize groups as opposed to individuals• Intervention may be designed to be delivered to groups.g p

• Intervention requires health professionals to change behavior which would be difficult to prevent leak in a practice. 

• Intervention requires introduction of expensive equipment so economical to supply randomized sites.

• Attractive design to test effectiveness of intervention in practice setting (pragmatic)

Page 30: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

16

Cluster Randomized Trials

• Randomization of individuals assumes outcomes are independent of one another.

• Cluster randomization characteristics of individuals in a group are more similar to each g pother than individuals in other groups. 

• Outcomes may differ due to environmental differences as well as the intervention.

• Need to account for clustering effects in calculating sample size and in data analysis. 

Cluster RCT

• Cluster RCT with reduced power due to increased chance of a Type 2 error.

• Intracalass coefficient (p)the proportion of the true total variation in outcome attributable to the difference between clusters. 

• Sample size  1+(m‐1)p, where m=average cluster size• Will always require larger sample size than a comparable non cluster RCT.

• Nailing down the covariates that account for between cluster variation decreases sample size. 

• Blinding may be difficult in cluster trials. 

Cluster RCT

• Ethical issues– Consent of individuals is difficult as randomizing groups, little leeway for individual choice.

– Consent obtained after cluster is randomized but post randomization selection bias an issue.

• Drop outs after randomization may be due to assignment (individuals or entire clusters)

– Decisions made by gatekeepers or caregivers.– How to ensure the interests of individual patients are met?

Page 31: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

17

Shortfalls in currently published Cluster RCTs

• 42% of 149 cluster RCTs of implementation research did not account for the clustering in design.

• 42% of 54 studies of decision support systems• 42% of 54 studies of decision support systems accounted for clustering in analysis and none in their calculation of sample size. 

• Only 25% of 16 explained the rationale for choosing cluster design. 

CONSORT Recommendations for Cluster RCTsCampbell, M. K., Elbourne, D. R., & Altman, D. G. (2004). CONSORT statement: Extension to 

cluster randomised trials. British Medical Journal, 328, 702‐708. 

• Explain rationale for choosing cluster RCT• How clustering incorporated into design and sample size calculations.

• How effects of clustering were incorporated into the analysis. y– Cluster outcome vs. individual patient outcome

• Flow of clusters from randomization to analysis. – Method of randomizing the assignment

• Use of blocking, stratification, matching to minimize imbalance (potential for systematic bias)

• Was the sequence of randomizaton concealed until assignment was made?

The delayed‐start study design.D’Agostino R N Engl J Med. 2009 Sep 24;361(13):1304‐6. B 

Page 32: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

1

Considerations in the Design of Clinical Trials for Comparative p

Effectiveness Research

Grant D. Huang, MPH, PhDVA Cooperative Studies Program

Society for Clinical Trials Pre-Conference WorkshopBaltimore, MDMay 16, 2010

OUTLINE

• Challenges facing clinical research

• Establishing and refining the question

• Methodological / operational considerations

The views, opinions, and content of this presentation are those of the speaker and do not necessarily reflect the views, opinions, or policies of the Department of Veterans Affairs.

Challenges in Clinical Research

• Burden of meeting regulatory requirements • Limited resources• Greater # of competing priorities • Rapidly changing environment

Scientific & technologicalScientific & technological• Contracting• Hiring trained personnel• More expectations for scientific process & product

RCTs take too long, cost too much, and don’t readily translate to the bedside.

Page 33: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

2

The Problem

How does one conduct a clinical trial in comparative effectiveness research…that addresses an important clinical need

hi h ill b il d t d b ti t d…which will be easily understood by patients and clinicians

…and can be completed quickly?

Can a clinical trial on comparative effectiveness be done “simply”?

Comparative Effectiveness Research

• Several definitions for CER are available (IOM, 2009):Congressional Budget OfficeIOM Roundtable on Evidence-Based MedicineAmerican College of PhysiciansIOM Committee on Reviewing Evidence to Identify Highly Effective Clinical ServicesMedicare Payment Advisory CommissionMedicare Payment Advisory CommissionAgency for Healthcare Research and Quality

“Ultimately, CER aims to provide data that can influence clinical decisions for the better.”

- Initial National Priorities for CER – IOM, 2009

What is the Question?

• “Begin with the end in mind” Steven Covey, “7 Habits of Highly Effective People”

What is the product at the end of the study?

• Comparative vs. explanatory studyWhat is the comparison - Intervention, setting, population?“while we’re at it ”while we re at it…

• Informing clinicians/patients vs. informing policyA study aimed at changing policy may increase its complexity.Policy is often the result of several factors.

• Consider the answerThinking about what is to be disseminated may help frame the question.

Page 34: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

3

Refining the Question

• IOM list of priorities provide some guidance on important areas of investigation.

• What question do clinicians wrestle with and why?

• Are there existing biases that have to be taken into account?

E.g. - cancer screening

• What are the interests of the stakeholder(s)?

Methodological Considerations

• DesignIs an RCT the best approach? Does a study have to be blinded?

• Patient population How do inclusion/exclusion criteria affectunderstanding of the comparison?Do co-morbid conditions impact intervention?Is managing risk factors important?

• “Cultural” factorsIs there clinical equipoise among clinicians?Do differences in interventions affect implementation?What are patient preferences?

• OutcomesWhat is meaningful for the end user?How often is follow-up needed? Justification?

Recruitment

Execution

Data collection

Translation

Page 35: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4/27/2010

4

Operational Considerations

• Do protocol requirements increase time needed for contracting, hiring, getting approvals?

• Do protocol requirements potentially complicate IRB reviews, consent form language, etc.?, g g ,

E.g., variation in describing behavioral strategy vs. drug intervention vs surgical technique

• How much time will fulfilling GCP requirements, resolving data queries, adjudication, etc. add to the effort?

Conclusions

• CER has important potential to inform health care providers, recipients, and decision makers.

RCTs represent a major tool for providing strong evidence.

• Designing RCTs in CER should consider how the study will inform and be translated.

While attempting to keep things “simple”

• Methodological considerations in CER should give high priority to patients and clinician factors.

QUESTIONS?QUESTIONS?

Page 36: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Statistical Considerations: RCT designed for Decision Makers

Overview: This segment will address key issues that the statistician must address in working with other researchers to design and implement CER designs. The session will specifically discuss subtle ways in which research questions and outcome measures may be distinct for CER clinical trials and discuss how extensions to traditional/classic subject-randomized designs may provide advantages for CER trials. Examples of such designs that will be discussed are cluster-randomized designs, Bayesian designs and adaptive designs. An outline of the presentation is provided below:

1. The statistician and CER Clinical Trials: Some Background Thoughts

a. Key Premise: The statistical aspects of the design of a CER RCT are not different than for any other SCT, but some of the issues differ subtly

b. Review the key elements of the IOM definition of CER from a statisticians perspective

c. Key elements in the statistical design of the RCT 2. What is the research question

a. Overall goal of the study b. Treatment measures (key issue for CER, fixed or variable) c. Population of interest (address the CER issue of population versus

individual risk) 3. Outcome measures (All of the normal issues plus the issue of patient-level or

population-level inference plus the IOM definition) 4. Where does SCT fit within the CER evidentiary profile? 5. Selecting the study design in the face of research question and outcomes

a. Limitations of the classic design b. Potential alternate study design approaches

6. Cluster-randomized designs a. Potential benefits for CER RCTs b. Key design issues c. Sample size calculations for cluster-randomized designs

7. Bayesian RCTs a. Potential benefits for CER RCTs b. Key design issues c. Sample size calculations for cluster-randomized designs

8. Adaptive designs for Arm trimming or addition a. Potential benefits for CER RCTs b. Key design issues c. Sample size calculations for cluster-randomized designs

9. Example study designs a. Cluster-randomized comparison of treatment for premature births b. Bayesian design for use of systematic review for study design

10. Final Comments

Page 37: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

1

Statistical Issues in the Design of RCTsfor Comparative Effectiveness Research

Dennis Wallace, PhDSociety for Clinical Trials Pre Conference Workshop

1

RTI International is a trade name of Research Triangle Institute

3040 Cornwallis Road ■ P.O. Box 12194 ■ Research Triangle Park, North Carolina, USA 27709

Society for Clinical Trials Pre-Conference WorkshopBaltimore, MDMay 16, 2010

Designing RCTs for CER: Key Premise

The statistical aspects of design of a randomized control trial (RCT) for comparative effectiveness research (CER) are fundamentally no different than those associated with any RCT, but the statistician

2

must be ready to address specific CER concerns

As always, the trial must be designed to answer the research questions of interest to the investigators within the available resource constraints and concerns for ethical treatment of research subjects.

The IOM Definition of CER from a Statistical Perspective

CER is the generation and synthesis of evidence that comparesthe benefits and harms of alternative methods to prevent,diagnose, treat, and monitor a clinical condition or to improve thedelivery of care. The purpose of CER is to assist consumers,clinicians, purchasers, and policy makers to make informeddecisions that will improve health care at both the individual and

3

decisions that will improve health care at both the individual andpopulation levels.

The key elements of this definition are the direct comparison ofeffective interventions, the study of patients in typical day-to-dayclinical care, and the aim of tailoring decisions to the needs ofindividual patients.

Page 38: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

2

Anatomy of a Study Design

Select the Study Design

Who are the Study

Define the Research Questionsand Hypotheses

4

Subjects?

What measurements do we need? OutcomesPredictors or

Confounders

How will we analyze the data?

How many subjects do we need (sample size)?

What is the Research Question?

What are the broad goals of the study in terms of clinical understanding and scientific inference?

What treatment regimens are to be evaluated and how will those treatment regimens be implemented for

5

how will those treatment regimens be implemented for this particular trial?

What clinical population is the target of the inference for this trial?

Study Inference Goals

In designing any randomized clinical trial, we must establish the framework in which the information generated from the study will be used to inform clinical understanding.

In most classical designs, the clinical understanding is typically f d t ti ti ll ti ti bl f l

6

framed statistically as an estimation problem or a formal hypothesis testing problem.

In designs for CER, clinical understanding may fall within the framework of evidence

Patient-specific decisions?

Evidence-based guidance?

Page 39: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

3

Study Inference Goals—CER Evidence

Levels of evidence

1++ High quality meta-analyses, systematic reviews of RCTs, or RCTs with a very low risk of bias

1+ Well conducted meta-analyses, systematic reviews of RCTs, or RCTs with a low risk of bias

1- Meta-analyses, systematic reviews or RCTs, or RCTs with a high risk of bias

2 Hi h lit t ti i f t l h t t di Hi h

7

2++ High quality systematic reviews of case-control or cohort studies or High quality case-control or cohort studies with a very low risk of confounding, bias, or chance and a high probability that the relationship is causal

2+ Well conducted case-control or cohort studies with a low risk of confounding, bias, or chance and a moderate probability that the relationship is causal

2- Case-control or cohort studies with a high risk of confounding, bias, or chance and a significant risk that the relationship is not causal

3 Non-analytic studies, eg case reports, case series

4 Expert opinionHarbour R, Miller J. A new system for grading recommendations in evidence based guidelines. BMJ 2001; 323: 334–36.

Treatment Regimens

“Treatment” may be defined more ambiguously in CER RCT’sClassicly the treatment regimen is very strictly defined through the protocol, with clear documentation of protocol deviations in the CRFs;For the CER goal of “study of patients in typical day to day

8

For the CER goal of study of patients in typical day-to-day clinical care” regimens may adapted or modified to address patient comorbidities or potential drug-drug interactions.Study designs should clearly specify how regimens are to be implemented or adapted and addressed in analyses.

Provisions may be needed for dropping or adding treatment regimens as evidence evolves or becomes available from external sources as the trial progresses.

Population of Inference

Because a primary goal of CER is the study of patients in typical day-to-day clinical care, the populations included in CER-driven trials are likely to be more diverse than classic trials.

Provides greater generalizabilityMay create concerns of selection bias if inference population doesn’t match enrolled population

9

match enrolled population

Careful consideration will need to be given to the potentially conflicting goals of improving health care at both the population level and tailoring decisions to the needs of individual patients

With the CER focus on decisions that meet the need of individual patients, subgroup analyses will be an issue:

Key subgroups of interest should be defined a prioriStrategies for analysis re post-hoc defined subgroups, particularly those defined by post-randomization factors should be addressed

Page 40: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

4

What are the Study Outcome Measures?

Luce in discussing pragmatic trials and CER notes: “Primary and secondary outcomes are patient-centered, chosen to reflect what matters most to patients and clinicians.”

While focus on such items is laudable, this goal coupled with the goal of studying a diverse patient population in large simple trials will yield a number of study design issues for statisticians and clinicians:

10

y gIn “patients with common comorbid conditions and diverse demographic characteristics” key patient-centered outcomes may differ across subpopulations of patients (composite outcome?)For any given individual both the patient and clinician will be concerned with multiple measures of clinical effectiveness and with safety in the face of comorbidities (multiplicity)Patient-centered outcomes are likely to involve quality of life measures, which may vary substantially with comorbidities and demography; these measures are historically difficult to standardize.

Study Design

Goals of the study design are:Yield clinically relevant estimates of treatment effects and the precision of these estimatesMinimize bias in both effect estimates and inferenceProvide a high level of reproducibility and external validity

11

“Getting the correct design for the question being posed is critical since no amount of statistical analysis can adjust for an inadequate or inappropriate design” (Cook and Demets, 2008)

Furthermore, to facilitate both design and interpretation, the trial should be designed as simply as possible given the complexities of the research question, the treatment algorithm, the outcomes of interest and their measurement, and the inferential requirements. (Adapted from Piantadosi, 1997)

Historic Limitations of RCTs for CER

Narrowly defined inclusion/exclusion criteria limit the generalizability of the results to a broad clinical population.

Outcome measures selected are not those of greatest interest to patients and clinicians

Trials tend to be expensive and long

12

Trials tend to be expensive and longFar too much data collected on parameters of limited valueNarrow criteria complicate recruitmentDealing with subgroup differences and associated multiplicity issues increases sample size requirements

Trials do not use external information efficiently for mid-trial decisions

Page 41: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

5

Possible Alternative Approaches for CER RCT Design

Cluster randomized trials

Bayesian approaches for randomized clinical trial design and analysis

13

Adaptive designs for trimming or addition of treatment regimens

Cluster-Randomized Trials: Potential Benefits for CER RCTs

Randomization at the practice-level rather than the individual level generally provides a more heterogeneous patient population

Cluster-randomized trials typically enroll larger

14

yp y gnumbers of subjects more quickly than individually randomized trials

Potential for simplified data collection in that substantial portions of data can be collected from practice-level records rather than from individual patient visits.

Cluster-Randomized Trials: Key Design Issues

Unit of InferencePopulation-average effects based on use of the cluster (practice) as the unit of analysis and inferencePatient-level inference with the individual subject as the unit of analysis and inference

15

Selection of the cluster-level randomization schemeSimple, completely randomized designStratified randomizationConstrained allocation

Accounting for within-cluster correlation of subjects in design and analysis

Page 42: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

6

Cluster-Randomized Trials: Alternative Analytic Approaches

Population-level inference with cluster as the unit of analysis

Because clusters are independent, use any standard approach like ANOVA or t-tests, possibly accounting for heterogeneity of variance across clusters

16

Permutation or randomization tests (possibly accounting for individual differences)

Patient-level inference with subject as the unit of analysis (accounting for correlation among subjects)

Linear mixed models for continuous (normal) outcomes

Random coefficient models for categorical outcomes

Cluster-Randomized Trials: What is Intraclass Correlation and How Do I Determine it?

The intraclass or intracluster correlation coefficient (ICC) is simply a measure of the “commonness” or correlation among individuals within the same cluster that makes those individuals have outcomes that are more similar than the outcomes for individuals in different clusters.

For continuous (normal) outcomes the ICC is simply the ratio of

17

For continuous (normal) outcomes, the ICC is simply the ratio of the between cluster variance to the total sum of between and within cluster variance

For binary (binomial) and count (Poisson) outcomes, the ICC can also be viewed in a similar fashion, but it can also be viewed as a function of the between cluster variability in event rates that exceeds that expected by chance due to the underlying binomial or Poisson distribution

Cluster-Randomized Trials: Sample Size

As with any clinical trial, selection of sample size is a function of the statistical properties of the trial and the resources available for analysis

For trials in which the cluster is the unit of analysis, sample size b t d i th l h f i di id ll

18

can be generated using the usual approaches for individually randomized trials, with the cluster considered to be the “individual”

For permutation or randomization-based analysis approaches, simulated power is typically the best approach

For large-sample based analytic approaches, simply use your favorite power calculation package

Page 43: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

7

Cluster-Randomized Trials: Sample Size

For trials in which the patient or subject is the unit of analysis, sample size can be generated using either simulation techniques or standard calculation formulae

Standard formulas are presented for a variety of outcome i th D d Kl f (2000) D i d

19

measures in the Donner and Klar reference (2000),: Design and Analysis of Cluster Randomization Trials in Health Research.

Generally the approaches in this reference simply adapt standard approaches to account for the intraclass correlation.

Bayesian Approaches: Why consider it for CER?

Frequentist (or classical) methods assume that unknown parameters (e.g. risk of mortality in the population) are fixed constants; hence you cannot make probability statements about parameters because they are fixed.

Bayesian methods treat parameters as random variables and define probability based on our understanding of how likely an

20

define probability based on our understanding of how likely an event is to be true; as such we can make probability statements about parameters.

By making probability statements about the current state of knowledge (prior probabilities), “Bayesian statistics” provides an approach and method of analysis which combine prior knowledge and accumulated experience with current information (likelihood) in order to make inferences about a quantity of interest (posterior probabilities).

Bayesian Approaches: What are some possible advantages for CER research?

If good prior information is available related to a question, Bayesian approaches provide a formal calculus for integrating that information with new data.

Specifying prior probability distributions requires collaborators to thi k f ll d k li it d i i b t h t i l d

21

think carefully and make explicit decisions about how to include prior information.

Bayesian approaches have some advantages for interim monitoring of clinical trials in the sense of handling accumulating information in a formal way and for dealing with subgroup analysis without having to deal with the issue of multiplicity.

Page 44: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

8

Bayesian Approaches: What issues will need to be addressed in designing CER RCTs?

Both the design of the trial and the conclusions of the trial depend upon the prior distributions. Consequently, statisticians and clinicians will need to work together closely to reach a consensus of how priors will be defined

Because the conclusions of the trial rely on prior information and

22

Because the conclusions of the trial rely on prior information and decisions about that information can be somewhat subjective, standard approaches for presenting results have not been established; during the design phase, the team will need to determine how study results are to be presented.

Software for the methods is less developed and less user friendly than for classic methods, so added expertise may be needed.

Care must be taken to not be overly optimistic during study design.

Bayesian Approaches: Steps in Designing a Trial from a Bayesian Perspective

As with the classic designDefine the key outcome measures for the studyDefine key subgroups that will be included a priori in the analyses and establish the outcome measures for each subgroup

F h k t d h b l ti f

23

For each key outcome measure and each subpopulation of interest, define the prior distribution of the parameters:

Typically for CER, prior distributions will be based on strong evidence provided via systematic reviewsDefine both optimistic and pessimistic alternatives to the prior distributions

Bayesian Approaches: Steps in Designing a Trial from a Bayesian Perspective

Establish the Bayesian evidentiary criteria that will be used to compare effectiveness of the alternatives at the end of the trial.

Utilize either simulation procedures or standard calculation methods based on conjugate priors to estimate sample sizes based on the full range of priors established above.

24

based on the full range of priors established above.

Page 45: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

9

RCTs for CER: Statistical Approaches

Questions

25

Page 46: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Case Study #1

Compare the effectiveness of traditional risk stratification for coronary heart disease (CHD) and noninvasive imaging (using coronary artery calcium, carotid intima media thickness, and other approaches) on CHD outcomes

Coronary heart disease is the leading cause of death in the United States. In nearly half of case, the first clinical manifestation is death or a life-threatening heart attack. Decades of pathological and imaging research has revealed that the disease often goes through a prolonged asymptomatic phase, one that often last for decades.

During the past 10-15 years, it has become possible to use CT scans to take pictures of coronary arteries and look for deposits of calcium, which serve as indirect measures of disease. Several large-scale studies have shown that coronary artery calcium measures are strong predictors of future heart attacks and death in otherwise asymptomatic people. Some advocates call for routine screening of the population with coronary calcium scanning. Others, however, argue that we’ve seen previous cases where a strongly predictive test failed to improve outcomes when used for screening (e.g. PSA for prostate cancer, urinary catecholamines for neuroblastoma).

By today’s standard of care, adults at potential risk for coronary disease are assessed for risk based on “traditional risk factors,” namely age, gender, smoking, blood pressure, and cholesterol. The IOM calls for CER the compares the effectiveness of traditional risk stratification with noninvasive imaging (e.g. coronary calcium) as ways to prevent premature deaths and heart attacks.

What kind of study should be done?

A recent article illustrating the problem is attached (Bonow).

I am also attaching an article that discusses the appropriateness of randomized trials that focus on diagnostic tests (Lord).

Page 47: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Case study #2: Compare the effectiveness of primary prevention methods, such as exercise and balance training, versus clinical treatments in preventing falls in older adults at varying degrees of risk. Background: In the U.S., over one-third of adults 65 years and older have had a fall. Falls are the leading cause of injury-related deaths, hospital admissions for trauma, and non-fatal injuries in older adults (CDC, 2010). In addition to injuries, falls are associated with disability, reduced mobility and physical fitness, and greater fears for falling. There are also significant costs associated with treating, caring for, and preventing falls and fall-related outcomes. Several risk factors for falls have been identified. Having had a previous fall, balance impairment (as determined by gait and strength), visual impairment, and taking multiple medications are among the most commonly identified risk factors. Medical conditions associated with higher risks for falls include depression, diabetes, arthritis, pain, and urinary incontinence. Women are more likely to have a non-fatal injury from a fall, while men are more likely to die from a fall. Efforts aimed at preventing falls have included both behavioral and pharmacological-based interventions. Exercise that combines gait, strength, balance, and endurance training is a commonly used approach. More recently, Tai Chi has become popular as a prevention method. It is not clear whether physical therapy or other structured therapies are beneficial. Vitamin D has been found to help with fracture prevention. However, most other clinical interventions have focused on managing medications that are associated with risk factors such as antihypertensive agents, anticoagulants, and antidepressants. A recent meta-analysis (Woolcott et al., 2009) found that sedative, hypnotics, antidepressants, and benzodiazepines were associated with a higher risk for falls among the elderly.

Page 48: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Case Study #3: Compare the effectiveness of various screening, prophylaxis, and treatment interventions in eradicating methicillin resistant Staphylococcus aureus (MRSA) in communities, institutions, and hospitals. Since its emergence in the early sixties, Methicillin-resistant Staphylococcus aureus (MRSA) was recognized to be associated with some predisposing factors. In 2005, around 94,000 invasive MRSA infections were estimated to occur in the USA, causing about 18,500 deaths. Traditionally confined to medical institutions, a growing number of community-acquired (CA-MRSA) cases are being diagnosed. Among the different forms of pathologic presentations, MRSA sepsis and pneumonia can be lethal. Only few costly antibiotics are currently efficient in treating these infections. It remains unclear, however, whether MRSA will expand the scope of their resistance thus causing further severe infections and posing growing challenges to the medical community. Preventive measures Infection control has multiple important elements, such as early screening, identification of MRSA carries for isolation, nasal and skin decontamination, staff education, enforcement of hand hygiene and decontamination of patients' wards. Besides nasal and cutaneous swabs, throat and rectal areas are considered for routine swabbing. While these aggressive measures and strict guidelines can improve the efficacy of hospital bed usage, they clearly cannot completely eradicate this resistant organism. It seems that MRSA is slowly gaining this battle by acquiring more territories and even using medical staff and their instruments, such as faucets, computer keyboards and stethoscopes, for further expansion. What study would you propose to prevent MRSA? Please consider the elements described in the workshop.

Page 49: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Case Study #4. : Compare the effectiveness of pharmacologic and non-pharmacologic treatments in managing behavioral disorders in people with Alzheimer’s disease and other dementias in home and institutional settings.

Alzheimers disease is a major health burden and is expected to worsen as the population ages. Behavioral disturbances are common problems that affect the management of person with Alzheimers disease with estimation that 60-80% of persons with AD will have psychological or behavioral disturbances during the course of the disease, more so in the stages of severe dementia. Aggressive behavior is especially problematic as it can lead to self harm, physical, and more often psychic harm to caregiver, transfer to a more restrictive environment, social isolation, etc. It is sometimes associated with paranoid delusions and hallucinations. The use of antipsychotics, hypnotics, and anticonvulsants to decrease aggressive behavior in AD has been studied to some limited extent but use of these agents remains controversial. Behavioral based treatment protocols are also touted as helpful in cohort studies and small RCTs at single institutions. Larger studies show some benefits of atypical narcoleptics in decreasing aggression but significant side effects – tardive dyskinesia, cardiovascular morbidity and mortality, sedation-related side effects such as aspiration, dehydration, depression, etc. Some side effects are dose related. Choices include treatment with medium to high doses of drugs at time of an aggressive event. Others include daily administration of lower doses in attempt to prevent aggressive events.

Scales available include aggression scale, BehaveAD scale, counting aggressive events, SAEs, quality of life scales for AD filled out by caregivers.

Troubles to consider

- caregiver reluctance in withholding medications in aggressive patients because of fear that things will get worse and come to crisis.

Most patients who have already demonstrated aggressive behavior are on drugs so need to consider wash out in a drug trial with its associated potential behavioral consequences.

Ethics of randomization in frail, severely demented, elderly persons who are unable to give consent.

Placebo effect which may be related to the enhanced human-interactions with patients in a study as compared to persons with same stage of disease who are not in the study.

A) You are retained by a health organization which owns and runs 100 skilled nursing facilities across the country with 100 AD patients each and their data suggests half of these have intermittent aggressive behavior. They have a $2million research budget and would like to know whether a neuroleptic (risperidone) should be prescribed for their AD patients (subgroup unspecified) alone, in place of, or in addition to, a behavioral program that they already have in place. What type of trials would you consider proposing?

Page 50: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

B) A Geriatrician consults you about an NIH grant idea. They wish to do a multi-center trial in the Alzheimers Disease Research Consortium. The Consortium has 10,000 AD patients in early and mid stages of the disease. Their data suggests that 10% of the patients have intermittent behavioral disorders. They wish to study whether daily use of an anticonvulsant, topiramate, is superior on some clinically meaningful metric, to counseling caregivers on behavior management. What type of trials would you consider proposing?

See attached Cochrane review for further background.

Page 51: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

 current as of March 4, 2010. Online article and related content 

  http://jama.ama-assn.org/cgi/content/full/290/12/1624

 . 2003;290(12):1624-1632 (doi:10.1001/jama.290.12.1624) JAMA

 Sean R. Tunis; Daniel B. Stryer; Carolyn M. Clancy  

PolicyResearch for Decision Making in Clinical and Health Practical Clinical Trials: Increasing the Value of Clinical

Correction Contact me if this article is corrected.

Citations Contact me when this article is cited. This article has been cited 361 times.

Topic collections

Contact me when new articles are published in these topic areas.Statistics and Research Methods; Randomized Controlled Trial Medical Practice; Health Policy; Quality of Care; Evidence-Based Medicine;

Related Letters

. 2004;291(4):426.JAMASean R. Tunis et al. In Reply: 

. 2004;291(4):425.JAMAJoshua J. Ofman et al. Realizing the Benefits of Practical Clinical Trials

http://pubs.ama-assn.org/misc/[email protected] 

http://jama.com/subscribeSubscribe

[email protected]/E-prints 

http://jamaarchives.com/alertsEmail Alerts

at National Institutes of Health on March 4, 2010 www.jama.comDownloaded from

Page 52: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

SPECIAL COMMUNICATION

Practical Clinical TrialsIncreasing the Value of Clinical Researchfor Decision Making in Clinical and Health PolicySean R. Tunis, MD, MScDaniel B. Stryer, MDCarolyn M. Clancy, MD

THE NEED FOR CAREFUL SCIEN-tific evaluation of clinical prac-tice became a prominent fo-cus during the second half of

the 20th century. Demonstration of per-vasive and persistent unexplained vari-ability in clinical practice1 and high ratesof inappropriate care,2 combined withincreased expenditures, have fueled asteadily increasing demand for evi-dence of clinical effectiveness. The lim-ited amount of high-quality evidence isrecognized to be partly responsible forgeographic variation, inappropriatecare, and the limited success of qual-ity improvement efforts.3,4 As a result,increased attention is being directed tothe development of methods that canprovide valid and reliable informationabout what works best in health care.

THE NEED FOR HIGH-QUALITYEVIDENCEAmong theprimaryaudiences forhigher-quality evidence are clinical and healthpolicy decision makers, including pa-tients, physicians, payers, purchasers,health care administrators, and publichealth policymakers. Patients and phy-sicians increasingly seek to combine theirpersonal beliefs about health care choiceswith attention to high-quality evidencein making individual decisions aboutcare. Medical professional societies pro-duce guidelines to assist physicians and

patients in making medical decisions.5

Health insurers and managed care orga-nizations increasingly depend on sys-tematic reviews and technology assess-ments to support quality improvementefforts and to develop coverage and pay-ment policy.6,7 Hospitals and health sys-

Author Affiliations: Centers for Medicare & Medic-aid Services, Baltimore, Md (Dr Tunis); and Agencyfor Healthcare Research and Quality, Rockville, Md(Drs Stryer and Clancy).Corresponding Author and Reprints: Sean R. Tunis,MD, MSc, Office of Clinical Standards and Quality,Centers for Medicare & Medicaid Services, 7500 Se-curity Blvd, Mailstop S3-02-01, Baltimore, MD 21244(e-mail: [email protected]).

Decision makers in health care are increasingly interested in using high-quality scientific evidence to support clinical and health policy choices; how-ever, the quality of available scientific evidence is often found to be inad-equate. Reliable evidence is essential to improve health care quality and tosupport efficient use of limited resources. The widespread gaps in evidence-based knowledge suggest that systematic flaws exist in the production ofscientific evidence, in part because there is no consistent effort to conductclinical trials designed to meet the needs of decision makers. Clinical trialsfor which the hypothesis and study design are developed specifically to an-swer the questions faced by decision makers are called pragmatic or prac-tical clinical trials (PCTs). The characteristic features of PCTs are that they(1) select clinically relevant alternative interventions to compare, (2) in-clude a diverse population of study participants, (3) recruit participants fromheterogeneous practice settings, and (4) collect data on a broad range of healthoutcomes. The supply of PCTs is limited primarily because the major fun-ders of clinical research, the National Institutes of Health and the medicalproducts industry, do not focus on supporting such trials. Increasing the sup-ply of PCTs will depend on the development of a mechanism to establishpriorities for these studies, significant expansion of an infrastructure to con-duct clinical research within the health care delivery system, more relianceon high-quality evidence by health care decision makers, and a substantialincrease in public and private funding for these studies. For these changesto occur, clinical and health policy decision makers will need to become moreinvolved in all aspects of clinical research, including priority setting, infra-structure development, and funding.JAMA. 2003;290:1624-1632 www.jama.com

1624 JAMA, September 24, 2003—Vol 290, No. 12 (Reprinted) ©2003 American Medical Association. All rights reserved.

at National Institutes of Health on March 4, 2010 www.jama.comDownloaded from

Page 53: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

tems use these documents to make de-cisions about capital investments such asequipment purchasing and construc-tion (BOX).

The current clinical research enter-prise in the United States is not con-sistently producing an adequate sup-ply of information to meet the needs ofclinical and health policy decision mak-ers.8-10 The inability to address manycommon, important clinical ques-tions, despite a significant increase inpublic and private funding for clinicalresearch, suggests a systemic problemin the production of clinical research.This article explains the impact ofknowledge gaps on health care deci-sion makers, describes the features ofclinical trials that would more reliablyanswer the practical questions they face,and discusses why the current clinicalresearch enterprise fails to address manyimportant practical questions. This ar-ticle also proposes strategies to ad-dress the current shortage of clinicaltrials to meet these needs.

PREVALENCE AND IMPACTOF KNOWLEDGE GAPSThe prevalence and significance of gapsin knowledge about clinical effective-ness are most readily appreciated by re-viewing the results of most systematicliterature reviews, technology assess-ments, and clinical practice guide-lines. These reports are generally pro-duced to provide comprehensivereliable information for decision mak-ers and usually address common con-ditions with large aggregate cost, mor-bidity, and public health importance.A consistent finding of these reviews isthat the quality of evidence available toanswer the critical questions identi-fied by experts is suboptimal. For ex-ample, a systematic review of newerpharmacologic agents for depressionconcludes that few studies provideddata on the long-term effectiveness oftreatment, the functional status of pa-tients, or the outcomes of patientstreated in typical practice settings.11

Furthermore, few studies compared theolder inexpensive agents with neweragents in terms of adverse effects and

clinical efficacy. Most well-done sys-tematic reviews and clinical guide-lines reach similar conclusions aboutthe quality of evidence associated withcommon clinical problems.

These gaps in evidence undermine ef-forts to improve the scientific basis ofhealth care decisions in several ways. Or-ganizations that develop evidence-based clinical practice guidelines maynot be able to develop clear, specific rec-ommendations.12 For example, the back-ground report for clinical guidelines onoutpatient management of exacerba-tions of chronic obstructive pulmo-nary disease found that although nu-merous industry-sponsored clinical trialsreported minor differences in the anti-microbial activity of alternative broad-spectrum antibiotics, no trials had beenperformed to determine whether any ofthe newer broad-spectrum antibioticswere better than older generic antibiot-ics or even placebo (for mild exacerba-tion).13,14 As a result, the guideline couldnot provide definitive recommenda-tions on the appropriate choice of anti-

biotics for chronic obstructive pulmo-nary disease exacerbations.

The limited quantity and quality ofavailable scientific information also im-pede the efforts of public and privatehealth insurers in developing evidence-based coverage policies for many newand existing technologies.7,15 Poor-quality studies of new technologies canlead to millions of dollars being allo-cated for new technologies for which thelong-term benefits and risks have notbeen determined.4 Minimally invasivetechnologies for treatment of benignprostatic hyperplasia (BPH) are in wide-spread use, yet no clinical trials have beenperformed to compare the risks and ben-efits of these treatments with standardsurgical interventions.16 The Medicareprogram has spent millions of dollars peryear for home use of special beds for pa-tients with pressure ulcers, despite thefact that no well-designed study dem-onstrates that they improve healing ofthese ulcers.17 The limited production ofthis body of research becomes increas-ingly problematic as major increases in

Box. Uses of Evidence in Decision Making

Physician/Patient Decision MakingOf existing diagnostic or treatment alternatives, which makes the most sense foran individual patient?

Choosing Plans or PhysiciansWhich plan or physician is likely to provide high-quality care?

Practice GuidelinesWhat is the best approach for patients with selected conditions?

Quality Measurement and ImprovementHow can evidence-based clinical performance be assessed? Do improvement pro-grams result in enhanced clinical care?

Product Purchasing and Formulary SelectionHow does this product compare with existing alternatives?

Benefit and Coverage DecisionsShould a new service be reimbursed and for which patients?

Organizational and Management DecisionsDoes a hospitalist program decrease costs and improve outcomes?

Program Financing and Priority SettingWhich services represent the best value for additional investments?

Product ApprovalShould this product be approved and, if so, for which indications?

CLINICAL RESEARCH FOR DECISION MAKING

©2003 American Medical Association. All rights reserved. (Reprinted) JAMA, September 24, 2003—Vol 290, No. 12 1625

at National Institutes of Health on March 4, 2010 www.jama.comDownloaded from

Page 54: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

public funding for basic research gener-ate an expanding range of potentiallyvaluable technologies in need of carefulobjective evaluation.

DESIGNING CLINICALRESEARCH FORDECISION MAKERSHealth services research and outcomesresearch have made important contri-butions toward the effective translationof clinical research discoveries to clini-cal practice and health policy.18-20 How-ever, observational and other nonexperi-mental methods may not providesufficiently robust information regard-ing the comparative effectiveness of al-ternative clinical interventions, primar-ily because of their high susceptibility toselection bias and confounding.21,22 Therecent findings demonstrating the ad-verse effects and lack of effectiveness ofhormone therapy despite consistent find-ings of benefit from observational stud-ies highlight the limitations of nonex-perimental studies.23

Clinical trials designed to assist healthcare decision makers, referred to as prag-matic clinical trials or practical clinicaltrials (PCTs), are defined as trials forwhich the hypothesis and study designare formulated based on informationneededtomakeadecision.24 Theyaredis-tinguishedfromexplanatoryclinical trials,for which the goal is to better under-stand how and why an interventionworks.Explanatory trials aredesigned tomaximize the chance that some biologi-cal effect of a new treatment will berevealed by the study. The PCTs addresspractical questions about the risks, ben-efits, and costs of an intervention as theywould occur in routine clinical prac-tice.25 The most distinctive features ofPCTs are that they select clinically rel-evant interventions to compare, includea diverse population of study partici-pants, recruit participants from a vari-ety of practice settings, and collect dataon a broad range of health outcomes.

Compare Clinically RelevantAlternativesOften PCTs are designed as head-to-head comparisons of viable alternative

clinical strategies. These comparativestudies have the potential to alter clini-cal decisions profoundly, because theyare derived from practical choices fac-ing patients and their physicians. In con-trast, many of the clinical studies com-paring individual agents with placebomay not provide a sound basis for choos-ing from among acceptable alternatives.

For example, a PCT conducted by theDepartment of Veterans Affairs (VA)compared the benefits of pharmaco-logic therapy with terazosin hydrochlo-ride, finasteride, or both for treatment ofsymptoms of BPH.26 Both drugs are ap-proved by the Food and Drug Adminis-tration (FDA) for this use based on trialscomparing each drug with a placebo.There is no incentive or requirement formanufacturers to initiate studies to com-pare the products. The VA study26 ran-domized 1229 men with BPH and dem-onstrated that terazosin was moreeffective than finasteride for reducingBPH symptoms at 1 year. Once initi-ated, both manufacturers contributed tothe design and funding for the study. TheNational Institutes of Health (NIH)funded a study that compared doxazo-sin mesylate, finasteride, and both drugsin 3047 patients with BPH for a mean of4.5 years. This study revealed that thecombination of the 2 drugs was signifi-cantly more effective at delaying pro-gression of symptoms than either drugalone.27

In addition, PCTs can address non-pharmacologicalternatives.Despitebackpain being among the most prevalentcomplaints in primary care, few high-quality studies have been performed tocompare results of the numerous exist-ing treatment alternatives. A PCT oftherapy for recent-onset low back painrandomized 323 patients to 1 of 3 widelyused alternative adjunct treatment strat-egies:physical therapy,chiropracticcare,and self-care (the principles of which aredescribed in an educational booklet).28

The study showed that physical therapyand chiropractic care increased patientsatisfaction and marginally reducedsymptoms compared with the self-careprinciples outlined in the booklet; how-ever, there were no differences between

the 3 study groups in function or ratesof recurrence. The educational bookletthat outlined self-care was substantiallylessexpensive.28 Astudyofsimilardesigndemonstrated that therapeutic massagewas more effective and less costly thanacupuncture in treating low back pain.29

Enroll a Diverse Study PopulationTypically, PCTs include a more diversestudy population by having broad inclu-sion criteria and fewer exclusion crite-ria when enrolling patients. The goal isto enroll patients in the trial with char-acteristics that reflect the range and dis-tribution of patients observed in clini-cal practice for a particular problem. Thisapproach addresses the common con-cern of decision makers about the ap-plicability of results from studies with re-stricted eligibility criteria.30,31 It can alsoensure that the higher-risk patients likelyto have the greatest benefit from sometreatment are not excluded from clini-cal trials.32

Most clinical trials of antidepres-sants have excluded elderly patients andfocus on patients with major depres-sion rather than patients with less se-vere depression, which is a more preva-lent problem. Williams et al33 enrolleda substantial number of elderly pa-tients in a PCT of 415 patients withminor depression or dysthymia, com-paring the effectiveness of oral antide-pressants, problem-solving treatment (aform of cognitive therapy), and pla-cebo. The study found that treatmentwith paroxetine showed greater im-provement in depressive symptoms andmental health function compared withproblem-solving treatment or placebo.Reliable and relevant information suchas this is more likely to convince phy-sicians to provide medication for theirelderly patients with this common causeof significant reversible morbidity.

Because physicians must often treatpatients based on the likely rather thanconfirmed diagnosis, studies that en-roll patients based on presenting symp-toms rather than definitive test resultsmay be of great practical value. A PCTof patients with sinusitis compared theeffectiveness of antibiotics with and

CLINICAL RESEARCH FOR DECISION MAKING

1626 JAMA, September 24, 2003—Vol 290, No. 12 (Reprinted) ©2003 American Medical Association. All rights reserved.

at National Institutes of Health on March 4, 2010 www.jama.comDownloaded from

Page 55: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

without nasal corticosteroids.34 Be-cause primary care physicians typicallyuse sinus radiographs to evaluate pa-tients with suspected sinusitis, radio-graphic evidence of sinusitis was usedas a study entry criterion (rather thansinus puncture). This approach maxi-mized the generalizability of the studyresults by enrolling a patient popula-tion that physicians would recognize assimilar to patients observed in daily prac-tice. Another PCT randomized 478 pa-tients to 2 alternative strategies for man-aging patients who presented withdyspepsia: (1) blood testing for Helico-bacter pylori with patients with posi-tive test results referred for endoscopyor (2) empirical treatment with acid-suppressive drugs.35 Neither study groupinvolved routine referral for endos-copy, replicating the standard of care forthis common primary care problem.

Recruit From a Varietyof Practice SettingsAlso, PCTs may improve external va-lidity by including a wider range of phy-sicians and settings to which the studywill be applicable. Pragmatic clinicaltrials have often been conducted incommunity-based settings, generallyclinics and physician offices at whichthe primary activity is clinical care. En-rolling patients from a more diversegroup of practice settings also allows forsome variability in how the study in-tervention and associated clinical carewill be provided.24 The ancillary carethat these patients receive is more likelyto reflect the average care patientswould receive outside the research con-text. In the study of nasal corticoste-roids for sinusitis, 18 of the 22 studysites were community-based primarycare or otolaryngology clinics. Dolor etal34 noted that this selection of study set-tings would ensure that the studysample represented the patient popu-lation typically observed in generalpractice. Outcomes of risky or techni-cally demanding procedures are par-ticularly likely to vary across sites, mak-ing it especially valuable to have resultsthat reflect outcomes across a range ofinstitutions.

The Antihypertensive and Lipid-Lowering Treatment to Prevent HeartAttack Trial (ALLHAT)compared treat-ment outcomes for various drug treat-ments for hypertension in more than42000 patients enrolled from 623 pri-mary care clinics.36 Many of the studysites had not previously been sites forclinical research. The study demon-strated that use of inexpensive diureticmedication resulted in a lower inci-dence of major cardiovascular eventscompared with the newer and moreexpensive classes of antihypertensiveagents, includingcalciumchannelblock-ers, angiotensin-converting enzymeinhibitors, and �-blockers.37,38 Physi-cians and policymakers can be confi-dent that the reported outcomes in thisstudyare likely topredict results thatwillbe observed across a wide range of prac-tice settings. This eliminates a commonobstacle to physician implementation ofclinical research findings.

Measure a Broad Range ofRelevant Health OutcomesSelection of the outcomes to be mea-sured in PCTs is based on the most im-portant anticipated effects of the inter-vention, taking into account thoseoutcomes of greatest relevance to de-cision makers. As a result, the study endpoints collected in PCTs include a broadrange of functional outcomes, includ-ing quality of life, symptom severity, sat-isfaction, and costs, as well as more tra-ditional end points, such as mortalityand major morbidity.39,40 The low backpain study by Cherkin et al28 reporteddisability days, satisfaction, and abil-ity to function. Many large trials of car-diovascular interventions now reportgeneral and disease-specific quality oflife rather than focusing only on mor-tality or major nonfatal outcomes suchas stroke or myocardial infarction.

In addition, the period of follow-upfor PCTs is often longer than in tradi-tional clinical trials to better reflect moreof the natural history of a disease. Manytherapies have different results in theshort term than in the long term, a phe-nomenon commonly observed in sur-gical trials that usually demonstrate

high short-term risks associated withsurgery.32 Two PCTs that compared theresults of immediate surgical repair withsurveillance for small abdominal aor-tic aneurysms followed up patients for4.9 years in 1 trial and 8.0 years in theother trial.41,42 Previous studies on thisquestion had been much shorter andhad left substantial uncertainty aboutthe safety of deferring surgical repair ofsmall aneurysms. Both studies demon-strated that survival was not im-proved by elective surgical repair ofthese aneurysms, a result that could leadto the avoidance of many unnecessarysurgical procedures.

Inclusion of economic outcomes inclinical trials provides information thathas a significant impact on clinical andpolicy decisions.43 A randomized studyevaluated the results of treatment withfluoxetine compared with tricyclic an-tidepressants for the initial manage-ment of mild depression.44 Primary out-comes of the study included the type andfrequency of adverse effects, the num-ber of patient visits associated with theseadverse effects, and the rate at which pa-tients were switched from their origi-nally assigned medication to anotheragent. The study concluded that, de-spite the higher cost of fluoxetine, totalcosts of care were no different betweenthe 2 groups because the incidence of ad-verse effects and frequency of physicianvisits were lower for patients initiallytreated with fluoxetine. The findings pro-vided important input for formulary de-cision makers and were also given sub-stantial weight in the development ofevidence-based clinical guidelines on themanagement of depression.45

WHY IS THE SUPPLYOF PCTs INADEQUATE?The supply of PCTs is currently inad-equate in large measure because trialsoften require substantial resources, andthe current level of public- and private-sector funding for these studies is in-adequate. Neither of the major sourcesof funding for clinical research in theUnited States—the NIH and the medi-cal products industry—has as a pri-mary mission the goal of ensuring that

CLINICAL RESEARCH FOR DECISION MAKING

©2003 American Medical Association. All rights reserved. (Reprinted) JAMA, September 24, 2003—Vol 290, No. 12 1627

at National Institutes of Health on March 4, 2010 www.jama.comDownloaded from

Page 56: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

studies are performed to address clini-cal questions important to decisionmakers. The funding entities that arefocused on supporting these studieshave relatively limited resources.

Practical clinical trials may be ex-tremely costly to conduct because theycan require large sample sizes and mayfollow up patients for a long period. The7-year, 42000-patient ALLHAT study isestimated to have cost approximately$120 million to complete (C. Furberg,MD, PhD, written communication, Janu-ary 9, 2003). The National Emphy-sema Treatment Trial enrolled 1200 pa-tients in the surgical and medical therapygroups, took approximately 6 years tocomplete, and cost more than $35 mil-lion for the research, with substantiallymore than this amount paid by Medi-care for the clinical care of patients inthe trial (G. Weinman, MD, writtencommunication, April 16, 2003). Evenstudies of relatively modest size andsimple design can have substantial costs.An NIH study of St John’s wort and cita-lopram will enroll 300 patients with milddepression and follow them up for 20weeks, with an estimated cost of $4 mil-lion over 4 years.46

Although PCTs are not a major fo-cus for the NIH, they have funded anumber of important PCTs, most ofwhich have provided crucial informa-tion for clinical and health policy de-cision makers. The NIH does not, how-ever, have any organized systematicmechanism in place to identify the high-priority questions of decision makersthat might need to be resolved througha PCT. Estimates of the distribution ofNIH extramural grants indicate that ap-proximately 70% of NIH grants are de-voted to basic biomedical research, withthe remainder going to clinical re-search of all types.47 The NIH scien-tific review committees, as currentlyconstituted, are more familiar with tra-ditional explanatory trials and have rela-tively limited experience with PCTs.Consistent with its primary mission andfunding priorities, the NIH has beensuccessful at biomedical discovery buthas been less consistent in supportingclinical research that ensures that these

discoveries will be effectively trans-lated into improved patient care andpublic health.10

Some public research funding orga-nizations have an explicit commitmentto supporting research that addressespractical questions of health care deci-sion makers. The Agency for Health-care Research and Quality (AHRQ) hasfunded several important PCTs48,49 butdevotes most of its resources to healthservices research, quality measure-ment, and patient safety. According toAHRQ (L. Levine, oral communica-tion, February 24, 2002), total spend-ing on PCTs was approximately $30 mil-lion in fiscal year 2001, and the currentagency budget of approximately $300million would not allow for a signifi-cant expansion for such studies whilemaintaining its other core activities. TheAHRQ and FDA jointly oversee the Cen-ters for Education and Research inTherapeutics, which were established forthe purpose of improving the effectiveuse of pharmaceuticals. The Centers forEducation and Research in Therapeu-tics are directed to accomplish this byconducting “essential research studieson drugs and therapies that are not beingperformed by the pharmaceutical in-dustry or the NIH” and communicat-ing these findings to practicing physi-cians.50 The annual funding by theCenters for Education and Research inTherapeutics of $7 million is adequateto identify, but not to support, impor-tant PCTs related to pharmaceuticaltherapy.

The Cooperative Studies Program ofthe VA has produced numerous studiesdesigned to support clinical and policydecisions. This focus of activity emergesdirectly from the purpose of that re-search program: to support a researchagenda designed to improve the effec-tiveness of care for veterans. This pro-gram funded some of the earliest trialsclearly demonstrating the value of coro-nary artery bypass graft surgery.51 Morerecently, the VA funded an evaluation ofarthroscopic surgery for osteoarthritis ofthe knee, which showed that the resultsof placebo surgery were no different thanresults from actual lavage and debride-

ment.52 Although the clinical implica-tions of the study are still being de-bated, the Department of Veterans Affairsissued an advisory to its physicians, rec-ommending that they not perform thissurgery in patients similar to those in theclinical trial.53 The study also promptedthe Medicare program to consider spe-cific limitations for use of this proce-dure, guided primarily by results of theVA study. With approximately $55 mil-lion to support such trials, the VA is animportant source of PCTs in the UnitedStates, but thenumberof importantques-tions is far beyond this level of funding.

Industry funding for clinical trials isseveral times greater than NIH spend-ing ($4.1 billion vs $850 million in2000)54; however, most of the clinical re-search supported by the pharmaceuti-cal and medical device industries is de-signed primarily to meet the regulatoryrequirements for marketing approval bythe FDA. More than 90% of industryspending on clinical trials supports phase1, 2, and 3 studies required for regula-tory approval, with the remaining 10%spent on phase 4 clinical trials.54 MostPCTs would be classified as phase 4 trials.

Most devices reviewed by the FDAare cleared for marketing based on thesimilarity of those devices to previ-ously approved devices, and there is norequirement for clinical studies that di-rectly evaluate the clinical effective-ness of most devices. Once FDA ap-proved, devices may be widely used offlabel for clinical purposes that havenever been carefully evaluated. Most ofthe pulmonary artery catheters in mod-ern use have been approved by the FDAwithout clinical trial data based on thesimilarity of these catheters to those inuse before 1976 (the year that the FDAbegan to regulate medical devices).Large randomized trials have recentlyshown that the pulmonary artery cath-eter does not provide clinical benefitcompared with standard care in el-derly high-risk surgical patients who re-quire intensive care.55 Most of the thera-pies in common use for treatment ofpressure ulcers and other chronicwounds, such as electrical stimulationand hyperbaric oxygen, have never been

CLINICAL RESEARCH FOR DECISION MAKING

1628 JAMA, September 24, 2003—Vol 290, No. 12 (Reprinted) ©2003 American Medical Association. All rights reserved.

at National Institutes of Health on March 4, 2010 www.jama.comDownloaded from

Page 57: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

the subject of high-quality trials or re-ceived FDA approval for treatment ofwounds.

For pharmaceuticals, the FDA re-quires placebo-controlled studies for ap-proval and generally does not requiretrials that compare a new drug with ex-isting alternatives, unless a company isspecifically seeking FDA approval tomake a claim of superiority to anothermarketed product. These comparativestudies are also unattractive to drugmanufacturers if there is any chance thatthe results might reflect poorly on theirproduct. Therefore, industry-spon-sored drug studies will usually not pro-vide the comparative information aboutrisks and benefits sought by most healthcare decision makers. Furthermore, be-cause the FDA requires informationabout efficacy rather than effective-ness, the study populations enrolled inFDA-approval studies are highly se-lected, in contrast with the more di-verse study populations that enhance thegeneralizability of results from PCTs.

Although medical product compa-nies increasingly support additional clini-cal studies after FDA approval, postmar-ket research priorities for industry areunderstandably guided by the objectiveof maximizing market share rather thanthe goal of addressing critical questionsconcerning optimal clinical use.50 For ex-ample, a longitudinal observational studywas recently conducted that recruitedmore than 200000 ambulatory, post-menopausal women older than 50 yearsfrom more than 4000 primary care prac-tices.56 The substantial cost of this studywas borne by a manufacturer of a pre-scription drug for osteoporosis and an or-ganization representing the industry thatproduces bone mineral density measure-ment devices. The main result of thestudywas thatnearlyhalf ofwomenolderthan 50 years may be at increased riskof osteoporotic fractures. These resultssuggest that physicians should muchmore aggressively screen for and treatwomen at risk for osteoporotic frac-tures, findings that are clearly alignedwith the financial interests of both ma-jor study sponsors. It would be un-likely that the same sponsors would de-

vote resources to the equally importantquestion of whether lifestyle modifica-tion was as good as drug therapy in re-ducing the risk of fractures in these at-risk women.

STRATEGIES FORIMPROVEMENTWe propose a set of strategies that webelieve would promote more consis-tent production of clinical trials thatmeet the needs of health care decisionmakers. These strategies would call forsustained effort and substantial re-sources from all affected stakeholders.

Systematic Identification andPrioritization of Knowledge GapsCurrently, there is no institution withthe responsibility for identifying and pri-oritizing important unanswered clini-cal research questions that would be use-ful to clinical and health policy decisionmakers. It is essential that this functionbe established. Ideally, the responsibleentity would have majority representa-tion from the decision makers in needof this information, including represen-tatives of payers, purchasers, medicalprofessionals, and patients.57 One pos-sibility would be for the Institute ofMedicine to serve this function, provid-ing staff and supportive infrastructurewithin a neutral scientific environ-ment. The list generated by this groupwould be available to public and pri-vate funders as they develop new initia-tives and make their funding deci-sions. The degree to which funderssupported studies from this list wouldhighlight whether research funders weremeeting important public health needs.To the extent that other health carestakeholders begin to devote addi-tional resources to clinical research, thepriority list will also serve as a guide toresearch most likely to generate reli-able usable information.

There are a number of sources fromwhich high-priority questions for PCTscould be identified. Virtually every clini-cal guideline, technologyassessment, sys-tematic review, and consensus report in-cludes a section that lists specific clinicalresearch priorities. These priorities de-

serve special attention because of the sys-tematic and comprehensive method bywhich they have been generated.

Decision Makers Insist onHigh-Quality Evidencein Making DecisionsThe production of high-quality clini-cal trials will increase significantly whenhealth care decision makers decide toconsistently base their decisions onhigh-quality evidence. Research spon-sors (public and private) will be moti-vated to provide the type of clinical re-search required by decision makers.Payers and purchasers can clearly in-dicate to the drug and device industrythat favorable coverage and paymentdecisions will be expedited by reliableevidence from PCTs. In particular,manufacturers will be motivated to per-form head-to-head comparative trials ifthese are required to justify paymentshigher than the existing less expen-sive alternatives. Physicians and medi-cal professional organizations can alsoincrease the degree to which care of in-dividual patients and professional so-ciety clinical policy are guided by at-tention to reliable evidence.

Patients and their advocacy organi-zations have a critical stake in the qual-ity of clinical research. The fundamen-tal premise of improving health carethrough informed patient choice can-not be realized if unreliable evidence isused to inform patients. Patient advo-cacy organizations should become moreinsistent on the production of high-quality evidence and work with re-search funders to increase support forPCTs. These groups could also facili-tate the research by encouraging pa-tients, physicians, and health care or-ganizations to participate in clinical trialsin those situations for which current evi-dence is uncertain. Many years and livescould have been saved had advocatesworked to ensure the rapid completionof clinical trials on bone marrow trans-plantation for breast cancer, instead ofputting pressure on state and federalpolicymakers to force payers to coverthis procedure when supportive high-quality data were lacking.58

CLINICAL RESEARCH FOR DECISION MAKING

©2003 American Medical Association. All rights reserved. (Reprinted) JAMA, September 24, 2003—Vol 290, No. 12 1629

at National Institutes of Health on March 4, 2010 www.jama.comDownloaded from

Page 58: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Create Operational InfrastructureTo conduct PCTs efficiently, a durableinfrastructure within the primary caresetting is needed. Considerable effort andresources will be required to develop ad-ditional networks of physicians willingand able to participate in clinical re-search. The HMO Research Network, aconsortium of large health mainte-nance organizations, including KaiserNorthern California, Group Health ofPuget Sound, and Harvard VanguardHealth System, may serve as a model forits capacity to implement studies andcapture data within the context of usualcare.59 Other examples include AHRQ’s55 practice-based research networks, theVA Cooperative Studies Program, and anumberof specialty-specific researchnet-works sponsored by various NIH insti-tutes. The NIH also supports the Gen-eral Clinical Research Centers, a robustinfrastructure for clinical research withinacademic medical centers. Additional at-tention to expanding this infrastruc-ture beyond teaching hospital and clin-ics and to extending the capacity forclinical research into real-world clini-cal care settingswouldbeextremelyhelp-ful in facilitating the conduct of clinicaltrials of value to decision makers.

It will also be necessary to signifi-cantly expand the number of physi-cians capable of conducting PCTs. TheNIH and AHRQ should expand their pro-grams to provide special career awards,scholarship, and loan-repayment pro-grams aimed at expanding the clinical re-search workforce. Although some indi-viduals will need extensive training tobecome principal investigators and de-vote their professional careers to thesestudies, a minimum level of skills in clini-cal investigation should become part ofthe basic training of physicians. Ulti-mately, with the expansion of elec-tronic medical record systems, mostclinical encounters could provide use-ful data and most physicians could func-tion in part as clinical investigators.

Address Methodologicaland Ethical IssuesPractical clinical trials pose a number ofmethodological and ethical challenges

that will need to be identified and ad-dressed. Clinical trialists have devel-oped some strategies to reduce the costof conducting large lengthy trials. Largesimple trials feature simple protocols, en-rollment from a large number of re-search sites, limited patient exclusion cri-teria, and data collection limited to thesmallest possible number of elements.60

This approach can make it possible tostudy thousands of patients at relativelylow cost and was first shown to be fea-sible by Richard Peto and his col-leagues, who conducted several largelandmark trials of drug treatment foracutemyocardial infarction.61 Broaderap-plication of these strategies to PCTs willbe needed to minimize cost and partici-pant burden. Because ancillary patientmanagement in a PCT may not be stan-dardized, there are likely to be cluster-ing effects at the level of physicians andorganizations that will require special-ized methods to facilitate analysis or ran-domization at the organizational level.Such study designs pose some unusualmethodological challenges that will re-quire additional attention to refine.

For questions of comparative effec-tiveness, the expected outcome differ-ences between 2 active therapies arelikely to be considerably smaller than thedifferences between a treatment and pla-cebo. Trials designed to demonstratesuch differences will require large samplesizes and may be expensive. In some cir-cumstances, fewer patients may be nec-essary to compare 2 active treatments inan equivalence trial: a study designed todemonstrate that there are no impor-tant differences in effectiveness be-tween 2 active treatments. However,equivalence trials raise a number of chal-lenging scientific and ethical issues andare not yet widely accepted, so addi-tional conceptual development of thisapproach will be necessary.62

Several important ethical issues maybe encountered in the design and con-duct of some PCTs. Because a numberof these trials may involve comparisonof new vs old technology, expensive vsinexpensive drugs, or high-intensity vslow-intensity services, there may be con-cernsabout theeconomicmotivations for

the trials and whether some patients en-rolled in the trials are being denied op-timal care (ie, whether true clinical equi-poise exists).63 Informed consent andconfidentiality issues may be raised bymore extensive use of clinical and ad-ministrative records to identify poten-tial study participants and to augment thedata collected specifically for researchpurposes. Consent procedures that areboth adequate and efficient will need fur-ther exploration, because the number ofPCTs that can be performed may be lim-ited by the operational burden they im-pose, particularly in settings not primar-ily focused on research. In addition,efficient strategies for obtaining institu-tional review board approval from mul-tiple institutions will be necessary, per-haps through approaches that involvecoordination of central and local insti-tutional review board reviews.

Funding Options for PCTsA significant amount of funding will benecessary to support PCTs, given thelarge number and high cost of such trials.Furthermore, a substantial amount ofsupport will be needed to develop the in-frastructure necessary to efficiently con-duct these studies. Resources to fundthese trials will necessarily come eitherfrom redistribution of current researchfunds to important PCTs or by findingadditional sources of new funds. Sev-eral possible public and private sourcesof additional or redistributed fundingshould be considered.

Although increasing the portion ofNIH resources flowing to PCTs may havemerit, given the NIH mission to im-prove public health though research,there is likely to be concern that themoney is being diverted from impor-tant basic research. However, the NIHhas recently acknowledged the need forimprovements in the clinical research en-terprise64 and may be increasingly will-ing to collaborate with decision makersonce they have more clearly identifiedtheir critical clinical research priorities.Funds from the NIH are also needed tosupport further development of the PCTinfrastructure, including support for re-search networks, expansion of the Gen-

CLINICAL RESEARCH FOR DECISION MAKING

1630 JAMA, September 24, 2003—Vol 290, No. 12 (Reprinted) ©2003 American Medical Association. All rights reserved.

at National Institutes of Health on March 4, 2010 www.jama.comDownloaded from

Page 59: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

eral Clinical Research Centers beyondacademic centers, and training physician-investigators who have skills to designand conduct PCTs.

Industry funding could also supportPCTs. This is most likely to occur in re-sponse to the increased use of evidence-based policy development by decisionmakers. Industry will be particularly re-sponsive to coverage and payment policyfirmly linked to evidence of improvedhealth outcomes and increased pay-ment to evidence of substantially im-proved outcomes. The regulatory frame-work of the FDA is designed to ensurethat medical technologies approved formarketing are safe and effective for theirintended use and is not structured to en-sure that research is conducted that willinform optimal clinical use of these tech-nologies. Additional requirements forcomparative studies before FDA ap-proval could significantly increase thetime and expense of bringing new tech-nologies to market. Recent policy dis-cussions have been focused on increas-ing the speed of FDA approvals bydeveloping strategies for gathering ad-ditional data in the postmarketing pe-riod.65 Increased diffusion of health in-formation technology will facilitate morerapid and efficient conduct of requisitecomparative studies, in both the premar-ket and postmarket phases.

Because payers and purchasers in-creasingly demand high-quality evi-dence to support decisions, they are of-ten proposed as the most likely sourceof additional funding for PCTs. The ben-efit of an enhanced evidence base formaking clinical and health policy deci-sions ultimately accrues to those who usehealth care services. For that reason, fi-nancial support and priority setting forthis body of research would most sen-sibly be associated with the fundingstreams that support the provision ofhealth care services. Furthermore, theamount of funding for this body of ap-plied research should be linked to thelevel of resources being consumed insupplying health care services.

Several health care delivery systemsallocate dedicated resources for clini-cal research. The VA spends approxi-

mately $24 billion on the delivery ofhealth care services and maintains abudget of $350 million for research, in-cluding approximately $60 million forclinical trials (virtually all of which arePCTs). The National Health Services inEngland and Canada also have pro-vided funding for a large number ofPCTs. These are examples of health sys-tems where health care services are de-livered within a fixed budget and clini-cal research is understood to be a criticalmechanism to ensure optimal use of ex-isting resources. Such an approachshould be considered in the broader UShealth care systems and could providesubstantial resources for high-priorityapplied clinical research if even a smallfraction of total health care spendingwere reserved for this purpose.

This proposal, although conceptuallysimple, may be challenging to imple-ment. Most private insurance contractsexclude payment for experimental andinvestigational services, although cov-erage of clinical trials is increasinglypopular. At present, the Medicare pro-gram does not have broad legal author-ity to fund clinical research, because thestatute does not allow payment for careunlessmedicallynecessary. If payers andpurchasers were to identify funds forclinical research, it would be essential tohavea functioningsystemforpriorityset-ting free of political interference androbustmechanismsforoversightofspon-sored research. Finally, costs for sup-porting clinical research will, at least ini-tially, increase overall health carespending or replace spending nowassigned to other technologies and ser-vices. Although PCTs might be expectedto increase theefficiencyof spendingandthehealthbenefitderivedfromgivendol-lars, any increased costs for clinicalresearch will eventually be passed on tothe payers, purchaser, employers, pub-lic programs, and eventually the gen-eral public. Health care consumers willultimately need to decide whether thisis a desirable use of their money.

CONCLUSIONThe United States is now seeing the re-sult of its heavy investment in biomedi-

cal research: numerous promisingmedical products have been devel-oped and many more are on the way toinitial clinical trials. With this successcomes an equally important addi-tional need: to develop a systematic ap-proach to efficiently evaluate the risksand benefits of these new technolo-gies in the context of existing alterna-tives. Currently, decision makers inhealth care are not provided with in-formation of adequate quality to makewell-informed decisions regarding al-ternative strategies for diagnosis andtreatment of most common clinical con-ditions. Improving the quality of clini-cal research will depend on more ac-tive involvement of clinical and healthpolicy decision makers in all aspects ofclinical research, including priority set-ting, study design, study implementa-tion, and funding.66

Disclaimer: The opinions contained in this article rep-resent the views of the individual authors and do notnecessarily reflect the views or policies of the Centersfor Medicare & Medicaid Services or AHRQ.Acknowledgment: We thank Joe Selby, MD, AllanKorn, MD, and Robert Califf, MD, for comments onearlier drafts of the manuscript. Alex Omaya, PhD, andthe members of the Institute of Medicine Clinical Re-search Roundtable contributed greatly through manyinsightful discussions of these issues. We also acknowl-edge the wisdom and support of Peter Mazonson, MD,in the earliest stages of this work.

REFERENCES

1. Wennberg J, Gittelsohn A. Small area variation inhealth care delivery. Sci Am. 1973;182:1102-1108.2. Schuster MA, McGlynn EA, Brook RH. How goodis the quality of health care in the United States?Milbank Q. 1998;76:517-563.3. Eddy DM, Billings J. The quality of medical evi-dence: implications for quality of care. Health Aff (Mill-wood). 1988;7:19-32.4. McNeil BJ. Shattuck lecture—hidden barriers to im-provement in the quality of care. N Engl J Med. 2001;345:1612-1620.5. Field MJ, Lohr KN, eds. Clinical Practice Guide-lines: Directions for a New Program. Washington, DC:National Academy Press; 1990.6. Steiner CA, Powe NR, Anderson GF, Das A. Tech-nology coverage decisions by health care plans andconsiderations by medical directors. Med Care. 1997;35:472-489.7. Garber AM. Evidence-based coverage policy. HealthAff (Millwood). 2001;20:62-82.8. Antman K, Lagakos S, Drazen J. Designing andfunding clinical trials of novel therapies. N Engl J Med.2001;344:762-763.9. Altman SH, Parks-Thomas C. Controlling spend-ing for prescription drugs. N Engl J Med. 2002;346:855-856.10. Sung NS, Crowley WF Jr, Genel M, et al. Centralchallenges facing the national clinical research enter-prise. JAMA. 2003;289:1278-1287.11. Mulrow CD, Williams JW Jr, Trivedi M, et al. Treat-ment of Depression: Newer Pharmacotherapies. Rock-

CLINICAL RESEARCH FOR DECISION MAKING

©2003 American Medical Association. All rights reserved. (Reprinted) JAMA, September 24, 2003—Vol 290, No. 12 1631

at National Institutes of Health on March 4, 2010 www.jama.comDownloaded from

Page 60: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

ville, Md: Agency for Health Care Policy and Re-search; 1999. AHCPR publication 99-E014.12. Lomas J, Anderson G, Enkin M, et al. The role ofevidence in the consensus process: results from a Ca-nadian consensus exercise. JAMA. 1988;259:3001-3005.13. Bach PB, Brown C, Gelfand SE, McCrory DC. Man-agement of acute exacerbations of chronic obstructivepulmonary disease: a summary and appraisal of pub-lished evidence. Ann Intern Med. 2001;134:600-620.14. Snow V, Lascher S, Mottur-Pilson C, for the JointExpert Panel on Chronic Obstructive Pulmonary Dis-ease of the American College of Chest Physicians andthe American College of Physicians-American Societyof Internal Medicine. Evidence base for managementof acute exacerbations of chronic obstructive pulmo-nary disease. Ann Intern Med. 2001;134:595-599.15. Tunis SR, Kang JL. Improvements in Medicare cov-erage of new technology. Health Aff (Millwood). 2001;20:83-85.16. National Institutes of Health, Grants-OER HomePage. Minimally invasive surgical therapies treat-ment consortium for benign prostatic hyperplasia.Available at: http://grants.nih.gov/grants/guide/rfa-files/RFA-DK-01-024.html. Accessed July 5, 2002.17. Centers for Medicare and Medicaid Coverage. Air-fluidized beds for pressure ulcers. Available at: http://www.cms.hhs.gov/coverage/8b3-q6.asp. Ac-cessed July 5, 2002.18. Rafuse J. Education, practice reviews needed to re-duce surgical intervention. CMAJ. 1996;155:463-464.19. Connors AF, Speroff T, Dawson NV, et al, for theSUPPORT Investigators. The effectiveness of right heartcatheterization in the initial care of critically ill pa-tients. JAMA. 1996;276:889-897.20. Soumerai SB, McLaughlin TJ, Spiegelman D, etal. Adverse outcomes of underuse of �-blockers in el-derly survivors of acute myocardial infarction. JAMA.1997;277:115-121.21. Naylor CD, Guyatt GH, for the Evidence-BasedMedicine Working Group. Users’ guides to the medi-cal literature, X: how to use an article reporting varia-tion in the outcomes of health services. JAMA. 1996;275:554-558.22. Greenland S. Randomization, statistics, and causalinference. Epidemiology. 1990;1:421-429.23. Rossouw JE, Anderson GL, Prentice RL, et al. Risksand benefits of estrogen plus progestin in healthy post-menopausal women: principal results from the Wom-en’s Health Initiative randomized controlled trial. JAMA.2002;288:321-333.24. Schwartz D, Lellouch J. Explanatory and prag-matic attitudes in therapeutic trials. J Chronic Dis. 1967;20:637-648.25. Roland M, Torgerson DJ. What are pragmatictrials? BMJ. 1998;316:285.26. Lepor H, Williford WO, Barry MJ, et al. The ef-ficacy of terazosin, finasteride, or both in benign pros-tatic hyperplasia. N Engl J Med. 1996;335:533-540.27. The MTOPS Research Group. The impact of medi-cal therapy on the clinical progression of BPH: resultsof the MTOPS trial. Paper presented at: American Uro-logical Association Annual Meeting; May 28, 2002;Orlando, Fla. Available at: http://www.niddk.nih.gov/welcome/releases/05-28-02abstract.pdf. AccessedJuly 25, 2002.28. Cherkin DC, Deyo RA, Battie M, et al. A com-parison of physical therapy, chiropractic manipula-tion, and provision of an educational booklet for thetreatment of patients with low back pain. N Engl J Med.1998;339:1021-1029.29. Cherkin DC, Eisenberg D, Sherman KJ, et al. Ran-domized trial comparing traditional Chinese medicalacupuncture, therapeutic massage, and self-care edu-

cation for chronic low back pain. Arch Intern Med.2001;161:1081-1088.30. Guyatt GH, Sackett DL, Cook DJ, for the Evidence-Based Medicine Working Group. Users’ guides to themedical literature, II: how to use an article abouttherapy or prevention, B: what were the results andhow will they help me in caring for my patients? JAMA.1994;271:59-63.31. Dans AL, Dans LF, Guyatt GH, Richardson S, forthe Evidence-Based Medicine Working Group. Us-ers’ guides to the medical literature, XIV: how to de-cide on the applicability of clinical trials results to yourpatient. JAMA. 1998;279:545-549.32. Califf RM, DeMets DL. Principles from clinical trialsrelevant to clinical practice: part I. Circulation. 2002;106:1015-1021.33. Williams JW, Barrett J, Oxman T, et al. Treat-ment of dysthymia and mild depression in primary care:a randomized controlled trial in older adults. JAMA.2000;284:1519-1526.34. Dolor RJ, Witsell DL, Hellkamp AS, et al. Compari-son of cefuroxime with or without intranasal fluticasonefor treatmentofrhinosinusitis: theCAFFSTrial:a random-ized controlled trial. JAMA. 2001;286:3097-3105.35. Delaney BC, Wilson S, Roalfe A, et al. Ran-domised controlled trial of Helicobacter pylori test-ing and endoscopy for dyspepsia in primary care. BMJ.2001;322:898-901.36. Grimm RH Jr, Margolis KL, Papamertriou VV, et al.Baseline characteristics of participants in the Antihyper-tensive and Lipid Lowering Treatment to Prevent HeartAttack Trial (ALLHAT). Hypertension. 2001;37:19-27.37. The ALLHAT Officers and Coordinators for theALLHAT Collaborative Research Group. Major out-comes in high-risk patients randomized to angiotensin-converting enzyme inhibitor or calcium-channel blockervs diuretic: the Antihypertensive and Lipid-LoweringTreatment to Prevent Heart Attack Trial (ALLHAT).JAMA. 2002;288:2981-2997.38. ALLHAT Collaborative Research Group. Major car-diovascular events in hypertensive patients random-ized to doxazosin vs chlorthalidone: the Antihyperten-sive and Lipid-Lowering Treatment to Prevent HeartAttack Trial (ALLHAT). JAMA. 2000;283:1967-1975.39. Guyatt GH, Rennie D. The basics: using the medi-cal literature. In:Users’Guides to theMedicalLiterature.Chicago, Ill: AMA Press; 2002.40. BarryMJ,CockettAT,HoltgreweHL,etal.Relation-shipofsymptomsofprostatismtocommonlyusedphysi-ological and anatomical measures of the severity of be-nign prostatic hyperplasia. J Urol. 1993;150:351-358.41. Lederle FA, Wilson SE, Johnson GR, Reinke DB.Immediate repair compared with surveillance of smallabdominal aortic aneurysms. N Engl J Med. 2002;346:1437-1444.42. The United Kingdom Aneurysm Trial Partici-pants. Long-term outcomes of immediate repair com-pared with surveillance of small abdominal aortic an-eurysms. N Engl J Med. 2002;346:1445-1452.43. Rizzo JD, Powe NR. Methodological hurdles in con-ducting pharmacoeconomic studies. Pharmacoeco-nomics. 1999;15:339-355.44. Simon GE, VonKorff M, Heiligenstein JH, et al.Initial antidepressant choice in primary care: effec-tiveness and cost of fluoxetine vs tricyclic antidepres-sants. JAMA. 1996;275:1897-1902.45. Snow V, Lascher S, Mottur-Pilson C, for the Ameri-can College of Physicians-American Society of Inter-nal Medicine. Pharmacologic treatment of acute ma-jor depression and dysthymia: clinical guideline, part1. Ann Intern Med. 2000;132:738-742.46. National Institutes of Health. St John’s wort studyin mild depression to enroll 300 participants. HealthNews Daily. 2003;15:5-6.

47. Nathan DG, Varmus HE. The National Institutesof Health and clinical research: a progress report. NatMed. 2000;6:1201-1204.48. Schein OD, Katz J, Bass EB, et al. The value of rou-tine preoperative medical testing before cataract sur-gery: study of medical testing for cataract surgery.N Engl J Med. 2000;342:168-175.49. Fitzgibbons RJ. Grants On-Line Database [data-base online]. Abstract of grant HS09860: inguinal her-nia management: watchful waiting vs operation.Agency for Healthcare Research and Quality. Avail-able at: http://www.gold.ahrq.gov/index.cfm. Ac-cessed July 20, 2002.50. Woosley RL. Centers for Education and Re-search in Therapeutics. Clin Pharmacol Ther. 1994;55:249-255.51. Takaro T, Hultgren HN, Lipton MJ, Detre KM. TheVA cooperative randomized study of surgery for coro-nary arterial occlusive disease II: subgroup with sig-nificant left main lesions. Circulation. 1976;54(suppl6):107-117.52. Moseley JB, O’Malley K, Petersen NJ, et al. A con-trolled trial of arthroscopic surgery for osteoarthritisof the knee. N Engl J Med. 2002;347:81-88.53. Kolata G. VA suggests halt to kind of knee surgery.New York Times. August 24, 2002;A:9(column 4).54. Getz K, Zisson S. Clinical grants market deceler-ates. CenterWatch. 2003;10:4-5.55. Sandham JD, Hull RD, Brant RF, et al. A random-ized, controlled trial of the use of pulmonary-arterycatheters in high-risk surgical patients. N Engl J Med.2003;348:5-14.56. Siris ES, Miller PD, Barrett-Connor E, Faulkner KG.Identification and fracture outcomes of undiagnosedlow bone mineral density in postmenopausal wom-en: results from the National Osteoporosis Risk As-sessment. JAMA. 2001;286:2815-2822.57. Tunis SR, Korn A, Omaya A. The Role of Pur-chasers and Payers in the Clinical Research Enter-prise. Washington, DC: National Academy Press; 2002.58. Mello MM, Brennan TA. The controversy overhigh-dose chemotherapy with autologous bone mar-row transplant for breast cancer. Health Aff(Millwood). 2001;20:101-117.59. HMO Research Network. HMO Research Net-work Home Page. Available at: http://hmoresearch-network.org/. Accessed June 21, 2002.60. Buring JE, Jonas MA, Hennekens CH. Large andsimple randomized trials. In: US Congress, Office ofTechnology Assessment. Tools for Evaluating HealthTechnologies: Five Background Papers. Washington,DC: Office of Technology Assessment; 1995.61. Baigent C, Collins R, Appleby P, Parish S, SleightP, Peto R, for the ISIS-2 (Second International Studyof Infarct Survival) Collaborative Group. ISIS-2: 10 yearsurvival among patients with suspected acute myo-cardial infarction in randomised comparison of intra-venous streptokinase, oral aspirin, both, or neither.BMJ. 1998;316:1337-1343.62. DjulbegovicB,ClarkeM.Scientific andethical issuesin equivalence trials. JAMA. 2001;285:1206-1208.63. Liliford R. Ethics of clinical trials from a bayesianand decision analytic perspective: whose equipoise isit anyway? BMJ. 2003;326:980-981.64. DeAngelis CD. NIH Director Elias A. Zerhouni, MD,reflects on agency’s challenges, priorities. JAMA. 2003;289:1492-1493.65. US Food and Drug Administration. Improving in-novation in medical technology: beyond 2002. Avail-able at: http://www.fda.gov/bbs/topics/NEWS/2003/beyond2002/report.html. Accessed April 30, 2003.66. Califf RM, DeMets DL. Principles from clinical trialsrelevant to clinical practice: part II. Circulation. 2002;106:1172-1175.

CLINICAL RESEARCH FOR DECISION MAKING

1632 JAMA, September 24, 2003—Vol 290, No. 12 (Reprinted) ©2003 American Medical Association. All rights reserved.

at National Institutes of Health on March 4, 2010 www.jama.comDownloaded from

Page 61: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

clinical practice

T h e n e w e ngl a nd j o u r na l o f m e dic i n e

n engl j med 361;10 nejm.org september 3, 2009990

Should Coronary Calcium Screening Be Used in Cardiovascular Prevention Strategies?

Robert O. Bonow, M.D.

From the Department of Medicine, Cen-ter for Cardiovascular Quality and Out-comes Research, Northwestern University Feinberg School of Medicine; and North-western Memorial Hospital — both in Chicago. Address reprint requests to Dr. Bonow at the Northwestern University Feinberg School of Medicine, 676 N. St. Clair St., Suite 600, Chicago, IL 60611, or at [email protected].

N Engl J Med 2009;361:990-7.Copyright © 2009 Massachusetts Medical Society.

A 52-year-old man requests a coronary-artery calcium (CAC) scan for assessment of his risk of coronary events after seeing an advertisement from a local facility that of-fers the test. He has no symptoms of cardiac disease, has never smoked, and is not overweight, but he does not exercise regularly. His father, who was a heavy smoker, had a fatal myocardial infarction at 45 years of age. The patient’s blood pressure is 130/85 mm Hg. His total cholesterol level is 220 mg per deciliter (5.7 mmol per liter), and his low-density lipoprotein (LDL) and high-density lipoprotein (HDL) cholesterol levels are 160 mg per deciliter (4.1 mmol per liter) and 38 mg per deciliter (1.0 mmol per liter), respectively. His fasting blood glucose level is 92 mg per deciliter (5.1 mmol per liter). What would you advise regarding CAC scanning?

The Clinic a l Problem

During the past four decades, there has been a dramatic decline in the age-adjusted rate of death from cardiac disease in the United States and many other developed countries.1-3 This reduction is attributed in large part to primary and secondary pre-vention strategies that target modifiable risk factors.2,4 Despite these advances, car-diovascular disease remains the leading cause of death in developed countries as well as in most developing countries,1,3,5 and there is concern that the growing preva-lence of obesity and type 2 diabetes will reverse the gains of the past 40 years. Fur-thermore, risk factors for cardiovascular disease are present and poorly controlled in the majority of persons who have no symptoms.3,6

Adult Treatment Panel III (ATP III) of the National Cholesterol Education Pro-gram7 based its 2001 recommendations for the treatment of hypercholesterolemia for primary prevention of cardiovascular disease on the assessment of individual risk factors and global risk as indicated by the Framingham risk score, which is deter-mined on the basis of age, blood pressure, levels of total cholesterol and HDL cho-lesterol, and smoking status. With the use of these criteria, persons without estab-lished coronary heart disease are considered to be at high risk if they have either a condition that is equivalent to coronary heart disease with respect to the risk of a coronary event (e.g., peripheral-artery atherosclerotic disease or diabetes) or more than a 20% estimated likelihood of a coronary event over the next 10 years; they are considered to be at low risk if the estimate is less than 10%.7 Although at the time of their first acute coronary event more than 85% of patients have at least one of the major established risk factors for heart disease,8,9 the presence of only a single risk factor would place most people in the low-risk category according to the ATP III criteria — meaning that intensive measures to reduce risk factors would not be considered necessary. The 2004 ATP III update10 also defined patients at “moderate

This Journal feature begins with a case vignette highlighting a common clinical problem. Evidence supporting various strategies is then presented, followed by a review of formal guidelines,

when they exist. The article ends with the author’s clinical recommendations.

An audio version of this article is available at

NEJM.org

Copyright © 2009 Massachusetts Medical Society. All rights reserved. Downloaded from www.nejm.org at HHS LIBRARIES CONSORTIUM on April 12, 2010 .

Page 62: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

clinical pr actice

n engl j med 361;10 nejm.org september 3, 2009 991

risk” for a coronary event as those having two or more risk factors and a 10-year risk of less than 10% and patients at “moderately high” risk (pre-viously termed “intermediate risk”) as those with a 10 to 20% likelihood of having a coronary event during the next 10 years. Neither patients at mod-erate risk nor those at moderately high risk would be considered candidates for the aggressive treat-ment of risk factors that is recommended for pa-tients at high risk.

The observation that a person classified as be-ing at low, moderate, or moderately high risk for a coronary event according to the updated ATP III criteria10 may nonetheless have a coronary event has spurred interest in new markers of such risk that might identify those who would benefit from more rigorous attention to risk-factor modifica-tion.11-13 In this regard, the potential role for imag-ing of subclinical atherosclerosis — in particular, the use of electron-beam computed tomography (EBCT) or multidetector computed tomography (MDCT) to detect CAC — has generated consider-able attention and debate.

S tr ategies a nd E v idence

EBCT is distinguished from standard computed tomography (CT) by the use of a scanning electron beam that is steered by an electromagnetic deflec-tion system across a series of fixed tungsten target rings; x-rays generated by the tungsten rings fan across the patient and are captured by detectors placed opposite to the tungsten rings. In contrast, MDCT uses the traditional x-ray tube and physi-cally rotates the x-rays around the patient. The ab-sence of mechanical motion in EBCT results in rapid image acquisition, with exposure times of just 50 to 100 msec for each tomogram. MDCT is also rapid, the exposure time is more than 220 msec. Both EBCT and MDCT are sensitive meth-ods for detecting coronary-artery calcification and provide highly reproducible data.14-17 Although MDCT is more widely available than EBCT and ap-pears to yield results of similar usefulness in terms of risk assessment,18 most of the information re-lating CAC scores to cardiovascular risk comes from studies using EBCT.

With very rare exception, coronary calcification is a marker of coronary atherosclerosis. Hence, the presence of coronary calcium signifies under-lying coronary-artery disease, with essentially no false positive findings, although calcification is

not synonymous with luminal stenosis or obstruc-tion. The CAC score, developed initially with the use of EBCT to quantify the extent of coronary calcification,19 correlates closely with the volume

16p6

A

B

AUTHOR:

FIGURE:

JOB:

4-CH/T

RETAKE

SIZE

ICM

CASE

EMail LineH/TCombo

Revised

AUTHOR, PLEASE NOTE: Figure has been redrawn and type has been reset.

Please check carefully.

REG F

Enon

1st

2nd

3rd

Bonow

1a&b of 3

09-03-09

ARTIST: mst

36110 ISSUE:

Aorta

Left mainartery

Left anteriordescending

artery

Left mainartery

Left anteriordescending

artery

Aorta

Figure 1. CT Scans of the Left Coronary Artery in Two Asymptomatic Men.

Two asymptomatic men, 51 years of age and 81 years of age, underwent coronary-artery calcium (CAC) im-aging with multidetector CT. The resulting transaxial cardiac tomograms, obtained at the level of the origin of the left coronary artery, show calcification of the left main and proximal left anterior descending coronary arteries in both the younger patient (Panel A) and the older patient (Panel B). The CAC score for the younger man, although relatively low at 80.1, places him in the 85th percentile for severity of CAC for men in his age group. The older man’s CAC score is higher, at 1054, but the severity of his CAC relative to that for men in his age group is lower — in the 70th percentile. (Imag-es courtesy of James Carr, M.D., Northwestern Univer-sity, Chicago.)

Copyright © 2009 Massachusetts Medical Society. All rights reserved. Downloaded from www.nejm.org at HHS LIBRARIES CONSORTIUM on April 12, 2010 .

Page 63: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

T h e n e w e ngl a nd j o u r na l o f m e dic i n e

n engl j med 361;10 nejm.org september 3, 2009992

of coronary-artery plaque measured at autopsy20 and is considered a surrogate for the overall cor-onary-plaque burden. However, postmortem stud-ies have also shown that the total plaque volume exceeds the volume of calcified plaque by a ratio of 5 to 1.20 Thus, for every calcified plaque, there are many noncalcified plaques that are not de-tected by EBCT. It is possible for plaque erosion or rupture and myocardial infarction to occur in patients without demonstrable CAC, especially in younger adults.

Population studies indicate that CAC scores increase with advancing age, reflecting the natu-ral progression of atherosclerosis,14,15,21,22 and men generally have higher CAC scores than wom-en of similar age. Women tend to have calcium scores similar to those of men who are 15 years younger.21 CAC scores are often reported as per-centiles of calcification in a reference population according to age and sex. Younger people (men younger than 45 years and women younger than 55 years) may have underlying atherosclerosis with little or no CAC, yet any degree of CAC would place a man or woman in these age groups in a very high percentile for the risk of coronary events

as compared with the risk among controls matched for age and sex (Fig. 1). Similarly, the Framing-ham risk score does not classify most young adults as being at high risk, since in this age group, coronary events are unlikely to occur within 10 years, even in young adults with multiple risk fac-tors. Estimates of the lifetime risk of coronary events, rather than the 10-year risk, may be more appropriate for this age group.23 Distributions of CAC scores adjusted for age and sex also vary ac-cording to race and ethnic group.22,24

Since the CAC score indicates the presence or absence and measures the extent of coronary ath-erosclerosis, it is not surprising that multiple stud-ies have shown that a high CAC score is a marker for an increased risk of coronary events. Thus, a CAC score of zero is associated with a very low risk of subsequent coronary events,25,26 whereas increasing CAC scores are associated with a step-wise increase in the risk of events. Numerous stud-ies indicate that the CAC score correlates with the number and severity of conventional risk fac-tors but is also an independent marker for the risk of coronary events, after adjustment for these conventional risk factors.18,24,27-38 For example, the Multi-Ethnic Study of Atherosclerosis (MESA) (ClinicalTrials.gov number, NCT00005487), a co-hort study of 6722 initially asymptomatic adults (38.6% white, 27.6% black, 21.9% Hispanic, and 11.9% Chinese),18 has provided strong evidence of the incremental prognostic information provided by the CAC score (Fig. 2). No major differences in the predictive value of CAC scores were noted among racial or ethnic groups. Although MESA showed a clear gradient of risk as CAC scores increased, the absolute risk of events was low (roughly 1% per year, even among participants with high scores).18

The association between CAC scores and car-diovascular events, which has been well docu-mented, is not unexpected, since the presence and extent of CAC reflect the actual presence and se-verity of atherosclerosis, whereas risk factors, risk scores, and biomarkers are merely markers for the likelihood of the disease. However, it remains un-clear whether CAC scanning has a favorable ef-fect on clinical outcomes, and there are concerns regarding the associated radiation exposure.39-42 The usual radiation burden associated with CAC scanning is small but real (generally, 0.6 to 1.0 mSv for EBCT and 0.9 to 2.0 mSv for MDCT),14 and some MDCT imaging protocols are associated with estimated radiation doses higher than 10

22p3

4.0M

ajor

Cor

onar

y Ev

ents

(%)

3.0

2.0

1.0

0.00 1–100 101–300 >300

CAC Score

No. of EventsNo. at RiskHazard Ratio

(95% CI)

834091.00

2517283.89

(1.72–8.79)

247527.08

(3.05–16.47)

328336.84

(2.93–15.99)

AUTHOR:

FIGURE:

JOB:

4-CH/T

RETAKE

SIZE

ICM

CASE

EMail LineH/TCombo

Revised

AUTHOR, PLEASE NOTE: Figure has been redrawn and type has been reset.

Please check carefully.

REG F

Enon

1st

2nd3rd

Bonow

2 of 3

09-03-09

ARTIST: ts

36110 ISSUE:

Figure 2. Risk of Major Coronary Events with Increasing Coronary-Artery Calcium Score.

The risk of a major coronary event (death or nonfatal myocardial infarction) with increasing coronary-artery calcium (CAC) scores, adjusted for stan-dard risk factors, was determined in 6722 initially asymptomatic healthy adults who were followed for a median of 3.8 years. Higher CAC scores were associated with higher event rates, but even with high CAC scores, the event rate was only approximately 1% per year. A CAC score of 0 indicates no detectable calcium; higher CAC scores indicate more severe degrees of coronary calcification, with the extent of severity influenced by age, ethnici-ty, and sex. CI denotes confidence interval. Data are from Detrano et al.18

Copyright © 2009 Massachusetts Medical Society. All rights reserved. Downloaded from www.nejm.org at HHS LIBRARIES CONSORTIUM on April 12, 2010 .

Page 64: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

clinical pr actice

n engl j med 361;10 nejm.org september 3, 2009 993

mSv.39 In comparison, the standard chest radio-graph yields a radiation dose of 0.01 to 0.02 mSv.40,41 An effective dose of 2.3 mSv is estimated to result in a small measurable increase in the risk of cancer,39 and this estimate would need to be considered if CAC testing (and repeated testing) were used for widespread population screening.

There are also uncertainties with regard to how, when, and in whom the test should be performed, what CAC-score threshold should trigger more aggressive treatment of risk factors, and what the most appropriate treatment is once that threshold is crossed. In the absence of data on outcomes, the CAC score does not meet the criterion for popu-lation screening established by the U.S. Preventive Services Task Force.43 That criterion specifies that screening and treating persons for early-onset dis-ease should improve the likelihood of favorable health outcomes (e.g., reduced disease-specific morbidity or mortality), as compared with the treatment of patients when they present with signs or symptoms of the disease.

In persons categorized as having a low risk of coronary events according to the ATP III criteria, the presence of coronary calcium increases the risk of future events, but even among those in this group with a high CAC score, the likelihood of coronary events remains low (Fig. 3). Similarly, for those who are in the high-risk category accord-ing to the ATP III criteria, a low CAC score does not eliminate the risk of an unfavorable outcome associated with an ominous risk-factor profile, and aggressive attention to risk reduction is warranted regardless of the CAC score. If any group might benefit from the additive risk stratification pro-vided by CAC scoring, it might be persons initially identified by the Framingham criteria as having an intermediate risk of coronary events — that is, those with a 10-year risk of 10 to 20% (Fig. 3). An absence of coronary calcium in these patients might shift them to a low-risk category, whereas a high CAC score might indicate a higher level of risk that calls for more intensive treatment of modifiable risk factors.15,29-31,33 Furthermore, women in general tend to be categorized as hav-ing a low risk according to the ATP III criteria, and a high CAC score appears to identify a group with a substantively higher risk of coronary events than their low-risk profile in the Framingham study would suggest.29,36

The goal of CAC scanning in asymptomatic persons is to refine the risk assessment in order to determine whether preventive strategies should

be intensified, not to identify persons with asymp-tomatic coronary-artery stenoses. However, high CAC scores often lead to additional downstream testing and procedures, including stress imaging, coronary angiography, and even percutaneous cor-onary interventions. Although the highest CAC scores represent substantial plaque burdens with the potential to encroach on the coronary lumen,44 there is no evidence that interventions other than risk-factor modification will lead to a better out-come in patients without symptoms. Widespread CAC screening runs the risk of increasing the number of unnecessary tests and procedures downstream — and of escalating health care costs. A corollary to this view is that CAC assess-ment is unlikely to alter the treatment plan for patients whose blood pressure and lipid levels are already well controlled.

A r e a s of Uncerta in t y

The major unresolved question is whether routine, widespread CAC screening or determination of the CAC score in an individual patient will lead to an overall improvement in quality of care and clin-ical outcomes. The identification of patients at increased risk is useful only if it leads to a success-ful strategy that helps avert future coronary events. Since no studies have been designed to demon-

22p3

Maj

or C

oron

ary

Even

t (%

)

20

12

16

8

4

00–9 10–15 16–20 ≥21

Framingham Risk Score (%)

No. of Patients 46 19 14 19 79 97 40 41 116 126 64 77 75 79 53 84

AUTHOR:

FIGURE:

JOB:

4-CH/T

RETAKE

SIZE

ICM

CASE

EMail LineH/TCombo

Revised

AUTHOR, PLEASE NOTE: Figure has been redrawn and type has been reset.

Please check carefully.

REG F

Enon

1st2nd3rd

Bonow

3 of 3

09-03-09

ARTIST: ts

36110 ISSUE:

CAC Score

01–100101–300>300

Figure 3. The 7-Year Rate of Major Coronary Events Predicted on the Basis of the Framingham Risk Score and the Coronary-Artery Calcium Score.

Rates are based on a Cox regression analysis of data from 1029 initially asymptomatic adults who were followed for a median of 7.0 years. A coro-nary-artery calcium (CAC) score of 0 indicates no detectable calcium; high-er CAC scores indicate more severe degrees of coronary calcification, with the extent of severity influenced by age, ethnicity, and sex. A major coro-nary event was defined as death or nonfatal myocardial infarction. Data are from Greenland et al.31

Copyright © 2009 Massachusetts Medical Society. All rights reserved. Downloaded from www.nejm.org at HHS LIBRARIES CONSORTIUM on April 12, 2010 .

Page 65: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

T h e n e w e ngl a nd j o u r na l o f m e dic i n e

n engl j med 361;10 nejm.org september 3, 2009994

strate this effect, there are no convincing data to suggest that this desired result can be achieved. Studies evaluating surrogate end points related to patient and physician behavior have had mixed results. Two observational studies reported that knowledge of a positive CAC scan was associated with greater use of statin and aspirin therapy45,46 and risk-reducing changes in lifestyle,43 but these studies cannot prove cause and effect. Moreover, in a randomized trial,47 patients who were informed of the results of CAC testing did not have a favor-able improvement in their risk-factor profile after 1 year; the results for these patients were no dif-ferent from the results for patients who were not informed of the results of testing. Knowledge of the CAC score does, however, lead to more anxi-ety, more hospitalizations, and more revascular-ization procedures.45

The final quandary concerning CAC tests is typical of screening tests in general. Should the subgroup with the highest CAC scores, identified by screening, become the target population for intensive risk-reduction measures? This group with the highest relative risk will represent only a small segment of the total population. The much larger group with low CAC scores will have a much lower relative risk, but because of its sheer mag-nitude, this group is likely to include persons who are destined to have at least as many coronary events as those in the high-scoring group. This is borne out in most of the longitudinal studies examining the relation between CAC scores and outcome (Table 1). One would have to treat virtu-ally all patients with even minimal CAC to achieve the goal of screening, thereby negating the value of the screening program. Twenty years ago, my colleagues and I pointed out this inevitability of cardiovascular screening in an article that focused on stress testing,48 and it is demonstrated once again with CAC scoring.

Guidelines

Widespread CAC screening of the asymptomatic adult population is not currently recommended because of the lack of prospective data showing that such screening ultimately results in improved outcomes and reduced coronary events, as well as the inherent limitations of screening. The U.S. Preventive Services Task Force recommends that adults at low risk for coronary events not under-

Table 1. Coronary-Artery Calcium Scores and Outcomes for Asymptomatic Subjects in Eight Studies.*

Study, Outcome Reported, and CAC Score†

EventRate

Total No.of Patients

No. of Patients

with Event

%

Shaw et al.,30 death at 5 yr

CAC score

0 1.0 5,915 59

1–100 1.4 7,990 111

>1000 5.0 311 16

Budoff et al.,35 death at 10 yr

CAC score

0–100 1.1 19,643 226

>1000 12.2 964 118

Detrano et al.,18 death or MI at 3.8 yr

CAC score

0–100 0.6 5,137 33

>300 3.8 833 32

Nasir et al.,24 death at 8 yr

CAC score

0–100 2.8 10,519 296

>1000 25.9 819 212

Raggi et al.,27 death or MI at 32 mo

CAC score

0–100 2.5 511 13

>400 12.8 47 6

Greenland et al.,31 death or MI at 7 yr

CAC score

0–100 5.5 637 35

>300 15.4 221 34

Vliegenthart et al.,32 death at 3.3 yr

CAC score

0–100 3.2 905 29

>1000 12.2 196 24

Lakoski et al.,36 ‡ CHD event at 3.8 yr§

CAC score

0–99 0.6 2403 14

>300 6.7 105 7

* CAC denotes coronary-artery calcium, CHD coronary heart disease, and MI myocardial infarction.

† A CAC score of 0 indicates no detectable calcium; higher CAC scores indicate more severe degrees of coronary calcification, with the extent of severity influ-enced by age, ethnicity, and sex.

‡ The participants in this study were women with low Framingham risk scores.§ A CHD event was defined as death from coronary disease, nonfatal myocardi-

al infarction, development of angina, coronary revascularization for chest pain, or resuscitation after cardiac arrest.

Copyright © 2009 Massachusetts Medical Society. All rights reserved. Downloaded from www.nejm.org at HHS LIBRARIES CONSORTIUM on April 12, 2010 .

Page 66: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

clinical pr actice

n engl j med 361;10 nejm.org september 3, 2009 995

go routine screening and has found that there is insufficient evidence to make a recommendation for or against routine screening of those at high risk for events.49 The expert-panel statement from Prevention Conference V, sponsored by the Amer-ican Heart Association in 2000,50 does not advo-cate routine CAC assessment, although it does note that such an assessment appears to offer the great-est potential value for the group that the Framing-ham criteria place at intermediate risk. The ATP III guidelines7 that were published a year later con-cur with this recommendation against the use of indiscriminate screening, noting that CAC scor-ing might be an option for advanced risk assess-ment in appropriately selected patients with mul-tiple risk factors, provided the test is ordered by a physician who is familiar with the strengths and weaknesses of noninvasive testing. Whether up-dated recommendations will be forthcoming from the newly constituted Adult Treatment Panel IV is uncertain at this time.

The scientific statement from the American Heart Association14 and the expert-consensus doc-ument from the American College of Cardiology–American Heart Association15 conclude that it may be reasonable to consider CAC screening in asymp-tomatic persons identified as having an interme-diate risk of coronary events on the basis of an assessment of multiple risk factors; this view is based on the possibility that such patients might be reclassified in a higher risk group on the basis of the CAC score and that the management of risk factors might then be intensified. As noted previously, no studies have shown that improved outcomes are associated with this approach. The lack of supporting data is reflected in the report

on the appropriate use of cardiac CT developed by the American College of Cardiology and partner organizations,51 which concluded that CAC screen-ing is inappropriate in asymptomatic patients who are at low risk for coronary events according to the ATP III criteria; the authors of the report were uncertain about the appropriateness of screening for those at intermediate or high risk.

Conclusions a nd R ecommendations

A broad population-based strategy of CAC screen-ing does not appear to be warranted. It is not clear whether it is reasonable to consider CAC scanning in persons whose global risk assess-ment places them in the intermediate-risk cate-gory or whether the findings from such testing will lead to a beneficial increase in the intensity of treatment. This issue needs to be addressed in future trials focusing on clinical outcomes and cost-effectiveness.

The patient presented in the vignette has a low Framingham risk score (less than a 10% risk of a coronary event over the next 10 years). I would not recommend CAC scanning but would discuss with him the absence of evidence that the use of this test improves outcomes, as well as the downside of testing, including the radia-tion burden and the possibility that the results will lead to further unnecessary testing. I would start therapy with aspirin and a statin and coun-sel him to initiate a program of regular exercise several times per week.

No potential conflict of interest relevant to this article was reported.

References

National Heart, Lung, and Blood In-1. stitute. Morbidity and mortality: 2007 chartbook on cardiovascular, lung, and blood diseases. Washington, DC: Depart-ment of Health and Human Services, 2007. (Accessed July 21, 2009, at http://www.nhlbi.nih.gov/resources/docs/ 07-chtbk.pdf.)

Ford ES, Ajani UA, Croft JB, et al. Ex-2. plaining the decrease in U.S. deaths from coronary disease, 1980-2000. N Engl J Med 2007;356:2388-98.

Lloyd-Jones D, Adams R, Carnethon 3. M, et al. Heart disease and stroke statis-tics — 2009 update: a report from the American Heart Association Statistics Committee and Stroke Statistics Subcom-

mittee. Circulation 2009;119(3):e21-e181. [Erratum, Circulation 2009;119(3):e182.]

Hardoon SL, Whincup PH, Lennon 4. LT, Wannamethee SG, Capewell S, Morris RW. How much of the recent decline in the incidence of myocardial infarction in British men can be explained by changes in cardiovascular risk factors? Evidence from a prospective population-based study. Circulation 2008;117:598-604.

Murray CJL, Lopez AD. Mortality by 5. cause for eight regions of the world: Glob-al Burden of Disease Study. Lancet 1997;349:1269-76.

Stamler J, Stamler R, Neaton JD, et al. 6. Low risk-factor profile and long-term car-diovascular and noncardiovascular mor-

tality and life expectancy: findings for 5 large cohorts of young adult and middle-aged men and women. JAMA 1999;282: 2012-8.

Expert Panel on Detection, Evalua-7. tion, and Treatment of High Blood Cho-lesterol in Adults. Executive summary of the Third Report of the National Choles-terol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treat-ment of High Blood Cholesterol in Adults (Adult Treatment Panel III). JAMA 2001;285:2486-97. (Also available at http://www.nhlbi.nih.gov/guidelines/cholester-ol/atp3full.pdf.)

Greenland P, Knoll MD, Stamler J, et 8. al. Major risk factors as antecedents of

Copyright © 2009 Massachusetts Medical Society. All rights reserved. Downloaded from www.nejm.org at HHS LIBRARIES CONSORTIUM on April 12, 2010 .

Page 67: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

T h e n e w e ngl a nd j o u r na l o f m e dic i n e

n engl j med 361;10 nejm.org september 3, 2009996

fatal and nonfatal coronary heart disease events. JAMA 2003;290:891-7.

Khot UN, Khot MB, Bajzer CT, et al. 9. Prevalence of conventional risk factors in patients with coronary heart disease. JAMA 2003;290:898-904.

Grundy SM, Cleeman JI, Merz CN, et 10. al. Implications of recent clinical trials for the National Cholesterol Education Program Adult Treatment Panel III guide-lines. Circulation 2004;110:227-39. [Erra-tum, Circulation 2004;110:763.]

Greenland P, Smith SC Jr, Grundy SM. 11. Improving coronary heart disease risk as-sessment in asymptomatic people: role of traditional risk factors and noninvasive cardiovascular tests. Circulation 2001;104: 1863-7.

Smith SC Jr, Greenland P, Grundy SM. 12. Prevention Conference V: beyond second-ary prevention: identifying the high-risk patient for primary prevention: executive summary. Circulation 2000;101:111-6.

Hlatky MA, Greenland P, Arnett DK, 13. et al. Criteria for evaluation of novel markers of cardiovascular risk: a scien-tific statement from the American Heart Association. Circulation 2009;119:2408-16. [Erratum, Circulation 2009;119(25): e606.]

Budoff MJ, Achenbach S, Blumenthal 14. RS, et al. Assessment of coronary artery disease by cardiac computed tomography: a scientific statement from the American Heart Association Committee on Cardio-vascular Imaging and Intervention, Coun-cil on Cardiovascular Radiology and In-tervention, and Committee on Cardiac Imaging, Council on Clinical Cardiology. Circulation 2006;114:1761-91.

Greenland P, Bonow RO, Brundage 15. BH, et al. ACCF/AHA 2007 clinical expert consensus document on coronary artery calcium scoring by computed tomography in global cardiovascular risk assessment and in evaluation of patients with chest pain: a report of the American College of Cardiology Foundation Clinical Expert Consensus Task Force. Circulation 2007; 115:402-26.

Carr JJ, Nelson JC, Wong ND, et al. 16. Calcified coronary artery plaque measure-ment with cardiac CT in population-based studies: standardized protocol of Multi-Ethnic Study of Atherosclerosis (MESA) and Coronary Artery Risk Development in Young Adults (CARDIA) study. Radiology 2005;234:35-43.

Detrano RC, Anderson M, Nelson J, et 17. al. Coronary calcium measurements: ef-fect of CT scanner type and calcium mea-sure on rescan reproducibility — MESA study. Radiology 2005;236:477-84.

Detrano R, Guerci AD, Carr JJ, et al. 18. Coronary calcium as a predictor of coro-nary events in four racial or ethnic groups. N Engl J Med 2008;358:1336-45.

Agatston AS, Janowitz WR, Hildner 19. H, Zusmer NR, Viamonte M Jr, Detrano R. Quantification of coronary artery calcium using ultrafast computed tomography. J Am Coll Cardiol 1990;15:827-32.

Rumberger JA, Simons DB, Fitzpat-20. rick LA, Sheedy PF, Schwartz RS. Coro-nary artery calcium area by electron-beam computed tomography and coronary ath-erosclerotic plaque area: a histopatholog-ic correlative study. Circulation 1995;92: 2157-62.

Hoff JA, Chomka EV, Krainik AJ, Davi-21. glus M, Rich S, Kondos GT. Age and gen-der distributions of coronary artery calci-um detected by electron beam tomography in 35,246 adults. Am J Cardiol 2001;87: 1335-9.

McClelland RL, Chung H, Detrano R, 22. Post W, Kronmal RA. Distribution of cor-onary artery calcium by race, gender, and age: results from the Multi-Ethnic Study of Atherosclerosis (MESA). Circulation 2006;113:30-7.

Lloyd-Jones DM, Leip EP, Larson MG, 23. et al. Predictors of lifetime risk for cardio-vascular disease by risk factor burden at 50 years of age. Circulation 2006;113: 791-8.

Nasir K, Shaw LJ, Liu ST, et al. Ethnic 24. differences in the prognostic value of coronary artery calcification for all-cause mortality. J Am Coll Cardiol 2007;50:953-60.

Shareghi S, Ahmadi N, Young E, Go-25. pal A, Liu ST, Budoff MJ. Prognostic sig-nificance of zero calcium scores on car-diac computed tomography. J Cardiovasc Comput Tomogr 2007;1:155-9.

Greenland P, Bonow RO. How low-26. risk is a coronary calcium score of zero? The importance of conditional probabili-ty. Circulation 2008;117:1627-9.

Raggi P, Callister TQ, Cooil B, et al. 27. Identification of patients at increased risk of first unheralded acute myocardial in-farction by electron-beam computed to-mography. Circulation 2000;101:850-5.

Wong ND, Hsu JC, Detrano RC, Dia-28. mond G, Eisenberg H, Gardin JM. Coro-nary artery calcium evaluation by electron beam computed tomography and its rela-tion to new cardiovascular events. Am J Cardiol 2000;86:495-8.

Kondos GT, Hoff JA, Sevrukov A, et al. 29. Electron-beam tomography coronary artery calcium and cardiac events: a 37-month follow-up of 5635 initially asymptomatic low- to intermediate-risk adults. Circula-tion 2003;107:2571-6.

Shaw LJ, Raggi P, Schisterman E, Ber-30. man DS, Callister TQ. Prognostic value of cardiac risk factors and coronary artery calcium screening for all-cause mortality. Radiology 2003;228:826-33.

Greenland P, LaBree L, Azen SP, Do-31. herty TM, Detrano RC. Coronary artery

calcium score combined with Framingham score for risk prediction in asymptomatic individuals. JAMA 2004;291:210-5.

Vliegenthart R, Oudkerk M, Hofman A, 32. et al. Coronary calcification improves car-diovascular risk prediction in the elderly. Circulation 2005;112:572-7.

Arad Y, Goodman KJ, Roth M, 33. Newstein D, Guerci AD. Coronary calcifi-cation, coronary disease risk factors, C-reactive protein, and atherosclerotic cardiovascular disease events: the St. Francis Heart Study. J Am Coll Cardiol 2005;46:158-65.

Taylor AJ, Bindeman J, Feuerstein I, 34. Cao F, Brazaitis M, O’Malley PG. Coro-nary calcium independently predicts inci-dent premature coronary heart disease over measured cardiovascular risk factors: mean three-year outcomes in the Prospec-tive Army Coronary Calcium (PACC) proj-ect. J Am Coll Cardiol 2005;46:807-14.

Budoff MJ, Shaw LJ, Liu ST, et al. 35. Long-term prognosis associated with cor-onary calcification: observations from a registry of 25,253 patients. J Am Coll Car-diol 2007;49:1860-70.

Lakoski SG, Greenland P, Wong ND, 36. et al. Coronary artery calcium scores and risk for cardiovascular events in women classified as “low risk” based on Framing-ham risk score: the Multi-Ethnic Study of Atherosclerosis (MESA). Arch Intern Med 2007;167:2437-42.

Church TS, Levine BD, McGuire DK, 37. et al. Coronary artery calcium score, risk factors, and incident coronary heart dis-ease events. Atherosclerosis 2007;190: 224-31.

Becker A, Leber A, Becker C, Knez A. 38. Predictive value of coronary calcifications for future cardiac events in asymptomatic patients. Am Heart J 2008;155:154-60.

Kim KP, Einstein AJ, Berrington de 39. González A. Coronary artery calcification screening: estimated radiation dose and cancer risk. Arch Intern Med 2009;169: 1188-94.

Brenner DJ, Hall EJ. Computed to-40. mography — an increasing source of ra-diation exposure. N Engl J Med 2007; 357:2277-84.

Fazel R, Krumholz HM, Wang Y, et al. 41. Exposure to low-dose ionizing radiation from medical imaging procedures. N Engl J Med 2009;361:849-57.

Lauer MS. Elements of danger — the 42. case of medical imaging. N Engl J Med 2009;361:841-3.

Preventive Services Task Force. Guide 43. to clinical preventive services. 2nd ed. (Accessed July 21, 2009, at http://odphp.osophs.dhhs.gov/pubs/guidecps/PDF/Frontmtr.PDF.)

He ZX, Hedrick TD, Pratt CM, et al. 44. Severity of coronary artery calcification by electron beam computed tomography

Copyright © 2009 Massachusetts Medical Society. All rights reserved. Downloaded from www.nejm.org at HHS LIBRARIES CONSORTIUM on April 12, 2010 .

Page 68: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

clinical pr actice

n engl j med 361;10 nejm.org september 3, 2009 997

predicts silent myocardial ischemia. Cir-culation 2000;101:244-51.

Wong ND, Detrano RC, Diamond G, 45. et al. Does coronary artery screening by electron beam computed tomography mo-tivate potentially beneficial lifestyle behav-iors? Am J Cardiol 1996;78:1220-3.

Taylor AJ, Bindeman J, Feuerstein I, et 46. al. Community-based provision of statin and aspirin after the detection of coro-nary artery calcium within a community-based screening cohort. J Am Coll Cardiol 2008;51:1337-41.

O’Malley PG, Feuerstein IM, Taylor AJ. 47. Impact of electron beam tomography, with or without case management, on motivation, behavioral change, and car-

diovascular risk profile: a randomized con-trolled trial. JAMA 2003;289:2215-23.

Epstein SE, Quyyumi AA, Bonow RO. 48. Myocardial ischemia — silent or symp-tomatic. N Engl J Med 1988;318:1038-43.

Preventive Services Task Force. Screen-49. ing for coronary heart disease. (Accessed July 21, 2009, at http://www.ahrq.gov/clinic/3rduspstf/chd/chdrs.htm.)

Greenland P, Abrams J, Aurigemma 50. GP, et al. Prevention Conference V: be-yond secondary prevention: identifying the high-risk patient for primary preven-tion: noninvasive tests of atherosclerotic burden. Circulation 2000;101(1):e16-e22.

Hendel RC, Patel MR, Kramer CM, et 51. al. ACCF/ACR/SCCT/SCMR/ASNC/NASCI/

SCAI/SIR 2006 appropriateness criteria for cardiac computed tomography and cardiac magnetic resonance imaging: a report of the American College of Cardi-ology Foundation Quality Strategic Direc-tions Committee Appropriateness Criteria Working Group, American College of Ra-diology, Society of Cardiovascular Com-puted Tomography, Society for Cardio-vascular Magnetic Resonance, American Society of Nuclear Cardiology, North Amer-ican Society for Cardiac Imaging, Society for Cardiovascular Angiography and In-terventions, and Society of Interventional Radiology. J Am Coll Cardiol 2006;48: 1475-97.Copyright © 2009 Massachusetts Medical Society.

Copyright © 2009 Massachusetts Medical Society. All rights reserved. Downloaded from www.nejm.org at HHS LIBRARIES CONSORTIUM on April 12, 2010 .

Page 69: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

http://ctj.sagepub.com

Clinical Trials

DOI: 10.1177/1740774506073173 2006; 3; 496 Clin Trials

Robert M Califf Clinical trials bureaucracy: unintended consequences of well-intentioned policy

http://ctj.sagepub.com/cgi/content/abstract/3/6/496 The online version of this article can be found at:

Published by:

http://www.sagepublications.com

On behalf of:

The Society for Clinical Trials

can be found at:Clinical Trials Additional services and information for

http://ctj.sagepub.com/cgi/alerts Email Alerts:

http://ctj.sagepub.com/subscriptions Subscriptions:

http://www.sagepub.com/journalsReprints.navReprints:

http://www.sagepub.co.uk/journalsPermissions.navPermissions:

http://ctj.sagepub.com/cgi/content/refs/3/6/496SAGE Journals Online and HighWire Press platforms):

(this article cites 30 articles hosted on the Citations

at NIMH NIH on September 17, 2008 http://ctj.sagepub.comDownloaded from

Page 70: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Clinical Trials 2006; 3: 496–502 ARTICLE

© Society for Clinical Trials 2006 10.1177/1740774506073173

Introduction

The tool of randomization represents a criticaladvance that has enabled modern medicine todistinguish modest but important effects, bothfavorable and detrimental, of medical technologieson patients’ well-being. Although the first modernmulticenter trial was reported in 1947 in the BritishMedical Journal [1], it was only in the 1970s that the

clinical trial became a widely-accepted tool for thebiomedical enterprise. Over the past 30 years, layerafter layer of regulation has been added to theconduct of clinical trials, but there has been littleaccountability for whether this bureaucracy ishaving the net effect of improving the human con-dition, or whether it is unnecessarily impeding theconduct of complex endeavors critical to guidingthe appropriate use of technology [2].

Clinical trials bureaucracy: unintendedconsequences of well-intentioned policy

Robert M Califf

Background As randomized controlled trials have become the ‘gold standard’ formedical research, a complex regulatory structure for the conduct of clinical trialshas emerged. However, this structure has not been adequately assessed to ensurethat regulations governing human subjects research actually produce the desiredeffects.Purpose Our purpose is to identify some of the major shortcomings in the currentregulatory system of human clinical trials oversight, and to propose some potentialsolutions to these problems.Methods We discuss the evolution of the current US regulatory environment andits application in the context of several widely-used drug therapies.Results Despite numerous randomized controlled trials, performed within a struc-ture of extensive documentation and data collection, serious shortcomings in anumber of pharmaceutical therapies were not detected until after the drugs wereapproved and widely adopted by clinicians.Conclusion The current system of regulatory bureaucracy in clinical trials has ledto an extremely expensive research paradigm that, in spite of complex systems ofoversight and exhaustive data collection, cannot be shown to adequately ensurethe integrity of the research process and the protection of human research subjects.Some parts of the system, including Research Ethics Review Boards, may not bewell-suited to carrying out their core mission of overseeing research conduct, andother aspects of clinical trials regulatory structure, such as monitoring/auditingreview and adverse event reporting, may constitute a waste of money andresources. Misdirected data collection and adverse events reporting divert valuableresources and hamper development of large, simple clinical trials powered todefinitively answer important research questions. Careful scrutiny of the utility ofcurrent or proposed regulatory schemes is required to ensure the integrity ofhuman subjects research and to enhance the effectiveness of research dollars.Clinical Trials 2006; 3: 496–502. http://ctj.sagepub.com

Duke Clinical Research Institute and the Duke University Medical Center, Durham, NC, USAAuthor for correspondence: Robert M Califf, Duke Clinical Research Institute, 2400 Pratt Street, Box 17969, Durham,NC 27715, USA. E-mail: [email protected] on a presentation at the 10th International Symposium on Long-Term Clinical Trials, Keble College, Oxford, UK,September 2005.

at NIMH NIH on September 17, 2008 http://ctj.sagepub.comDownloaded from

Page 71: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

A major milestone in the history of this scientifictool was the decision by the US Food and DrugAdministration (FDA) to review the evidence for allpharmaceuticals on the market in 1962 [3]. At thatpoint, it was determined that for most drugs on themarket, the evidence to determine the balance ofrisk and benefit was not available. Many productswere withdrawn from the market and the random-ized controlled trial (RCT) was established as thestandard for determining the efficacy of therapiesfrom the FDA perspective.

The adoption of this standard dramaticallyincreased the diligence with which pharmaceuticalcompanies pursued randomized trials but also led toan understandable desire for a regulatory structureunder which the interpretation of regulatory trialscould be accepted and interpreted. When problemswere found with the conduct of RCTs, new ruleswere developed to prevent recurrence of the issues[4], but the complex maze of rules was not reviewedto eliminate those without benefit. Indeed, thissituation led to multiple poorly-coordinated effortsto codify standard operating procedures (SOPs) forthe conduct of trials and ultimately created anentirely new industry: the contract research organi-zation (CRO). CROs have become finely attuned toproducing trials that satisfy the stated needs of FDAofficials and regulatory departments within themedical technology industry; these entities (FDA,regulatory departments and CROs) wield tremen-dous power in determining the study designs thatultimately define which products are marketed andhow those products are labeled.

A second series of demands for regulation, andtherefore creation of bureaucracy, resulted from theincreasingly widespread realization that clinical trialsare, in essence, human experiments. As recognitionof the potential for abuse of volunteers in humanstudies mounted, regulations intended to protecthuman subjects proliferated [5]. A series of controver-sial failures in the system [6,7] recently have led to aneven more stringent set of rules intended to preventinvestigators and corporations from exposing sub-jects to undue conflicts of interest or unnecessaryrisks as a result of participating in a clinical trial.

The end result of this expression of regulatoryand ethical concerns about clinical trials has beenthe development of a series of well-intentioned reg-ulations and business standards. Due at least in partto the nature of the regulations and the risk ofpenalty and public humiliation if regulators decidethat a breach of the rules has occurred, few havechallenged each new layer of bureaucracy. This situ-ation has culminated in a feeling among manyinvestigators in the United States that the regulatoryburden has become so great that the risks of partici-pating as an investigator exceed the potential bene-fits. Furthermore, the cost of regulatory compliance

has become large enough to create a major incentivefor corporations to move research to countrieswith a much lower cost structure for doing thesame work. Fundamentally, the transactional cost ofthe bureaucracy originally intended to improvethe process of clinical research threatens its viability.

Definitions

The definition of bureaucracy provides insight intoissues that need to be addressed in ‘fixing’ the per-ceived problem of excessive bureaucracy. Merriam-Webster’s dictionary defines bureaucracy as a bodyof nonelective government officials or an adminis-trative policy-making group. More specifically, abureaucracy can be described as a government (orgoverning entity) characterized by specialization offunctions, adherence to fixed rules, and a hierarchyof authority. Finally and of most concern to clinicalresearchers is the definition: ‘a system of adminis-tration marked by officialism, red tape, and proli-feration.’ A tenet of this manuscript is that while abureaucracy is needed, that bureaucracy should beone that actively avoids this latter definition.Instead, the bureaucracy should be characterized byan unceasing effort to optimize the function ofclinical research, and clinical trials as componentsof clinical research, to answer important questionsabout medical technologies and practice while alsoprotecting human subjects from harm due to poorresearch conduct [8].

Clinical trials bureaucracy: the generallandscape

The overall ‘megatrends’ in health care and bio-medical research cry out for a reassessment of thebureaucracy that regulates research efforts. The agingof the population in the economically developedworld and the rapid population growth and rapidly-emerging health systems in developing countriesare leading to significant financial pressure onhealth care systems. This societal financial pressure,combined with the rapid evolution of consumerismat the level of the individual, has produced substan-tial demand for an understanding of the relationshipsbetween value and cost in medical technologies. Theknowledge base for medical therapeutics has evolvedso that clinical trials are recognized as the mostreliable evidence upon which to base our decisionsabout appropriate medical therapies [9]. Mostgovernments and professional societies are vestedin the development of clinical practice guidelinesand technology assessments in which the random-ized clinical trial is viewed as the highest level ofevidence [10,11].

Clinical trials bureaucracy 497

http://ctj.sagepub.com Clinical Trials 2006; 3: 496–502

at NIMH NIH on September 17, 2008 http://ctj.sagepub.comDownloaded from

Page 72: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

498 Robert M Califf

Clinical Trials 2006; 3: 496–502 http://ctj.sagepub.com

The post-genomic revolution will increase thepressure on the system to produce irrefutable evi-dence about which technologies are effective andwhich are ineffective or even dangerous. Knowledgeof genes, proteins, and metabolites, along with func-tional imaging, will divide populations into multi-ple ‘phenotypes’, creating both an unprecedentedopportunity to improve clinical outcomes as well asan enormous subgroup issue that will require manymore trials in larger numbers of human subjects. Asthe market becomes flooded with inadequately-studied tests used to stratify treatment, the demandfor adequate comparative studies will understand-ably increase. With current regulations, however,the cost of the required increase in clinical trialsenrollment will be prohibitive.

A broad system for optimizing medical therapeu-tics has now been envisioned. The conduct of alarger number of pragmatic clinical trials isabsolutely essential to the generation of the level ofevidence needed to guide practice. When the righttrials are done, definitive evidence is developed thatcan be used to create clinical practice guidelines.From clinical practice guidelines, the context ofappropriate practice can be described and perform-ance metrics for practice can be generated [12]. Wenow know that these metrics can be used toimprove practice, thereby reducing death anddisability [13]. We also know that practice can beexamined to continuously improve delivery.However, this ‘cycle of quality’ is completelydependent upon the foundation of well-designedpragmatic clinical trials [14]. The combination ofincreasing societal awareness of the critical natureof clinical trials in defining appropriate uses oftechnology, coupled with the implications of thegenomic revolution, requires a radical restructuringof bureaucracy to permit many more subjects toparticipate in trials at a much lower cost.

The general problem

Given the proliferation of regulations and stan-dards governing clinical trials, one would hope thatthe effort is yielding increasingly efficient resultswith fewer problems, including reductions in viola-tions of human research subject protections [15].Instead, we have ample evidence that the researchsystem has become increasingly expensive, and thefew studies that are available indicate that theresearch dollar is producing fewer results [16]. Asclinical research expenses have skyrocketed, thenumber of new molecular entities reaching themarket has declined in recent years. Furthermore,when the recent histories of late drug failures arereviewed [17], the problem that emerged was notfailure to adhere to regulations, but expenditure of

huge amounts of money on the collection of vastamounts of irrelevant data. These failures haveoccurred in many different fields of medicine.

The failure of the Type I ‘anti-arrhythmic’ drugsprovides the prototype example. Multiple clinicaltrials were completed to demonstrate that thesedrugs reduced the frequency of ventricular prema-ture beats in patients thought to be at risk ofsudden death. None of the trials was large enoughto determine the actual effect on death. Thousandsof pages of data were collected and adverse eventswere dutifully reported to regulatory authorities.Unfortunately, these trials did not ask the funda-mentally simpler but more important question: didthese drugs reduce or increase the risk of suddendeath? Unfortunately, the answer came after thedrugs had been on the market for years: they didindeed increase rather than decrease the risk ofsudden death [18]. Thus, the bureaucracy divertedattention from the critical question and created acost structure that delayed the crucial trials for yearswhile the drugs were being used in practice in alethal manner.

The field of women’s health was rocked by thedemonstration in two consecutive trials that theuse of hormone replacement therapy (HRT) inpostmenopausal women increased rather thandecreased cardiovascular events [19,20]. At the timethese trials were performed, HRT was the most com-monly prescribed drug therapy globally. Many trialshad been done with small numbers of women withthe intention of providing mechanistic insight intothe ‘benefits’; however, little consideration wasgiven to the potential for harm. Additionally, yearsof adverse event reporting and mountains ofdetailed data collection in these small samplesfailed to uncover the problem. The cost of theWomen’s Health Initiative project (over $200million) has been used as a reason to recommenddecreasing the public funding of clinical trials [21],yet without the trial this treatment would still beleading to thousands of unnecessary deaths, strokesand myocardial infarctions in the hands of well-intentioned doctors and postmenopausal women.

The exposure of large numbers of patients toCOX-2 inhibitors occurred before their effect oncardiovascular events was discovered. Again, thepattern of failure involved multiple small clinicaltrials, with huge amounts of data collected on eachpatient at a tremendous cost. The studies wereheavily monitored: fortunes were spent on flyinghighly-paid monitors as often as once per month toinvestigational sites, where all data items werechecked to ensure that the data submitted to thecoordinating center matched the clinical sourcedocumentation. As in the cases of Type I anti-arrhythmics and HRT, one cannot help but wonderwhether spending the same amount of money on

at NIMH NIH on September 17, 2008 http://ctj.sagepub.comDownloaded from

Page 73: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

large studies with comparisons of clinical out-comes, while at the same time dispensing with allthe monitoring and source documentation, couldhave achieved a reliable estimate of the balance ofrisk and benefit.

We are currently experiencing a period in whichmultiple pragmatic questions are plaguing drugscommonly used in the treatment of mental healthproblems. Selective serotonin reuptake inhibitor(SSRI) antidepressants seem to cause an increase insuicidal ideation [22], and 10% of American boysare said to be taking drugs to treat attention deficitdisorder that are known to raise heart rate andblood pressure, and for which anecdotal cases ofsudden death have been reported [23]. Perhaps nofield other than mental health drug developmenthas conducted so many small trials with mountainsof data with so little understanding of the balanceof risks and benefits of taking the drugs.

All of these failures have in common a tremen-dous investment in detailed data collection, exten-sive auditing of the data collected, and scrutiny thatthe hypotheses tested in the studies were con-firmed. These expensive methods would not haveallowed early detection of these problems, becausemuch larger size trials would have been needed andthe costs would have been untenable. Given thepropensity among those who do not fully compre-hend the risks involved in skipping evaluative clini-cal trials to advocate the use of unproven putativesurrogate endpoints as a cost-saving measure [24], itis especially critical to eschew unnecessary costs sothat these vital pragmatic trials can be done.

Research ethics review boards,institutional review boards and ethics committees

Few would quibble with the concept that humanstudies should have oversight by an unbiased groupof people representing diverse interests. Yet thesystem of research ethics review boards (RERBs),also known as ethics committees or institutionalreview boards (IRBs), has taken on a life of its own.Bureaucracy proliferates as new rules are addedwhile outdated or even useless regulations oftenremain in place [25–27]. Indeed, almost noempirical studies have been done to determinewhich aspects of human subjects protection over-sight are effective and which are useless or evencounterproductive.

In May 2000 a multidisciplinary meeting involv-ing investigators, regulators, RERB chairs, andindustry representatives was held to address thisproblem [28]. But while progress has been madesince 2000 in some areas, it would be difficult toconclude that the bureaucracy has been ‘right-sized’

for the function of human subjects protectionoversight.

At this meeting, numerous gaps in the function-ing of the system were identified. RERBs werethought to be incapable of making safety assess-ments based on the data received from multicentertrials, except in the case of rare, idiosyncraticadverse events. An enormous duplication of effortoccurs in multicenter trials, with each RERB review-ing consent forms and looking at the same uselessreports of adverse events accruing from all over theworld. Empirical evidence indicates that tamperingwith consent forms by local RERBs often rendersthem harder to understand, and despite all of ourefforts subjects frequently cannot demonstrate thatthey truly understand the nature of the study inwhich they have agreed to participate [29]. Recentempirical data [30] indicate that the present fervorover informing subjects about details of potentialconflicts of interest may have been overblown fromthe perspective of what subjects actually want toknow.

Most of the misplaced and very expensive effortsof local RERBs would be better performed by well-functioning data monitoring committees (DMCs).Only in 2002 was the first textbook for DMCswritten, and the ‘rules of engagement’ are still inevolution [31]. However, it is obvious that in amulticenter trial, each local RERB is not in a posi-tion to interpret individual adverse event reports,particularly when the RERB is neither unblindednor aware of the denominator that forms the essen-tial context for interpreting adverse events in trialsthat are still enrolling subjects. Despite dozens ofmeetings and statements decrying this absurd gen-eration of mountains of expensive paper withalmost nothing to show for it, the policies have yetto change. Furthermore, DMCs are now under thegun of the legal system, causing an increase in theneed for expensive documentation and indemnifi-cation for those who fill this vital societal role [32].

While RERBs remain bogged down reviewingprotocols and consent forms, there is little to nooversight of the actual conduct of clinical trials.Such an effort would require real-time, sample-based observation of the functioning of theresearch team as it interacts with human subjects,together with interviews of subjects to determinethe degree to which they understood the issuesraised by participation in the trial.

Adverse events reporting and collection

When medical products (drugs, devices or biolog-ics) are made available to prevent or treat illness, asystem for reporting and tracking adverse eventsrelated to the use of the technology makes sense.

Clinical trials bureaucracy 499

http://ctj.sagepub.com Clinical Trials 2006; 3: 496–502

at NIMH NIH on September 17, 2008 http://ctj.sagepub.comDownloaded from

Page 74: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

500 Robert M Califf

Clinical Trials 2006; 3: 496–502 http://ctj.sagepub.com

Indeed, despite its shortcomings [33], the sponta-neous adverse events reporting system currently inplace has resulted in the identification of a numberof adverse effects of drugs and devices that have ledto modifications in use or withdrawal from themarket. However, partial success should not justifycontinued proliferation of useless components. Inparticular, many companies have interpreted regu-lations as requiring that all adverse events bereported to all RERBs participating in clinical trialswith the drug or device, a practice that costs hun-dreds of millions of dollars. When RERBs receivethese useless documents, they feel compelled toassign a clinical expert to assess the report. Themailing and filing costs alone are enormous, butadded to this is the incalculable cost of professionaltime, much of it provided on a voluntary basis. TheFDA recently developed a tentative guidance thatwould dramatically reduce the proliferation ofadverse event reports, but this guidance has notbeen finalized, and the practice continues.

Auditing and monitoring

Well-publicized examples of untruthful behavior byclinical investigators have fueled an understandableinterest in the creation of systems to ensure thatinvestigators are both truthful and accurate whenthey report clinical research data. The response tothis interest has been to develop an elaboratesystem of data auditing in clinical trials that aresubmitted for regulatory review. For many industry-sponsored studies, the SOPs call for every casereport form to be reviewed in person by a monitorwho physically visits the enrolling site. This is avery expensive process, as study monitors are well-paid and incur significant travel and lodgingexpenses.

This labor-intensive approach has been defendedfrom two perspectives. First, a company hoping togain regulatory approval of its product for market-ing can ill afford to fail an audit from regulators,with the result that extra up-front expense is justi-fied. Second, there is deep concern about the verac-ity of investigators as they report data in the courseof clinical trials. On-site auditing to detect fraud andcorrect sloppiness, and the extensive efforts to verifythe accuracy of the data, meet ‘common sense’views of how to produce reliable research results.This rationale is reinforced by the FDA policy of on-site auditing, often performed by individuals withlittle knowledge of the clinical material, which rele-gates them to checking process-related issues andexamining concordance of different data sources.Unfortunately errors, and even on occasion frankfraud, are found often enough to lend credence tothe concept of on-site auditing.

One effect of extensive auditing is the generationof ‘data queries’, a process by which sites aresolicited to correct or verify possibly discrepant orincorrect data. An entire industry has developedaround creating, communicating, and archiving thedocumentation necessary for verifying the natureof, and individuals responsible for, changes made tostudy records. It is estimated that the cost of gener-ating and resolving a single query is over $150. Yetonly a few empirical studies have been done toexamine the merits of this practice, and thesestudies indicate that it may be a colossal waste ofmoney.

An empirical approach is sorely needed todevelop quality standards in clinical trials that leadto reliable results at the lowest cost. This approachwill involve statistical sampling in the samemanner as utilized in other industries. By examin-ing patterns in the reported data, answers thatappear either impossible or too ‘perfect’ can bedetected. Indeed, studies of fraud indicate that sta-tistical process control will more often detect fraudthan on-site auditing, since someone clever enoughto attempt to fabricate data will be inclined toprepare for an audit [34–37].

Dollars spent

The implications of the issues enumerated are pro-found in terms of the efficiency and effectiveness ofclinical research. One view of this problem was gen-erated from an exercise recently published in theAmerican Heart Journal [38]. A group of academics,industry experts, and FDA officials convened todiscuss the cost of conducting clinical trials. Twohypothetical clinical trials were developed and agroup of companies applied their operationalmodels to determine the time frames and resourcesneeded to conduct the trials. The average total costfor a trial in acute coronary syndromes was $83million, but estimates ranged from $57 million to$158 million. A heart failure trial was priced at anaverage of $135 million, with a range of $102 to$207 million. The cost of managing research sites(monitoring and data auditing) and payments tothe investigators accounted for �65% of total costs,with the main drivers of these costs being excessivedata collection and monitoring visits, with the asso-ciated proliferation of travel and labor costs.

Possible solutions

The development of the internet has made it possi-ble to automate much of what first developed as atedious and expensive set of manual processes inthe conduct of clinical trials. The current system is

at NIMH NIH on September 17, 2008 http://ctj.sagepub.comDownloaded from

Page 75: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

driven by companies seeking guidance in regula-tory and reimbursement issues, as well as seekingmarketing advantages for their products, and alsoby officials appointed to prevent poorly-conductedresearch of the type that has led to public concerns.Investigators are ‘hired’ to conduct the researchunder a set of regulations as described above. Aparallel system, funded by government agencies toanswer questions driven by the academic scientificcommunity, exists in some countries. These paralleluniverses overlap at the level of individual researchsites, but neither functions optimally in answeringthe questions that would best inform decisionsabout medical care.

An alternative approach would be to create net-works of investigators motivated to answer ques-tions of clinical relevance for both the medicalproducts industry and the health policy communi-ties. This infrastructure could be funded by govern-ments that maintain a continuous record of thequality of site performance. Industry could usethese sanctioned networks without bearing the costof creating expensive data collection instrumentsand data auditing systems. As electronic healthrecords are phased in, this possibility will becomeeven more substantial, creating the opportunity tomaintain ‘interoperable’ networks that could eval-uate technologies and behavioral interventionsacross multiple disease categories. Statistical processcontrol could be used to prompt cause-specificaudits based on data patterns, and fees to performresearch activities could be reimbursed at a profes-sional level directly to staff actually dealing withsubjects, instead of to auditors [39]. Both the USNational Institutes of Health and the BritishMedical Research Council are making initial effortsin this direction. The networks themselves couldparticipate in the design of the studies to ensurethat they are feasible and properly directed toanswer important questions.

If these efforts succeed, the bureaucracy cur-rently devoted to processes that have not beentuned to optimize the societal value of researchcould focus on driving the systems toward answer-ing questions that would better inform medicaldecision-making. Such an approach could alsoreverse the worrisome trend of control of data bythe medical products industry instead of by theinvestigators [40].

Conclusion

While bureaucracy is necessary in providing rulesand guidance for the conduct of clinical research,the system has become unnecessarily complex andexpensive. Many well-intentioned regulations haveresulted in huge amounts of effort and paperwork

with little or no measurable positive impact on theethical conduct of clinical research. However, it isclear that these regulations and standards havedramatically increased the cost of clinical trials,thereby making the enterprise much less efficient inits primary mission of answering questions thatinform decisions for patients and consumers,doctors and other health care providers and admin-istrators. Correction of this enormous societal wasteis critical not just to the clinical trials industry orthe industries that develop and sell medical prod-ucts. Rather, since the pragmatic clinical trial is thecornerstone of evidence upon which the quality ofmedical care is built, tuning the logistics to themost efficient delivery of meaningful results iscritical to the future of medicine.

References

1. MRC Streptomycin in Tuberculosis Trials Committee.Streptomycin treatment of pulmonary tuberculosis. BrMed J 1948; ii: 769–78.

2. Yusuf S. Randomized clinical trials: slow death by athousand unnecessary policies? CMAJ 2004; 17: 889–92.

3. Temple RJ. The effect of the Drug Regulation Reform Actof 1978 on clinical research, drug availability, and thepublic health. Ann N Y Acad Sci 1981; 368: 175–85.

4. Temple R, Pledger GW. The FDA’s critique of theanturane reinfarction trial. N Engl J Med 1980; 303:1488–92.

5. Capron AM. When experiments go wrong: the U.S.perspective. J Clin Ethics 2004; 15: 22–29.

6. Smith L, Byers JF. Gene therapy in the post-Gelsingerera. JONAS Healthc Law Ethics Regul 2002; 4: 104–10.

7. Anon. OHRP orders Johns Hopkins University to suspendhuman subjects research. J Investig Med 2001; 49: 379.

8. Califf RM, Morse MA, Wittes J et al. Toward protectingthe safety of participants in clinical trials. Control ClinTrials 2003; 24: 256–71.

9. Califf RM, Peterson ED, Gibbons RJ et al. Integratingquality of care into the cycle of therapeuticdevelopment. J Am Coll Cardiol 2002; 40: 1895–901.

10. Gibbons RJ, Smith S, Antman E; American College ofCardiology; American Heart Association. AmericanCollege of Cardiology/American Heart Associationclinical practice guidelines: Part I: where do they comefrom? Circulation 2003; 107: 2979–86.

11. Gibbons RJ, Smith SC Jr, Antman E; American Collegeof Cardiology; American Heart Association. AmericanCollege of Cardiology/American Heart Associationclinical practice guidelines: Part II: evolutionary changesin a continuous quality improvement project. Circulation2003; 107: 3101–107.

12. Bufalino V, Peterson ED, Burke GL et al. Payment forquality: guiding principles and recommendations:principles and recommendations from the American HeartAssociation’s Reimbursement, Coverage, and Access PolicyDevelopment Workgroup. Circulation 2006; 113: 1151–54.

13. Peterson ED, Roe MT, Mulgund J et al. Associationbetween hospital process performance and outcomesamong patients with acute coronary syndromes. JAMA2006; 295: 1912–20.

14. DeMets DL, Califf RM. Lessons learned from recentcardiovascular clinical trials: Part I. Circulation 2002; 106:746–51.

Clinical trials bureaucracy 501

http://ctj.sagepub.com Clinical Trials 2006; 3: 496–502

at NIMH NIH on September 17, 2008 http://ctj.sagepub.comDownloaded from

Page 76: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

502 Robert M Califf

Clinical Trials 2006; 3: 496–502 http://ctj.sagepub.com

15. Pattison J, Stacey T. Research bureaucracy in the UnitedKingdom: seeking a balance: response from theDepartment of Health and COREC. Br Med J 2004; 329:622.

16. Moses H 3rd, Dorsey ER, Matheson DH, Thier SO.Financial anatomy of biomedical research. JAMA 2005;294: 1333–42.

17. Woodcock J. An audience with Janet Woodcockdiscusses the role of the FDA in improving pharmaproductivity. Nat Rev Drug Discov 2004; 3: 904.

18. Morganroth J, Bigger JT Jr. Pharmacologic manage-ment of ventricular arrhythmias after the cardiacarrhythmia suppression trial. Am J Cardiol 1990; 65:1497–503.

19. Hulley S, Grady D, Bush T et al. Randomized trial ofestrogen plus progestin for secondary prevention ofcoronary heart disease in postmenopausal women. Heartand Estrogen/progestin Replacement Study (HERS)Research Group. JAMA 1998; 280: 605–13.

20. Rossouw JE, Anderson GL, Prentice RL et al. Risks andbenefits of estrogen plus progestin in healthypostmenopausal women: principal results from theWomen’s Health Initiative randomized controlled trial.JAMA 2002; 288: 321–33.

21. Marks AR. Rescuing the NIH before it is too late. J ClinInvest 2006; 116: 844.

22. March J, Silva S, Petrycki S et al. Fluoxetine, cognitive-behavioral therapy, and their combination foradolescents with depression: treatment for AdolescentsWith Depression Study (TADS) randomized controlledtrial. JAMA 2004; 292: 807–20.

23. Nissen SE. ADHD drugs and cardiovascular risk. N Engl JMed 2006; 354: 1445–48.

24. Begg CB, Brawley O, Califf RM, Demets DL, EllenbergSS, Kaplan RS. Marketing drugs too early in testing.Science 2006; 312: 195.

25. Burman WJ, Reves RR, Cohn DL, Schooley RT.Breaking the camel’s back: multicenter clinical trials andlocal institutional review boards. Ann Intern Med 2001;134: 152–57.

26. Jamrozik K. Research ethics paperwork: what is the plotwe seem to have lost? Br Med J 2004; 329: 286–87.

27. Peto J, Fletcher O, Gilham C. Data protection, informedconsent, and research. Br Med J 2004; 328: 1029–30.

28. Morse MA, Califf RM, Sugarman J. Monitoring andensuring safety during clinical research. JAMA 2001; 285:1201–205.

29. Sugarman J, Lavori PW, Boeger M et al. Evaluating thequality of informed consent. Clin Trials 2005; 2: 34–41.

30. Weinfurt KP, Dinan MA, Allsbrook JS et al. Policies ofacademic medical centers for disclosing financialconflicts of interest to potential research participants.Acad Med 2006; 81: 113–18.

31. Ellenberg S, Fleming TR, DeMets DL. Data monitoringcommittees in clinical trials: a practical perspective (statisticsin practice). John S Wiley and Sons, 2003.

32. DeMets DL, Fleming TR, Rockhold F et al. Liabilityissues for data monitoring committee members. ClinicalTrials 2004; 1: 525–31.

33. Gross R, Strom BL. Toward improved adverse event/suspected adverse drug reaction reporting. Pharmaco-epidemiol Drug Saf 2003; 12: 89–91.

34. Chilappagari S, Kulkarni A, Bolich-Aldrich S et al.A statistical process control method to monitorcompleteness of central cancer registry reporting data.J Registry Mgt 2002; 4: 121–27.

35. Svolba G, Bauer P. Statistical quality control in clinicaltrials. Cont Clin Trials 1999; 20: 519–30

36. Nahm M. Optimizing data quality through statisticalprocess control. 15th Annual US DIA Clinical DataManagement Symposium: Innovative Technologies andStrategies for the Global Management of Clinical Data.14–17 March 2000, Philadelphia, Pennsylvania, USA.

37. Rostami R. Optimized clinical trial data quality checksusing SPC. 14th Annual European Clinical DataManagement Conference, DIA. 8–10 November 2004,Amsterdam, The Netherlands.

38. Eisenstein EL, Lemons PW 2nd, Tardiff BE, SchulmanKA, Jolly MK, Califf RM. Reducing the costs of phase IIIcardiovascular clinical trials. Am Heart J 2005; 149: 482–88.

39. Ferris L, Naylor CD. Physician remuneration in industry-sponsored clinical trials: the case for standardized clinicaltrial budgets. CMAJ 2004; 171: 883–86.

40. Schulman KA, Seils DM, Timbie JW et al. A nationalsurvey of provisions in clinical-trial agreements betweenmedical schools and industry sponsors. N Engl J Med2002; 347: 1335–41.

at NIMH NIH on September 17, 2008 http://ctj.sagepub.comDownloaded from

Page 77: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

When Is Measuring Sensitivity and Specificity Sufficient To Evaluate aDiagnostic Test, and When Do We Need Randomized Trials?Sarah J. Lord, MBBS, MS; Les Irwig, MBBCh, PhD; and R. John Simes, MBBS, MS, MD

The clinical value of using a new diagnostic test depends onwhether it improves patient outcomes beyond the outcomesachieved using an old diagnostic test. When can studies of diag-nostic test accuracy provide sufficient information to infer clinicalvalue, and when do clinicians need to wait for results from ran-domized trials? The authors argue that accuracy studies suffice if anew diagnostic test is safer or more specific than, but of similarsensitivity to, an old test. However, if a new test is more sensitivethan an old test, it leads to the detection of extra cases of disease.Results from treatment trials that enrolled only patients detected by

the old test may not apply to these extra cases. Clinicians need towait for results from randomized trials assessing treatment efficacyin cases detected by the new diagnostic test, unless they can besatisfied that the new test detects the same spectrum and subtypeof disease as the old test or that treatment response is similar acrossthe spectrum of disease.

Ann Intern Med. 2006;144:850-855. www.annals.orgFor author affiliations, see end of text.

Diagnostic tests are used to confirm, exclude, classify, ormonitor disease to guide treatment. Their clinical

value depends on whether the information they provideleads to improved patient outcomes; this can be assessed byrandomized trials that compare patient outcomes from thenew diagnostic test versus the old test strategy. However,randomized trials of test-and-treatment strategies are notroutinely performed. They are not required for marketingapproval, and they are not always feasible because theyrequire large sample sizes. As a result, new diagnostic testsfrequently enter clinical practice without evidence of im-proved patient outcomes.

Studies of diagnostic test accuracy can show how welldiagnostic strategies that include a new test identify thepresence or absence of disease compared with an old teststrategy by comparing each with the results of a referencestandard test. However, if clinicians use only informationabout test accuracy to decide whether to adopt a new di-agnostic test, they sometimes may harm patients (for ex-ample, if subsequent treatments are unsafe or ineffective).It is therefore worthwhile to investigate how best to decideif clinicians can rely on evidence about test accuracy or ifthey need to wait for patient outcome results from ran-domized trials. Fryback and Thornbury (1) described ahierarchy of 6 levels of evidence for the assessment of adiagnostic test: 1) the technical quality of test information;2) diagnostic accuracy; 3) change in the referring physi-cian’s diagnostic thinking; 4) change in the patient man-agement plan; 5) change in patient outcomes; and 6) soci-etal costs and benefits. This framework does not provideguidelines about if and when lower levels of evidence areadequate to assess a test and always requires randomizedtrials for conclusions about improved patient outcomes.Some researchers have suggested that accuracy studiesalone may sometimes suffice (2–6). We provide a practicalframework to help clinicians decide whether a new diag-nostic test can be adopted on the basis of evidence of testaccuracy alone or if they need to wait for results fromrandomized trials.

A FRAMEWORK FOR DECIDING IF EVIDENCE OF TEST

ACCURACY WILL SUFFICE

Studies of diagnostic test accuracy can suffice if clini-cians already have evidence from randomized trials show-ing that treatment of the cases detected by the diagnostictest improved patient outcomes (Figure 1). This approachmay seem straightforward; however, it requires a clear un-derstanding of the proposed use and benefits of the newtest, as well as careful consideration of whether the casesdetected are representative of the patients included in treat-ment trials.

The benefits of a new diagnostic test will vary accord-ing to how it is used. Investigators of a new diagnostic testneed to explicitly state who will be tested, where the newtest will fit in the existing diagnostic pathway, and whattests it will supplement or replace. This information willallow them to identify the expected benefits of adoptingthe new test and, therefore, the most relevant questions toask to assess its value. Test attributes generally fall into 3categories: 1) The test is safer or is less costly; 2) the test ismore specific (excludes more cases of nondisease) and thusavoids unnecessary treatment; and 3) the test is more sen-sitive (detects more cases of disease) and thus promotesmore appropriate treatment.

We propose a simple sequence of questions to guidedecisions about whether evidence of test accuracy will suf-fice for each of these 3 categories (Figure 2). The first stepin assessing a new diagnostic test is to classify it accordingto whether it is more sensitive than the old test. We de-scribe the key concepts of this framework using simpleexamples to describe situations where the new test offersbetter safety or specificity with similar sensitivity, followed by

See also:

Web-OnlyConversion of figures and table into slides

Annals of Internal Medicine Academia and Clinic

850 © 2006 American College of Physicians

Page 78: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

consideration of situations where the test is more sensitive. Wealso consider other, more complex scenarios. In each of theseexamples, we have some existing evidence of treatment ef-ficacy for cases detected by an old test, and therefore, therationale for testing has already been established.

When a New Test Has Similar Sensitivity to an Old TestIf a new diagnostic and old diagnostic test have similar

sensitivity, it is generally reasonable to assume that theywill detect the same true cases of disease. However, a newtest may offer other positive attributes, such as better safetyor more specificity than an old test. If studies of test accu-racy show that the new test offers other positive attributeswithout a loss of sensitivity, it is logical to assume that casesdetected by either test will show the same response to treat-ment. Therefore, new trials assessing treatment efficacy inthe cases detected by the new test are not needed.

When a new test and an old test show similar sensi-tivity and specificity, the value of the new test correspondsto the benefits of avoiding the adverse events or costs asso-ciated with the old test. For example, Doppler ultrasonog-raphy has been adopted as a less invasive replacement forvenography for diagnosis of deep venous thrombosis (7).Given that trials have shown that treatment of this condi-tion is effective (8), studies of test accuracy showing thatDoppler ultrasonography and venography have similar sen-sitivity and specificity suffice to determine the value of thenew test (Table, example 1).

If a new test is more specific than an old test, it avoidsthe harms of unnecessary further investigation or treat-ment. In this scenario, evidence showing that the new testhas similar sensitivity but better specificity than the old testwill suffice to determine the new test’s value. For example,new immunochemical fecal occult blood testing kits have

been proposed to offer increased specificity, without a lossof sensitivity, for detecting human blood, reducing thehigh false-positive rate of the older guaiac-based test kits(9). Given that trials have already shown the benefit ofdetecting colorectal cancer with guaiac test kits, studies oftest accuracy will suffice to determine whether the immu-nochemical tests have increased specificity without anysubstantial loss in sensitivity (Table, example 2).

Our assessment is more complex when we consider anew test that involves a tradeoff between positive and neg-ative attributes. We deal with this situation later.

When the New Test Is More Sensitive than the Old TestIf a new test is more sensitive than an old test but has

similar specificity, its value is directly related to the treat-ment response in the extra cases detected. If treatment re-sponse has already been assessed by treatment trials enroll-ing patients detected by the new test, decisions to use thetest will be based on whether these trials showed that treat-ment improves patient outcomes. In this instance, evidenceof test accuracy linked with evidence of treatment efficacyreplaces the need for new randomized trials (Figure 1).

There will also be a good match between tested andtreated populations if the results for patients identified by anew test are analyzed in a treatment trial as potential pre-dictors of treatment response. For example, trials of adju-vant tamoxifen among women with early breast cancerhave shown that the estrogen receptor status of the tumordetermines its response to treatment (10). These trialsdemonstrate the clinical value of testing for estrogen recep-tor status, as well as the effectiveness of tamoxifen therapy.However, more commonly, treatment trials have only en-

Figure 1. Trial evidence versus linked evidence of test accuracy and treatment efficacy.

Randomized Trial

Targetpopulation

New test

Casesdetected

Treatment

Patientoutcomes

Old test

Casesdetected

Treatment

Patientoutcomes

Test Accuracy Study plus Randomized Trial of Treatment Efficacy

Targetpopulation

New testaccuracy

Casesdetected

Old testaccuracy

Casesdetected

Casesdetected*

Treatment

Patientoutcomes

Notreatment

Patientoutcomes

plus

*Cases detected by the new and old test may not show similar response to treatment.

Academia and ClinicAssessment of New Diagnostic Tests

www.annals.org 6 June 2006 Annals of Internal Medicine Volume 144 • Number 11 851

Page 79: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

rolled cases detected by an old test. In these situations,clinicians need to consider whether the results apply tocases detected by the new test.

Assessing Whether the Extra Cases Detected by a New,More Sensitive Test Respond to Treatment

Clinicians first need to ask whether the extra casesdetected by a new diagnostic test represent the same spec-trum or subtypes of disease as those included in treatmenttrials. If they do, then randomized trials may not be re-quired (Figure 2). If they do not, or if clinicians cannot becertain that they do, then clinicians also need to askwhether treatment response is known to be similar acrossthe range of disease spectrum or subtype. These consider-ations are important regardless of the magnitude of thedifference in sensitivity between tests.

At one extreme, a good match between tested and

treated populations is possible if the reference standardused to determine test sensitivity and specificity is also thestarting point for trials conducted in similar populations.For example, computed tomography colonography is moresensitive for the detection of large colorectal polyps whenscanning is performed with the patient in prone and supinepositions versus supine positioning alone, without reducedspecificity (11). Treatment trials showing improved sur-vival following the early detection and treatment of colo-rectal polyps have been based on cases detected by colonos-copy, the reference standard for the computed tomographycolonography accuracy studies. All of these tests are used toidentify the same disease characteristic (adenomatous colo-rectal polyps) and spectrum of disease (classified by polypsize), so it is reasonable to conclude that computed tomog-raphy colonography with dual positioning will improve pa-

Figure 2. Assessing new tests using evidence of test accuracy, given that treatment is effective for cases detected by the old test.

RCT � randomized, controlled trial. * New test � diagnostic strategies that include the new test; old test � standard diagnostic strategies that do notinclude the new test.

Academia and Clinic Assessment of New Diagnostic Tests

852 6 June 2006 Annals of Internal Medicine Volume 144 • Number 11 www.annals.org

Page 80: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

tient outcomes compared with supine positioning alone(Table, example 3).

At the other extreme, a good match between testedand treated populations will not be possible if a new testmeasures a different biological characteristic to define dis-ease and leads to a different selection of cases for treatment.For example, a new nuclear magnetic resonance spectros-copy technique to measure lipoprotein particle size andconcentration has been proposed as a more accurate testthan plasma lipoprotein cholesterol levels to identify pa-tients who will benefit from lipid-lowering therapy (12).However, the effectiveness of treatment for the extra pa-tients detected by the new rather than the old test has notbeen assessed (Table, example 4).

Often we need to consider situations between these 2extremes, including situations where the evidence is incom-plete. Accuracy studies and treatment trials are usuallydone by different investigators at different times and indifferent settings. Variations in the clinical setting andspectrum of disease may lead to genuine differences (het-erogeneity) in test accuracy and treatment effect (13, 14),

limiting clinicians’ ability to link evidence between studiesand draw conclusions about the value of a new test. Con-sider magnetic resonance angiography versus conventionalarteriography for detecting lower-limb arterial disease. Al-though magnetic resonance angiography is a more sensitivetest for patients with claudication, the effectiveness of sur-gical and percutaneous interventions has been well estab-lished only in patients with critical limb ischemia (15).Whether this evidence can be linked to infer that magneticresonance angiography will improve patient outcomes de-pends on whether clinicians can assume that test accuracyand treatment response are similar for patients with differ-ent disease severity.

A new test may also produce a shift in the spectrum ofdisease detected. For example, breast magnetic resonanceimaging in addition to mammography is more sensitivethan mammography alone for the early detection of inva-sive breast cancer in young women at high risk. However,the extra cases detected may represent a different spectrumof disease that does not show improved treatment response.Therefore, the value of breast magnetic resonance imaging

Table. New Diagnostic Test Assessment Framework and Examples*

Test and Indication ProposedBenefits of NewDiagnostic Test

Does the Evidence of EffectiveTreatment Apply to the CasesDetected by the New Test?

Can Studies of Test Accuracy Suffice?

Example 1: Dopplerultrasonography vs.venography for detectionof deep venous thrombosis

Safer, less costly Yes; no change to the definition orspectrum of disease.

Yes; the value of the new test may beinferred from an assessment of itsrelative safety and cost.

Example 2: new vs. standardFOBT for early detectionof colorectal cancer

More specific Yes; no change to the definition orspectrum of disease.

Yes; the value of the new test may beinferred from an assessment of itsrelative safety and cost and thebenefits of avoiding a false-positiveresult.

Example 3: supine and pronepositioning for CTcolonography vs. supinepositioning alone fordetection of adenomatouscolorectal polyps

More sensitive Yes; no change to the definition orspectrum of disease because casesdetected by both tests aresubsequently confirmed bycolonoscopy.

Yes; the value of the new test may beinferred from an assessment of itsrelative sensitivity, given nosubstantial loss in specificity andexisting trial evidence that treatmentimproves patient outcomes.

Example 4: NMRspectroscopy vs. plasmalipoprotein levels for thedetection of hyperlipidemia

More sensitive Unknown; it is unknown whetherNMR detects patients who wouldbenefit more or less fromlipid-lowering agents than thosewhose disorder was detected usingthe existing test.

No; trial evidence is required todetermine the value of the new test.

Example 5: MRI vs.mammography for earlierdetection of invasivebreast cancer in women athigh risk for the disease

More sensitive No; it is uncertain if any benefitsof treatment at the earlier stage ofdisease outweigh the harm ofoverdetection of cancer that wouldnever have presented clinically.

No; trial evidence is required todetermine the impact of testing onpatient outcomes or, at least, theinterval breast cancer rate.

Example 6: PET, MRI, andEEG vs. MRI and EEG todetect an epileptogenicfocus in patients withmedically refractoryepilepsy who are beingconsidered for surgery

More sensitive Uncertain; it is uncertain whetherpatients with functional lesionsdetected by PET for whom existingstandard imaging yields negativeor inconclusive results will showthe same treatment response tosurgery as patients with structurallesions that can be detected withstandard imaging alone.

Judgment is needed about whetherclinicians require randomized trials toassess treatment response in the extracases detected by PET or whetherthey can rely on existing trialsconducted in patients with diseasedetected by MRI and EEG andobservational evidence abouttreatment response in cases detectedby PET.

* CT � computed tomography; EEG � electroencephalography; FOBT � fecal occult blood test; MRI � magnetic resonance imaging; NMR � nuclear medicineresonance; PET � positron emission tomography.

Academia and ClinicAssessment of New Diagnostic Tests

www.annals.org 6 June 2006 Annals of Internal Medicine Volume 144 • Number 11 853

Page 81: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

is uncertain without trials comparing patient outcomes af-ter early versus standard detection (16) (Table, example 5).

When the possibility of a spectrum effect and its im-pact on treatment efficacy are considered, it is useful todetermine whether a new test performs consistently in dif-ferent populations and whether treatment of the detectedcondition is effective across different patient subgroups.Previous studies that identify subgroups in which the test isless accurate or the treatment less effective may be availableand may assist in making judgments about generalizabilitybetween the cases detected by a new test and patients in-cluded in treatment trials. In many cases, generalizabilitywill depend on expert opinion. Sometimes, differences inthe spectrum of disease detected may be obvious becausethe new test detects disease of a different size, stage, grade,or severity. In other cases, this judgment will be less clear.

Consider the use of positron emission tomography asan incremental test to detect an operable epileptogenic fo-cus in patients with medically refractory epilepsy who haveno structural lesion on magnetic resonance imaging. Thereis trial evidence that seizure control improves after surgeryfor patients with structural lesions (17). However, the extracases detected by positron emission tomography may rep-resent a different spectrum of disease. Thus, clinicians’ de-cisions whether evidence of test accuracy will suffice willdepend on whether it is plausible that the response to treat-ment is similar in patients with structural lesions (thetreated population in this example) and those with func-tional lesions detected by positron emission tomography(positive results on the new test in the tested population)(Table, example 6). The willingness of clinicians to acceptassumptions about generalizability depends on the esti-mated costs of a new test and the severity of the conse-quences if the assumptions are proved false. If the costs orconsequences are substantial, clinicians should wait for di-rect evidence from trials that include the extra cases de-tected by the new test.

OTHER CONSIDERATIONS

The examples discussed here do not cover all possiblescenarios. For example, it is possible for a new test to haveoverall sensitivity and specificity that are similar to those ofan old test but to detect different cases of disease if itidentifies patients at a different disease spectrum. However,we think our approach provides a reasonable and efficientframework to deal with most situations. It helps cliniciansto quickly identify more complex situations where a newtest offers a tradeoff between positive and negative at-tributes (for example, if a new test is less invasive but alsoless specific than the existing test). Here, the benefits ofbetter safety need to be assessed against the harms arisingfrom additional false-positive results. Such tradeoffs can beassessed by using a decision analytic model to assess thebenefits and harms of new and old tests, including the ratesand the consequences of true-positive, false-positive, true-

negative, and false-negative results, as well as test compli-cations. Decision analysis also allows clinicians to comparethe effects of testing in populations with a different prev-alence of disease. If a new test offers better sensitivity buthas other negative attributes, decision analysis may not beappropriate because it relies on assumptions that evidenceof test accuracy can be linked to evidence of treatmentefficacy. If this linkage is uncertain, clinicians will need tocall for new randomized trials. In these situations, trialsinvestigating the effect of treatment in patients who havepositive results on the new test and negative results on theold test may be more efficient and more clinically relevantthan trials conducted on all patients in whom the new testyields positive results (18).

CONCLUSION

Whenever clinicians use a new diagnostic test becauseit is more sensitive than an old test, they need to be clearabout the assumptions linking this evidence to improvedpatient outcomes, such as evidence that the new test de-tects the same spectrum of disease as the old test or hassimilar treatment efficacy across the spectrum of disease.The selection of effective tests is just as essential as theselection of effective treatments; thus, whenever clinicians’assumptions about a new diagnostic test are in doubt, theyshould wait for evidence from randomized trials.

From The University of Sydney, Sydney, Australia.

Grant Support: The authors have received funding from the AustralianMedical Services Advisory Committee for the development of guidelinesfor the assessment of diagnostic technologies and National Health andMedical Research Council Program Grants (253602, 402764)

Potential Financial Conflicts of Interest: None disclosed.

Requests for Single Reprints: Sarah J. Lord, MBBS, MS, NationalHealth and Medical Research Council Clinical Trials Centre, Universityof Sydney, 88 Mallett Street, Camperdown NSW, 2050 Australia; e-mail, [email protected].

Current author addresses are available at www.annals.org.

References1. Fryback DG, Thornbury JR. The efficacy of diagnostic imaging. Med DecisMaking. 1991;11:88-94. [PMID: 1907710]2. Guyatt GH, Tugwell PX, Feeny DH, Haynes RB, Drummond M. A frame-work for clinical evaluation of diagnostic technologies. CMAJ. 1986;134:587-94.[PMID: 3512062]3. Barratt A, Irwig L, Glasziou P, Cumming RG, Raffle A, Hicks N, et al.Users’ guides to the medical literature: XVII. How to use guidelines and recom-mendations about screening. Evidence-Based Medicine Working Group. JAMA.1999;281:2029-34. [PMID: 10359392]4. Busse R, Orvain J, Velasco M, Perleth M, Drummond M, Gurtner F, et al.Best practice in undertaking and reporting health technology assessments. Work-ing group 4 report. Int J Technol Assess Health Care. 2002;18:361-422. [PMID:12053427]5. Harris RP, Helfand M, Woolf SH, Lohr KN, Mulrow CD, Teutsch SM, etal. Current methods of the US Preventive Services Task Force: a review of the

Academia and Clinic Assessment of New Diagnostic Tests

854 6 June 2006 Annals of Internal Medicine Volume 144 • Number 11 www.annals.org

Page 82: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

process. Am J Prev Med. 2001;20:21-35. [PMID: 11306229]6. Biesheuvel CJ, Grobbee DE, Moons KG. Distraction from randomization indiagnostic research. Ann Epidemiol. 2006. [PMID: 16386925]7. Dauzat M, Laroche JP, Deklunder G, Ayoub J, Quere I, Lopez FM, et al.Diagnosis of acute lower limb deep venous thrombosis with ultrasound: trendsand controversies. J Clin Ultrasound. 1997;25:343-58. [PMID: 9282799]8. Gould MK, Dembitzer AD, Doyle RL, Hastie TJ, Garber AM. Low-molec-ular-weight heparins compared with unfractionated heparin for treatment ofacute deep venous thrombosis. A meta-analysis of randomized, controlled trials.Ann Intern Med. 1999;130:800-9. [PMID: 10366369]9. Ouyang DL, Chen JJ, Getzenberg RH, Schoen RE. Noninvasive testing forcolorectal cancer: a review. Am J Gastroenterol. 2005;100:1393-403. [PMID:15929776]10. Tamoxifen for early breast cancer: an overview of the randomised trials. EarlyBreast Cancer Trialists’ Collaborative Group. Lancet. 1998;351:1451-67.[PMID: 9605801]11. Fletcher JG, Johnson CD, Welch TJ, MacCarty RL, Ahlquist DA, Reed JE,et al. Optimization of CT colonography technique: prospective trial in 180 pa-tients. Radiology. 2000;216:704-11. [PMID: 10966698]

12. Otvos JD, Jeyarajah EJ, Cromwell WC. Measurement issues related to li-poprotein.13. Bailey KR. Generalizing the results of randomized clinical trials. Control ClinTrials. 1994;15:15-23. [PMID: 8149769]14. Irwig L, Bossuyt P, Glasziou P, Gatsonis C, Lijmer J. Designing studies toensure that estimates of test accuracy are transferable. BMJ. 2002;324:669-71.[PMID: 11895830]15. Koelemay MJ, Lijmer JG, Stoker J, Legemate DA, Bossuyt PM. Magneticresonance angiography for the evaluation of lower extremity arterial disease: ameta-analysis. JAMA. 2001;285:1338-45. [PMID: 11255390]16. Irwig L, Houssami N, Armstrong B, Glasziou P. Evaluating new screeningtests for breast cancer [Editorial]. BMJ. 2006;332:678-9. [PMID: 16565097]17. Wiebe S, Blume WT, Girvin JP, Eliasziw M. A randomized, controlled trialof surgery for temporal-lobe epilepsy. N Engl J Med. 2001;345:311-8. [PMID:11484687]18. Bossuyt PM, Lijmer JG, Mol BW. Randomised comparisons of medicaltests: sometimes invalid, not always efficient. Lancet. 2000;356:1844-7. [PMID:11117930]

SERVICES FOR PDA USERS

See the “PDA Services” heading at the top of the Annals home page atwww.annals.org to learn more about the following services available toPDA users.

Annals Reader for Palm PDAEditors’ Notes, abstracts, and full texts. Images and references

are not available.Reminders from PDA to the desktop computer about articles. The

reminders include a link to the full text of the article atwww.annals.org.

Storage of multiple issues of Annals. You can control how manyissues to store on your PDA or have the reader automaticallydelete back issues.

Issues may be stored on external memory. Back issues are availablestarting with July 2003.

Storage of favorite articles (separate from storage of back issues).Storage of collections of previously published articles from Annals.

Wireless access to the current issue for an Internet-enabled PDA

Academia and ClinicAssessment of New Diagnostic Tests

www.annals.org 6 June 2006 Annals of Internal Medicine Volume 144 • Number 11 855

Page 83: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Current Author Addresses: Drs. Lord and Simes: National Health andMedical Research Council Clinical Trials Centre, University of Sydney,88 Mallett Street, Camperdown NSW, 2050 Australia.

Dr. Irwig: School of Public Health, A27, Edward Ford Building, Uni-versity of Sydney, NSW 2006 Australia.

Annals of Internal Medicine

W-208 6 June 2006 Annals of Internal Medicine Volume 144 • Number 11 www.annals.org

Page 84: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

DO

I:10

.150

3/cm

aj.0

9052

3

CMAJ Analysis

CMAJ • MAY 12, 2009 • 180(10)© 2009 Canadian Medical Association or its licensors

E47

Randomized trials have traditionally been broadlycategorized as either an effectiveness trial or an effi-cacy trial, although we prefer the terms “pragmatic”

and “explanatory.” Schwartz and Lellouch described these 2approaches toward clinical trials in 1967.1 These authorscoined the term “pragmatic” to describe trials that help userschoose between options for care, and “explanatory” to de-scribe trials that test causal research hypotheses (i.e., that agiven intervention causes a particular benefit).

We take the view that, in general, pragmatic trials are pri-marily designed to determine the effects of an intervention un-der the usual conditions in which it will be applied, whereasexplanatory trials are primarily designed to determine the ef-fects of an intervention under ideal circumstances.2 Thus, theseterms refer to a trial’s purpose and, in turn, structure. The de-gree to which this purpose is met depends on decisions abouthow the trial is designed and, ultimately, conducted.

Very few trials are purely pragmatic or explanatory. Forexample, in an otherwise explanatory trial, there may be someaspect of the intervention that is beyond the investigator’s con-trol. Similarly, the act of conducting an otherwise pragmatictrial may impose some control resulting in the setting beingnot quite usual. For example, the very act of collecting data re-quired for a trial that would not otherwise be collected in usualpractice could be a sufficient trigger to modify participant be-haviour in unanticipated ways. Further, several aspects of atrial are relevant, relating to choices of trial participants, healthcare practitioners, interventions, adherence to protocol andanalysis. Thus, we are left with a multidimensional continuumrather than a dichotomy, and a particular trial may displayvarying levels of pragmatism across these dimensions.

In this article, we describe an effort to develop a tool to as-sess and display the position of any given trial within thepragmatic–explanatory continuum. The primary aim of thistool is to help trialists assess the degree to which design deci-sions align with the trial’s stated purpose (decision-making v.explanation). Our tool differs, therefore, from that of Gart-lehner and associates3 in that it is intended to inform trial de-sign rather than provide a method of classifying trials for thepurpose of systematic reviews. It can, however, also be used

by research funders, ethics committees, trial registers andjournal editors to make the same assessment, provided trial-ists declare their intended purpose and adequately report theirdesign decisions. Hence, reporting of pragmatic trials is ad-dressed elsewhere.4

Ten ways in which pragmatic and explanatory trials can differ

Trialists need to make design decisions in 10 domains that de-termine the extent to which a trial is pragmatic or explana-tory. Explanatory randomized trials that seek to answer thequestion “Can this intervention work under ideal conditions?”address these 10 domains with a view to maximizing what-ever favourable effects an intervention might possess.2

Table 1 illustrates how an explanatory trial, in its most ex-treme form, might approach these 10 domains.

Pragmatic randomized trials that seek to answer the ques-tion “Does this intervention work under usual conditions?”5,6

address these 10 domains in different ways when there areimportant differences between usual and ideal conditions.Table 1 illustrates the most extreme pragmatic response tothese domains.

Kevin E. Thorpe MMath, Merrick Zwarenstein MD MSc, Andrew D. Oxman MD, Shaun Treweek BSc PhD, Curt D. Furberg MD PhD, Douglas G. Altman DSc, Sean Tunis MD MSc,Eduardo Bergel PhD, Ian Harvey MB PhD, David J. Magid MD MPH, Kalipso Chalkidou MD PhD

Published at www.cmaj.ca on Apr. 16, 2009. An abridged version of this article appeared in the May 12 issue of CMAJ. This article waspublished simultaneously in the May 2009 issue of the Journal of Clinical Epidemiology (www.jclinepi.com).

A pragmatic–explanatory continuum indicator summary(PRECIS): a tool to help trial designers

From the Dalla Lana School of Public Health (Thorpe) and the Department ofHealth Policy, Management and Evaluation (Zwarenstein), University ofToronto, Toronto, Ont.; the Centre for Health Services Sciences, SunnybrookResearch Institute, Sunnybrook Health Sciences Centre, and the Institute forClinical Evaluative Sciences (Zwarenstein), Toronto, Ont.; the Preventive andInternational Health Care Unit, Norwegian Knowledge Centre for the HealthServices (Oxman, Treweek), Oslo, Norway; the Division of Clinical and Popu-lation Sciences and Education, University of Dundee (Treweek), Dundee, UK;the Division of Public Health Sciences, Wake Forest University School of Med-icine (Furberg), Winston-Salem, USA; the Centre for Statistics in Medicine,University of Oxford (Altman), Oxford, UK; the Center for Medical Technol-ogy Policy (Tunis), Baltimore, USA; the UNDP/UNFPA/WHO/World Bank Spe-cial Programme of Research, Development and Research Training in HumanReproduction, Department of Reproductive Health and Research, WorldHealth Organization (Bergel), Geneva, Switzerland; the Faculty of Health,University of East Anglia (Harvey), Norwich, UK; the Institute for Health Re-search, Kaiser Permanente Colorado, and the Departments of PreventiveMedicine and Biometrics and of Emergency Medicine, University of ColoradoHealth Sciences Center (Magid), Denver, USA; and the National Institute forHealth and Clinical Excellence (Chalkidou), London, UK

@@ See related commentaries by Zwarenstein and Treweek, page 998, and by Maclure, page 1001

Page 85: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Analysis

CMAJ • MAY 12, 2009 • 180(10)E48

Table 1: PRECIS domains illustrating the extremes of explanatory and pragmatic approaches to each domain

Domain Pragmatic trial Explanatory trial

Participants

Participant eligibility criteria

All participants who have the condition of interest are enrolled, regardless of their anticipated risk, responsiveness, comorbidities or past compliance.

Stepwise selection criteria are applied that (a) restrict study individuals to those previously shown to be at highest risk of unfavourable outcomes, (b) further restrict these high-risk individuals to those who are thought likely to be highly responsive to the experimental intervention and (c) include just those high-risk, highly responsive study individuals who demonstrate high compliance with pretrial appointment-keeping and mock intervention.

Interventions and expertise

Experimental intervention —flexibility

Instructions on how to apply the experimental intervention are highly flexible, offering practitioners considerable leeway in deciding how to formulate and apply it.

Inflexible experimental intervention, with strict instructions for every element.

Experimental intervention —practitioner expertise

The experimental intervention typically is applied by the full range of practitioners and in the full range of clinical settings, regardless of their expertise, with only ordinary attention to dose setting and side effects.

The experimental intervention is applied only by seasoned practitioners previously documented to have applied that intervention with high rates of success and low rates of complications, and in practice settings where the care delivery system and providers are highly experienced in managing the types of patients enrolled in the trial. The intervention often is closely monitored so that its “dose” can be optimized and its side effects treated; co-interventions against other disorders often are applied.

Comparison intervention — flexibility

“Usual practice” or the best alternative management strategy available, offering practitioners considerable leeway in deciding how to apply it.

Restricted flexibility of the comparison intervention; may use a placebo rather than the best alternative management strategy as the comparator.

Comparison intervention —practitioner expertise

The comparison intervention typically is applied by the full range of practitioners and in the full range of clinical settings, regardless of their expertise, with only ordinary attention to their training, experience and performance.

Practitioner expertise in applying the comparison intervention(s) is standardized to maximize the chances of detecting whatever comparative benefits the experimental intervention might have.

Follow-up and outcomes

Follow-up intensity No formal follow-up visits of study individuals. Instead, administrative databases (e.g., mortality registries) are searched for the detection of outcomes.

Study individuals are followed with many more frequent visits and more extensive data collection than would occur in routine practice, regardless of whether patients experienced any events.

Primary trial outcome

The primary outcome is an objectively measured, clinically meaningful outcome to the study participants. The outcome does not rely on central adjudication and is one that can be assessed under usual conditions (e.g., special tests or training are not required).

The outcome is known to be a direct and immediate consequence of the intervention. The outcome is often clinically meaningful but may sometimes (e.g., early dose-finding trials) be a surrogate marker of another downstream outcome of interest. It may also require specialized training or testing not normally used to determine outcome status or central adjudication.

Compliance/adherence

Participant compliance with “prescribed” intervention

There is unobtrusive (or no) measurement of participant compliance. No special strategies to maintain or improve compliance are used.

Study participants’ compliance with the intervention is monitored closely and may be a prerequisite for study entry. Both prophylactic strategies (to maintain) and “rescue” strategies (to regain) high compliance are used.

Practitioner adherence to study protocol

There is unobtrusive (or no) measurement of practitioner adherence. No special strategies to maintain or improve adherence are used.

There is close monitoring of how well the participating clinicians and centres are adhering to even the minute details in the trial protocol and “manual of procedures.”

Analysis

Analysis of primary outcome

The analysis includes all patients regardless of compliance, eligibility, and others (intention-to-treat analysis). In other words, the analysis attempts to see if the treatment works under the usual conditions, with all the noise inherent therein.

An intention-to-treat analysis is usually performed. However, this may be supplemented by a per-protocol analysis or an analysis restricted to “compliers” or other subgroups in order to estimate maximum achievable treatment effect. Analyses are conducted that attempt to answer the narrowest, “mechanistic” question (whether biological, educational or organizational).

Note: PRECIS = pragmatic–explanatory continuum indicator summary.

Page 86: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

The design choices for a trial intended to inform a researchdecision about the benefit of a new drug are likely to be moreexplanatory (reflecting ideal conditions). Those for a latertrial of the same drug intended to inform practical decisionsby clinicians or policy-makers are likely to be more pragmatic(reflecting usual conditions). When planning their trial, trial-ists should consider whether a trial’s design matches theneeds of those who will use the results. A tool to locate trialdesign choices within the pragmatic–explanatory continuumcould facilitate these design decisions, help to ensure that thechoices that are made reflect the intended purpose of the trial,and help others to appraise the extent to which a trial is appro-priately designed for its intended purpose.

Such a tool could, for example, expose potential inconsis-tencies, such as the use of intensive adherence monitoring andintervention (explanatory tactics) in a trial being designed toanswer a more pragmatic question. Alternatively, a trial mightinclude a wide range of participants and meaningfully assessthe impact (pragmatic tactics) but evaluate an intervention thatis enforced or tightly monitored (explanatory tactics) and thusnot widely feasible. By supporting the identification of poten-tial inconsistencies such as these, a pragmatic–explanatory in-dicator could improve the extent to which trial designs are fitfor purpose by highlighting design choices that do not supportthe needs of the intended users of the trial’s results. In this arti-cle we introduce such a tool.

The pragmatic–explanatory distinction comprises a contin-uous spectrum, not an either/or dichotomy of the extremes, asillustrated in Table 1. Moreover, it is probably impossibleever to perform a “purely” explanatory or “purely” pragmatictrial. For example, no patients are perpetually compliant, andthe hand of the most skilled surgeon occasionally slips, sothere can never be a “pure” explanatory trial. Similarly, a“pure” pragmatic trial loses its purity as soon as its first eligi-ble patient refuses to be randomized.

Development of the PRECIS tool

The proposal for the pragmatic–explanatory continuum indi-cator summary (PRECIS) was developed by an internationalgroup of interested trialists at 2 meetings in Toronto (2005and 2008) and in the time between. The initiative grew fromthe Pragmatic Randomized Controlled Trials in HealthCare(Practihc) project (www.practihc.org), an initiative funded byCanada and the European Union to promote pragmatic trialsin low- and middle-income countries.

The development of the PRECIS indicator began with theidentification of key domains that distinguish pragmatic fromexplanatory trials. As illustrated in Table 1, they comprise:• The eligibility criteria for trial participants.• The flexibility with which the experimental intervention is

applied.• The degree of practitioner expertise in applying and moni-

toring the experimental intervention.• The flexibility with which the comparison intervention is

applied.• The degree of practitioner expertise in applying and moni-

toring the comparison intervention.

• The intensity of follow-up of trial participants.• The nature of the trial’s primary outcome.• The intensity of measuring participants’ compliance with

the prescribed intervention, and whether compliance-improving strategies are used.

• The intensity of measuring practitioners’ adherence to thestudy protocol, and whether adherence-improving strate-gies are used.

• The specification and scope of the analysis of the primaryoutcome.During the 2005 meeting, 8 domains emerged during a

brainstorming session. Furthermore, 5 mutually exclusive defi-nitions were used to assign the level of pragmatism in each do-main. Attempts to use the initial tool on a number of publishedtrials revealed some difficulties. The mutually exclusive cate-gories were technically difficult to understand and use, and insome cases contradictory among domains. The current ap-proach, for the most part, is to consider a number of designtactics or restrictions consistent with an explanatory trial ineach domain. The more tactics that are present, the more ex-planatory is the trial. However, these design tactics and restric-tions (see “The domains in detail” section for some examples)are not equally important, so it is not a simple matter of addingup tactics. Where exactly to place a trial on the pragmatic–explanatory continuum is, therefore, a judgment best made bytrialists discussing these issues at the design stage of their trialand reaching consensus. Initially, the domains for interventionflexibility and practitioner expertise addressed both the experi-mental and comparison interventions. Discussions at the 2008meeting led to the separation of experimental and comparisoninterventions into their own domains and the replacement of adomain regarding trial duration with the domain related to thenature of the primary outcome.

At this point, a brief explanation of our use of some termi-nology may be helpful. In this paper, we view a trial partici-pant as the recipient of the intervention. In many trials, theparticipants are patients. However, in a trial of a continuingeducation intervention, for example, the participants may bephysicians. By practitioner we mean the person deliveringthe intervention. Again, for many trials the practitioners arephysicians. For a continuing education intervention, the prac-titioners may be trained instructors.

We defined the purpose of a pragmatic trial as answeringthe question “Does an intervention work under usual condi-tions?,” where we take “usual conditions” to mean the same as,or very similar to, the usual-care setting. Characterizing thepragmatic extreme of each domain is less straight forward,since what is considered “usual care” may depend on context.For some interventions, what is usual for each domain mayvary across different settings. For example, the responsivenessand compliance of patients, adherence of practitioners to guide-lines, and the training and experience of practitioners may bedifferent in different settings. Thus, characterizing the prag-matic extreme requires specifying the settings for which a trialis intended to provide an answer. Occasionally a pragmatic trialaddresses a question in a single specific setting. For example, arandomized trial of interventions to improve the use of activesick leave was designed to answer a pragmatic question under

Analysis

CMAJ • MAY 12, 2009 • 180(10) E49

Page 87: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

usual conditions specific to the Norwegian context, where ac-tive sick leave was being promoted as a public sickness benefitscheme offered to promote early return to modified work fortemporarily disabled workers.7 More often pragmatic trials willaddress questions across specific types of settings or across awide range of settings. Examples of specific types of settingsinclude settings where chloroquine-resistant falciparum malariais endemic, where hospital facilities are in close proximity, orwhere trained specialists are available.

Conversely, we defined the purpose of an explanatory trialas answering the question “Can an intervention work underideal conditions?” Given this definition, characterizing theexplanatory extreme of each domain is relatively straight-forward and intuitive. It simply requires considering the de-sign decisions one would make in order to maximize thechances of success. Thus, for example, one would select pa-tients that are most likely to comply and respond to the inter-vention, ensure that the intervention is delivered in a way thatoptimizes its potential for beneficial effects, and ensure that itis delivered by well-trained and experienced practitioners.

Thus, we recommend that trialists or others assessingwhether design decisions are fit for purpose do this in 4 steps:1. Declare whether the purpose of the trial is pragmatic or

explanatory.2. Specify the settings or conditions for which the trial is in-

tended to be applicable.3. Specify the design options at the pragmatic and explana-

tory extremes of each domain.4. Decide how pragmatic or explanatory a trial is in relation

to those extremes for each domain.For some trials, there may not be any important difference

between the pragmatic and explanatory extremes for some di-mensions. For example, delivering an intervention, such asacetylsalicylic acid (ASA) therapy to someone with an acute

myocardial infarction, does not require practitioner expertise.As mentioned earlier, for domains where the extremes areclear, it should not be difficult to decide whether a design de-cision is at one extreme or the other. For design decisions thatare somewhere in between the extremes, it can be more chal-lenging to determine how pragmatic or explanatory a trial willbe. For this reason we recommend that all the members of thetrial design team rate each domain and compare.

To facilitate steps 3 and 4, we have identified a number ofdesign tactics that either add restrictions typical of explanatorytrials or remove restrictions in the fashion of pragmatism. Thetactics that we describe here are not intended to be prescrip-tive, exhaustive or even ordered in a particular way, but ratherillustrative. They are to aid trialists or others in assessingwhere, within the pragmatic–explanatory continuum, a domainis, allowing them to put a “tick” on a line representing the con-tinuum. To display the “results” of this assessment, the linesfor each domain are arranged like spokes of a wheel, with theexplanatory pole near the hub and the pragmatic pole on therim (Figure 1). The display is completed by joining the loca-tions of all 10 indicators as we progress around the wheel.

The proposed scales seem to make sense intuitively andcan be used without special training. Although we recognizethat alternative graphical displays are possible, we feel thatthe proposed wheel plot is an appealing summary and is in-formative in at least 3 ways.

First, it depicts whether a trial is tending to take a broadview (as in a pragmatic trial asking whether an interventiondoes work, under usual conditions) or tending to be narrowly“focused” near the hub (as for an explanatory trial askingwhether an intervention can work, under ideal conditions).

Second, the wheel plot highlights inconsistencies in howthe 10 domains will be managed in a trial. For example, if atrial is to admit all patients and practitioners (extremely prag-matic) yet will intensely monitor compliance and intervenewhen it falters (extremely explanatory), a single glance at thewheel will immediately identify this inconsistency. This al-lows the researcher to make adjustments in the design, if pos-sible and appropriate, to obtain greater consistency with theirobjective in conducting the trial.

Third, the wheel plot can help trialists better report anylimitations in interpretation or generalization resulting fromdesign inconsistencies. This could help users of the trial re-sults make better decisions.

The domains in detail

Participant eligibility criteriaThe most extremely pragmatic approach to eligibility wouldseek only to identify study participants with the condition ofinterest from as many sources (e.g., institutions) as possible.As one moves toward a more explanatory attitude, additionalrestrictions will be placed on the study population. These re-strictions include the following:• Excluding participants not known or shown to be highly

compliant to the interventions under study.• Excluding participants not known or shown to be at high

risk for the primary trial outcome.

Analysis

CMAJ • MAY 12, 2009 • 180(10)E50

Flexibility of the comparison intervention

Practitioner expertise (experimental)

Flexibility of the experimental intervention

Eligibility criteria

Primary analysis

Practitioner adherence

Participant compliance

Outcomes

Follow-up intensity

Practitioner expertise (comparison)

E

Figure 1: The blank “wheel” of the pragmatic–explanatorycontinuum indicator summary (PRECIS) tool. “E” represents the“explanatory” end of the pragmatic–explanatory continuum.

Page 88: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

• Excluding participants not expected to be highly respon-sive to the experimental intervention.

• Using a small number of sources (or even 1) for participants.The first 3 restrictions noted above are typically achieved

by applying various exclusion criteria to filter out participantsthought least likely to respond to the intervention. So, ex-planatory trials tend to have more exclusion criteria thanpragmatic trials. Exclusion criteria for known safety issueswould not necessarily count against a pragmatic trial, sincesuch individuals would not be expected to get the interventionunder usual practice.

Flexibility of experimental interventionThe pragmatic approach leaves the details of how to implementthe experimental intervention up to the practitioners. For exam-ple, the details of how to perform a surgical procedure are leftentirely to the surgeon. How to deliver an educational programis left to the discretion of the educator. In addition, the prag-matic approach would not dictate which co-interventions werepermitted or how to deliver them. Several restrictions on the in-tervention’s flexibility are possible:• Specific direction could be given for administering the in-

tervention (e.g., dose, dosing schedule, surgical tactics, ed-ucational material and delivery).

• Timing of the delivery of the intervention could be de-signed to maximize the intervention effect.

• The number and permitted types of co-interventions couldbe restricted, particularly if excluded co-interventionswould dilute any intervention effect.

• Specific direction could be given for applying permittedco-interventions.

• Specific direction could be given for managing complica-tions or side effects from the primary intervention.

Experimental intervention — practitioner expertiseA pragmatic approach would put the experimental interven-tion into the hands of all practitioners treating (educating, andothers) the study participants. The choice of practitioner canbe restricted in a number of ways:• Practitioners could be required to have some experience,

defined by length of time, in working with the participantslike the ones to be enrolled in the trial.

• Some specialty certification appropriate to the interventioncould be required.

• For an intervention that has been in use (e.g., surgery)without a trial evaluation, experience with the interventionitself could be required.

• Only practitioners who are deemed to have sufficient ex-perience in the subjective opinion of the trial investigatorwould be invited to participate.

Flexibility of the comparison interventionSpecification of the flexibility of the comparison interventioncomplements that of the flexibility of the experimental inter-vention. A pragmatic trial would typically compare an inter-vention to “usual practice” or the best alternative manage-ment strategy available, whereas an explanatory trial wouldrestrict the flexibility of the comparison intervention and

might, in the case of early-phase drug development trials, usea placebo rather than the best alternative management strat-egy as the comparator.

Comparison intervention — practitioner expertiseSimilar comments apply as for the specification of the flexi-bility of the comparison intervention. In both cases, the ex-planatory extreme would maximize the chances of detectingwhatever benefits an intervention might have, whereas thepragmatic extreme would aim to find out the benefits andharms of the intervention in comparison with usual practice inthe settings of interest.

Follow-up intensityThe pragmatic position would be not to seek follow-up con-tact with the study participants in excess of the usual practicefor the practitioner. The most extreme position is to have nocontact with study participants and instead obtain outcomedata by other means (e.g., administrative databases to deter-mine mortality). Various adjustments to follow-up intensityare possible. The extent to which these adjustments couldlead to increased compliance or improved intervention re-sponse will determine whether follow-up intensity moves to-ward the explanatory end.• Follow-up visits (timing and frequency) are prespecified in

the protocol.• Follow-up visits are more frequent than typically would

occur outside the trial (i.e., under usual care).• Unscheduled follow-up visits are triggered by a primary

outcome event.• Unscheduled follow-up visits are triggered by an interven-

ing event that is likely to lead to the primary outcomeevent.

• Participants are contacted if they fail to keep trial ap-pointments.

• More extensive data are collected, particularly intervention-related data, than would be typical outside the trial.Often the required trial outcomes may be obtained only

through contact with the participants. Even in the “no follow-up” approach, assessment of outcomes may be achieved with asingle “end of study” follow-up. The end of study would needto be defined so that there is sufficient time for the desiredstudy outcomes (see “Primary trial outcome” section) to beobserved. When the follow-up is done in this way, it is un-likely to have an impact on compliance or responsiveness.However, there may often be considerable tension betweenunobtrusive follow-up and the ability to collect the necessaryoutcomes. Often, although not always, explanatory trials areinterested in the effect of an intervention during the interven-tion period, or shortly afterward. On the other hand, pragmatictrials may follow patients well beyond the intervention periodin their quest to answer the “does this work?” question. Suchlonger term follow-up may well require more patient contactthan usual care. However, it is not necessarily inconsistentwith a pragmatic approach if it does not result in patient man-agement that differs from the usual conditions, which may inturn increase the chance of detecting an intervention effect be-yond what would be expected under usual conditions.

Analysis

CMAJ • MAY 12, 2009 • 180(10) E51

Page 89: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Analysis

CMAJ • MAY 12, 2009 • 180(10)E52

Table 2: A PRECIS assessment of 4 trials (part 1)

Domain; trial Assessment of domain

Participant eligibility criteria

DOT8 The trial admitted all comers receiving care for newly diagnosed tuberculosis at 2 clinics. This was extremely pragmatic, but since only 2 clinics were studied and the setting of interest is (at a minimum) all of South Africa, it is not at the extreme edge.

NASCET9 NASCET enrolment was restricted to symptomatic patients stratified for carotid stenosis severity, with primary interest in a group with severe carotid stenosis (high risk) who were thought to be most likely to respond to endarterectomy, if it was efficacious. There was no prior compliance testing (other than a willingness to undergo angiography and several less invasive diagnostic tests). Exclusions included mental incompetence, another illness likely to cause death within 5 years, prior total stroke in the affected territory, a cardiac valvular or rhythm disorder (e.g., atrial fibrillation) likely to lead to embolic stroke, or prior carotid surgery on the affected artery. Patients also were temporarily ineligible if they had any of 7 transient medical conditions (e.g., uncontrolled hypertension or diabetes, recent major surgery, unstable coronary artery disease). Thus, eligibility was very near the extreme explanatory end of the scale.

CLASP10 This trial had broad inclusion criteria (12–32 weeks’ gestation at sufficient risk of pre-eclampsia or intrauterine growth retardation to consider acetylsalicylic acid [ASA] usage), few exclusion criteria and was conducted in a large (213) number of centres. This was extremely pragmatic.

Caritis et al.11 This trial recruited high-risk patients from 13 centres. Before patients were randomized, compliance was evaluated. Only patients with 70% or better compliance were randomized. This was extremely explanatory in character.

Experimental intervention — flexibility

DOT The method of self-administration was left to the individual patient, who could delegate weekly drug-collection visits to a family member. This was extremely pragmatic in character.

NASCET An endarterectomy had to be carried out (rather than stenting or some other operation), but the surgeon was given leeway in how it was performed (e.g., whether to use patches or temporary shunts). Also, simultaneous coronary artery bypass grafting was proscribed. Bilateral carotid endarterectomy could be performed provided the symptomatic side was operated on first. The same co-interventions (best medical care) were specified for both surgical and control patients. This was clearly very explanatory, but could have been more so if the intraoperative procedures had also been specified.

CLASP Patients were instructed to take 1 tablet per day unless their doctor advised otherwise. Compounds containing ASA were recommended against, and a compound for analgesia was recommended. So, some flexibility (doctor’s opinion) was permitted. However, the medication recommendations were such that they would tend to maximize the difference between treatments. Thus, this domain was not completely pragmatic.

Caritis et al. Patients were instructed to take 1 tablet per day unless they were told they had developed pre-eclampsia. They were given a list of medications to avoid and medication for analgesia. Since the criterion for stopping the study drug was specified, this tended to be more explanatory in nature, although it was by no means extreme in that regard.

Experimental intervention — practitioner expertise

DOT All clinic nurses were involved, with no particular specialization or additional training. Patients were self-treating with no special training. Thus, this was an extremely pragmatic approach.

NASCET NASCET surgeons had to be approved by an expert panel and were restricted to those who had performed at least 50 carotid endarterectomies in the last 24 months, with a postoperative complication rate (stroke or death within 30 days) of less than 6%. This was an extremely explanatory approach. All follow-up assessments were carried out by board-certified neurologists or their senior subspecialty trainees (a slightly less explanatory approach).

CLASP Patients remained under the care of their own doctors. This was a pragmatic approach.

Caritis et al. This domain was not explicitly stated in the trial report. However, we can make an educated guess. The patients were under the care of a physician at the participating centre. Since this trial was studying high-risk patients, it is reasonable to assume that the participating centres were chosen because they have a relatively high volume of high-risk cases, which in turn suggests that specialists rather than generalists were involved in patient care. This tended to be a more explanatory approach.

Comparison intervention(s) — flexibility

DOT Clinics already had the intervention (direct observation) in place, and this was not altered; extremely pragmatic.

NASCET In NASCET, antiplatelet therapy (usually 1300 mg of ASA per day) was prescribed. Also, the co-interventions applied to surgical patients were also applied to control patients (antihypertensive therapy with blood pressure targets and feedback, antilipid and antidiabetic therapy) as indicated; an explanatory approach.

CLASP Since both interventions were a simple tablet, this domain was treated the same as for the experimental arm.

Caritis et al. Since both interventions were a simple tablet, this domain was treated the same as for the experimental arm.

Page 90: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Analysis

CMAJ • MAY 12, 2009 • 180(10) E53

Table 2: A PRECIS assessment of 4 trials (part 2)

Domain; trial Assessment of domain

Comparison intervention(s) — practitioner expertise

DOT All clinic nurses were involved, with no particular specialization or additional training, which was extremely pragmatic.

NASCET Like the surgical patients, the patients in the medical arm were managed and followed by board-certified neurologists or their senior subspecialty trainees.

CLASP Since there was no difference in care provider with respect to treatment, this domain was treated the same as for the experimental arm.

Caritis et al. Since there was no difference in care provider with respect to treatment, this domain was treated the same as for the experimental arm.

Follow-up intensity

DOT No extra clinic visits were scheduled. In fact, in the experimental arm, no visits whatsoever were required, since even the weekly drug collection could be delegated to a family member. This was the most extreme pragmatic approach.

NASCET NASCET patients had prescheduled appointments at 1, 3, 6, 9, 12, 16, 20 and 24 months (and every 4 months thereafter). Each consisted of a medical, neurologic and functional-status assessment. All blood pressure records were reviewed centrally, and elevated readings triggered reminder letters. None of the 659 patients were lost to follow-up. A highly explanatory approach was taken here.

CLASP There was a single scheduled follow-up, which happened after delivery of the infant and any of the primary study outcomes. Infant deaths up to 1 year were also recorded. This was very pragmatic.

Caritis et al. Study follow-ups were scheduled to occur with the standard patient care schedules at each centre. Usually the patients were seen every 4 weeks up to 28 weeks’ gestation, then every 2 weeks up to 36 weeks’ gestation and then weekly thereafter until delivery. Although the visit schedule was no more intense than it would have been at these centres outside the trial, there would have been trial-related data collected that may not normally have been collected, which may have altered patient management from standard care. This was very explanatory but not extreme.

Primary trial outcome

DOT The primary outcome was “successful treatment,” which included all patients who were cured and all patients who completed the treatment. All patients were followed up for a year, until they completed their treatment, died, were classified as “incompletely treated” or were lost to follow-up; very pragmatic.

NASCET The primary outcome was time to ipsilateral stroke, the clinically relevant, explanatory outcome most likely to be affected by carotid endarterectomy. Other outcomes were more pragmatic: all strokes, major strokes and mortality were secondary outcomes.

CLASP The primary outcome of pre-eclampsia was defined in a clinically relevant way that required only investigations common to standard care. Deaths up to 1 year post-delivery were recorded and adjudicated for cause. This was very pragmatic, but not the most extreme position.

Caritis et al. The primary outcome of pre-eclampsia was defined in a clinically relevant way that required only investigations common to standard care. There was blinded adjudication of the primary outcome. There were a number of other short-term outcomes. Although the primary outcome itself was consistent with a pragmatic approach, the adjudication and focus on short-term outcomes moved this some way toward an explanatory approach.

Participant compliance with “prescribed” intervention

DOT Compliance was an element of the outcomes, and so was measured for this purpose, but was not used to improve patient compliance. This was pragmatic, but not at the most extreme end.

NASCET The experimental intervention in NASCET was offering a one-time operation. Because the 50% probability of operation was clearly stated in the original consent documents, patients who did not want surgery were unlikely to enter the trial (only 0.3% of admitted patients randomized to the operation refused it). This was a prophylactic strategy for achieving compliance and was thus an explanatory approach.

CLASP Compliance was asked about at the follow-up visit. Since this was after the completion of treatment, it could in no way affect compliance in the trial. Thus, it was extremely pragmatic.

Caritis et al. Compliance was measured by pill count and direct questioning during follow-up. A research nurse periodically contacted women to “survey and reinforce compliance.” This was an extremely explanatory approach.

Practitioner adherence to study protocol

DOT There were no measurements of protocol adherence, and no adherence-improving strategies were used. This was the most pragmatic approach possible.

NASCET The completeness, timeliness and accuracy of clinical data forms generated at admission, follow-up and for events were monitored centrally. Both at regular intervals, and more frequently when they were deficient, the NASCET principal investigator made a personal visit to that centre. In addition, blood pressure reports from each visit were scrutinized centrally, with letters pestering clinical collaborators when readings were elevated. An extremely explanatory approach was evident here.

CLASP Not specified; assumed not extreme in either direction.

Caritis et al. Not specified; assumed not extreme in either direction.

Page 91: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Primary trial outcomeFor primary trial outcome, it is more intuitive to begin fromthe explanatory pole and describe the progression to the prag-matic pole. The most explanatory approach would consider aprimary outcome (possibly surrogate, as in dose-finding trialsintended to demonstrate a biological response) that the exper-imental intervention is expected to have a direct effect on.Phase 3 and 4 trials often have patient-important outcomesand thus may be more pragmatic in this domain. There maywell be central adjudication of the outcome, or assessment ofthe outcome may require special training or tests not normallyused to apply outcome definition criteria. Two obvious relax-ations of the strict outcome assessment present in explanatorytrials are the absence of central outcome adjudication and thereliance on usual training and measurement to determine theoutcome status. For some interventions, the issue may bewhether to measure outcomes only during the intervention pe-riod or up to a “reasonable” time after the intervention iscomplete. For example, stroke could be a primary outcomefor explanatory and pragmatic trials. However, time horizonsmay vary from short term following a one-time intervention(more explanatory) to long term (more pragmatic).

Participant compliance with “prescribed” interventionThe pragmatic approach recognizes that noncompliance withany intervention is a reality in routine medical practice. Becausemeasurement of compliance may possibly alter subsequentcompliance, the pragmatic approach in a trial would be not tomeasure or use compliance information in any way. The morerigorous a trial is in measuring and responding to noncompli-ance of the study participants, the more explanatory it becomes:• Compliance is measured (indirectly) purely for descriptive

purposes at the conclusion of the trial.• Compliance data are measured and fed back to providers

or participants during follow-up.• Uniform compliance-improving strategies are applied to

all participants.

• Compliance-improving strategies are applied to partici-pants with documented poor compliance.For some trials, the goal of an intervention may be to im-

prove compliance with a treatment guideline. Provided the com-pliance measurement is not used, directly or indirectly, to influ-ence subsequent compliance, a trial could still be “verypragmatic” in this domain. On the other hand, if measuringcompliance is part of the intervention (e.g., audit and feedback),this domain would, appropriately, move toward a more explana-tory approach if audit and feedback could not be similarly ap-plied as part of the intervention under usual circumstances.

Practitioner adherence to study protocolThe pragmatic approach takes account of the fact thatproviders will vary in how they implement an intervention. Apurely pragmatic approach, therefore, would not be con-cerned with how practitioners vary or “customize” a trial pro-tocol to suit their setting. By monitoring and (especially) act-ing on protocol nonadherence, a trial shifts toward beingmore explanatory:• Adherence is measured (indirectly) purely for descriptive

purposes at the conclusion of the trial.• Adherence data are measured and fed back to practitioners.• Uniform adherence-improving strategies are applied to all

practitioners.• Adherence-improving strategies are applied to practition-

ers with documented poor adherence.

Analysis of the primary outcomeRecall that the pragmatic trial is concerned with the question“Does the intervention work under usual conditions?” Assum-ing other aspects of a trial have been treated in a pragmaticfashion, an analysis that makes no special allowance for non-compliance, nonadherence or practice variability, for example,is most appropriate for this question. So, the pragmatic ap-proach to the primary analysis would typically be an intention-to-treat analysis of an outcome of direct relevance to the study

Analysis

CMAJ • MAY 12, 2009 • 180(10)E54

Table 2: A PRECIS assessment of 4 trials (part 3)

Domain; trial Assessment of domain

Analysis of primary outcome

DOT All randomized patients were included in the primary analysis. Patients who failed to meet the criteria for “successful treatment” (including those who died, were lost to follow-up or were transferred to another clinic) were classified “failures.” This was an extremely pragmatic approach.

NASCET The primary analysis was restricted to fatal and nonfatal strokes affecting the operated side of the cerebral circulation. In addition, blind adjudicators removed 3 NASCET patients after they were randomized because a review of their data before randomization revealed that they had other explanations for their symptoms (glaucoma, symptoms not arising from a carotid territory of the brain) or were inoperable (total occlusion of their carotid artery). However, patients were not excluded if they did not have a carotid endarterectomy or had uncontrolled blood pressure. This leaned toward an explanatory approach.

CLASP An intention-to-treat analysis was conducted on patients who completed the follow-up. Some subgroups, notably high-risk subgroups, were considered a priori. This was a fairly pragmatic approach.

Caritis et al. An intention-to-treat analysis was conducted on women with outcome data. An analysis, adjusted for compliance, was also performed. A number of additional “explanatory” analyses were conducted. This was fairly explanatory in its approach.

Note: PRECIS = pragmatic–explanatory continuum indicator summary, DOT = directly observed treatment, NASCET = North American Symptomatic Carotid Endarterectomy Trial, CLASP = Collaborative Low-dose Aspirin Study in Pregnancy.

Page 92: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

participants and the population they represent. The intention-to-treat analysis is also the norm for explanatory trials, espe-cially when regulatory approval for an intervention is beingsought. However, there are various restrictions that may (addi-tionally) be used to address the explanatory question “Can thisintervention work under ideal conditions?”:• Exclude noncompliant participants.• Exclude patients found to be ineligible after randomization.• Exclude data from nonadherent practitioners.• Plan multiple subgroup analyses for groups thought to

have the largest treatment effect.For some explanatory trials (e.g,. dose-finding trials), it

may be appropriate to have primary analysis restricted in theways mentioned, otherwise such restricted analyses of the pri-mary outcome would be preplanned as secondary analyses ofthe primary outcome. Note that, if all domains of the trialwere designed in an explanatory fashion and the trial wereconducted accordingly, the above restrictions should havevery little impact. A purely pragmatic approach would notconsider these restricted analyses.

Examples

To demonstrate the use of the PRECIS tool, we applied the in-strument to 4 trials exhibiting varying degrees of pragmatic andexplanatory approaches. Table 2 describes how these trials ad-dressed the 10 domains previously described. As we havestated previously, the PRECIS tool is intended to be used at thedesign stage. We have applied it post-hoc to these examples forillustrative purposes only.

The first example uses the trial of self-supervised and di-rectly observed treatment of tuberculosis (DOT).8 The DOTtrial asked the question: Among South African adults withnewly diagnosed pulmonary tuberculosis, does direct obser-vation of pill swallowing 5 times weekly by a nurse in theclinic, compared with self-administration, increase the proba-bility that patients will take more than 80% of the doseswithin 7 months of starting treatment, with no interruptions ofmore than 2 weeks? In this example, the experimental inter-vention was self-administration and the comparison interven-tion was DOT, which was widely used (throughout SouthAfrica and elsewhere) but not adequately evaluated.

The second example uses the North American Sympto-matic Carotid Endarterectomy Trial (NASCET).9 TheNASCET trial asked the question: Among patients withsymptomatic stenosis (70%–99%) of a carotid artery (andtherefore at high risk of stroke), can the addition of carotidendarterectomy (performed by an expert vascular or neuro-surgeon with an excellent track record) to best medical ther-apy, compared with best medical therapy alone, reduce theoutcomes of major stroke or death over the next 2 years?

The third example uses the Collaborative Low-dose As-pirin Study in Pregnancy (CLASP) trial.10 The placebo-controlled trial was designed to “provide reliable evidenceabout the overall safety of low-dose aspirin use in pregnancyand to find out whether treatment really produces worthwhileeffects on morbidity and on fetal and neonatal mortality.”

The final example uses the trial by Caritis and colleagues.11

This is another placebo-controlled trial of ASA designed todetermine whether low-dose ASA therapy could reduce theincidence of pre-eclampsia among women at high risk for thiscondition.

Figure 1 shows a blank wheel plot for summarizing the 10indicators. All that is left is to mark each spoke to representthe location on the explanatory (hub) to pragmatic (rim) con-tinuum and connect the dots.

Given the tactics used in the DOT trial in each of these di-mensions, if we link each of the dots to its immediate neigh-bour, we get a visual representation of the very broad pragmaticapproach of this trial (Figure 2A). Similarly, given the tacticsused in the NASCET trial in each of these domains, Figure 2Bprovides a visual representation of the mostly narrow explana-tory approach of this trial. The final 2 examples are trials of thesame intervention for the same condition. It can be seen fromFigure 2C and Figure 2D that the CLASP trial tended to bemore pragmatic than the trial by Caritis and colleagues.

Comment

The PRECIS tool is an initial attempt to identify and quantifytrial characteristics that distinguish between pragmatic andexplanatory trials to assist researchers in designing trials. Assuch, we welcome suggestions for its further development.For example, the tool is applicable to individually random-ized trials. It would probably apply to cluster randomized tri-als as well, but we have not tested it for those designs.

It is not hard to imagine that a judgment call is required toposition the dots on the wheel diagram, especially for do-mains that are not at an extreme. Because trials are typicallydesigned by a team of researchers, the PRECIS tool should beused by all involved in the design of the trial, leading to aconsensus view on where the trial is situated within the prag-matic–explanatory continuum. The possible subjectiveness ofdot placement should help focus the researcher’s attention onthose domains that are not as pragmatic or explanatory asthey would like. Clearly, domains where consensus is diffi-cult to achieve warrant more attention.

There are other characteristics that may more often bepresent in pragmatic trials but, because they can also befound in explanatory trials, are not immediately helpful fordiscrimination. An appreciation of these characteristics helpsround out the picture somewhat and assists with the interpre-tation of a given trial. For example, in a pragmatic trial, thecomparison intervention is, by definition, standard care. So,one would be unlikely to use a placebo group in a pragmatictrial. Therefore, although the presence of a placebo groupsuggests an explanatory trial, absence of a placebo groupdoes not necessarily suggest a pragmatic trial. Another ex-ample of this is blinding, whether it be blinded interventiondelivery or outcome assessment blinded to treatment assign-ment. Blinding is desirable in all trials to the extent possible.Blinding may be less practical to achieve in some pragmatictrials, but that does not imply that blinding is inconsistentwith a pragmatic trial.

Understanding the context for the applicability of the trialresults is essential for all trials. For example, the intervention

Analysis

CMAJ • MAY 12, 2009 • 180(10) E55

Page 93: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

studied in a pragmatic trial should be one that is feasible toimplement in the “real world” after the completion of thetrial. However, feasibility is often context specific. For exam-ple, an intervention could be easy to implement in Ontario,Canada, but all but impossible to implement in a low-incomecountry because of cost, different health care delivery systemsand many other reasons.

Our initial experiences developing the PRECIS tool sug-

gest that it has the potential to be useful for trial design, al-though we anticipate that some refinement of the scales willbe required. The reporting of pragmatic trials is addressedelsewhere.4 The simple graphical summary is a particularlyappealing feature of this tool. We believe it has value for theplanning of trials and the assessment of whether the design ofa trial is fit for purpose. The tool can help ensure the right bal-ance is struck to achieve the primary purpose of a trial, which

Analysis

CMAJ • MAY 12, 2009 • 180(10)E56

A B

DC

Flexibility of the comparison intervention

Practitioner expertise (experimental)

Flexibility of the experimental intervention

Eligibility criteria

Primary analysis

Practitioner adherence

Participant compliance

Outcomes

Follow-up intensity

Practitioner expertise (comparison)

Flexibility of the comparison intervention

Practitioner expertise (experimental)

Flexibility of the experimental intervention

Eligibility criteria

Primary analysis

Practitioner adherence

Participant compliance

Outcomes

Follow-up intensity

Practitioner expertise (comparison)

Flexibility of the comparison intervention

Practitioner expertise (experimental)

Flexibility of the experimental intervention

Eligibility criteria

Primary analysis

Practitioner adherence

Participant compliance

Outcomes

Follow-up intensity

Practitioner expertise (comparison)

Flexibility of the comparison intervention

Practitioner expertise (experimental)

Flexibility of the experimental intervention

Eligibility criteria

Primary analysis

Practitioner adherence

Participant compliance

Outcomes

Follow-up intensity

Practitioner expertise (comparison)

E E

EE

Figure 2: A: PRECIS summary of a randomized controlled trial of self-supervised and directly observed treatment of tuberculosis (DOT).8

B: PRECIS summary of the North American Symptomatic Carotid Endarterectomy Trial (NASCET) of carotid endarterectomy in sympto-matic patients with high-grade carotid stenosis.9 C: PRECIS summary of a randomized trial of low-dose acetylsalicylic acid (ASA) therapyfor the prevention and treatment of pre-eclampsia (CLASP).10 D: PRECIS summary of a randomized trial of low-dose ASA for the preven-tion of pre-eclampsia in women at high risk.11 “E” represents the “explanatory” end of the pragmatic–explanatory continuum.

Page 94: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

may be to answer an “explanatory” question about whether anintervention can work under ideal conditions or to answer a“pragmatic” question about whether an intervention doeswork under usual conditions. The PRECIS tool highlights themultidimensional nature of the pragmatic–explanatory contin-uum. This multidimensional structure should be borne inmind by trial designers and end-users alike so that overly sim-plistic labelling of trials can be avoided.

We would also like to caution readers to not confound thestructure of a trial with its usefulness to potential users.Schwartz and Lellouch clearly linked the ability of a trial tomeet its purpose with decisions about how the trial is de-signed and that, taken together, these decisions affect wherethe trial is placed on the explanatory–pragmatic continuum.1

However, how useful a trial is depends not only on design buton the similarity between the user’s context and that of thetrial. Although it is unreasonable to expect the results of atrial to apply in all contexts, trials should be designed and re-ported in such a way that users of the results can make mean-ingful judgments about applicability to their own context.12

Finally, we stress that this article, building on earlier workfrom multiple investigators, describes a “work in progress.”We welcome suggestions from all who read it, especially thosewho wish to join us in its further development. The words withwhich Schwartz and Lellouch closed their 1967 paper continueto apply: “This article makes no pretention to originality, nor tothe provision of solutions; we hope we have clarified certain is-sues to the extent of encouraging further discussion.”

REFERENCES1. Schwartz D, Lellouch J. Explanatory and pragmatic attitudes in therapeutical trials.

J Chronic Dis 1967;20:637-48. [Reprinted in J Clin Epidemiol 2009;62:499-505.]2. Sackett DL. Explanatory vs. management trials. In: Haynes RB, Sackett DL, Guy-

att GH, et al., editors. Clinical epidemiology: how to do clinical practice research.Philadelphia (PA): Lippincott, Williams and Wilkins; 2006.

3. Gartlehner G, Hansen RA, Nissman D, et al. A simple and valid tool distinguishedefficacy from effectiveness studies. J Clin Epidemiol 2006;59:1040-8.

4. Zwarenstein M, Treweek S, Gagnier J, et al.; CONSORT and Pragmatic Trials inHealthcare (Practihc) groups. Improving the reporting of pragmatic trials: an exten-sion of the CONSORT statement. BMJ 2008;337:a2390.

5. Tunis SR, Stryer DB, Clancy CM. Practical clinical trials: increasing the value ofclinical research for decision making in clinical and health policy. JAMA 2003;290:1624-32.

6. Tunis SR. A clinical research strategy to support shared decision making. HealthAff 2005;24:180-4.

7. Scheel IB, Hagen KB, Herrin J, et al. Blind faith? The effects of promoting ac-tive sick leave for back pain patients. A cluster-randomized trial. Spine 2002;27:2734-40.

8. Zwarenstein M, Schoeman JH, Vundule C, et al. Randomised controlled trial ofself-supervised and directly observed treatment of tuberculosis. Lancet 1998;352:1340-3.

9. North American Symptomatic Carotid Endarterectomy Trial Collaborators. Bene-ficial effect of carotid endarterectomy in symptomatic patients with high-gradecarotid stenosis. N Engl J Med 1991;325:445-53.

10. CLASP (Collaborative Low-dose Aspirin Study in Pregnancy) CollaborativeGroup. CLASP: a randomized trial of low-dose aspirin for the prevention and treat-ment of pre-eclampsia among 9364 pregnant women. Lancet 1994;343:619-29.

11. Caritis S, Sibai B, Hauth J, et al. Low-dose aspirin to prevent preeclampsia inwomen at high risk. N Engl J Med 1998;338:701-5.

12. Rothwell PM. External validity of randomised controlled trials: “To whom do theresults of this trial apply?” Lancet 2005;365:82-93.

Analysis

CMAJ • MAY 12, 2009 • 180(10) E57

Competing interests: None declared.

Contributors: All of the authors made significant contributions to the intel-lectual content of this paper, reviewed multiple drafts for important omis-sions and have approved the final manuscript.

Acknowledgements: We are especially indebted to Dr. David L. Sackett forhis encouragement and advice during the development of the tool and prepara-tion of this manuscript. We would also like to acknowledge the contributionsmade by the numerous attendees at the Toronto workshops in 2005 and 2008.

Correspondence to: Prof. Kevin E. Thorpe, Dalla Lana Schoolof Public Health, University of Toronto, 6th floor, HealthSciences Building, 155 College St., Toronto ON M5T 3M7;fax 416 864-6057; [email protected]

The Practihc group was supported by the European Commission’s 5thFramework INCO program (contract ICA4-CT-2001-10019). The 2005Toronto meeting was supported by a Canadian Institutes for Health Researchgrant (no. FRN 63095). The 2008 Toronto meeting was supported by the UKMedical Research Council, the Centre for Health Services Sciences at Sunny-brook Health Sciences Centre, Toronto, Canada, the Center for MedicalTechnology Policy, Baltimore, USA, and the National Institute for Healthand Clinical Excellence, London, UK.

Page 95: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Atypical antipsychotics for aggression and psychosis in

Alzheimer’s disease (Review)

Ballard CG, Waite J, Birks J

This is a reprint of a Cochrane review, prepared and maintained by The Cochrane Collaboration and published in The Cochrane Library

2008, Issue 4

http://www.thecochranelibrary.com

Atypical antipsychotics for aggression and psychosis in Alzheimer’s disease (Review)

Copyright © 2008 The Cochrane Collaboration. Published by John Wiley & Sons, Ltd.

Page 96: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

T A B L E O F C O N T E N T S

1HEADER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2PLAIN LANGUAGE SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2BACKGROUND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3OBJECTIVES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3METHODS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11DISCUSSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12AUTHORS’ CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21CHARACTERISTICS OF STUDIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34DATA AND ANALYSES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Analysis 1.1. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 1 BEHAVE-AD or NPI TOTAL (change from

baseline at 10-13 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Analysis 1.2. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 2 BEHAVE-AD or NPI PSYCHOSIS (change

from baseline at 8-13 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Analysis 1.3. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 3 BEHAVE-AD AGGRESSIVENESS (change

from baseline at 12-13 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Analysis 1.4. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 4 CMAI TOTAL AGGRESSIVENESS (change

from baseline at 10-13 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Analysis 1.5. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 5 Number of dropouts before end of treatment at

8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Analysis 1.6. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 6 Number of dropouts due to adverse events before

end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

Analysis 1.7. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 7 Number of adverse events before end of treatment

at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Analysis 1.8. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 8 Number of adverse events of somnolence before

end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Analysis 1.9. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 9 Number of adverse events of injury before end of

treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Analysis 1.10. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 10 Number of adverse events of purpura before

end of treatment at 12-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

Analysis 1.11. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 11 Number of adverse events of pain before end

of treatment at 12-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Analysis 1.12. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 12 Number of adverse events of coughing before

end of treatment at 12-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Analysis 1.13. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 13 Number of adverse events of rhinitis before

end of treatment at 12-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Analysis 1.14. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 14 Number of adverse events of upper respiratory

tract infection before end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . 58

Analysis 1.15. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 15 Number of adverse events of a fall before end

of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Analysis 1.16. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 16 Number of adverse events of extrapyramidal

disorder before end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . 60

Analysis 1.17. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 17 Number of adverse events of urinary tract

infection before end of treatment at 12-13 weeks. . . . . . . . . . . . . . . . . . . . . . 61

Analysis 1.18. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 18 Number of adverse events of peripheral

oedema before end of treatment at 12-13 weeks. . . . . . . . . . . . . . . . . . . . . . . 62

Analysis 1.19. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 19 Number of adverse events of fever before end

of treatment at 12-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

iAtypical antipsychotics for aggression and psychosis in Alzheimer’s disease (Review)

Copyright © 2008 The Cochrane Collaboration. Published by John Wiley & Sons, Ltd.

Page 97: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Analysis 1.20. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 20 Number of adverse events of agitation before

end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

Analysis 1.21. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 21 Number of adverse events of tremor before

end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Analysis 1.22. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 22 Number of adverse events of abnormal gait

before end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 65

Analysis 1.23. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 23 Number of adverse events of confusion before

end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

Analysis 1.24. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 24 Number of adverse events of hallucination

before end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 66

Analysis 1.25. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 25 Number of adverse events of urinary

incontinence before end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . 67

Analysis 1.26. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 26 Number of adverse events of asthenia before

end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Analysis 1.27. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 27 Number of adverse events of hostility before

end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

Analysis 1.28. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 28 Number of adverse events of flu syndrome

before end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 68

Analysis 1.29. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 29 Number of adverse events of dyspnea before

end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

Analysis 1.30. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 30 Number of adverse events of angina pectoris

before end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 69

Analysis 1.31. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 31 Number of cerebrovascular adverse events

before end of treatment at 8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 70

Analysis 1.32. Comparison 1 RISPERIDONE VS PLACEBO, Outcome 32 Number of deaths before end of treatment at

8-13 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

Analysis 2.1. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 1 BEHAVE-AD (change from baseline at

12 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Analysis 2.2. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 2 BEHAVE-AD PSYCHOSIS (change

from baseline at 12 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

Analysis 2.3. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 3 BEHAVE-AD AGGRESSION(change

from baseline at 12 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Analysis 2.4. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 4 Number of dropouts before end of

treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Analysis 2.5. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 5 Number of dropouts due to adverse events

before end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Analysis 2.6. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 6 Number of adverse events of somnolence

before end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Analysis 2.7. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 7 Number of adverse events of injury before

end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

Analysis 2.8. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 8 Number of adverse events of fall before

end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

Analysis 2.9. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 9 Number of adverse events of extrapyramidal

disorder before end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . . . 75

Analysis 2.10. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 10 Number of adverse events of urinary

tract infection before end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . 75

Analysis 2.11. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 11 Number of adverse events of peripheral

oedema before end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . . . 76

Analysis 2.12. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 12 Number of adverse events of purpura

before end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Analysis 2.13. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 13 Number of adverse events of fever

before end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . 77

iiAtypical antipsychotics for aggression and psychosis in Alzheimer’s disease (Review)

Copyright © 2008 The Cochrane Collaboration. Published by John Wiley & Sons, Ltd.

Page 98: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Analysis 2.14. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 14 Number of adverse events of pain

before end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Analysis 2.15. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 15 Number of adverse events of coughing

before end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Analysis 2.16. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 16 Number of adverse events of agittion

before end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Analysis 2.17. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 17 Number of adverse events of rhinitis

before end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . 79

Analysis 2.18. Comparison 2 RISPERIDONE 2 mg/day vs 1 mg/day, Outcome 18 Number of adverse events of upper

respiratory tract infection before end of treatment at 12 weeks. . . . . . . . . . . . . . . . . . 79

Analysis 3.1. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 1 NPI-NH TOTAL (change from baseline at 6-10

weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Analysis 3.2. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 2 NPI-NH PSYCHOSIS (change from baseline at

6-10 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Analysis 3.3. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 3 NPI-NH AGITATION/AGGRESSION (change

from baseline at 6-10 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Analysis 3.4. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 4 NPI-NH ANXIETY (change from baseline at 6-

10 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Analysis 3.5. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 5 NPI-NH APATHY/INDIFFERENCE (change

from baseline at 6-10 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Analysis 3.6. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 6 NPI-NH EUPHORIA/ELATION (change from

baseline at 6-10 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Analysis 3.7. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 7 NPI-NH IRRITABILITY/LABILITY (change

from baseline at 6-10 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

Analysis 3.8. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 8 NPI-NH ABERRANT MOTOR (change from

baseline at 6-10 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Analysis 3.9. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 9 NPI-NH DEPRESSION (change from baseline

at 6-10 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

Analysis 3.10. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 10 NPI-NH CARER DISTRESS (change from

baseline at 6-10 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Analysis 3.11. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 11 BPRS TOTAL (change from baseline at 6-10

weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

Analysis 3.12. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 12 Number of dropouts before end of treatment

at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Analysis 3.13. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 13 Number of dropouts due to adverse events

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 92

Analysis 3.14. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 14 Number of deaths before end of treatment at

6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

Analysis 3.15. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 15 Number of adverse events due to accidental

injury before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . 94

Analysis 3.16. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 16 Number of adverse events due to somnolence

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 95

Analysis 3.17. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 17 Number of adverse events due to pain before

end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

Analysis 3.18. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 18 Number of adverse events due to abnormal

gait before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . 97

Analysis 3.19. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 19 Number of adverse events due to anorexia

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 98

Analysis 3.20. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 20 Number of adverse events due to ecchymosis

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 99

Analysis 3.21. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 21 Number of adverse events due to fever before

end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

iiiAtypical antipsychotics for aggression and psychosis in Alzheimer’s disease (Review)

Copyright © 2008 The Cochrane Collaboration. Published by John Wiley & Sons, Ltd.

Page 99: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Analysis 3.22. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 22 Number of adverse events due to agitation

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 101

Analysis 3.23. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 23 Number of adverse events due to weight loss

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 102

Analysis 3.24. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 24 Number of adverse events due to increased

cough before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . 103

Analysis 3.25. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 25 Number of adverse events due to peripheral

oedema before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . 104

Analysis 3.26. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 26 Number of adverse events due to nervousness

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 105

Analysis 3.27. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 27 Number of adverse events due to hostility

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 106

Analysis 3.28. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 28 Number of adverse events due to confusion

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 107

Analysis 3.29. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 29 Number of adverse events due to anxiety before

end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Analysis 3.30. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 30 Number of adverse events due to dry mouth

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 108

Analysis 3.31. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 31 Number of adverse events due to akathisia

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 109

Analysis 3.32. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 32 Number of adverse events due to falls before

end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

Analysis 3.33. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 33 Number of adverse events due to weight

increase before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . 111

Analysis 3.34. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 34 Number of adverse events due to hallucination

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 111

Analysis 3.35. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 35 Number of adverse events due to urinary

incontinence before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . 112

Analysis 3.36. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 36 Number of adverse events due to asthenia

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 112

Analysis 3.37. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 37 Number of adverse events due to flu syndrome

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 113

Analysis 3.38. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 38 Number of adverse events due to dyspnea

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 113

Analysis 3.39. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 39 Number of adverse events due to angina

pectoris before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . 114

Analysis 3.40. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 40 Number of cerebrovascular adverse events

before end of treatment at 6-10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . 114

Analysis 3.41. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 41 CGI (change from baseline at 6-10 weeks)

ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

Analysis 3.42. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 42 MMSE (change from baseline at 6-10 weeks)

ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Analysis 3.43. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 43 BEHAVE-AD TOTAL (change from baseline

at 6-10 weeks). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

Analysis 3.44. Comparison 3 OLANZAPINE VS PLACEBO, Outcome 44 CGI-Severity (change from baseline at 6-10

weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

Analysis 4.1. Comparison 4 ARIPIPRAZOLE VS PLACEBO, Outcome 1 BEHAVE-AD PSYCHOSIS (change from

baseline at 10 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Analysis 4.2. Comparison 4 ARIPIPRAZOLE VS PLACEBO, Outcome 2 BPRS TOTAL (change from baseline at 10

weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Analysis 4.3. Comparison 4 ARIPIPRAZOLE VS PLACEBO, Outcome 3 BPRS - PSYCHOSIS SUBSCALE (change

from baseline at 10 weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

ivAtypical antipsychotics for aggression and psychosis in Alzheimer’s disease (Review)

Copyright © 2008 The Cochrane Collaboration. Published by John Wiley & Sons, Ltd.

Page 100: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Analysis 4.4. Comparison 4 ARIPIPRAZOLE VS PLACEBO, Outcome 4 Number of dropouts before end of treatment at

10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Analysis 4.5. Comparison 4 ARIPIPRAZOLE VS PLACEBO, Outcome 5 Number of dropouts due to adverse events

before end of treatment at 10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Analysis 4.6. Comparison 4 ARIPIPRAZOLE VS PLACEBO, Outcome 6 Number of deaths before end of treatment at 10

weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Analysis 4.7. Comparison 4 ARIPIPRAZOLE VS PLACEBO, Outcome 7 Number of adverse events due to urinary tract

infection before end of treatment at 10 weeks. . . . . . . . . . . . . . . . . . . . . . . . 121

Analysis 4.8. Comparison 4 ARIPIPRAZOLE VS PLACEBO, Outcome 8 Number of adverse events due to accidental

injury before end of treatment at 10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . 122

Analysis 4.9. Comparison 4 ARIPIPRAZOLE VS PLACEBO, Outcome 9 Number of adverse events due to somnolence

before end of treatment at 10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . 122

Analysis 4.10. Comparison 4 ARIPIPRAZOLE VS PLACEBO, Outcome 10 Number of adverse events due to bronchitis

before end of treatment at 10 weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . 123

Analysis 4.11. Comparison 4 ARIPIPRAZOLE VS PLACEBO, Outcome 11 Number of adverse events due to

extrapyramidal symptoms before end of treatment at 10 weeks. . . . . . . . . . . . . . . . . . 123

Analysis 5.1. Comparison 5 QUETIAPINE VS PLACEBO, Outcome 1 CMAI TOTAL (change from baseline at 6 weeks)

ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Analysis 5.2. Comparison 5 QUETIAPINE VS PLACEBO, Outcome 2 CMAI TOTAL (change from baseline at 26

weeks) ITT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Analysis 5.3. Comparison 5 QUETIAPINE VS PLACEBO, Outcome 3 SIB (change from baseline at 6 weeks) ITT. 125

Analysis 5.4. Comparison 5 QUETIAPINE VS PLACEBO, Outcome 4 SIB (change from baseline at 26 weeks) ITT. 125

Analysis 5.5. Comparison 5 QUETIAPINE VS PLACEBO, Outcome 5 Number of dropouts by end of treatment at 26

weeks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

126WHAT’S NEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

126HISTORY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

127CONTRIBUTIONS OF AUTHORS . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

127DECLARATIONS OF INTEREST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

127SOURCES OF SUPPORT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

127NOTES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

127INDEX TERMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vAtypical antipsychotics for aggression and psychosis in Alzheimer’s disease (Review)

Copyright © 2008 The Cochrane Collaboration. Published by John Wiley & Sons, Ltd.

Page 101: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

[Intervention Review]

Atypical antipsychotics for aggression and psychosis inAlzheimer’s disease

Clive G Ballard1, Jonathan Waite2, Jacqueline Birks3

1The Wolfson Centre for Age-Related Diseases , Kings College London, London, UK. 2Old Age Psychiatary, The Courtyard, Not-

tingham , UK. 3Centre for Statistics in Medicine, University of Oxford, Oxford, UK

Contact address: Clive G Ballard, The Wolfson Centre for Age-Related Diseases , Kings College London, The Wolfson Wing, Hodgkin

Building, Guy’s Campus, London, SE1 1UL, UK. [email protected].

Editorial group: Cochrane Dementia and Cognitive Improvement Group.

Publication status and date: Edited (no change to conclusions), published in Issue 4, 2008.

Review content assessed as up-to-date: 20 February 2006.

Citation: Ballard CG, Waite J, Birks J. Atypical antipsychotics for aggression and psychosis in Alzheimer’s disease. Cochrane Database

of Systematic Reviews 2006, Issue 1. Art. No.: CD003476. DOI: 10.1002/14651858.CD003476.pub2.

Copyright © 2008 The Cochrane Collaboration. Published by John Wiley & Sons, Ltd.

A B S T R A C T

Background

Aggression, agitation or psychosis occur in the majority of people with dementia at some point in the illness. There have been a number

of trials of atypical antipsychotics to treat these symptoms over the last five years, and a systematic review is needed to evaluate the

evidence in a balanced way.

Objectives

To determine whether evidence supports the use of atypical antipsychotics for the treatment of aggression, agitation and psychosis in

people with Alzheimer’s disease.

Search strategy

The trials were identified from a last updated search of the Specialized Register of the Cochrane Dementia and Cognitive Improve-

ment Group on 7 December 2004 using the terms olanzapine, quetiapine, risperidone, clozapine, amisulpride, sertindole, zotepine,

aripiprazole, ziprasidone. This Register contains articles from all major healthcare databases and many ongoing trials databases and is

updated regularly.

Selection criteria

Randomised, placebo-controlled trials, with concealed allocation, where dementia and psychosis and/or aggression were assessed.

Data collection and analysis

1. Three reviewers extracted data from included trials

2. Data were pooled where possible, and analysed using appropriate statistical methods

3. Analysis included patients treated with an atypical antipsychotic, compared with placebo

1Atypical antipsychotics for aggression and psychosis in Alzheimer’s disease (Review)

Copyright © 2008 The Cochrane Collaboration. Published by John Wiley & Sons, Ltd.

Page 102: for Clinical Trials 31st Annual Meeting P10 P10... · Sunday, May 16, 2010 1 ... contention or wager, be 300 Florens, deposited on both sides:Here your business is decided.” 9 Van

Main results

Sixteen placebo controlled trials have been completed with atypical antipsychotics although only nine had sufficient data to contribute

to a meta-analysis and only five have been published in full in peer reviewed journals. No trials of amisulpiride, sertindole or zotepine

were identified which met the criteria for inclusion.

The included trials led to the following results:

1. There was a significant improvement in aggression with risperidone and olanzapine treatment compared to placebo.

2. There was a significant improvement in psychosis amongst risperidone treated patients.

3. Risperidone and olanzapine treated patients had a significantly higher incidence of serious adverse cerebrovascular events (including

stroke), extrapyramidal side effects and other important adverse outcomes.

4. There was a significant increase in drop-outs in risperidone (2 mg) and olanzapine (5-10 mg) treated patients.

5. The data were insufficient to examine impact upon cognitive function.

Authors’ conclusions

Evidence suggests that risperidone and olanzapine are useful in reducing aggression and risperidone reduces psychosis, but both are

associated with serious adverse cerebrovascular events and extrapyramidal symptoms. Despite the modest efficacy, the significant increase

in adverse events confirms that neither risperidone nor olanzapine should be used routinely to treat dementia patients with aggression

or psychosis unless there is severe distress or risk of physical harm to those living and working with the patient. Although insufficient

data were available from the considered trials, a meta-analysis of seventeen placebo controlled trials of atypical neuroleptics for the

treatment of behavioural symptoms in people with dementia conducted by the Food and Drug Administration suggested a significant

increase in mortality (OR 1.7). A peer-reviewed meta-analysis (Schneider 2005) of 15 placebo controlled studies (nine unpublished)

found similarly increased risk in mortality (OR=1.54, 95% CI 0.004 to 0.02, p=0.01) for the atypical neuroleptics.

P L A I N L A N G U A G E S U M M A R Y

Atypical antipsychotics benefit people with dementia but the risks of adverse events may outweigh the benefits, particularly

with long term treatment

Atypical antipsychotics have become the pharmacological treatment of choice for many clinicians in the treatment of behavioural and

psychiatric symptoms in people with dementia, and the largest evidence base for double blind placebo controlled trials in this area is

for risperidone. Particularly in view of recent safety concerns, a meta-analysis of efficacy and adverse events to inform clinical practice

is timely. Modest efficacy is evident, but the elevated risk of cerebrovascular adverse events, mortality, upper respiratory infections,

oedema and extrapyramidal symptoms is a concern, particularly as selective reporting makes interpretation of other potential adverse

outcomes impossible.

B A C K G R O U N D

Five percent of people aged over 65 and 20% of those over 80 have

dementia, with 700,000 suffers in the UK alone, a number that

will continue to rise as the age of the population increases. More

than 50% of people with dementia experience behavioural and

psychological disturbances (BPSD). BPSD are distressing for the

patients (Gilley 1991), problematic for their carers (Rabins 1982)

in whom they are associated with clinically significant depression

(Ballard 1996) and are frequently the trigger for placement in

residential or nursing home care (Steele 1990).

Pharmacological treatment with antipsychotic agents (also known

as neuroleptics) is often the first line treatment for these disorders,

despite evidence of only modest efficacy. Up until the last decade

there were only a few double-blind placebo-controlled trials per-

taining to the treatment of BPSD, the majority of which were con-

ducted using typical antipsychotics such as haloperidol. However,

these agents have a particularly high risk of harmful side effects for

people with dementia, and the Chief Medical Officer for England

and Wales (CMO 2004) has recommended that they should be

2Atypical antipsychotics for aggression and psychosis in Alzheimer’s disease (Review)

Copyright © 2008 The Cochrane Collaboration. Published by John Wiley & Sons, Ltd.