a simple method of evaluating medical literature

8/12/2019 A Simple Method of Evaluating Medical Literature

1/6

May 2004 www.aafp.org/fpm F A M I L Y P R A C T I C E M A N A G E M E N T 47

Keeping up with the latest advances in diagnosis

and treatment is a challenge we all face as phy-cians. We need information that is both valid

(that is, accurate and correct) and relevant toour patients and practices. While we have many sourcesof clinical information, such as CME lectures, textbooks,pharmaceutical advertising, pharmaceutical representatives

and colleagues, we often turn to journal articles for themost current clinical information.Unfortunately, a great deal of research reported in journal

articles is poorly done, poorly analyzed or both, and thusis not valid. A great deal of research is also irrelevant to ourpatients and practices. Separating the clinical wheat fromthe chaff can take skills that many of us never were taught.

The article Making Evidence-Based Medicine Doablein Everyday Practice in the February 2004 issue of FPMdescribes several organizations that can help us. Theseorganizations, such as the Cochrane Library, Bandolierand Clinical Evidence, develop clinical questions and then

review one or more journal articles to identify the bestavailable evidence that answers the question, with a focuson the quality of the study, the validity of the results andthe relevance of the findings to everyday practice. Theseorganizations provide a very valuable service, and the num-ber of important clinical questions that they have studiedhas grown steadily over the past five years. (See Four stepsto an evidence-based answer on page 48.)

If you find a systematic review or meta-analysis doneby one of these organizations, you can feel confident thatyou have found the current best evidence. However, theseorganizations have not asked all of the common clinicalquestions yet, and you will frequently be faced with findingthe pertinent articles and determining for yourself whetherthey are valuable. This is where the PP-ICONS approachcan help.

What is PP-ICONS?When you find a systematic review, meta-analysisor randomized controlled trial while readingyour clinical journals or searching PubMed(http://www.ncbi.nlm.nih.gov/entrez/query.fcgi),you need to determine whether it is valid and

relevant. There are many different ways to analyzean abstract or journal article, some more rigorousthan others.1,2I have found a simple but effectiveway to identify a valid or relevant article withina couple of minutes, ensuring that I can useor discard the conclusions with confidence.This approach works well on articles regardingtreatment and prevention, and can also be usedwith articles on diagnosis and screening.

A Simple

Method forEvaluatingthe ClinicalLiterature

The PP-ICONS approach willhelp you separate the clinical wheatfrom the chaff in mere minutes.

Robert J. Flaherty, MD

Dr. Flaherty is a family physician with the Student Health Service

and WWAMI Medical Program, Montana State University-Bozeman,and a clinical professor in the Department of Family Medicine,

University of Washington. Conflicts of interest: none reported.ILLUSTR

ATIO

N

BY

G

REG

CLARKE

Downloaded from the Family Practice Management Web site at www.aafp.org/fpm. Copyright 2004 American Academy of Family Physicians. For the private,

noncommercial use of one individual user of the Web site. All other rights reserved. Contact [email protected] for copyright questions and /or permission requests.


2/6

48 F A M I L Y P R A C T I C E M A N A G E M E N T www.aafp.org/fpm May 2004

The most important information to lookfor when reviewing an article can be sum-marized by the acronym PP-ICONS,which stands for the following:

Problem, Patient or population, Intervention,

Comparison, Outcome, Number of subjects, Statistics.For example, imagine that you just saw

a nine-year-old patient in the office withcommon warts on her hands, an ideal candi-date for your usual cryotherapy. Her motherhad heard about treating warts with ducttape and wondered if you would recommendthis treatment. You promised to call Momback after you had a chance to investigate

this rather odd treatment.When you get a free moment, you write

down your clinical question: Is duct tapean effective treatment for warts in chil-dren? Writing down your clinical questionis useful, as it can help you clarify exactlywhat you are looking for. Use the PPICOparts of the acronym to help you write yourclinical question; this is actually how manyresearchers develop their research questions.

You search Cochrane and Bandolier with-out success, so now you search PubMed,which returns an abstract for the followingarticle: Focht DR 3rd, Spicer C, FairchokMP. The efficacy of duct tape vs cryotherapy

in the treatment of verruca vulgaris (thecommon wart).Arch Pediatr Adolesc Med.2002 Oct;156(10):971-974.

You decide to apply PP-ICONS to thisabstract (shown on page 49) to determine ifthe information is both valid and relevant.

Problem. The first P in PP-ICONS is forproblem, which refers to the clinical con-dition that was studied. From the abstract,it is clear that the researchers studied thesame problem you are interested in, whichis important since flat warts or genital wartsmay have responded differently. Obviously,if the problem studied were not sufficientlysimilar to your clinical problem, the resultswould not be relevant.

Patient or population. Next, considerthe patient or population. Is the study groupsimilar to your patient or practice? Are theyprimary care patients, for example, or arethey patients who have been referred to a

Physicians often turn

to journal articles to

find the most current

clinical informa-

tion; however, much

research is not valid

or relevant.

The PP-ICONS

approach can help

physicians quickly

determine whether a

particular study

is valuable.

PP-ICONS stands for

problem, patient or

population, interven-

tion, comparison,outcome, number of

subjects and statistics.

Writing down your

clinical question can

help you clarify exactly

what you are looking

for in the clinical

literature.

SPEEDBAR

Reading the abstract is often sufficient when eval-

uating an article using the PP-ICONS approach.

The most relevant studies will involve outcomes

that matter to patients (e.g., morbidity, mortal-

ity and cost) versus outcomes that matter to

physiologists (e.g., blood pressure, blood sugar orcholesterol levels).

Ignore the relative risk reduction, as it overstates

research findings and will mislead you.

KEY POINTS

When faced with a clinical question, follow these steps to find an evidence-based answer:

1. Search the Web site of one of the evidence review organizations, such as Cochrane (http: //www.cochrane.org/

cochrane/revabstr/mainindex.htm), Bandolier (http://www.jr2.ox.ac.uk/bandolier) or Clinical Evidence (http://

www.clinicalevidence.com), described in Making Evidence-Based Medicine Doable in Everyday Practice, FPM,

February 2004, page 51. You can also search the TRIP+ Web site (http: //www.tripdatabase.com), which simul-taneously searches the databases of many of the review organizations. If you find a systematic review or meta-

analysis by one of these organizations, you can be confident that youve found the best evidence available.

2. If you dont find the information you need through step 1, search for meta-analyses and systematic reviews

using the PubMed Web site (see the tutorial at http:/ /www.nlm.nih.gov/bsd/pubmed_tutorial/m1001.html).

Most of the recent abstracts found on PubMed provide enough information for you to determine the validity

and relevance of the findings. If needed, you can get a copy of the full article through your hospital library or the

journals Web site.

3. If you cannot find a systematic review or meta-analysis on PubMed, look for a randomized controlled trial

(RCT). The RCT is the gold standard in medical research. Case reports, cohort studies and other research meth-

ods simply are not good enough to use for making patient care decisions.

4. Once you find the article you need, use the PP-ICONS approach to evaluate its usefulness to your patient.

FOUR STEPS TO AN EVIDENCE-BASED ANSWER


3/6


E V A L U A T I N G C L I N I C A L L I T E R A T U R E

tertiary care center? Are they of a similarage and gender? In this case, the researchersstudied children and young adults inoutpatient clinics, which is similar to yourpatient population. If the patients in thestudy are not similar to your patient, forexample if they are sicker, older, a differentgender or more clinically complicated, theresults might not be relevant.

Intervention.The interventioncould be a diag-nostic test or atreatment. Make

sure the inter-vention is thesame as what youare looking for.The patients mother was asking about ducttape for warts, so this is a relevant study.

Comparison. The comparison is whatthe intervention is tested against. It could bea different diagnostic test or another thera-py, such as cryotherapy in this wart study.It could even be placebo or no treatment.Make sure the comparison fits your ques-

tion. You usually use cryotherapy for com-mon warts, so this is a relevant comparison.

Outcome. The outcome is particularlyimportant. Many outcomes are disease-oriented outcomes, which are based ondisease-oriented evidence (DOEs). DOEsusually reflect changes in physiologic param-eters, such as blood pressure, blood sugar,cholesterol, etc. We have long assumedthat improving the physiologic parametersof a disease will result in a better disease

outcome, butthat is not neces-sarily true. Forinstance, finaste-ride can improve

urinary flowrate in prostatichypertrophy, butit does not signifi-

cantly change symptom scores.3

DOEs look at the kinds of outcomes thatphysiologists care about. More relevant areoutcomes that patients care about, oftencalled patient-oriented outcomes. Theseare based on patient-oriented evidence thatmatters (POEMs) and look at outcomessuch as morbidity, mortality and cost. Thus,

when looking at a journal article, DOEsare interesting but of questionable relevance,

ABSTRACT FROM PUBMED

We have long assumed that improving

the physiologic parameters of a disease

will result in a better disease outcome,but that is not necessarily true.

Using the PP-ICONSapproach, physicians

can evaluate the

validity and relevance

of clinical articles in

minutes using only

the abstract, such as

this one, obtained free

online from PubMed,

http://www.ncbi.

nlm.nih.gov/entrez/

query.fcgi. The author

uses this abstract to

evaluate the use of

duct tape to treat

common warts.

For a study to be use-

ful, it must address the

same problem you are

interested in and your

patient must be similar

to those in the study

group.

The study interven-

tion should also be the

same as what you are

looking for.

Disease-oriented out-

comes may be interest-

ing but of questionable

relevance, whereas

patient-oriented out-

comes are interestingand relevant.

Using PubMed online,

you can find abstracts

for the clinical articles

that match your search

parameters.

SPEEDBAR


4/6


whereas POEMs are very interesting andvery relevant. In the study on the previouspage, the outcome is complete resolution ofthe wart, which is something your patient isinterested in.

Number. The number of subjects iscrucial to whether accurate statistics can be

generated from the data. Too few patients ina research study may not be enough to showthat a difference actually exists between theintervention and comparison groups (knownas the power ofa study). Manystudies are pub-lished withless than 100subjects, which isusually inadequateto provide reliable

statistics. A goodrule of thumb is400 subjects.4Fifty-one patients completedthe wart study, which is a pretty small num-ber to generate good statistics.

Statistics. The statistics you are inter-ested in are few in number and easy tounderstand. Since statistics are frequentlymisused in journal articles, it is worth afew minutes to learn which to believe andwhich to ignore.

Relative risk reduction. It is not unusu-al to find a summary statement in a journal

article similar to this one from an articletitled Long-Term Effects of MammographyScreening: Updated Overview of the Swed-ish Randomised Trials:5

There were 511 breast cancer deathsin 1,864,770 women-years in the invitedgroups and 584 breast cancer deaths in

1,688,440 women-years in the controlgroups, a significant 21 percent reductionin breast cancer mortality.

This 21-percent statistic is the relativerisk reduction(RRR), whichis the percentreduction in themeasured out-come betweenthe experimen-tal and control

groups. (SeeSome important

statistics, below, for more information oncalculating the RRR and other statistics.)The RRR is not a good way to compareoutcomes. It amplifies small differencesand makes insignificant findings appearsignificant, and it doesnt reflect the baselinerisk of the outcome event. Nevertheless, theRRR is very popular and will be reported innearly every journal article, perhaps becauseit makes weak results look good. Think ofthe RRR as the reputation reviving ratio

Too few patients in a

research study may

not be enough to show

that a difference actu-

ally exists between the

intervention and com-

parison groups.

Studies with less than

100 subjects are usu-

ally inadequate to pro-

vide reliable statistics.

The relative risk

reduction is not a

good way to compare

outcomes, as it ampli-

fies small differences

and doesnt reflect the

baseline risk of the

outcome event.

SPEEDBAR

The relative risk reduction is very

popular and will be reported in nearly

every journal article, perhaps because

it makes weak results look good.

Absolute risk reduction (ARR): The difference between the control groups event rate (CER) and the experimental groups event rate (EER).

Control event rate (CER): The proportion of patients responding to placebo or other control t reatment. For example, if 25 patients are in a control

group and the event being studied is observed in 15 of those patients, the control event rate would be 15/25 = 0.60.

Experimental event rate (EER): The proportion of patients responding to the experimental treatment or intervention. For example, if 26 patients

are in an experimental group and the event being studied is observed in 22 of those patients, the experimental event rate would be 22/26 = 0.85.

Number needed to treat (NNT) : The number of patients that must be treated to prevent one adverse outcome or for one patient to benefit. TheNNT is the inverse of the ARR; NNT = 1/ARR.

Relative risk reduction (RRR): The percent reduction in events in the treated group compared to the control group event rate.

When the experimentaltreatment reduces therisk of a bad event:

Example: Beta-blockers to preventdeaths in high-risk patients withrecent myocardial infarction:

When the experimentaltreatment increases theprobability of a good event:

Example: Duct tape to eliminatecommon warts.

Relative riskreduction (RRR)

CER-EER/CER (.66 -. 50)/.66 = .24 or 24 percent EER-CER/CER (.85-.60)/.60 = .42 or 42 percent

Absolute riskreduction (ARR):

CER-EER (.66 - .50) = .16 or 16 percent EER-CER .85-.60 = .25 or 25 percent

Number needed

to treat (NNT)1/ARR 1/.16 = 6 1/ARR 1/.25 = 4

SOME IMPORTANT STATISTICS


5/6


E V A L U A T I N G C L I N I C A L L I T E R A T U R E

or the reporters reason for rit-ing. Ignore the RRR. It willmislead you. In our wart treat-ment example, the RRR wouldbe (85 percent - 60 percent)/60percent x 100 = 42 percent. TheRRR could thus be interpreted

as showing that duct tape is42 percent more effective thancryotherapy in treating warts.

Absolute risk reduction.Abetter statistic is the absolute riskreduction (ARR), which is thedifference in the outcome eventrate between the control groupand the experimental treatedgroup. Thus, in our wart treat-ment example, the ARR is theoutcome event rate (complete

resolution of warts) for duct tape(85 percent) minus the outcomeevent rate for cryotherapy (60percent) = 25 percent. Unlikethe RRR, the ARR does notamplify small differencesbut shows the true differencebetween the experimental andcontrol interventions. Using theARR, it would be accurate tosay that duct tape is 25-percentmore effective than cryotherapyin treating warts.

Number needed to treat. The singlemost clinically useful statistic is the numberneeded to treat (NNT). The NNT is thenumber of patients who must be treatedto prevent one adverse outcome. To thinkabout it another way, the NNT is the num-ber of patients whomust be treated for onepatient to benefit. (Therest who were treated

obtained no benefit,although they still suf-fered the risks and costsof treatment.) In ourwart therapy article, the NNT would tell ushow many patients must be treated with theexperimental treatment for one to benefitmore than if he or she had been treated withthe standard treatment.

Now this is a statistic that physiciansand their patients can really appreciate!Furthermore, the NNT is easy to calculate,

as it is simply the inverse of the ARR.For our wart treatment study, the NNT is

1/25 percent =1/0.25 = 4, meaning that 4patients need to be treated with duct tapefor one to benefit more than if treatedby cryotherapy.

Wrapped up in this simple little statisticare some very important concepts. TheNNT provides you with the likelihoodthat the test or treatment will benefit any

individual patient,an impression of thebaseline risk of theadverse event, and

a sense of the costto society. Thus, itgives perspective andhints at the reason-

ableness of a treatment. The value of thisstatistic has become appreciated in the lastfive years, and more journal articles arereporting it.

What is a reasonable NNT? In a perfectworld, a treatment would have an NNT of1, meaning that every patient would benefitfrom the treatment. Real life is not so kind

(see Examples of NNTs, above). Clearly,an NNT of 1 is great and an NNT of 1,000

Now this is a statistic that

physicians and their patients

can really appreciate!

The number needed to treat (NNT) is one of the most usefulstatistics for physicians and patients. It calculates the number of

patients that must be treated to prevent one adverse event or for

one patient to benefit. Note that NNTs for preventive interventions

will usually be higher than NNTs for treatment interventions. The

lower the NNT, the better.

The following examples of NNTs are borrowed from an excellent list

available through the Bandolier Web site at http://www.jr2.ox.ac.uk/

bandolier/band50/b50-8.html.

Therapy NNT

Triple antibiotic therapy to eradicate H. pylori 1.1

Isosorbide dinitrate for prevention of

exercise-induced angina5

Short course of antibiotics for otitis media in children 7

Statins for secondary prevention of adverse

cardiovascular outcomes11

Statins for primary prevention of adverse

cardiovascular outcomes35

Finasteride to prevent one operation for benign

prostatic hyperplasia39

Misoprostol to prevent any gastrointestinal compli-

cation in nonsteroidal anti-inflammatory drug users166

EXAMPLES OF NNTS

The absolute risk

reduction shows

the true difference

between the experi-

mental and control

interventions.

The most clinically use-

ful statistic is the num-

ber needed to treat

(NNT), which tells you

the number of patients

who must be treated

to prevent one adverse

outcome or for one

patient to benefit.

The NNT is easy to cal-

culate, as it is simplythe inverse of the abso-

lute risk reduction.

In a perfect world, a

treatment would have

an NNT of 1, but real

life is not so kind.

SPEEDBAR


6/6


is terrible. Although it is hard to come upwith firm guidelines, for primary therapiesI am satisfied with an NNT of 10 orless and very pleased with an NNT lessthan 5. Our duct tape NNT of 4 is good,particularly since the treatment is cheap,easy and painless.

Note that NNTs for preventive inter-ventions (e.g., the use of aspirin to preventcardiac problems) willusually be higher thanNNTs for treatmentinterventions (e.g.,the use of duct tape tocure warts). Preventiongroups contain bothhigher-risk and lower-risk individuals, so theyproduce bigger denomi-

nators, whereas treatment groups onlycontain diseased patients. Thus, an NNTfor prevention of less than 20 might beparticularly good.

When discussing a particular therapy,I explain the NNT to my patient. Sincethis statistical concept is easy to understand,it can help the patient be a more informedpartner in therapeutic decisions.

You will soon start to see a similar sta-tistic, the number needed to screen (NNS),which is the number of patients neededto screen for a particular disease for agiven duration for one patient to benefit.6Although few NNSs have been calculated,they are likely to involve higher numbers,since the screening population consists ofpatients with and without the disease. Forexample, in the article on mammographyscreening mentioned above, the NNS was961 for 16 years. In other words, you wouldneed to screen 961 women for 16 years toprevent one breast cancer death.

The good news and the badUsing PP-ICONS to assess the wart study,the problem, the patient/population, theintervention, the comparison and the out-come are all relevant to your patient. Thenumber of subjects is on the small side,making you a little wary, but the interven-tion is cheap and low-risk. The statistics,particularly the NNT, are reasonable. Onbalance, this looks l ike a fair approach, soyou call the patients mother and discuss

it with her.The PP-ICONS approach is an easy

way to screen an article for validity andrelevance, and the abstract often containsall of the information you need. Even thestatistics can be done quickly in your head.You can apply PP-ICONS when search-ing for a particular article, when you comeacross an article in your reading, when

data are presented at lectures, when a phar-maceutical representative hands you anarticle to support hisor her pitch, and evenwhen reading newsstories describingmedical breakthroughs.

Dont be discour-aged if you find thathigh-quality articles arerare, even in the mostprestigious journals.

This seems to be changing for the better,although many careers are stil l being builton questionable research. Nevertheless,screening articles will help you find thetruth that is out there and will helpyou practice the best medicine. And aswe become more discerning end-usersof research, we might just stimulateimprovements in clinical research inthe process.

Send comments to [email protected].

1. Miser WF. Critical appraisal of the literature.

J Amer Board Fam Pract. 1999;12(4):315-333.

2. Guyatt GH, et al. Users guides to the medi-

cal literature. How to use an art icle about therapy

or prevention. Are the results of the study valid?

JAMA. 1993;270(21):2598-2601.

3. Lepor H, et al. The efficacy of terazosin, fin-

asteride or both in benign prostatic hyperplasia.

Veterans Af fairs Cooperative Studies BenignProstatic Hyperplasia Study Group. N Engl J Med.

1996;335(8):533-539.

4. Krejcie RV, Morgan DW. Determining sample

size for research activities. Educational and Psycho-

logical Measurement. 1970;30:607-610.

5. Nystrom L, et al, Long-term effects of

mammography screening: updated overview

of the Swedish randomised trials. Lancet.

2002;359(9310): 909-919.

6. Rembold CM. Number needed to screen:

development of a statistic for disease screening.BMJ. 1998;317:307-312.

You can apply PP-ICONS

even when reading news

stories describing medical

breakthroughs.

NNTs for preventive

interventions will usu-

ally be higher than

NNTs for treatment

interventions.

Explaining the NNT to

patients can help them

be more informed

partners in therapeutic

decisions.

If a studys number of

subjects is small, its

intervention may still

be worth trying if its

risks and costs are low.

The abstract of an

article often contains

all of the information

you need to screen it

for validity using the

PP-ICONS approach.

SPEEDBAR

a simple method of evaluating medical literature

Documents