acp: clinical programs - guidelines -...

http://www.acponline.org/clinical_information/guidelines/process/algorithms/

2

AlgorithmsAlgorithms are schematic models of the clinical decision pathway described in a guideline. The decision points are represented with yes/no nodes, and the clinical characteristics, test characteristics, ortreatment options are also simplified into their basic components. Because of this abbreviated format, algorithms can be very useful clinical tools but are not exhaustive. An algorithm, like the guideline itself, cannot take into account all patient-related variables such as disease state, clinical setting, or economic and insurance issues. Thus, they are to be used only as a guide and should never override the clinician's and patient's best judgment.Index:

Management of Acute Exacerbations of Chronic Obstructive Pulmonary Disease NEW

Pharmacological Treatment of Acute Major Depression and Dysthymia

We would appreciate your feedback about the algorithm as aguideline form. We are particularly interested in your answers to the following questions:

Do you prefer the algorithm format to the full text version?

What do you like and not like about it?

How would you have structured the information?

Algorithm Development Models

Hadorn, David C. Use of algorithms in clinical guideline development in Clinical Practice Guideline Development: Methodology Perspectives, pp 93-104. AHCPR Pub. No. 95-0009. Agency for Health Care Policy and Research: Rockville, MD, Jan. 1995.

Society for Medical Decision Making, Committee on Standardization of Clinical Algorithms. Proposal for Clinical Algorithm Standards in Med Decis Making 12:149-54; 1992.

Acrobat PDF format. Download Acrobat Reader softwarefor free from Adobe. Problems with PDFs?

Search PIER® - DecisionSupport

ACP Members Only. Decision support for over 460 clinical topics.

New Clinical Topics

Current Pharmacologic Treatment of DementiaAnnals of Internal Medicine, Mar 4, 2008A Clinical Practice Guideline from the American College of Physicians and the American Academy ofFamily Physicians.

Screening for Chronic Obstructive Pulmonary Disease Using SpirometryAnnals of Internal Medicine, April 1, 2008 (early release online) U.S. Preventive Services Task Force Recommendation Statement.

Systematic Review: Randomized, Controlled Trials of Nonsurgical Treatments for Urinary Incontinence in WomenAnnals of Internal Medicine, Feb 12, 2008(early release online) Systematic Review: To synthesize evidence of management of urinary incontinence in women.

Access your peers' clinical expertise in our ACP members-only Clinical Practice Discussion Group

Surf and Seek Contest

ACP is running a quick, fun contest to celebrate the launch of the redesigned ACP Online. Contest is opento ACP members.

BEGIN THE CHALLENGE NOW

Recertifying in Internal Medicine This Year?

This chapter describes how algorithms can be used toorganize and summarize the recommendations containedin clinical practice guidelines.

What Are Algorithms?The word “algorithm” is unfamiliar to most people. Yet

almost everyone is familiar with the basic concept under-lying the use of algorithms. For example, many willrecall how animals are classified using a branchingsystem of characteristics which permits distinctions to bemade between mammals (hair or fur), reptiles (smoothskin, do not breathe under water), and fish (smooth skin,breathe under water) (Figure 1). More detailed systems ofclassification permit finer distinctions to be made,enabling us to distinguish, for instance, between amuskrat (furry plus pointy tail) and a beaver (furry plusflat tail).

Several things are immediately apparent in Figure 1.First, the algorithm consists of a set of boxes containingeither “yes-or-no” questions or “answers” to the animalidentification problem. The boxes containing questionsalways have two arrows coming out of them, one corre-sponding to a “yes” answer, the other to a “no.” Arrowsare always one-way in the direction indicated by thearrowhead and are followed until an “answer box,” orterminal node, is reached. Some arrows leave thealgorithm, e.g., mammals larger or smaller than abreadbox, indicating that animals with these characteris-tics are not considered further in this particularalgorithm. The shape of the boxes follows commonlyaccepted criteria and depends on whether a question oranswer is contained within the box.

The algorithm in Figure 1 could have been constructedvery differently while maintaining its ability to distin-guish among different types of animals. Alternatedescriptors could have been used, including whether theunidentified animal was warm- or coldblooded, gavebirth to live young vs. laid eggs, was carnivorous vs.herbivorous, pair-bonded for life or “played the field,”etc.In theory, any of several branching schemes

From RAND, Santa Monica, CA.Abbreviation: AHCPR = Agency for Health Care Policy and Research.

could lead to the same result. However, the variablesactually selected in Figure 1 are more “user-friendly”than the suggested alternate descriptors. In addition, theyare efficient discriminators among different types ofanimals. An important part of the art of algorithmconstruction—whether for identifying animals or fordiagnosing illness—is the careful selection, of identifica-tion variables that possess these two properties, i.e., user-friendliness and efficiency.

As illustrated by the previous example, algorithms areflow diagrams that consist of branching-logic pathwayswhich permit the application of carefully defined criteriato the task of identifying or classifying different types ofsome entity.

Algorithms have been developed for a wide variety ofsuch tasks, including the identification of trees (e.g., bythe shape of the leaves combined with type of bark),classification of bacteria (e.g., aerobic vs. anaerobiccombined with Gram-positive vs. Gram-negative), andqualitative analysis in chemistry (e.g., precipitates aschloride but not as nitrate). Part of the beauty ofalgorithms is the diversity of their application toeveryday life.

Algorithms in Health CareAlgorithms have been used in the health care setting for

many years, often as aids to clinical diagnosis. It will benoted that diagnosis is another form of classification andidentification. Figure 2 shows a simple diagnosticalgorithm based on nurse-practitioner and physician-assistant protocols in common use today in managed-careplans and other settings.

This algorithm depicts a diagnostic strategy aimed atclassifying patients as those with either viral or strepto-coccal pharyngitis. Patients are classified as having viralpharyngitis in one of two ways:, (1) presumptively (i.e.,without a throat culture) if they are relatively afebrile(≤102°F) and have no exudate on their tonsils and haveassociated viral symptoms, or (2) if a throat culture isnegative for streptococci. Patients are classified as havingstreptococcal pharyngitis only if a throat culture ispositive for streptococci; moreover, the algorithm

M E T H O D O L O G Y

- 93 -

Use of Algorithms in Clinical Guideline DevelopmentDavid C. Hadorn, M.D., M.A.

indicates that patients should receive a throat culture only

if they have one or more of the above signs or symptoms.The variables selected for identification purposes in

this algorithm are clearly accessible and observable, buthow efficient are they at distinguishing between viral andstreptococcal pharyngitis? The answer to this and similarquestions depends in large part on the sensitivity, speci-ficity, and predictive value of the various diagnosticsigns, symptoms, and tests. We will return to this issuebelow.

Management Algorithms

Simple classification algorithms, including thosedesigned to aid clinical diagnosis, require no action onthe part of the user other than observation, including inthe case of clinical medicine, patient examination and thenoting of test results. (Ordering a test is not consideredas an action here.)

For purposes of the Agency for Health Care Policy andResearch (AHCPR) clinical guideline program, suchalgorithms are incomplete. Clinicians not only diagnose,

- 94 -

Figure 1. Approach to identifying unknown animal

Unknownanimal

Fur or hair?Yes

Mammal

About asbig as a

breadbox?

No

Feathers?Yes

BirdLives in

the woods andswims under

water?

No

Smoothskin?

NoWho

knows? Flat tail?

NoMuskrat

BeaverBreathes

underwater?

NoReptile/

amphibian

Fish

Yes

Yes

Yes

Yes

No

No

Yes

they also provide treatment. Moreover, diagnosis andtreatment often go hand in hand in what is commonlyreferred to as a management strategy. Algorithms thatinclude both diagnostic and treatment modalities arereferred to as management algorithms. As an example,Figure 3 depicts the approach to the unconscious heartattack victim.

The main structural difference between managementalgorithms and simple diagnostic algorithms is thepresence of boxes (nodes) containing instructions (e.g.,“Begin rescue breathing"). In contrast to question nodes(which always have two exit arrows—one for “yes” andone for “no”), instruction nodes have only one exit arrow.

Management algorithms appear to differ from thealgorithms discussed above in that they seem to gobeyond the classification and identification tasks associ-ated with those algorithms. This is not the case, however.Management algorithms also classify patients intoseparate subgroups (“types of patients”) according todifferences in needs (and appropriate management strate-gies). Classification occurs each time an instruction nodeor terminal node is encountered.

For example, using the heart attack algorithm in Figure3, any patient who clutches his or her chest and becomesunconscious is classified into one of three specificcategories: (1) breathing, does not need emergency inter-vention; (2) not breathing but has a pulse, needs rescuebreathing but not chest compressions; (3) not breathingand has no pulse, needs both rescue breathing and chestcompressions. Thus, patients are classified according tothe type of procedure or treatment that they require orthat is appropriate for them. Indeed, even purely diagnos-tic algorithms classify patients prior to reaching terminalnodes, e.g., into patients who should receive (or who“need”) a throat culture vs. those who should not.

The primary function of management algorithms is toidentify patients who stand to benefit (or not benefit)from a particular management strategy (or range ofstrategies). Judgments of expected benefit are tricky, ofcourse, and in effect constitute outcome predictions withtheir own degree of sensitivity, specificity, and predictivevalue. Outcome studies are essential if these judgmentsare to be made as accurately as possible.


- 95 -

No

Figure 2. Approach to patient with sore throat

Patientcomplains ofsore throat

Fever over102°F?

YesThroat culture

Exudate ontonsils?

Associatedviral symptoms?

(e.g., runnynose)?

Yes

YesPositive forstrep?

Viral pharyngitisYes

No

Strep pharyngitis

No

No

Advantages of AlgorithmsHaving defined management algorithms, we shall now

look at their potential advantages and disadvantages,particularly with respect to their use in conjunction withthe development of clinical guidelines.

• Algorithms conveniently convey the scope of aguideline, summarizing at a glance the types ofpatients covered and not covered, as well as the rangeof management decisions and strategies addressed. Inaddition, algorithms serve to organize the guideline,enabling the user to see the “big picture” with respectto how the different sections of guideline relate to eachother.

• Algorithms have been shown to result in fasterlearning, higher retention, and better compliance withestablished practice standards than standard prose text

(Grimm, Shimoni, Harlon et al., 1975; Kormaroff,Black, Flatley et al., 1974; Sox, 1979; Sox, Sox, andTompkins, 1973). Algorithms have also been usedsuccessfully for retrospective quality review activitiesin a variety of settings (Gottlieb, Margolis, andSchoenbaum, 1990; Greenfield, 1990; Greenfield,Cretin, Worthman et al., 1981; Palmer, Strain, Maureret al., 1984).

• One of the most valuable features of algorithms is thatthey identify situations in which testing is unnecessary.Too often, testing is carried out irrespective of whetherthe subsequent management strategy would change asa result of the tests. With the algorithm approach,testing is incorporated in the flow of patient evaluationonly if “downstream” management strategies dependon test results. Figure 4 illustrates this concept. In thissituation, test results determine subsequent action. Hadthis not been the case, the algorithm would have beenconstructed without this testing node.

• Properly constructed algorithms help guideline devel-opers specify appropriate indications for particularmanagement strategies. For example, the first node inFigure 4 asks whether a patient is at high risk for aparticular condition.

Importantly, this question is tantamount to asking“What are the appropriate indications for testing?”Alternate phraseology might be “Increased risk ofcondition?” or “High (or increased) likelihood ofcondition?” or “Appropriate candidate for testing?” Theselected phrase must then be specifically defined withinthe algorithm’s annotation (see below) into specificcriteria. Which patients, exactly, are at high risk? Whatare the findings (e.g., symptoms, signs) that distinguishhigh-risk patients from those at low risk (and who,therefore, should not be tested)?

The use of decision-relevant questions within thealgorithm boxes is an important feature in the construc-tion of management algorithms. Properly phrased, thesequestions compel guideline developers to define clearlythe types of patients who should or should not be consid-ered to receive particular interventions or who should orshould not be managed in a defined manner (Hadorn,McCormick, and Diokno, 1992). Thus, managementalgorithms are congruent with the Institute of Medicine’sobservation that guidelines must “use unambiguouslanguage, define terms precisely, and use logical, easy-to-follow modes of presentation” (Field and Lohr, 1990).

• Algorithms are readily translatable into computerizedformats, which permit the application of guidelinerecommendations to quality assessment and utilization

- 96 -

Figure 3. Approach to the unconscious heartattack victim

Patient collapsesafter clutching

chest

Is patientbreathing?

Monitor patientuntil help arrives

Yes

Begin rescuebreathing

Doespatient have

a pulse?

Yes

No

No

Patientbreathing onown again?

No

Yes

Begin chestcompressions

Continue rescuebreathing until

help arrives

Pulserestored?

Yes

Continue chestcompressions until

help arrives

No

review activities (two congressionally mandated usesfor AHCPR’s guidelines). Indeed, computerization ofguidelines is dependent on the use of algorithmicformats to illustrate decisionmaking and appropriatecare.

• Algorithms permit the sort of modeling and testingrequired to explore the impacts of changing assump-tions about outcomes, costs, and preferences on thestructure and content of clinical guidelines. Criticalbranches or pathways can be selected and guidelinepanels presented with a range of differing assumptionsto determine whether and to what extent the manage-ment recommendations contained in those branches orpathways change in light of changing assumptions.

Criticisms of the Use of AlgorithmsTwo major criticisms have been leveled at the use of

algorithms in clinical practice. The most commonlyheard criticism is that algorithms impose a locksteprigidity on physicians, turning them into “robots who donot think.” Patients are too variable in their presentationsand preferences, it is said, to encapsulate them withinpredefined roadmaps: Second, the use of algorithms inmedical practice has been challenged on the grounds ofclinical validity. For example, Kassirer and Kopelmanhave asked:

Are the algorithms currently being offered for use inpractice or in teaching valid? Is their basis rigorouslyrooted in actual clinical data? ... Can they deal withthe uncertainties inherent in clinical practice?(Kassirer and Kopelman, 1990, p. 24)

Too often, the answer to these questions has been “no.” Inparticular, insufficient effort has been expended to linkalgorithms to the literature in order to ensure maximumpossible clinical validity.

The algorithms developed to date by some of theAHCPR guideline panels have attempted to overcome thetwo limitations listed above, i.e., rigidity and question-able validity, through the use of algorithm features notfound in most previous algorithms. First, to address theconcern about lack of adequate flexibility, some ofAHCPR’s guideline algorithms contain special nodesshowing where major preference-dependent decisionsoccur. These counseling and decision nodes are insertedto indicate points during the evaluation and managementprocess where patients should be presented with two ormore clinically appropriate courses of action (e.g.,medical vs. surgical management). To the extent possible,the expected outcome associated with each option ispresented at each decision node, including evidencetables whenever possible.

Figure 5 shows the draft algorithm developed by theAHCPR heart failure guideline panel. Counseling anddecision nodes are present in several locations (nodes 12,14, 19), depicting decisions about whether to undergofurther testing or surgery as opposed to medicaltreatment. Such decisions are the most common use forthese nodes.

Counseling and decision nodes provide an important,indeed critical, degree of clinical flexibility to guidelinealgorithms. By presenting the range of clinically appro-priate testing or treatment options, managementalgorithms can summarize for physicians and otherproviders the conclusions reached by the guideline panelsduring their literature review and subsequent delibera-tions. At the same time, patients are able to “apply theirpreferences” to the outcome evidence presented at eachdecision node and to select the option that best suits theirvalues and preferences.

The counseling and decision nodes are not meant torepresent the only points at which patient counseling isrequired, however. Additional counseling is almostalways required, as well, within testing and treatmentnodes, e.g., regarding specific drugs or surgical proce-dures. For this reason, annotations to test and treatmentnodes should summarize the considerations bearing on


- 97 -

Figure 4. Approach to decision about whether to test

High riskfor particular

condition?

No

Yes

No testing

Test

Test resultsdiagnostic ofcondition?

Yes

No

Treat forcondition

Look elsewherefor problem

- 98 -

Rev

ascu

lari

zati

onno

t acc

epta

ble

Fig

ure

5. D

raft

of

man

agem

ent

alg

ori

thm

fo

r p

atie

nts

wit

h he

art

failu

re

Pat

ient

pre

sent

sw

ith

sym

ptom

sof

HF

Spe

cifi

cca

use

of s

xid

enti

fied

?

Yes

Init

ial

eval

uati

on

Not

cov

ered

by

this

gui

deli

ne

Req

uire

hosp

ital

man

agem

ent?

Yes

No

Cli

nica

lvo

lum

eov

erlo

ad?

Ech

o: v

alve

dise

ase?

Yes

Init

iate

diur

etic

s

No

No

Eje

ctio

nfr

acti

on>

35%

?

Yes

Not

cov

ered

by

this

gui

deli

ne

Pat

ient

and

fam

ily

coun

seli

ng

Init

ial m

edic

alm

anag

emen

t

Con

trai

n-di

cati

on to

reva

sc.? N

o

Cou

nsel

ing

and

deci

sion

:

No

sign

if.

angi

na a

nd n

oM

I

No

sign

if.

angi

na b

ut

+ M

I

Rev

ascu

lari

zati

on

ac

cept

able

+ S

igni

f.an

gina

Cou

nsel

ing

and

deci

sion

Phy

siol

ogic

alte

st:

sign

ific

ant

posi

tive

find

ings

?

Cor

onar

yan

giog

ram

:si

gnif

ican

tpo

siti

vefi

ndin

gs?

Yes

No

No

Cou

nsel

ing

and

deci

sion

Con

tinu

e m

edic

alm

anag

emen

t

Goo

dou

tcom

e?

Rev

ascu

lari

ze

Fol

low

up

Add

itio

nal m

edic

alm

anag

emen

t

No

Yes

Can

dida

tefo

r he

art

tran

spla

nt?

Ref

er f

orev

alua

tion

for

hear

ttr

ansp

lant

1

2

3 4

5

6

7 8

9 10

11

12

13

14

15

16

17

18

19

20

21

22

2324

25

Yes

No

26

Yes

Yes

No

HF

= h

eart

fai

lure

; Ech

o =

ech

ocar

diog

ram

; rev

asc

= r

evas

cula

riza

tion

; MI

= m

yoca

rdia

l inf

arct

ion.

these “second-level” choices (including additionalevidence tables, where feasible) and should refer to thesections of the guideline text where these choices arediscussed.

We turn now to the question of algorithm validity. TheAHCPR guideline panels are in an excellent position tocreate algorithms based (to the extent possible) on data.Because systematic literature reviews are an integral partof the AHCPR guideline-development process, theguideline’s algorithms can link the guideline’s manage-ment recommendations to the literature (and, asnecessary, to expert consensus).

The basic approach for accomplishing this linkagerelies on the systematic annotation of each node withinthe algorithm, i.e., wherever specific findings, character-istics, or interventions are described or prescribed. Theannotation summarizes the guideline’s more detailedtextual material, concerning that particular aspect ofpatient management, referring to the guideline text wherethe matter is discussed. Citations to literature are, in turn,provided in that text.

For example, the heart failure management algorithmdepicted in Figure 5 begins by defining the symptomswith which heart failure patients present. Accompanyingdetailed annotation (not shown) summarizes thesesymptoms, including their relative specificity for heartfailure, and refers the reader to the guideline text whereavailable literature on symptoms is discussed. Thus,careful and systematic annotation to each node providesthe explicit linkage of recommendations to the literaturerequired for an algorithm to be considered valid.

To summarize, algorithms are systems of classificationand identification which permit clinicians to approachpatients with specified presentations in an effective andefficient manner. The goal of management algorithms isto summarize the recommendations contained in theguideline by identifying succinctly the types of patientswho are predicted to receive (or not receive) significantbenefit from a particular management strategy. Variablesused to guide this identification process should be easilyaccessible and capable of discriminating efficientlyamong different types of patients. Annotated manage-ment algorithms have the following advantages forguideline development:

• Rapidly depict the scope of the guideline and organizecontent.

• Crystallize the guideline’s key management recom-mendations in a “user-friendly” format.

• Provide for an adequate level of clinical flexibility.

• Permit the precise definition of terms.

• Ensure that the algorithms are explicitly linked to theliterature.

• Provide an adequate level of clinical detail.

• Incorporate specific outcome estimates on which thealgorithms recommendations are based.

• Permit the explicit consideration of and respect forpatients’ preferences through the use of counseling anddecision nodes.

• Allow for testing the extent to which varying outcome,preference, and cost assumptions would alter theguideline’s management recommendations.

• Facilitate the translation of guidelines into computerformats.

• Identify important gaps in the literature requiringadditional research.

Methodologic ConsiderationsIn this section, we will consider certain methodologic

issues that arise in the development of annotated manage-ment algorithms. These are (1) selection of descriptorvariables, (2) the problem of incomplete data bases andconsequent uncertainties about appropriate managementstrategies, (3) how to integrate quantitative with qualita-tive information, and (4) how to represent healthoutcomes.

Selection of descriptor variables

The importance of selecting descriptor variables on thebasis of observability and discriminatory power hasalready been noted. The former attribute is usually self-evident (e.g., observation of variable should not requireexotic tests), but determining the power of a variable todistinguish among different types of patients is moredifficult.

The efficiency of a given variable to separate patientsinto clinically meaningful subgroups depends in largepart on the sensitivity, specificity, and predictive powerof that variable to determine the optimum managementstrategy (or range of strategies) for a given patient. Forexample, one of the first nodes in most algorithms is,generically, “Screening examination determines thatproblem is caused by something other than what thisalgorithm is about?” In the case of heart failure (Figure5), for example, a finding (in nodes 2-3) of significantvalvular disease would immediately classify patients intoa category other than that of heart failure due to coronaryartery disease or idiopathic dilated cardiomyopathy, themajor entities to which the algorithm is addressed. Thus,this screening node has a high sensitivity for patients who


- 99 -

do not “belong” in the algorithm and, conversely, a highspecificity for patients with the condition of interest.

Subsequent nodes in Figure 5 attempt to duplicate thisfeat by distinguishing among patients requiring differentmanagement strategies. The heart failure panel believedthat the most efficient dividing criterion (after exclusionof specific causes of symptoms) was whether patientsrequire acute inpatient management due to the severity oftheir symptoms. That is, answering the question posed innode 4, “Require hospital management?” has high“leverage” value in determining appropriate manage-ment. Similar strategies are adopted for each node until“action nodes” (e.g., nodes 9, 10, 21) are reached or thepatient leaves the algorithm.

There are two brief points worth making about thisprocess. First, guideline development panels shouldattempt to base the selection of descriptor variables ondata and evidence in the literature whenever possible.Ideally, estimates of predictive (or discriminatory) powershould come from controlled studies, although onlyrarely will this ideal be realized in practice. Morecommonly, panels will need to arrive at consensus aboutwhat they believe are the best variables for discriminatingamong patients who require different management strate-gies. Panels should provide rationales for their selectionto the extent possible. Major gaps in the literature shouldbe identified and addressed through research activities.

The second observation about this variable-selectionprocess is that it has strong conceptual and proceduralparallels in the statistical arena. Regression analysis,which isolates and quantifies the impact of a givenvariable on outcomes, is by far the most common statisti-cal method for estimating variables’ discriminatorypower. Another approach, known as recursive partition-ing (Breiman, Friedman, Olshen et al., 1984; Cook andGoldman, 1984; Marshall, 1986), separates an overallpopulation of patients into ever smaller subgroups basedon Boolean combinations of variables (e.g., more than 65years of age and hematocrit less than 30 and systolicblood pressure greater than 160 mm Hg) in an attempt toidentify a set of relatively homogeneous “types ofpatients.” Unlike regression techniques, which define theprobability of an outcome by the magnitude of the scoreobtained by adding and subtracting the weights (coeffi-cients) assigned to predictor variables, recursive parti-tioning creates a branching, algorithm-like “tree,” withthe “trunk” (i.e., the entire sample population) and largerbranches “split” into two smaller branches based on thevalue of the single variable that minimizes a measure ofwithin-group heterogeneity. The tree terminates in two ormore “nodes,” each of which defines a subgroup of

relatively similar patients with respect to the selectedoutcome. This method holds particular promise as an aidto the identification of clinically meaningful subgroupsduring the development of clinical guidelines and theirconstituent algorithms.

Incomplete literature and consequentuncertainties

As mentioned above, panels will frequently need tosupplement an incomplete or conflicting literature withconsensus concerning the range of appropriate manage-ment options for patients with defined sets of clinicalfindings (i.e., within specified nodes on the managementalgorithm). Significant gaps in the literature exist inevery area of medicine; moreover, this unfortunatesituation is likely to persist for many years.

As a general rule, the annotated algorithms shoulddepict the common thread (or lowest common denomina-tor) about which there is reasonable consensus concern-ing necessary care or recommended management strate-gies. For example, if panelists agree that all patients withsigns and symptoms of heart failure should have theirejection fraction measured but cannot agree on a particu-lar procedure, the node would read “Measure ejectionfraction.” Alternatively, if consensus is achieved about aspecific method, the node would read, for example,“Obtain echocardiogram.”

Details of agreement and disagreement about particularstrategies should be summarized in the annotation anddiscussed in the larger guideline document. Ideally, someformal process should be used to make explicit the levelof panel consensus achieved regarding particularmanagement strategies.

As yet, there is no “right” or “wrong” way to accom-plish this specification. For purposes of illustration,however, a possible process is described which embodiesthe principles common to any explicit process. Theprototype process uses a four-category system into whichpanelists can place all identified management strategies(e.g., “all patients with X, Y, and Z should receiveangiography”):

• Necessary care (practice standard, recommenda-tion). Services that have been reasonably well demon-strated to provide significant net health benefit(longevity plus quality of life).

• Appropriate but not necessary care (option).Services that have not been well demonstrated toprovide significant net benefit but for which there issome reason to believe that the benefits outweigh theharms (although probably not significantly).

- 100 -

• Equivocal care (not recommended). Services thathave no good basis for a reasonable expectation ofsignificant net benefit but that have not been demon-strated to provide net harm.

• Inappropriate care (recommendation against use ofthe procedure or treatment). Services that have beenreasonably well demonstrated to provide net harm.

For each management strategy, panelists would indicateto which category they believe the strategy should beassigned, based on the literature and on the panel’scollective experience and discussion. The annotationwould summarize the assignments made by the panel foreach management strategy. For example, consider twoalternative strategies for dealing with patients with Xfinding:

Strategy 1: All patients with X finding should receivetreatment Y.Category AssignmentsNecessary 5Appropriate 4Equivocal 5Inappropriate 2

Strategy 2: All patients with X finding should receivetreatment Y if they have a history of a previous myocar-dial infarction.Category AssignmentsNecessary 9Appropriate 4Equivocal 2Inappropriate 0

The suggested rules for determining the overall strengthof recommendation (e.g., standard vs. option) are asfollows: if a majority of panelists vote for a strategy asbeing “necessary,” as occurred with strategy 2 above, it isconsidered a standard. In general, these are the strategiesillustrated in management algorithms (but see below forfurther criteria). For example, the algorithm correspond-ing to the voting depicted above is shown in Figure 6.

Annotation would indicate how many panelists wouldprovide treatment Y for patients with X irrespective of a

history of recent myocardial infarction. Note that themiddle node would not exist if the panel had collectivelyrecommended strategy 1. Completing the envisionedtypology, strategies would be considered “options” ifthey fell short of a majority vote for “necessary” but if atleast half the panel assigned them to either “necessary”or “appropriate.” This occurred with strategy 1, above.Strategies would be categorized as “not recommended” ifthey did not meet this latter criterion and less than half ofthe panel labeled them as inappropriate. Finally, strate-gies would receive a "recommendation against” if amajority of panelists assigned them to the “inappropri-ate” category. Strategies falling into one of these threecategories should generally be discussed in the annota-tion (and guideline itself) rather than depicted as nodeswithin the algorithm itself.

Integration of qualitative and quantitativeinformation

Many times, guideline panels will be tempted to definedescriptor variables (and the patients to which thesevariables refer) in general or nonspecific terms. Forexample, the urinary incontinence panel unanimouslyagreed that different management strategies were appro-priate for patients with “increased” (vs. normal) post-void residual urine volume but was unable to agree on aprecise figure that would separate increased from normalresidual.

The reluctance to quantify descriptors and criteria isunderstandable, especially in view of the universal dearthof hard evidence to back up any particular figure.Nevertheless, qualitative descriptors like “increased” or“abnormal” are manifestly inadequate to fulfill the rolesbeing demanded of clinical guidelines. Often, problemsof specification will arise in the context of screeningtools or patient questionnaires: what cutoff point shouldbe used to determine the threshold for adopting adifferent management strategy? Several of the AHCPRguideline panels faced this difficulty.

One of the virtues of algorithms is that they highlightpanels’ attempts to “paper over” differences or lack of


- 101 -

Figure 6. Strategy for management of patients with X findings (see text)

Patient has X History ofrecent MI?

Y TreatmentYes

No

MI = myocardial infarction.

agreement about specific thresholds or values for clini-cians (or payers) to “hang their hat on” when faced with adecision about how to manage a particular patient. Hereagain, a tally of individual panelists’ recommendationsprovides useful information.

Panelists should be encouraged to provide specificquantitative information whenever possible, attempting toreplicate in general what they do in particular for theirown patients. Often, of course, threshold values dependon other factors. These factors can be either incorporatedin the algorithm (when there is consensus about theirsignificance) or discussed in the annotation.

Specification of health outcomes

One of the most critical aspects of algorithm (andguideline) development is the identification and repre-sentation of health outcomes, relevant to the guidelinetopic area. For example, the urinary incontinence panelwrestled with such potential endpoints as “percentpatients dry,” “percent patients wet,” “percent patientsimproved,” and the like. Such terms are generally toovague and imprecise to be of adequate use.

Outcomes should always be specified as precisely aspossible. For some guidelines (e.g., heart failure),mortality is a relevant and highly objective therapeuticendpoint. For others (e.g., urinary incontinence andcataracts), mortality is not an issue, and “softer,” quality-of-life endpoints are the sole consideration. (Even withlethal conditions, like congestive heart failure, quality-of-life outcomes are also often of primary importance.)These include, at a minimum, the effects of managementon (1) pain and other physical symptoms, such asshortness of breath or weakness; and (2) functionalability. Inclusion of even softer outcomes, such asoutlook on life or degree of social functioning, is morecontroversial. Specific side effects of treatments shouldalso be listed and the risk of their occurrence estimated.All of these expected outcomes should be tabulatedwithin the algorithm annotation (and larger guidelinetext) as appropriate, especially for the counseling anddecision nodes. Arguably, it is in the specification andestimation of specific health outcomes that AHCPR-supported guidelines have the greatest potential toprovide meaningful guidance to both patients and practi-tioners.

Technical SuggestionsThe creation of algorithms is an art, although, as with

medicine, experience and observation are transformingthis endeavor into a science. In this final section, we will

consider some general technical issues bearing onalgorithm construction.

• Much of the art of algorithm development lies inlooking for ways to reduce clutter. Several methodscan be useful for this purpose. First, nodes shouldselected with care. This point has been made repeated-ly in this chapter, with respect both to (1) observabilityand discriminatory power and (2) the need to restrictnodes to the “lowest common denominator” of panelagreement. In addition, nodes should be limited tothose that either represent a significant decision pointor else apply to a significant proportion of patients(e.g., 20 percent -25 percent) flowing through thatparticular pathway. Variables not meeting one of thesecriteria should be relegated to annotation. Judgment isneeded, as always, to determine when these criteria aremet.

• Another way to reduce clutter it to include withintesting nodes the principal question to be answered bythis test, rather than using two nodes for this purpose.For example, rather than creating two nodes: (1)“Obtain coronary angiogram” and (2) “Significantpositive findings?”, the instruction and question can beintegrated into a single node (e.g., nodes 16 and 18 inFigure 5). A colon separates the instruction from thequestion.

• Multiple, redundant nodes and arrows can often becollapsed into a more parsimonious structure. Forexample, the section of an early version of the heartfailure algorithm shown in Figure 7 can be collapsedinto the four nodes shown in Figure 8. Clues to theredundancy lay in the repeated, parallel use of“coronary angiography,” “inoperable lesions,” and“medical management.” Identical terms should be usedsparingly, if at all, within a given algorithm.

• As discussed above, selection of nodes and corre-sponding identification parameters should be based inlarge part on discriminatory power. In pursuit of theobjective, the yes-or-no questions should capture theessence of the clinical decisionmaking process. Forexample, nodes asking about response to therapy (e.g.,node 22 in Figure 5), although apparently simple,actually demand careful consideration of multiplepossible therapeutic endpoints and monitoring criteria.Similarly, other simple-sounding nodes (e.g., “Meetcriteria for hospitalization?” and “Contraindications torevascularization?”) demand that specific criteria bedefined,to determine the types of patients for whom a“yes” or “no” answer applies. This requirement for

- 102 -


- 103 -

Figure 7. Portion of early draft of heart failure algorithm

Test for ischemia

Small area ofischemia or

equivocal study

No ischemia Extensive ischemia

Coronaryangiography

Coronaryangiography

Inoperablelesions

Operable lesions inarea of ischemia

Inoperable lesionsNo significant

lesions in area ofischemia

RevascularizationMedical

managementMedical

management

Figure 8. Algorithm from Figure 7; collapsed into four nodes

Test forischemiapositive?

Medicalmanagement

No

Yes

Coronaryangiography:

operable lesionsin area ofischemia?

Yes

Revascularization

No

specificity is an important strength of the annotatedalgorithm approach.

• A logical flow must be maintained within thealgorithm. It is important to ensure that all events anddecisions in the algorithm depend on the answers tothe questions and observations that precede them. Onecommon consideration is the placement of “contraindi-cations to surgery” nodes. This node should be placedrelatively late in the flow of management and decision-making when the workup of patients proceeds identi-cally for surgical candidates and nonsurgical candi-dates for most of the management process. If, on theother hand, the decision about whether or not to obtaina particular test early depends on surgical candidacy,the “Contraindication to surgery” node should appearearlier. Additional useful suggestions related to theconstruction of clinical algorithms can be found in arecent report by the Society for Medical DecisionMaking’s Committee on Standardization of ClinicalAlgorithms (1992). Most of the suggestions in thischapter are compatible with those contained in theCommittee’s report.

ReferencesBreiman L, Friedman JH, Olshen RA, et al. Classification and regres-sion trees. Belmont, CA: Wadsworth International Group; 1984.

Cook EF, Goldman L. Empiric comparison of multivariate analytictechniques: Advantages and disadvantages of recursive partitioninganalysis. J Chronic Dis 37:721-31, 1984.

Field MJ, Lohr KN, eds. Clinical Practice Guidelines: Directions for anew program. Washington, DC: National Academy Press; 1990.

Gottlieb LK, Margolis CZ, Schoenbaum SC. Clinical practice guide-lines at an HMO: Development and implementation in a qualityimprovement model. Qual Rev Bull 16:80-6, 1990.

Greenfield S. Measuring the quality of office practice. In: GoldfieldN, Nash DB, editors. Providing quality care: The challenge to clini-cians. Philadelphia: American College of Physicians; 1990.

Greenfield S, Cretin S, Worthman LG, et al. Comparison of a criteriamap to a criteria list in quality of care assessment: The relation ofeach to outcome. Med Care 19:255-72, 1981.

Grimm RK, Shimoni K, Harlon W, et al. Evaluation of patient-careprotocol use by various providers. N Engl J Med 292:507-11,

1975.

Hadorn DC, McCormick K, Diokno A. An annotated algorithmapproach to clinical guideline development. JAMA 267;3311-14,1992.

Kassirer JP, Kopelman RI. Diagnosis and decisions by algorithms.Hosp Pract (Off Ed) 23-31, March 15, 1990.

Komaroff AL, Black WL, Flatley M, et al. Protocols for physicians’assistants: Management of diabetes and hypertension. N Engl J Med290:307-12, 1974.

Marshall RJ. Partitioning methods for classification and decisionmaking in medicine. Stat Med 5:517-26, 1986.

Palmer RH, Strain R, Maurer J, et al. Quality assurance in eight adultmedicine group practices. Med Care 22:632-43, 1984.

Society for Medical Decision Making, Committee on Standardizationof Clinical Algorithms. Proposal for clinical algorithm standards. MedDecis Making 12:149-54, 1992.

Sox HC. Quality of patient care by nurse practitioners and physician’sassistants: A ten year perspective. Ann Intern Med 91:459-68, 1979.

Sox HC, Sox CH, Tompkins RK. The training of physician’s assis-tants: The use of a clinical algorithm system for patient care, audit ofperformance and education. N Engl J Med 288:818-24, 1973.

- 104 -

This chapter describes how algorithms can be used toorganize and summarize the recommendations containedin clinical practice guidelines.

What Are Algorithms?The word “algorithm” is unfamiliar to most people. Yet

almost everyone is familiar with the basic concept under-lying the use of algorithms. For example, many willrecall how animals are classified using a branchingsystem of characteristics which permits distinctions to bemade between mammals (hair or fur), reptiles (smoothskin, do not breathe under water), and fish (smooth skin,breathe under water) (Figure 1). More detailed systems ofclassification permit finer distinctions to be made,enabling us to distinguish, for instance, between amuskrat (furry plus pointy tail) and a beaver (furry plusflat tail).

Several things are immediately apparent in Figure 1.First, the algorithm consists of a set of boxes containingeither “yes-or-no” questions or “answers” to the animalidentification problem. The boxes containing questionsalways have two arrows coming out of them, one corre-sponding to a “yes” answer, the other to a “no.” Arrowsare always one-way in the direction indicated by thearrowhead and are followed until an “answer box,” orterminal node, is reached. Some arrows leave thealgorithm, e.g., mammals larger or smaller than abreadbox, indicating that animals with these characteris-tics are not considered further in this particularalgorithm. The shape of the boxes follows commonlyaccepted criteria and depends on whether a question oranswer is contained within the box.

The algorithm in Figure 1 could have been constructedvery differently while maintaining its ability to distin-guish among different types of animals. Alternatedescriptors could have been used, including whether theunidentified animal was warm- or coldblooded, gavebirth to live young vs. laid eggs, was carnivorous vs.herbivorous, pair-bonded for life or “played the field,”etc.In theory, any of several branching schemes

From RAND, Santa Monica, CA.Abbreviation: AHCPR = Agency for Health Care Policy and Research.

could lead to the same result. However, the variablesactually selected in Figure 1 are more “user-friendly”than the suggested alternate descriptors. In addition, theyare efficient discriminators among different types ofanimals. An important part of the art of algorithmconstruction—whether for identifying animals or fordiagnosing illness—is the careful selection, of identifica-tion variables that possess these two properties, i.e., user-friendliness and efficiency.

As illustrated by the previous example, algorithms areflow diagrams that consist of branching-logic pathwayswhich permit the application of carefully defined criteriato the task of identifying or classifying different types ofsome entity.

Algorithms have been developed for a wide variety ofsuch tasks, including the identification of trees (e.g., bythe shape of the leaves combined with type of bark),classification of bacteria (e.g., aerobic vs. anaerobiccombined with Gram-positive vs. Gram-negative), andqualitative analysis in chemistry (e.g., precipitates aschloride but not as nitrate). Part of the beauty ofalgorithms is the diversity of their application toeveryday life.

Algorithms in Health CareAlgorithms have been used in the health care setting for

many years, often as aids to clinical diagnosis. It will benoted that diagnosis is another form of classification andidentification. Figure 2 shows a simple diagnosticalgorithm based on nurse-practitioner and physician-assistant protocols in common use today in managed-careplans and other settings.

This algorithm depicts a diagnostic strategy aimed atclassifying patients as those with either viral or strepto-coccal pharyngitis. Patients are classified as having viralpharyngitis in one of two ways:, (1) presumptively (i.e.,without a throat culture) if they are relatively afebrile(≤102°F) and have no exudate on their tonsils and haveassociated viral symptoms, or (2) if a throat culture isnegative for streptococci. Patients are classified as havingstreptococcal pharyngitis only if a throat culture ispositive for streptococci; moreover, the algorithm


- 93 -

Use of Algorithms in Clinical Guideline DevelopmentDavid C. Hadorn, M.D., M.A.