bayes theorem: fully informed rational estimates of ......bayes theorem: fully informed rational...

12
2010;141;658-659 J Am Dent Assoc estimates of diagnostic probabilities Bayes theorem: Fully informed rational jada.ada.org ( this information is current as of June 1, 2010 ): The following resources related to this article are available online at http://jada.ada.org/cgi/content/full/141/6/658 in the online version of this article at: including high-resolution figures, can be found Updated information and services http://www.ada.org/prof/resources/pubs/jada/permissions.asp this article in whole or in part can be found at: of this article or about permission to reproduce reprints Information about obtaining © 2010 American Dental Association. The sponsor and its products are not endorsed by the ADA. on June 1, 2010 jada.ada.org Downloaded from

Upload: others

Post on 03-Aug-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bayes theorem: Fully informed rational estimates of ......Bayes theorem: Fully informed rational jada.ada.org ( this information is current as of June 1, 2010 ): The following resources

  2010;141;658-659 J Am Dent Assoc

estimates of diagnostic probabilitiesBayes theorem: Fully informed rational

jada.ada.org ( this information is current as of June 1, 2010 ):The following resources related to this article are available online at 

http://jada.ada.org/cgi/content/full/141/6/658in the online version of this article at:

including high-resolution figures, can be foundUpdated information and services

http://www.ada.org/prof/resources/pubs/jada/permissions.aspthis article in whole or in part can be found at:

of this article or about permission to reproducereprintsInformation about obtaining

© 2010 American Dental Association. The sponsor and its products are not endorsed by the ADA.

on June 1, 2010 jada.ada.org

Dow

nloaded from

Page 2: Bayes theorem: Fully informed rational estimates of ......Bayes theorem: Fully informed rational jada.ada.org ( this information is current as of June 1, 2010 ): The following resources

656 JADA, Vol. 141 http://jada.ada.org June 2010

C L I N I C A L P R A C T I C E

The probability that adiagnostic observation ortest result is positivewhen a patient has thecondition in question is

not necessarily the same as theprobability that the same patienthas the condition if the observationor test result is positive. In the firstcase, the condition causes the evi-dence to appear; in the other case,the evidence is used to infer theexistence of the underlying condi-tion (the diagnosis). The former is aprincipal concern of the researchcommunity and those who developtests; the latter is the primary con-cern of practitioners.

Researchers have studied exten-sively the way in which physiciansform judgments in clinical diag-nosis.1-4 This study represents one ofthe first attempts to understandhow that process takes place in den-tistry. The most straightforwardview of diagnosis is that practi-tioners directly observe the condi-tion that requires treatment, withno intervening process of inferenceconnecting evidence to the back-ground condition; practitionerssimply understand what needs to bedone. A more nuanced interpreta-tion is that practitioners observeevidence directly and use the evi-dence as a basis for making correctdiagnoses; however, the processthey use in proceeding from evi-

Dr. Chambers is a professor of dental education, Department of Dental Practice, Arthur A. DugoniSchool of Dentistry, University of the Pacific, San Francisco. Contact Dr. Chambers at 19800 7th St. E.,Sonoma, Calif. 95476, e-mail “[email protected]”. Mr. Mirchel is a third-year dental student, Arthur A. Dugoni School of Dentistry, University of thePacific, San Francisco.Dr. Lundergan is a professor of periodontics, Arthur A. Dugoni School of Dentistry, University of thePacific, San Francisco.

An investigation of dentists’ and dental students’ estimates of diagnostic probabilitiesDavid W. Chambers, EdM, MBA, PhD; Ryan Mirchel, BS; William Lundergan, DDS, MA

Background. Research in medicine has shown that physicians havedifficulty estimating the probability that a patient has a condition on thebasis of available diagnostic evidence. They consistently undervalue base-line information about the patient relative to test information and arepoor intuitive calculators of probability. The authors could not locate inthe literature any studies of diagnostic probability estimates from base-line information and test data for dentists.Methods. Using two vignettes that contained different baseline infor-mation, dental students and clinical faculty members estimated the prob-ability that the described hypothetical patient had the condition in ques-tion. Respondents also commented on the project.Conclusions. Both groups of respondents overemphasized the impor-tance of test evidence relative to baseline information, although experi-enced practitioners did so to a lesser extent than did students. Respond -ents, especially practitioners, expressed resistance to performing adiagnostic task that required precise estimates of probability.Clinical Implications. Dentists appear to estimate diagnostic proba-bilities in an intuitive fashion, but they do so imprecisely. Clinical experi-ence provides some protection against the bias of overestimating test evi-dence compared with baseline information. These findings raisequestions about how practitioners use probability estimates and whetherother models also may play a role. The incorporation of information fromevidence-based dentistry into practice requires better understanding.Key Words. Clinical diagnosis; Bayes theorem; evidence-based dentistry; test sensitivity.JADA 2010;141(6):656-666.

A B S T R A C T

Copyright © 2010 American Dental Association. All rights reserved. Reprinted by permission.

on June 1, 2010 jada.ada.org

Dow

nloaded from

Page 3: Bayes theorem: Fully informed rational estimates of ......Bayes theorem: Fully informed rational jada.ada.org ( this information is current as of June 1, 2010 ): The following resources

JADA, Vol. 141 http://jada.ada.org June 2010 657

C L I N I C A L P R A C T I C E

dence to diagnosis is subconscious and flawless.In most cases in dentistry, this straightforwardrepresentation of the diagnostic process is func-tionally accurate and the proportion of detecteddiagnostic surprises is small.1-4

However, this does not mean that practitionersactually use such a method; rather, it may be aconvenient summary or gloss on the more com-plex, underlying process that actually occurs.Substantial literature demonstrates that decisionmakers are prone to making systematic errorsand are poor at describing the processes theyactually use. The classic reference is a 1982 col-lection of academic reports edited by Kahnemanand colleagues5 and its sequel.6 Gladwell’s book“Blink: The Power of Thinking WithoutThinking”7 is a clear presentation of the gapbetween deciding and knowing how one decides.In addition, books by Taleb8 and Ariely9 are read-able accounts of rigorous research in this field. Intheir background report for the 1995 Institute ofMedicine report, Bader and Shugars10 summa-rized relevant research showing that dentistsappear to unknowingly use more than a straight-forward see-the-condition approach to diagnosis.As biological technology advances, the role ofassays in diagnosis will grow in importance.

The research model1-4,5,6 used by investigators tostudy how physicians make diagnostic probabilityestimates involves comparing the decisions theyactually make in controlled circumstances withthe decisions that perfectly rational and com-pletely informed people would have made. Wheresystematic differences are observed, researcherscan make inferences about practitioners’ habits ofmind. Under ideal circumstances, decision makingabout the probability that a condition existsrequires that two processes be completed satisfac-torily: the relevant facts must be assessed andthey must be combined appropriately. Intuitively,practitioners recognize that a positive observationregarding a common condition such as periodon-titis is more actionable than is the same positiveobservation regarding a less common conditionsuch as necrotizing ulcerative gingivitis.

The decision maker uses the baseline likelihoodof the condition to weight the probability that afinding is a clear signal that the underlying condi-tion is present. The second factor that must be con-sidered is the quality of the observation. Intu-itively, a poorly exposed radiograph more likelyrepresents false-positive evidence than does a well-exposed radiograph. Researchers and clinicians

must adjust the evidence, whether positive or neg-ative, for its sensitivity (that is, the chance of apositive observation when the condition exists) andfor the baseline likelihood of the condition.

The theoretically correct method for estimatingdiagnostic probabilities involves the use of Bayestheorem.1-3 The sidebar to this article provides anexample that describes this approach. The con-cept of Bayes theorem is to consider all the waysin which a test or observation can produce a posi-tive finding. An obvious example is when a true-positive result occurs because the test or observa-tion is sensitive. Adjusting for the baselinelikelihood of the condition provides the likelihoodthat acting on the positive test result or observa-tion will be appropriate.

However, there is a second way in which find-ings can appear to indicate the presence of a con-dition: false-positive test results or observationscan occur, and these must be adjusted for the like-lihood that the condition does not exist. (By defini-tion, false-positive findings exist only in patientswho do not have the condition.) According toBayes theorem, the chance of being correct inacting on a positive test result or observation isthe ratio of the correct positive finding to the total(both correct and incorrect) positive findings.

Eddy11 and Lyman and Balducci12,13 conductedresearch on diagnostic decision making andreported large variability among physicians givena common set of information. According to otherresearchers,4,14-17 physicians’ estimates often areinaccurate when compared with fully informedestimates. Three general areas of difficulty in per-forming this decision-making task are describedin the literature:ddifficulty in understanding the evidence provided by tests5,6,18,19;dunderuse of baseline information about thepatient4,12-14,20,21;dinappropriate integration of the availabledata.4,15,16,18,22-26

The American College of Dentists27 provides adetailed review of the medical literatureregarding the assessment of diagnostic probabili-ties and an explanation of the use of Bayes the-orem in combining these data.

On the basis of our assessment of the litera-ture regarding the accuracy of diagnostic proba-bility estimates given baseline data and test evi-dence, we generated the following hypotheses fortesting among dental students and clinical fac-ulty members by using vignette simulations of a

Copyright © 2010 American Dental Association. All rights reserved. Reprinted by permission.

on June 1, 2010 jada.ada.org

Dow

nloaded from

Page 4: Bayes theorem: Fully informed rational estimates of ......Bayes theorem: Fully informed rational jada.ada.org ( this information is current as of June 1, 2010 ): The following resources

658 JADA, Vol. 141 http://jada.ada.org June 2010

C L I N I C A L P R A C T I C E

clinical case:dThere will be a wide range of estimated diag-nostic probabilities, even when we provide par-ticipants with common baseline information andtest evidence.dIn estimating diagnostic probability, partici-pants will place too much weight on test evidencein comparison with baseline data.dWhen estimating diagnostic probabilities, experi-enced practitioners will be less likely to undervaluebaseline patient characteristics than will novices.

PARTICIPANTS, MATERIALS ANDMETHODS

We presented a brief vignette (Box; page 660),described as hypothetical, to eight groups of stu-dents and five groups of faculty members at theArthur A. Dugoni School of Dentistry, Universityof the Pacific, San Francisco. The vignette

involved a positive test result for a periodontalcondition in a patient from a population groupwith a given prevalence of the condition; the testinvolved the use of an experimental piece ofequipment. We provided specific quantitativevalues for the sensitivity of the test, the false-positive test results and the baseline prevalenceof the condition in the population. We providedparticipants with all of the information needed tocalculate precisely the likelihood of the patient’shaving the condition, and we did not provide anyextraneous information in the vignette.

We presented two versions of the vignette tothe groups; they differed only in the informationcontained regarding the baseline prevalence ofthe condition in the population. One version con-tained baseline information designed to raise thecorrect probability estimate above the test sensi-tivity value. In this version, sensitivity equals

Recent discussions of changes inguidelines for prostate-specificantigen tests and mammographyhave focused attention on the factthat positive results from tests or

from direct observation are not sufficient toestablish the presence of a condition with 100percent certainty. Two factors may temper posi-tive diagnostic information. The first is the sen-sitivity of the test or observation; sometimes afalse-positive result occurs. The second factor isthe baseline prevalence of the condition.

A judgment that a common condition existsin a particular patient often is correct simplybecause the condition occurs so frequently inpatients. The opposite is equally true; rare con-ditions remain somewhat unlikely, even whenpositive results from tests or direct observationoccur.

The problem for clinicians is incorporatingtest sensitivity (freedom from false-positiveobservations) and baseline prevalence into ameaningful estimate of the likelihood that apatient has the condition in question. Health

care professionals do this intuitively, with somedegree of accuracy and possibly with somedegree of bias. A precise way of accomplishingthis is with Bayes theorem.1

Combining information about the sensitivityof an observation or a test result with the base-line prevalence of the condition in the popula-tion is somewhat complex. The rationalapproach is expressed in Bayes’ theorem, whichstates that the likelihood that evidence in favorof a condition’s existing is correct is the ratio oftrue-positive observations in the populationdivided by the probability of true-positiveobservations plus false-positive observations. In other words, how trustworthy is the observa-tion or test result?

What we want to determine is the proba-bility (Pr) that a condition (C) exists given theevidence (E): Pr(C|E). What we are given is theprobability that the evidence exists given thecondition Pr(E|C) and the baseline prevalenceof the condition Pr(C). The fundamental con-cept in Bayes theorem is that there are twoways to obtain a positive test result or an

Bayes theoremFully informed rational estimates of diagnostic probabilities

Copyright © 2010 American Dental Association. All rights reserved. Reprinted by permission.

on June 1, 2010 jada.ada.org

Dow

nloaded from

Page 5: Bayes theorem: Fully informed rational estimates of ......Bayes theorem: Fully informed rational jada.ada.org ( this information is current as of June 1, 2010 ): The following resources

JADA, Vol. 141 http://jada.ada.org June 2010 659

C L I N I C A L P R A C T I C E

0.70, the false-positive rate equals 0.30 and thebaseline prevalence of the condition equals 0.80.Given these data, the Bayesian diagnostic proba-bility is 0.90; this vignette is referred to as the “90percent vignette.” The second vignette was iden-tical, except that the baseline prevalence of thecondition equals 0.10. This difference has theeffect of shifting the correct diagnosis in the oppo-site direction of the positive test results. Thesedata combine to produce a Bayesian diagnosticprobability of 0.21; thus, the second vignette isreferred to as the “21 percent vignette.”

We presented the vignettes in 2008 to eightgroups of first-year dental students as part of acourse on clinical dental research (total class size= 143) and to five groups of clinical faculty mem-bers as part of a quarterly in-service training pro-gram (total attendance of approximately 80people). The study protocol fell under the exempt

classification for human research subjects, andparticipants were anonymous. Participation wasvoluntary. After the participants provided theirestimates, we presented a 20-minute explanationof the Bayesian approach to diagnosis, which wasillustrated by the stimulus vignettes.

We shuffled the stimulus vignettes before dis-tributing them, and each respondent had an equalchance of receiving the 90 percent or the 21 per-cent vignette. We asked the students and facultymembers to provide a single, quantitative estimateof the likelihood that the patient had the indicatedcondition on the basis of the available data, as wellas to briefly describe the reasoning used. Manyparticipants also provided comments. After makingtheir individual estimates, the students worked ingroups to develop a better understanding of therole that evidence plays in diagnosis; however, wedid not gather any data from these discussions and

chance that the patient does not have the con-dition—or 1.00 − 0.10). The chance of observinga true- or false-positive test result or clinicalobservation is 0.07 (that is, a positive result forpatients with the condition) plus 0.27 (that is, apositive result for patients without the condi-tion) equals 0.34. The probability that a posi-tive observation will accurately reveal the pres-ence of the condition in question is 0.07 (true-positive test results) divided by 0.34 (all posi-tive test results) equals 0.21.

Although the Bayesian calculation appearscomplex when explained in detail, it givesproper weight to the relevant factors in practi-tioners’ use of clinical judgment and testresults to estimate diagnostic probabilities. It isno more complex than the calculations driversengage in when calculating the timing andforce of braking when driving. ■

1. Hunter D. Laws of probability, Bayes’ theorem, and the centrallimit theorem. 5th Penn State Astrostatistics School.“http://astrostatistics.psu.edu/su09/lecturenotes/probability.pdf”.Accessed May 3, 2010.

observation; a positive result can occur whenthe condition actually exists (that is, a true-positive observation weighted by the baselineprevalence), and a positive result can occurwhen the condition does not exist (that is, afalse-positive observation weighted by the base-line prevalence). The probability of being fooledby a false-positive observation can be calcu-lated from the information in hand:

where Pr equals probability, E equals the evi-dence, C equals the patient’s having the condi-tion, ~C equals the patient’s not having thecondition.

Consider this example based on the vignettesused in this study. If a test with a sensitivity of0.70 (that is, seven of 10 times the observationis correct) is applied to a patient with only a 10percent chance of having the condition, thelikelihood of observing a positive test result inthis or any patient will be 0.07 (0.70 × 0.10).The likelihood of observing a positive result ina patient who does not have the condition actu-ally will be larger in this example. The likeli-hood of being misled by the test result is 0.27(that is, 0.30—the chance of a false-positiveresult—or 1.0 − 0.70 multiplied by 0.90—the

ABBREVIATION KEY. Pr(C): Probability that a condition exists. Pr(~C): Probability that a conditiondoes not exist. Pr(C|E): Probability that a conditionexists in a patient given evidence of its existence.Pr(E|C): Probability of a positive test result orobservation in a patient who has a condition.Pr(E|~C): Probability of a positive test result or ob -servation in a patient who does not have a condition.

Pr(E|~C) = 1 – Pr(E|C) and Pr(~C) = 1 – Pr(C)and Bayes theorem states

Pr(C|E) = [Pr(C)*Pr(E|C)]/[Pr(C)*Pr(E|C) + Pr(~C)*Pr(E|~C)]

Copyright © 2010 American Dental Association. All rights reserved. Reprinted by permission.

on June 1, 2010 jada.ada.org

Dow

nloaded from

Page 6: Bayes theorem: Fully informed rational estimates of ......Bayes theorem: Fully informed rational jada.ada.org ( this information is current as of June 1, 2010 ): The following resources

they did not affect the data reported in this study.Using the participants’ descriptions of numer-

ical estimates, we classified their responsesaccording to a particular strategy or combinationof strategies, as described in the Results sectionbelow. We set up the baseline values in thevignettes to be nonsymmetrical to ensure thateach participant could arrive at his or her esti-mate only via a unique strategy. Some respon-dents volunteered comments on their responseforms, and several expressed frustration that thevignettes contained insufficient information forthem to make the requested estimate or thatmaking any precise estimate was difficult oruncomfortable. We calculated a single proportionfor the number of such comments divided by thenumber of responding students or clinical facultymembers. One of us (D.W.C.) classified allresponses according to strategy, and the otherauthors reviewed a sample of responses to con-firm these classifications.

We tested the hypotheses by using the conven-tional test for differences between proportions.

RESULTS

Thirteen students declined to participate or wereabsent on the day of the activity, and three

handed in estimates that were indeci-pherable or implausible (for example, P > 1.0). Approximately six facultymembers declined to participate; fiveothers handed in forms that containedonly comments or unusable estimates.The effective sample size was 127 stu-dents (60 for the 90 percent vignetteand 67 for the 21 percent vignette) and69 faculty members (36 for the 90 per-cent vignette and 33 for the 21 percentvignette).

Figure 1 shows the distribution ofthe estimated likelihood of the patient’shaving the periodontal condition for the90 percent vignette, based on the evi-dence provided and baseline informa-tion. Figure 2 shows the distribution forthe 21 percent vignette. For bothvignettes, the estimates ranged from 0to 100 percent. The standard deviationsof the estimates are large relative tothe means. Interquartile ranges for the90 percent vignette were 0.60 to 0.70for students and 0.60 to 0.80 for facultymembers. For the 21 percent vignette,

for which we expected the baseline data to lowerthe estimate, the interquartile range was 0.30 to0.70 for students and 0.10 to 0.50 for facultymembers. The distributions are not strictly con-tinuous because each respondent’s chosenstrategy yielded an exact probability. Neverthe-less, the distributions are approximately normal.

Among faculty members, the hypothesizedupward shift for the 90 percent vignette and thehypothesized downward shift for the 21 percentvignette appear in Figures 1 and 2, respectively,and in Table 1 (page 662). The greater accuracy ofthe estimated diagnostic probability among fac-ulty members compared with that among studentswas significant at P = .048 for the 90 percentvignette and at P = .008 for the 21 percentvignette. We reran the t test comparing the resultsfor faculty members with those for students forthe 21 percent vignette to exclude all estimateslower than 10 percent; this enabled us to elimi-nate the effects of the pronounced left-hand tail inthe distribution (Figure 2). The results under thisseverely handicapped test still revealed a signifi-cantly greater use of baseline information by fac-ulty members (P = .046).

Diagnostic strategies. By considering bothrespond ents’ numerical estimates and the reasons

660 JADA, Vol. 141 http://jada.ada.org June 2010

C L I N I C A L P R A C T I C E

BOX

Vignette used by dental students and facultymembers to estimate the diagnostic probability of a hypothetical patient’s having a specific condition.

VIGNETTEAssume that several research studies have investigated a microbiological assayfor a periodontal pathogen that affects women of Mediterranean descent inwhom periodontal disease is resistant to therapy. The condition is known toaffect an average of one in 10 women of Mediterranean descent in whomperiodontal disease is resistant to therapy. There is some consistency amongthese investigations despite their having been conducted at three research universities. In all studies to date, an assay positively identifies between 68 and 72 percent of the existing cases of the condition in the target population.Several studies also report a false-positive test result (that is, a patient in thetarget population with a certain clinical profile mistakenly appears to have thepathogen) of exactly 30 percent.

You observe a woman in your practice who is of Mediterranean ancestry and inwhom periodontal disease is highly resistant to several courses of therapy, andthe assay is positive. What is your estimate of the likelihood that this patient hasthe pathogen?

QUANTITATIVE VALUESdThe above vignette, which has a Bayesian estimate for probability of

the condition of 21 percent, contains the following information: baselinelikelihood equals 0.10, test sensitivity equals 0.70 and false-positive result for the test equals 0.30.

dAn alternative vignette also was used that was identical except that thebaseline likelihood equals 0.80, test sensitivity equals 0.70 and the false-positive result for the test equals 0.30. Given this information, the resultingBayesian estimate of diagnostic probability is 90 percent.

Copyright © 2010 American Dental Association. All rights reserved. Reprinted by permission.

on June 1, 2010 jada.ada.org

Dow

nloaded from

Page 7: Bayes theorem: Fully informed rational estimates of ......Bayes theorem: Fully informed rational jada.ada.org ( this information is current as of June 1, 2010 ): The following resources

JADA, Vol. 141 http://jada.ada.org June 2010 661

C L I N I C A L P R A C T I C E

test results and observations (100 percent) andbaseline information. All of these strategiesinvolved mismanagement in the selection or com-bination of the given data needed to estimate cor-rectly the diagnostic probability of the patient’shaving the condition; none of these strategies ledto the correct Bayesian estimate.

Table 2 (page 663) is a detailed breakdown ofthe diagnostic strategies used by students and

provided in their comments,we were able to identify adiagnostic strategy for eachrespondent. We identified fourapproaches on the basis of asingle source of information:dthe participant equatedthe likelihood of the patient’shaving the condition with thebaseline prevalence (whileignoring all other information); dthe participant set thediagnostic probability equalto the test sensitivity (whileignoring all other information);dthe participant equatedthe positive test result with a100 percent likelihood of thepatient’s having the condi-tion (while ignoring all otherinformation);dthe participant used thefalse-positive rate to deter-mine a diagnosis (whileignoring all other information).

In addition to the above, wenoted two strategies thatmade use of false-positiveadjustments to the sensitivityrate. In one case, the respond -ent subtracted the absoluterate of false-positive observa-tions from the test sensitivityrate (that is, 0.70 − 0.30 =0.40). In the second case, therespondent subtracted thefalse-positive rate from thetest sensitivity rate on a pro-portional basis (that is, 0.70 −[0.70 × 0.30] = 0.49). A relatedstrategy involved subtractingthe false-positive rate from the positive test result(that is, 1.00 − 0.30 = 0.70).

Finally, we identified three mixed strategies:one involved the use of various combinations ofbaseline and test sensitivity data; one involvedsubtracting baseline information from the testsensitivity rate or the test sensitivity rate frombaseline information; and the final strategyinvolved the use of some combination of positive

◆ ◆ ◆◆

◆ ◆

60

50

40

30

20

10

00-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100

PROBABILITY ESTIMATES

PER

CEN

TA

GE O

F R

ESP

ON

DEN

TS

Students

FacultyMembers

■ ■ ■ ■

Figure 1. Distribution of the estimated probabilities that a hypothetical patient has the perio-dontal condition, given information consistent with a Bayesian estimate of 90 percent.

◆ ◆

◆◆ ◆

◆■

■■

■■

■ ■ ■

60

50

40

30

20

10

00-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100

PROBABILITY ESTIMATES

PER

CEN

TA

GE O

F R

ESP

ON

DEN

TS

Students

FacultyMembers

Figure 2. Distribution of the estimated probabilities that a hypothetical patient has the perio-dontal condition, given information consistent with a Bayesian estimate of 21 percent.

Copyright © 2010 American Dental Association. All rights reserved. Reprinted by permission.

on June 1, 2010 jada.ada.org

Dow

nloaded from

Page 8: Bayes theorem: Fully informed rational estimates of ......Bayes theorem: Fully informed rational jada.ada.org ( this information is current as of June 1, 2010 ): The following resources

662 JADA, Vol. 141 http://jada.ada.org June 2010

C L I N I C A L P R A C T I C E

students and 55 percent forfaculty members) inrespondents’ dependence ontest sensitivity in somefashion (P = .024).

Baseline information.With respect to the use ofbaseline information, fac-ulty members were signifi-cantly more likely thanstudents to combine base-line information with testsensitivity data or test

results and to use baseline information in someway (36 percent for faculty members versus 21 percent for students).

Comments. The comments offered by studentsand faculty members provided insight into thestrategies they used to estimate diagnostic proba-bility and how the respondents viewed this chal-lenge. Neither group of respondents appeared tobe comfortable with the task of making a preciseestimate of diagnostic probability. Some com-ments reflect respondents’ frustration withworking out a satisfactory answer: “There are toomany things to consider.” “We have to leave roomfor error.” “I would ask the patient.” “I tend tofavor aggressive treatments.” “There are toomany variables to place a lot of faith in any con-clusion.” “There is not enough information here.”“I would need to do the tests myself in order toknow what to think.” “I need to know the riskfactor first.” “You didn’t tell me what the samplesize is.” “The sample size is too small.” “It’s goingto depend on what the outcome of treatment is.”“This case is not specific enough.” “If I had anyfaith in research, I would want to keep my possi-bilities open.” “Most of these data seem irrele-vant.” “Don’t ask me. I just sort of combinethings.”

Additional comments appeared to question themeaningfulness of making precise estimates ofdiagnostic probability. For example, “I reallydoubt that research can be truly accurate for sup-porting clinical decisions.” “We can only say thatsomething is possible or not; we cannot assignnumbers to it.” “It is foolish to extrapolate popula-tion studies to a single individual patient.” “This isan unrealistic case because I have no patients likethis in my practice.” “Legal considerations have notbeen taken into account.” “It is better to be intu-itive.” “These are hypotheses; nothing has beenproven.” “There has been insufficient research in

faculty members in the two vignettes. We listedpercentages separately for students and facultymembers for each vignette and as a combinedtotal, weighted for sample size. The most com-monly used strategy was to begin with the testsensitivity (that is, the likelihood that the testresults will be positive for patients who, in fact,have the condition) and make either absolute orproportional adjustments based on informationabout the test sensitivity, resulting in false-positive observations. These represented 20 and 18 percent of the overall strategies.

In the next most likely strategy (16 percent),participants used only the test sensitivity andmade no adjustments. Ten percent of the respond -ents adjusted the absolute positive observation(100 percent) by the chance of a false-positiveobservation. Twenty-six percent of students andfaculty members gave some consideration to thebaseline prevalence of the condition in reachingtheir diagnosis. Sixty-one percent of respondentsincluded test sensitivity as part of their strategy.Fourteen percent of respondents used the testresult itself (that is, they confused a positiveobservation with a positive condition) as ananchor for their diagnoses.

Test sensitivity. We found differences betweenstudents and faculty members with respect to thestrategies used to determine diagnostic probability.These differences, apparent in both the 90 and 21percent vignettes, are consistent with our hypoth-esis of greater diagnostic accuracy for faculty mem-bers. For example, faculty members were lesslikely to rely entirely on test sensitivity informa-tion. They also were less likely to make a relativevalue adjustment to the test sensitivity informa-tion for false-positive observations but were moreapt to make an absolute adjustment. These differ-ences were statistically significant and contributedto the overall significant difference (63 percent for

TABLE 1

Students’ and faculty members’ estimates of theprobability that a hypothetical patient had a specificcondition.VIGNETTE STUDENTS’ MEAN

(SD*) PROBABILITY

ESTIMATE

NO. OFSTUDENTS

FACULTY MEMBERS’MEAN (SD)

PROBABILITYESTIMATE

NO. OFFACULTYMEMBERS

PVALUE

90% Vignette 64.63 (16.68) 60 69.90 (13.66) 36 .048

21% Vignette 47.46 (22.79) 67 35.08 (24.02) 33 .008

* SD: Standard deviation.

Copyright © 2010 American Dental Association. All rights reserved. Reprinted by permission.

on June 1, 2010 jada.ada.org

Dow

nloaded from

Page 9: Bayes theorem: Fully informed rational estimates of ......Bayes theorem: Fully informed rational jada.ada.org ( this information is current as of June 1, 2010 ): The following resources

JADA, Vol. 141 http://jada.ada.org June 2010 663

C L I N I C A L P R A C T I C E

ering the patient’s needs and treatment.” “I haveseen several of these cases firsthand.”

this area.” “No P values have been provided.” “Iwould not rely on any data/research when consid-

TABLE 2

Students’ and faculty members’ strategies for estimating the probabilitythat a hypothetical patient had a specific condition.STRATEGY OVERALL

USE OFSTRATEGY

(%)

PERCENTAGE OF PARTICIPANTS 90%VIGNETTE/

21%VIGNETTEP VALUE*†

STUDENT/FACULTY

PVALUE*‡

90% Vignette

21% Vignette

CombinedVignettes

Students(n = 54)

Facultymembers(n = 30)

Students(n = 56)

Facultymembers(n = 20)

Students Facultymembers

Individual Strategies

Used baseline dataonly

9 4 5 11 15 8 10 .045 —

Used test sensitivitydata only

16 19 9 23 5 21 7 — .007

Used positive testresult only (100%)

1 2 0 0 0 1 0 — —

Used false-positive observation only

4 2 0 4 15 2 7 .031 —

Adjusted sensitivityfor absolute value offalse-positive observation

20 19 27 13 30 15 29 — .038

Adjusted sensitivityfor relative value offalse-positive observation

18 13 5 34 10 24 7 .003 .002

Adjusted positiveobservation (100%)for false-positive observation

10 21 5 5 5 13 5 .019 —

Adjusted baselinedata for false-positiveobservation

8 13 9 4 5 8 7 — —

Used baseline dataand test sensitivity invarious relationships

6 6 18 0 5 3 12 .009 .045

Used test sensitivity combined with baseline data

4 0 5 5 10 3 7 — —

Used positive observation (100%)combined with baseline data

5 4 18 0 0 2 10 .002 .050

Combined Strategies

Consideration ofbaseline data

26 23 36 20 35 21 36 — .038

Consideration of testsensitivity

61 56 59 70 50 63 55 — .024

Consideration of testresult (100%)

14 21 9 7 5 15 12 — —

Respondents’ Concerns

Resistance to problemas given

17 11 27 7 35 9 31 — .001

* Only P values less than .05 are reported. Others are designated by a dash.† P values are for test of differences in proportions between the vignettes, with Bayesian analysis estimates of 90% or 21%; data are combined for

students and faculty members. ‡ Student/faculty P values are for test of differences in proportions between students and faculty members across vignette type.

Copyright © 2010 American Dental Association. All rights reserved. Reprinted by permission.

on June 1, 2010 jada.ada.org

Dow

nloaded from

Page 10: Bayes theorem: Fully informed rational estimates of ......Bayes theorem: Fully informed rational jada.ada.org ( this information is current as of June 1, 2010 ): The following resources

mated diagnostic probability was significantlycloser to the Bayesian answer in both vignettescompared with that for students. This finding isconsistent with those of other studies.24,25

We might speculate that clinical practice buildsa repertoire of baseline experience or at least anincreasing awareness of the value of patients’background information as an element of diag-nosis. We do not need to assume that experiencedpractitioners consciously weighted backgroundinformation in their calculations; the adjustmentsmay have been largely intuitive. The finding byAberegg and colleagues28 that experienced practi-tioners opt for more conservative or middle-of-the-road diagnoses is inconsistent with our results,because the baseline values in our study weremore extreme than were the test sensitivityvalues in both vignettes.

An unexpected result of this study was thatrespondents, especially the experienced practi-tioners, exhibited a resistance to the structurerequired to estimate diagnostic probabilities pre-cisely. Not only did participants complain thatthe task was difficult, but they challenged theconcept of estimating precise probabilities as rep-resenting a best guess for their diagnoses. It iseasy to form the impression that dentists do nothave formulas for estimating the likelihood that adisease is present given various data that theyuse regularly. Respondents’ comments in thisstudy seem to indicate that they were being askedto perform a mental process in which they did notregularly engage.

Since the mid-1960s, a line of research inapplied decision theory (called “man as intuitivestatistician”)29 has shown that we are capable ofand typically use approximations of the moredetailed rules used by decision scientists.30-32 Inthe aggregate, the results of our study confirmedthat students and practitioners did use approxi-mations of the best decision rules. However, thecomments volunteered by participants suggestthat something more is involved in the decision-making process.

Colombotos33 advanced a theory that healthcare professionals prefer to make personal, sub-jective assessments of clinical situations ratherthan data-driven, well-defined assessments. If itwere possible for others with the same inputinformation to calculate accurate diagnosticassessments, what would be left to professionaljudgment and the special place of practitioners?Furthermore, if others knew in some precise way

664 JADA, Vol. 141 http://jada.ada.org June 2010

C L I N I C A L P R A C T I C E

These comments expressing frustration withthe challenge of making precise estimates of diag-nostic probability or expressing doubt about thevalue of using scientific data to make diagnosticestimates were three times as likely to come fromfaculty members than from students (31 versus 9 percent, P < .001).

DISCUSSION

The results of this study confirmed our threehypotheses that were based on similar studies inmedicine5,6,12-16,18-26:dwe observed a wide range of diagnostic proba-bility estimates; dwhen making estimates, participants gaveundue weight to test evidence compared withbaseline information; dexperienced practitioners were less prone toexaggerate the importance of test evidence thanwere students.

In addition, experienced practitionersexpressed greater resistance to using structuredapproaches when estimating diagnostic probabili-ties, which suggests that the diagnostic processmay involve other considerations, as noted below.Finally, these findings raise some issuesregarding evidence-based dentistry.

First hypothesis. Standard deviations andinterquartile ranges were large in this study,although generally smaller than those reported inthe literature involving physicians’ diagnosticpatterns.4,12,13,15-21,24,25 It is difficult to explain whystudents and practitioners, when given identicaldata, provided probability estimates that fluctu-ated so widely.

Second hypothesis. The second hypothesis—overemphasis on test outcomes—also was con-firmed. For both vignettes and both groups of par-ticipants, the mean estimated diagnosticprobability was between the baseline prevalenceand the test sensitivity value, but it was closer tothe test sensitivity value, a finding consistentwith the literature.5,6,12,13,15,20,21 The fact that one-quarter of the participants gave no considerationto baseline data when calculating the diagnosticprobability supports this hypothesis to anextreme degree.

Third hypothesis. Participants’ failure tomake appropriate adjustments for baseline datais especially noteworthy. Faculty members were1.7 times more likely than students to considerbaseline information in their diagnoses, con-firming our third hypothesis. Their mean esti-

Copyright © 2010 American Dental Association. All rights reserved. Reprinted by permission.

on June 1, 2010 jada.ada.org

Dow

nloaded from

Page 11: Bayes theorem: Fully informed rational estimates of ......Bayes theorem: Fully informed rational jada.ada.org ( this information is current as of June 1, 2010 ): The following resources

JADA, Vol. 141 http://jada.ada.org June 2010 665

C L I N I C A L P R A C T I C E

what was happening or what should happen withpatients, then practitioners’ freedom of choiceregarding treatment options would be truncatedand others would be able to evaluate whether theclinician had performed correctly in a given situation.

In the Colombotos hypothesis, there can be toomuch diagnostic precision, or at least a desirablerange of vagueness in which professional judg-ment holds sway. Ambiguity controlled by thehealth care professional has value. Man-Son-Hingand colleagues34 found that patients were morelikely than practitioners to prefer exact quantita-tive diagnostic information. Berner and Maisiak35

reported that physicians resist using decisionsupport systems that require an understanding ofthe system’s logic. Physicians also reportedgreater levels of discomfort with treatment regi-mens that limited their range of choices.36,37

Researchers also have identified the strategicadvantage of using imprecise estimates of proba-bilities in business settings.38-41

The research cited in this article assumes thatdental care is based on diagnostic acumen andthat an essential part of the diagnostic process iscombining baseline information with evidence togenerate an estimate of the likelihood that apatient has a specific disease or trauma condition.However, it is possible that this model is ideal-istic and more typical of researchers, while prac-ticing dentists actually view the relationshipbetween information, diagnosis and treatmentpaths differently.

Roswarski and Murray42 used vignettes todetermine prescribing patterns among primaryand emergency care physicians. Simple patientscenarios produced little decision bias; as thecomplexity increased, however, the amount ofbias increased to the point at which some re -spondents deferred making choices altogether.Redelmeier and Shafir19 reported similar findingswith several groups that evaluated a vignetteinvolving osteoarthritis; they noted that complexcases led to respondents’ overreactions to the data(that is, salient evidence hypothesis) or mainte-nance of the status quo (that is, hypothesis of dis-engaging from the problem). Aberegg and col-leagues28 investigated the diagnostic decision-making process of pulmonologists by using com-plex cases, and they found both status quo biasand a tendency of participants to undervalueinformation that would have led to treatment(both are forms of disengaging from the problem).

Evidence-based dentistry. The results of ourresearch have implications for the discussion ofevidence-based dentistry, which the AmericanDental Association defines as “an approach tooral health care that requires the judicious inte-gration of systematic assessments of clinically rel-evant scientific evidence, relating to the patient’soral and medical condition and history, with thedentist’s clinical expertise and the patient’s treat-ment needs and preferences.”43 This studyappears to be the first to address the process of“judicious integration” in the context of dentistry.We also explored the effects of information withrespect to the patient’s condition and the historyof patients in the target population.

If the findings of this research are confirmed,our level of concern must be raised with regard tothe possibility that salient, precise external evi-dence will receive undue weight in the decisionsmade by practitioners in individual cases. Areport of Canadian primary care physicians con-cluded, “Clinicians strongly identified with the[evidence-based medicine] EBM model of clinicalpractice are less sensitive to context, which mightbe an obstacle to efforts to integrate patientvalues and clinical circumstances into patient-centered care.”44(p1106)

Webster and colleagues45 found that primarycare physicians generally ignore clinical guide-lines because they perceive them to undervaluespecific clinical situations. In a review of 76studies, Cabana and colleagues46 found that thepotential for harming individual patients in thecourse of following general rules was the principalreason cited by physicians who eschewed evi-dence-based medicine. If, in fact, many practi-tioners are worried about evidence-based den-tistry, not because of the poor quality of theevidence, but because of concern that evidencecould overshadow the particular dentist-patientrelationship, then calls for greater methodologicalrigor through systematic reviews and meta-analyses may not be the most useful step forward.

CONCLUSION

The results of this study raise some issuesregarding both the Bayesian model of diagnosisand evidence-based dentistry as the best concep-tualizations of clinical practice. More research isneeded to begin to understand how practitionersactually diagnose diseases rather than confirmingthe fact that they poorly approximate the wayresearchers advise them to diagnose disease. ■

Copyright © 2010 American Dental Association. All rights reserved. Reprinted by permission.

on June 1, 2010 jada.ada.org

Dow

nloaded from

Page 12: Bayes theorem: Fully informed rational estimates of ......Bayes theorem: Fully informed rational jada.ada.org ( this information is current as of June 1, 2010 ): The following resources

666 JADA, Vol. 141 http://jada.ada.org June 2010

C L I N I C A L P R A C T I C E

24. Randolph AG, Zollo MB, Egger MJ, Guyatt GH, Nelson RM,Stidham GL. Variability in physician opinion on limited pediatric lifesupport. Pediatrics 1999;103(4):e46.

25. Winkenwerder W, Levy BD, Eisenberg JM, Williams SV, YoungMJ, Hershey JC. Variation in physicians’ decision-making thresholdsin management of a sexually transmitted disease. J Gen Intern Med1993;8(7):367-373.

26. Cahan A, Gilon D, Manor O, Paltiel O. Clinical experience did notreduce the variance in physicians’ estimates of pretest probability in across-sectional survey. J Clin Epidemiol 2005;58(11):1211-1216.

27. American College of Dentists. Dental leadership version 1.1.“www.dentalleadership.org/index.shtml”. Accessed April 16, 2010.

28. Aberegg SK, Haponik EF, Terry PB. Omission bias and decisionmaking in pulmonary and critical care medicine. Chest 2005;128(3):1497-1505.

29. Peterson CR, Beach LR. Man as an intuitive statistician. PsycholBull 1967;68(1):29-46.

30. Alloy LB, Tabachnik N. Assessment of covariation by humans andanimals: the joint influence of prior expectations and current situa-tional information. Psych Rev 1984;91(1):112-149.

31. Cosmides L, Tooby J. Are humans good intuitive statisticiansafter all? Rethinking some conclusions from the literature on judgmentunder uncertainty. Cognition 1996;58(1):1-73.

32. Crocker J. Judgments of covariation by social perceivers. PsycholBull 1981;90(2):272-292.

33. Colombotos JL. Responses of the health professions to informa-tion: dilemmas and contradictions. J Am Coll Dent 1989;56(3):50-54.

34. Man-Son-Hing M, O’Connor AM, Drake E, Biggs J, Hum V, Lau-pacis A. The effect of qualitative vs. quantitative presentation of proba-bility estimates on patient decision-making: a randomized trial. HealthExpect 2002;5(3):246-255.

35. Berner ES, Maisiak RS. Influence of case and physician charac-teristics on perceptions of decision support systems. J Am Med InformAssoc 1999;6(5):428-434.

36. Pearson SD, Goldman L, Orav EJ, et al. Triage decisions foremergency department patients with chest pain: do physicians’ riskattitudes make the difference? J Gen Intern Med 1995;10(10):557-564.

37. Shen J, Andersen R, Brook R, Kominski G, Albert PS, Wenger N.The effects of payment methods on clinical decision-making: physicianresponses to clinical scenarios. Med Care 2004;42(3):297-302.

38. Alvarez SA, Parker SC. Emerging firms and the allocation of con-trol rights: a Bayesian approach. Acad Mgmt Rev 2009;34(2):209-227.

39. Cyert RM, DeGroot MH. Bayesian Analysis and Uncertainty inEconomic Theory. Lanham, Md.: Rowman and Littlefield; 1987.

40. Foss NJ. Firms, incomplete contracts, and organizationallearning. Hum Sys Mgmt 1996;15(1):17-26.

41. Leroy S, Singell L. Knight on risk and uncertainty. J Poli Econ1987;95(2):394-406.

42. Roswarski TE, Murray MD. Supervision of students may protectacademic physicians from cognitive bias: a study of decision makingand multiple treatment alternatives in medicine. Med Decis Making2006;26(2):154-161.

43. ADA Center for Evidence-Based Dentistry. About EBD.“http://ebd.ada.org/about.aspx”. Accessed May 2, 2010.

44. Tracy CS, Dantas GC, Moineddin R, Upshur RE. Contextual fac-tors in clinical decision making: national survey of Canadian familyphysicians. Can Fam Physician 2005;51:1106-1107.

45. Webster BS, Courtney TK, Huang YH, Matz S, Christiani DC.Physicians’ initial management of acute low back pain versus evidence-based guidelines. J Gen Intern Med 2005;20(12):1132-1135.

46. Cabana MD, Rand CS, Powe NR, et al. Why don’t physiciansfollow clinical practice guidelines? A framework for improvement.JAMA 1999;282(15):1458-1465.

Disclosure. None of the authors reported any disclosures.

1. Hunink M. Decision Making in Health and Medicine: IntegratingEvidence and Values. New York City: Cambridge University Press; 2001.

2. Sox HC. Medical Decision Making. Boston: Butterworths; 1988.3. Weinstein MC, Fineberg HV. The use of diagnostic information to

revise probabilities. In: Weinstein MC, ed. Clinical Decision Analysis.Philadelphia: Saunders; 1980:75-130.

4. Noguchi Y, Matsui K, Imura H, Kiyota M, Fukui T. Quantitativeevaluation of the diagnostic thinking process in medical students. JGen Intern Med 2002;17(11):839-844.

5. Kahneman D, Slovic P, Tversky A, eds. Judgment Under Uncer-tainty: Heuristics and Biases. Cambridge, United Kingdom: CambridgeUniversity Press; 1982.

6. Gilovich T, Griffin D, Kahneman D, eds. Heuristics and Biases:The Psychology of Intuitive Judgment. Cambridge, United Kingdom:Cambridge University Press; 2002.

7. Gladwell M. Blink: The Power of Thinking Without Thinking. NewYork City: Little, Brown; 2005.

8. Taleb NN. Fooled by Randomness: The Hidden Role of Chance inLife and in the Markets. 2nd ed. New York City: Thomson/Texere;2004.

9. Ariely D. Predictably Irrational: The Hidden Forces That ShapeOur Decisions. New York City: Harper; 2009.

10. Bader JD, Shugars DA. Variation, treatment outcomes, and prac-tice guidelines in dental practice. J Dent Educ 1995;59(1):61-95.

11. Eddy DM. Probabilistic reasoning in clinical medicine: problemsand opportunities. In: Kahnemann D, Slovic P, Tversky A, eds. Judg-ment Under Uncertainty: Heuristics and Biases. Cambridge, UnitedKingdom: Cambridge University Press; 1982:249-267.

12. Lyman GH, Balducci L. Overestimation of test effects in clinicaljudgment. J Cancer Educ 1993;8(4):297-307.

13. Lyman GH, Balducci L. The effect of changing disease risk onclinical reasoning. J Gen Intern Med 1994;9(9):488-495.

14. Morise AP. Are the American College of Cardiology/AmericanHeart Association guidelines for exercise testing for suspected coronaryartery disease correct? Chest 2000;118(2):535-541.

15. Puhan MA, Steurer J, Bachmann LM, Riet G. A randomized trialof ways to describe test accuracy: the effect on physicians’ post-testprobability estimates. Ann Intern Med 2005;143(3):184-189.

16. Steurer J, Fischer JE, Bachmann LM, Koller M, Riet G. Commu-nicating accuracy of tests to general practitioners: a controlled study(published correction appears in BMJ 2002;324[7350]:1391). BMJ 2002;324(7341):824-826.

17. Ebell MH, Bergus GR, Warbasse L, Bloomer R. The inability ofphysicians to predict the outcome of in-hospital resuscitation. J GenIntern Med 1996;11(1):16-22.

18. Sox CM, Doctor JN, Koepsell TD, Christakis DA. The influence oftypes of decision support on physicians’ decision making. Arch DisChild 2009;94(3):185-190.

19. Redelmeier DA, Shafir E. Medical decision making in situationsthat offer multiple alternatives. JAMA 1995;273(4):302-305.

20. Phelps MA, Levitt MA. Pretest probability estimates: a pitfall tothe clinical utility of evidence-based medicine? Acad Emerg Med 2004;11(6):692-694.

21. Attia JR, Nair BR, Sibbritt DW, et al. Generating pre-test proba-bilities: a neglected area in clinical decision making. Med J Austral2004;180(9):449-454.

22. Grether DM. Bayes’ rule as a descriptive model: the representa-tiveness heuristic. Q J Econ 1980;95:537-557.

23. Arocha JF, Patel VL, Patel YC. Hypothesis generation and thecoordination of theory and evidence in novice diagnostic reasoning.Med Decis Making 1993;13(3):198-211.

Copyright © 2010 American Dental Association. All rights reserved. Reprinted by permission.

on June 1, 2010 jada.ada.org

Dow

nloaded from