methods of clinical prediction

7
RESEARCH METHODS: STATE OF THE SCIENCE Methods of clinical prediction William A. Grobman, MD, MBA, a David M. Stamilio, MD, MSCE b, * Department of Obstetrics and Gynecology, Northwestern University Medical School, Chicago, IL a ; Department of Obstetrics and Gynecology, University of Pennsylvania Health System, Philadelphia, PA b Received for publication May 16, 2005; revised September 4, 2005; accepted September 14, 2005 KEY WORDS Prediction models Statistical methods The abililty to predict clinical outcomes is of great importance to physicians and patients alike. Accordingly, multiple different methods have been used in an effort to accurately predict these outcomes. These methods include the development of scoring systems based on univariable and multivariable analysis, as well as models involving the use of neural network, nomograms, and classification and regression trees. The principles of these types of methods, as well as their advantages and disadvantages will be presented. Ó 2006 Mosby, Inc. All rights reserved. The ability to predict clinical outcomes is of seminal importance in the physician-patient relationship. For physicians, the ability to understand the most likely end point of a patient’s clinical course may allow the modification of disease surveillance and treatment in such a way that improved outcomes can be achieved. For example, the ability to predict accurately which pregnant women will have a shoulder dystocia would allow physicians to make preparations and treatment decisions that could minimize the maternal and neonatal morbidity that are associated with this obstetric emer- gency. Alternatively, the prediction of their own poten- tial outcomes allows patients to make the most educated choices from the different treatment strategies. Thus, confronted with the decision of undergoing a trial of labor after cesarean delivery, a woman would be sub- stantially aided if she were able to understand not only summary outcomes for a large population but also the probability of the outcomes that she is most likely to experience, given her own particular characteristics. Accordingly, much of clinical research is devoted to understanding the factors that are associated with dif- ferent patient outcomes. There have been a multitude of published analyses, for example, that have attempted to determine those characteristics that make a shoulder dystocia or a failed trial of labor more likely. 1,2 In fact, many individual risk factors for both these adverse events have been elucidated. Nevertheless, although these factors can give both physicians and patients some helpful guidance, they ultimately are not adequate predictive tools. Even a strong association of a factor with an outcome does not guarantee that this factor pre- dicts the outcome accurately. And, when there are mul- tiple factors that are associated with a given outcome, associations alone do not give insight into how these fac- tors can be combined to provide the most predictive potential. We review the different methods that have been used in the medical literature to construct clinical predictive tools (Table) and focus on the methods that have been used already in the domain of obstetrics and gynecol- ogy. Both the theory underlying these methods and the specific examples of their use from the obstetric and gynecologic literature are supplied. We detail some * Reprint requests: David M. Stamilio, MD, MSCE, 2000 Court- yard Bldg, 3400 Spruce St, Philadelphia, PA 19104. E-mail: [email protected] 0002-9378/$ - see front matter Ó 2006 Mosby, Inc. All rights reserved. doi:10.1016/j.ajog.2005.09.002 American Journal of Obstetrics and Gynecology (2006) 194, 888–94 www.ajog.org

Upload: william-a-grobman

Post on 30-Aug-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Methods of clinical prediction

American Journal of Obstetrics and Gynecology (2006) 194, 888–94

www.ajog.org

RESEARCH METHODS: STATE OF THE SCIENCE

Methods of clinical prediction

William A. Grobman, MD, MBA,a David M. Stamilio, MD, MSCEb,*

Department of Obstetrics and Gynecology, Northwestern University Medical School, Chicago, ILa; Department ofObstetrics and Gynecology, University of Pennsylvania Health System, Philadelphia, PAb

Received for publication May 16, 2005; revised September 4, 2005; accepted September 14, 2005

KEY WORDSPrediction modelsStatistical methods

The abililty to predict clinical outcomes is of great importance to physicians and patients alike.Accordingly, multiple different methods have been used in an effort to accurately predict these

outcomes. These methods include the development of scoring systems based on univariableand multivariable analysis, as well as models involving the use of neural network, nomograms,and classification and regression trees. The principles of these types of methods, as well as their

advantages and disadvantages will be presented.� 2006 Mosby, Inc. All rights reserved.

The ability to predict clinical outcomes is of seminalimportance in the physician-patient relationship. Forphysicians, the ability to understand the most likely endpoint of a patient’s clinical course may allow themodification of disease surveillance and treatment insuch a way that improved outcomes can be achieved.For example, the ability to predict accurately whichpregnant women will have a shoulder dystocia wouldallow physicians to make preparations and treatmentdecisions that could minimize the maternal and neonatalmorbidity that are associated with this obstetric emer-gency. Alternatively, the prediction of their own poten-tial outcomes allows patients to make the most educatedchoices from the different treatment strategies. Thus,confronted with the decision of undergoing a trial oflabor after cesarean delivery, a woman would be sub-stantially aided if she were able to understand not onlysummary outcomes for a large population but also theprobability of the outcomes that she is most likely toexperience, given her own particular characteristics.

* Reprint requests: David M. Stamilio, MD, MSCE, 2000 Court-

yard Bldg, 3400 Spruce St, Philadelphia, PA 19104.

E-mail: [email protected]

0002-9378/$ - see front matter � 2006 Mosby, Inc. All rights reserved.

doi:10.1016/j.ajog.2005.09.002

Accordingly, much of clinical research is devoted tounderstanding the factors that are associated with dif-ferent patient outcomes. There have been a multitude ofpublished analyses, for example, that have attempted todetermine those characteristics that make a shoulderdystocia or a failed trial of labor more likely.1,2 In fact,many individual risk factors for both these adverseevents have been elucidated. Nevertheless, althoughthese factors can give both physicians and patientssome helpful guidance, they ultimately are not adequatepredictive tools. Even a strong association of a factorwith an outcome does not guarantee that this factor pre-dicts the outcome accurately. And, when there are mul-tiple factors that are associated with a given outcome,associations alone do not give insight into how these fac-tors can be combined to provide the most predictivepotential.

We review the different methods that have been usedin the medical literature to construct clinical predictivetools (Table) and focus on the methods that have beenused already in the domain of obstetrics and gynecol-ogy. Both the theory underlying these methods and thespecific examples of their use from the obstetric andgynecologic literature are supplied. We detail some

Page 2: Methods of clinical prediction

Grobman and Stamilio 889

predictive methods that have not yet been used widely inthe obstetrics and gynecology literature but that holdpromise to further aid in the prediction of clinical out-comes in our field.

Differentiation between a strong risk factorand a good predictive factor

Understanding the difference between a strong riskfactor and a good predictive factor is integral to a dis-cussion of clinical prediction methods. Ideally, a reli-able predictive factor or model has a high sensitivitythat detects most patients who are destined to have theoutcome and a high specificity that results in a low false-positive rate and the correct identification of mostpatients who will not have the outcome. Investigatorsoften proclaim that a strong risk factor is a reliablepredictor of a clinical outcome without the assessmentof more than just the association between the factor andthe outcome. Although a risk factor can be identifiedsimply by examining the relative risk (or odds ratio) ofan outcome in patients with the factor compared withthose patients without the factor, a factor is adequatelypredictive based not only on the magnitude of therelative risk but also on the prevalence of both thefactor and the outcome. A risk factor that has a strongassociation with a clinical outcome (for example, arelative risk of 6) may not be a good predictor if itsprevalence and the prevalence of the outcome areextremely low. Thus, a good clinical predictor not onlyis associated strongly with the outcome but also dis-criminates accurately between patients with and withouta given outcome.

A good example of this concept is the prediction ofpreterm delivery. An increased cervical fetal fibronectinlevel and a shortened cervical length (!25 mm) areassociated with preterm delivery, with relative risks of 4to 7 depending on the gestational age at which they areperformed. However, the positive predictive value ofthese tests for preterm delivery has ranged from 20% to40%, even when performed in a high-risk study popu-lation.3 Studies that have used multiple clinical variablesto predict preterm delivery have been equally unsuc-cessful in achieving adequate predictive capability.4

Receiver-operating characteristic (ROC)curve analysis

In general, as the sensitivity of a predictive test increases,the specificity decreases and vice versa. Both of these testcharacteristics are important to consider if the overallaccuracy of a predictive model is to be assessed. ROCcurve analysis makes the trade-off between sensitivityand specificity explicit. Sensitivity (true positive) isplotted on the y-axis, and 1-specificity (false positive) is

plotted on the x-axis. The most accurate model or testis the one that generates the point on the ROC plot thatis closest to the upper left corner of the graph (Figure 1).Conversely, a useless model or test discriminates nomore accurately than chance and has an ROC curvethat approximates the 45-degree diagonal.

It should be noted that the selection of the optimaldiscriminatory point in the clinical setting is arbitraryand does not always mandate that the upper-left-mostpoint on the curve be chosen. Severity of the disease andrisk of the test also must be considered. For example, ifthe outcome of disease is severe, one might favormaximizing sensitivity at the cost of a lower specificity,but if the risk or cost of the test is high, specificity mightbe maximized preferentially.

Development of a scoring system basedon univariable analysis

One simple method of the development of a predictivemodel uses the results that are derived from a univariableanalysis. In this method, associations between categoric

Table Techniques for the development of a clinical predic-tion model or rule

TechniqueAdvantages/strengths

Disadvantages/limitations

Univariateanalysis

Simple statisticmethods

Reduced accuracy

Easy clinicalapplication

Multivariableanalysis(eg, logisticregression)

Improvedaccuracy

More involvedstatistical methods

Relatively easeof clinicalapplication

May miss complexvariablerelationships

Neuralnetwork

Improvedaccuracy

Difficult clinicalapplication anddissemination

Incorporation ofcomplex variablerelationships

Less intuitive

Unknown effect ofany single variable

Predictivenomogram

Improved accuracy Advanced statistics

Ease of clinicalapplication

CARTanalysis

Improved accuracy Advanced statistics

Ease of clinicalapplication

Intuitivepartitioning

Page 3: Methods of clinical prediction

890 Grobman and Stamilio

variables and a clinical outcome are ascertained throughthe use of univariable statistical tests such as the Student ttest and chi-square analysis. Variables that are found tobe associated significantly with the outcome are then re-served for use in the final predictive model. Essentially,a patient’s outcome is predicted by her ‘‘score,’’ which it-self is based on the number of risk factors that she has.

One good example of this type of model is proposedby Troyer and Parisi5 to predict a successful trial of la-bor after cesarean delivery. These investigators analyzed567 women at O36 weeks of gestation who had a historyof previous low-transverse cesarean delivery in an effortto determine the variables that are associated signifi-cantly with a vaginal delivery. In univariable analysis,4 variables were found to be associated with cesareandelivery after a trial of labor: previous dysfunctional la-bor, no previous vaginal delivery, nonreassuring fetalheart tracing on admission, and induction of labor inthe current pregnancy. For the scoring system, eachwoman was accorded 1 point for each variable thatwas present on her admission to labor and delivery.Thus, this system had possible total scores that rangedfrom 0 (for women with no risk factors) to 4 (for womenwith all risk factors). When stratifying by these scores,the authors demonstrated that the scores were associ-ated with significantly different vaginal birth rates. The91.5% chance of vaginal delivery for those womenwith a score of 0 plummeted to a 46.1% chance of vag-inal delivery for those women with a score of 4.

Other well-known examples of predictive systemsthat are based on univariable associations are thosedescribed by Bishop6 for successful induction of laborand by Benacerraf et al7 for prenatal detection of chro-mosomal abnormalities. In contrast to the work of Par-isi and Troyer,5 the presence of each predictive factor inthese models does not result in an identical contributionto the total predictive ‘‘score.’’ In the work of Benacerrafet al,7 for example, findings on ultrasound that are asso-ciated with trisomy 21 are accorded points; a higher

Figure 1 Receiver-operator characteristic (ROC) curve for amultivariable predictive model of a clinical outcome.

number of points correspond to a greater probabilityof having a fetus with trisomy 21. Some ultrasound find-ings (such as an increased nuchal fold) generate 2 points,although others (such as a short femur) generate only1 point. In Bishop’s model, the predictive factors arenot dichotomized but have multiple categories that areassociated with different point contributions.6 For ex-ample, cervical dilation may contribute from 0 to 3points, depending on its extent.

There are several attractive aspects to this type ofapproach. First, the analysis can be done with statisticaltechniques that are understandable readily by the audi-ence and accessible to investigators with basic statisticalcomputer packages. Also, the scoring system is relativelyeasy to translate to bedside practice, because all thatneed be remembered is what the predictive factors areand how many points each contributes. The final score,which is obtained with simple arithmetic, has a limitedrange of values, and the model has a correspondinglimited range of outcomes. Yet, there are several aspectsof this method that are less than ideal. When onlyunivariable analysis is performed, factors that are notassociated independently with the outcome and possiblyrelated to one another may still be included in thefinal model, thereby reducing predictive accuracy. Also,points are accorded to each predictive factor with adegree of arbitrariness. In the article by Troyer andParisi,5 for example, a previous dysfunctional labor isassociated with a 26.6% chance of cesarean delivery, al-though a nonreassuring fetal heart tracing on admissionis associated with a 57.3% chance of cesarean delivery.Yet, each factor makes the same contribution, namely1 point, to the final score. Similarly, although there issome correlation between the magnitude of the associa-tion of an ultrasound finding with trisomy 21 and thenumber of points contributed by a given finding, therange of points is only from 1 to 2.7 This limitationmay ease the clinical application of the model but impairits final accuracy. As with the scoring system of Bishop,6

the selection of each particular point contribution is notmandated by a statistical algorithm or quantitative as-sessment of risk magnitude to ensure the most predictivemodel but is chosen by an investigator without formalsupporting evidence of its optimizing potential.

Predictive models that are basedon logistic regression

Multivariable analysis resolves some of the limitationsthat are encountered when univariable analysis is usedto form the basis of a predictive model. Most specifi-cally, multivariable techniques allow for the improvedassessment of the association of each independent var-iable with the dependent outcome. In turn, the resultantprediction models can be founded better on both themost relevant predictive factors and the magnitude of

Page 4: Methods of clinical prediction

Grobman and Stamilio 891

the contributions that these factors make. Although anymultivariable technique (eg, continuous regression, lo-gistic regression, Cox regression) can be used to arrive ata prediction model, much of the literature in obstetricsand gynecology makes use of logistic regression.

Logistic regression has been particularly useful forobstetric and gynecologic investigation for several rea-sons. Many of the outcomes of interest (such as failedtrial of labor) are dichotomous and consequently arewell suited to the logistic procedure. The relationshipbetween a predictive factor and the outcome, moreover,is expressed by an odds ratio, a measure of effect that isused frequently in the medical literature and relativelyeasy to interpret. And, the very structure of the logisticequation itself allows the calculation of the individual-ized probability that each woman will experience theoutcome of interest. In some cases, to establish the finalregression equation, authors will use a stepwise proce-dure, a selection algorithm that uses predeterminedprobabilistic criteria to direct which factors remain inthe final model. It should be noted that there are severaltypes of stepwise procedures and that, in some cases,they may help to maximize the combination of accuracyand parsimony with respect to the number of variablesin the model. That said, these procedures may not takeinto account special clinical knowledge that an investi-gator can contribute, and it is not always true thatmodels that are built with these techniques are mostuseful clinically. Readers interested in exploring thedetails of using logistic regression for developing pre-dictive models may refer to textbooks8 or softwaremanuals (eg, STATA version 8 SE; Stata Corporation,College Station, TX) that are dedicated to this topic.

Pickhardt et al9 developed a predictive model for avaginal birth after a cesarean delivery using stepwise lo-gistic regression techniques. After reviewing the recordsof 495 women who underwent a trial of labor, they an-alyzed 19 factors for their relationship with vaginal de-livery. Three factors were ultimately maintained in thelogistic equation that provided the most accurate predic-tion. These 3 factors were the number of previous cesar-ean deliveries, cervical dilation on admission, andestimated gestational age at delivery. To use this equa-tion to predict vaginal delivery, practitioners could usethe regression constant (�8.6165) and b-coefficients foreach factor (0.83296, �0.4803, and 0.2160, respectively)that were provided by the authors to construct the logis-tically derived probabilities for their own patients.Flamm and Geiger10 also used logistic regression tobuild a predictive model for vaginal delivery after a pre-vious cesarean delivery, but simplified the application ofthis model by assigning points to each predictive factor,thereby eliminating the need for a practitioner to calcu-late probability with logistic methods. The number ofpoints each factor was accorded was a whole number,the magnitude of which was based on each factor’s

adjusted odds ratio from the logistic regression. For ex-ample, for the outcome of vaginal birth after cesarean de-livery, ‘‘age under 40 years’’ had an odds ratio of 2.58,and ‘‘previous vaginal birth’’ had an odds ratio of 9.11;these odds ratios generated corresponding points of 2and 4, respectively. The ultimate probability of vaginaldelivery was predicated on the total points accumulatedand increased as the summative score increased.

Other investigators have attempted to provide clini-cians with predictive models with similarly simplifiedscoring systems that were generated by logistic multi-variable techniques. Odibo et al11 used stepwise logisticregression in an effort to predict the chance of pretermbirth for women who receive a cervical cerclage becauseof their pregnancy history. In their model, the 3 factorsmost associated with preterm delivery at !32 weeks ofgestation were previous cone biopsy, a cervical lengthbefore cerclage of !25 mm, and emergency placement.Rather than assign scores of varying magnitude to eachrisk factor, the authors constructed a score that wasbased on the number of factors that were present forany given patient. They then examined whether a scoreof R1, R2, or 3 was most predictive of their outcomeof interest. Their determination of the most predictivemodel used ROC curves. Their analysis showed thatthe presence of R2 factors was the cutoff that madethe model most predictive, with a sensitivity of 80%, aspecificity of 98%, and an overall accuracy of 93%.

Although logistic regression is a powerful and widelyused tool in the development of prediction models, it isnot without disadvantages. Independent variables mayhave interactions with one another, the knowledge ofwhich would improve the accuracy of a predictionmodel. Although an investigator can search for theseinteractions, if many independent variables are beingevaluated, an evaluation of all 2-way interactions is atthe very least tedious, and a comprehensive evaluationof higher order interactions is all but impossible. Also,the use of logistic regression makes assumptions aboutthe linear relationships between the dependent andindependent variables, but in some cases this relation-ship does not exist. Once again, researchers can inves-tigate whether these nonlinear relationships are presentand, if so, whether transformations can resolve thenonlinearities. This work is time-consuming, and theremay be instances in which complex nonlinear relation-ships cannot be modeled in an appropriate form. Inthese instances, other methods that are more capable ofassessing nonlinear relationships among variables mayprovide more accurate predictions.

Predictive models that are basedon neural networks

A neural network has the inherent ability to account forthe complex nonlinear relationships of independent

Page 5: Methods of clinical prediction

892 Grobman and Stamilio

variables with one another and with the dependentvariable as well. These networks have this capabilitybecause they do not depend on predefined mathematicrelationships that are inherent to a logistic regression.Correspondingly, they free investigators from thetedious and sometimes unsuccessful search for theserelationships and theoretically optimize the predictiveaccuracy of models when these relationships are present.

Conceptually, a neural network can be thought of asmimicking the process flows of information within thehuman brain. Prediction that is based on human cere-bral processes can be thought of as output (ie, predic-tion) based on inputs (ie, stimuli) recognized by neuronsin the brain that are interconnected in complex patternsand alter information pathways in response to feedbackmechanisms (ie, past experience). Similarly, computerneural networks formulate an output (ie, prediction)based on inputs (ie, predictive factors) with computerprograms that allow the interactions of all the inputswith one another and provide the ability for improve-ment of future prediction based on feedback mecha-nisms (ie, accuracy of past predictions). The exactmechanism of the computer programs that create aneural network is beyond the scope of the presentdiscussion but may be investigated further in articlesthat are devoted to this topic.12,13

One situation that seems well-suited to analysisthrough the use of neural networks is the prediction offetal acidemia that is based on fetal heart rate monitor-ing. The analysis of continuous fetal heart rate tracingsrequires the processing of large amounts of data thathave complex temporal interconnections. Salamalekis etal14 have demonstrated how this data stream can be or-ganized by a neural network to predict the umbilicalcord arterial pH of the fetus. Using a computer to seg-ment each 10-minute period of the fetal heart rate trac-ing into its constituent ‘‘wavelet coefficients’’ andcorrelating these findings with fetal oxygen saturationthat was provided by a fetal pulse oximeter, the investi-gators were able to construct a neural network thatcould predict with 100% sensitivity and 83% specificitythe occurrence of an umbilical cord arterial pH !7.00.

MacDowell et al15 had a similar success with a neuralnetwork in their prediction of which laboring womenwould ultimately have a cesarean delivery. A decisionto perform a cesarean delivery is based on a multiplicityof factors, many of which are interrelated, and includematernal characteristics, fetal characteristics, and otherfactors that are not strictly clinical (such as time of birthand a mother’s expectations). MacDowell et al used aneural network to sift through 41 potentially useful pre-dictor variables. Their network identified 23 variablesthat substantively aided in the prediction of the birthingmode, and a model was developed that correctly catego-rized 88.2% and 82.1% of women who ultimately weredelivered by the cesarean or vaginal route, respectively.

Macones et al16 also theorized that a neural networkcould optimize the prediction of birth mode and devel-oped a predictive model using this method for thosewomen who undergo a trial of labor after cesarean deliv-ery. Somewhat surprisingly, when compared with a pre-dictive model that was developed with more standardlogistic techniques, the neural network was found tobe less accurate.

This finding of Macones et al16 only serves to empha-size that the theoretic advantages of more complex pre-dictive methods may not always translate into tangibleexperimental improvements. Moreover, even if a neuralnetwork were to generate a superior predictive model, itstill has some aspects that may hinder the translation ofresults into clinical practice. Perhaps most importantly,whereas logistic regression specifies the magnitude ofthe relationship that each independent variable haswith the outcome, a neural network provides no such in-sight. The network accepts input and generates outputbut does not delineate clearly the relative importanceof the variables with respect to the output. Correspond-ingly, because the predictive model is less intuitive andmore removed from clinical experience, adoption by cli-nicians is less likely. In some sense, the theoretic advan-tage of the neural network method, which allows thedetermination of otherwise indiscernible data patterns,is a prominent disadvantage for clinical translation. Al-though the logistic regression equation and related scor-ing systems can sometimes be tedious to use, theynevertheless can be written explicitly and disseminatedeasily. A trained neural network, conversely, offers nosuch capacity for facile distribution.

Future directions

An optimal clinical prediction model is likely to have 2main characteristics: (1) the ability to predict the out-come of interest accurately, itself dependent on thecapacity to account for complex relationships among thedata, and (2) the potential for ease of use in the clinicalsetting. As mentioned, although each of the methodsthat have been described so far has advantageouscharacteristics, none often fulfill both criteria. Conse-quently, researchers have developed other predictionmethods. These methods have not been used widely inobstetrics and gynecology so far but have been used inother disciplines within medicine. Two such methods arepredictive nomograms and classification and regressiontree (CART) analysis.

Predictive nomograms

Nomograms allow for the accounting of the interactiveeffects ofmultiple independent variables andpresent theseinteractions in a graphic form that is accessible to

Page 6: Methods of clinical prediction

Grobman and Stamilio 893

clinicians. The statistics underlying the nomograms canbe advanced, because multivariable techniques that donot rely on presupposed distributions of datamay be usedto identify factors that are most predictive of the outcomeof interest. Despite the potential complexity of theunderlying mathematics, the final prediction model isquite straightforward, because each independent variableis accorded a number of points, which are then summatedto yield a total ‘‘score’’ that is predictive of outcome.

A schematic diagram of a hypothetic 2-variablenomogram is presented in Figure 2. The method forthe use of the nomogram is as follows: For each clinicalcharacteristic, find the number of points to which it cor-responds on the uppermost point scale. Thus, if the con-tinuous ‘‘variable A’’ had value of ‘‘20,’’ 30 points wouldbe generated. If that person also belonged within ‘‘strata2,’’ 60 further points would result, which would yield atotal score of 90 (30 C 60). Draw a line from the ‘‘totalpoints’’ scale to the ‘‘outcome probability’’ graphic topredict the probability of the outcome of interest forthe person under consideration. This hypothetic patientwould have a 70% chance of the outcome of interestoccurring.

It should be noted that points are generated, not just onthe basis of the presence or absence of an independentvariable, but for a wide range of values that any indepen-dent variable may have. Also, these points are not chosensimply by the investigator but are generated by theanalysis itself to optimize the prediction model. Kattanet al17,18 and Eastham and Kattan19 have demonstratedthe clinical value of predictive nomograms in their evalu-ation of men with prostate cancer. They have developednomograms that allow, for example, the prediction of 5-year survival after either a surgical procedure or radio-therapy. Moreover, clinical usefulness can be enhancedfurther because the nomogram can be incorporated intoa computer or personal digital assistant, such that oncethe desired predictive factors are entered, the outcomeprobability is calculated automatically. Of note, we foundno examples of the use of predictive nomograms in the ob-stetric or gynecologic literature.

Figure 2 Schematic representation of a predictive nomogram.

CART analysis

CART analysis is another type of predictive model thatuses nonparametric techniques to evaluate data, accountfor complex relationships, and present the results in aclinically useful form. In this type of analysis, there isprogressive splitting of the population into subgroupsthat are based on the predictive independent variables.The variables that are chosen, the discriminatory valuesof the variable, and the order in which the splitting occursare all produced by the underlying mathematic algorithmto maximize predictive accuracy. A simplified example ofa CART, with the same hypothetic variables that wereused in the example of the nomogram, is presented inFigure 3. In this analysis, the clinician simply follows thepaths of the tree that describe the characteristics of thepatient being evaluated and arrives at the prediction ofthe outcome of interest for that particular patient.

This type of model is relatively easy to use for theclinician. First, in contrast to many logistic regressionmodels, there are no complicated equations to remem-ber or use. The structure of the tree is one that isappealing intuitively and congruent with methods ofdecision-making that a physician already uses on manyoccasions. For example, in trying to understand the bestdiagnostic test or treatment for a given patient, clini-cians will use specific patient characteristics to determineprogressively which modalities are most appropriate orwhich outcomes are most likely. The CART not only usesthis type of logic but also provides a formal structure andquantitative outcome assessment that can optimize theactual clinical decision. CARTs have been employedusefully in the prediction of death for hospitalized pa-tients with acquired immunodeficiency syndrome–related

Figure 3 Schematic representation of a CART analysis.

Page 7: Methods of clinical prediction

894 Grobman and Stamilio

Pneumocystis carinii pneumonia and for those patientswith acutely decompensated heart failure.20,21 AlthoughCART analysis has not been used extensively in obstetricclinical research, some investigators have used the tech-nique to predict ectopic pregnancy or to decide which pa-tients with a possible ectopic pregnancy require closeobservation, immediate intervention, or no further evalu-ation.22,23 Prediction was based on symptoms and exam-ination, medical history, and serum analyte values.Guzick et al24 also used CART analysis to distinguishmen with subfertility with the use of semen analysis vari-ables. Readers who are interested in learning more aboutthe methods and applications of CART can refer to textsand reviews that are dedicated to the topic.25,26

Comment

Prediction models have a long history of making usefulcontributions to clinicians and patients alike in thediscipline of obstetrics and gynecology. Models that arebased on univariable analysis have provided assistance inthe prediction in circumstances such as inductionof labor.More recently, investigators have turned to models thatrely on multivariable statistical methods in an effort tobetter account for complex relationships among data andthereby improve predictive accuracy. The use of multi-variable analysis is conducive particularly to clinicaloutcomes that may depend on multiple interrelated pre-dictive factors (such as success of a trial of labor). In somecases, however, these models may be more difficult to usein the clinical setting. Newer predictive methods hold thepromise of achieving models that not only extract themost discriminatory potential from the data but alsopromote the use of the models in the clinical setting.

References

1. Gherman RB. Shoulder dystocia: an evidence-based evaluation of

the obstetric nightmare. Clin Obstet Gynecol 2002;45:345-62.

2. Mozurkewich EL, Hutton EK. Elective repeat cesarean delivery

versus trial of labor: a meta-analysis of the literature from 1989

to 1999. Am J Obstet Gynecol 2000;183:1187-97.

3. Iams J, Newman R, Thom E, Goldenberg R, Mueller-Heubach E,

Mowad A. Frequency of uterine contractions and the risk of spon-

taneous preterm delivery. N Engl J Med 2002;346:250-5.

4. Mercer B, Goldenberg R, Moawad AH, Meis PJ, Iams JD, Das

A, et al. The preterm prediction study: a clinical risk assessment

system. Am J Obstet Gynecol 1996;174:1885-95.

5. Troyer L, Parisi V. Obstetric parameters affecting success in a trial

of labor: designation of a scoring system. Am J Obstet Gynecol

1992;167:1099-104.

6. Bishop EH. Pelvic scoring for elective induction. Obstet Gynecol

1964;24:266-8.

7. Benacerraf B, Neuberg D, Bromley B, Frigoletto F. Sonographic

scoring index for prenatal detection of chromosomal abnormali-

ties. J Ultrasound Med 1992;11:449-58.

8. Hosmer D, Lemeshow D. Applied logistic regression. 2nd ed. New

York: Wiley; 2000.

9. Pickhardt MG, Martin JN, Meydrech EF, Blake PG, Martin RW,

Perry KG, et al. Vaginal birth after cesarean delivery: Are there

useful and valid predictors of success or failure? Am J Obstet Gy-

necol 1992;166:1811-9.

10. Flamm B, Geiger A. Vaginal birth after cesarean delivery: an ad-

mission scoring system. Obstet Gynecol 1997;90:907-10.

11. Odibo AO, Farrell C, Macones GA, Berghella V. Development of

a scoring system for predicting the risk of preterm birth in women

receiving cervical cerclage. J Perinatol 2003;23:664-7.

12. Cheng B, Titterington D. Neural networks: a review from a statis-

tical perspective. Stat Sci 1994;9:2-30.

13. Tu JV. Advantages and disadvantages of using artificial neural net-

works versus logistic regression for predicting medical outcomes.

J Clin Epidemiol 1996;49:1225-31.

14. Salamalekis E, Thomopoulos P, Giannaris D, Salloum I, Vasios G,

Prentza A, et al. Computerised intrapartum diagnosis of fetal hy-

poxia based on fetal heart rate monitoring and fetal pulse oximetry

recordings utilising wavelet analysis and neural networks. BJOG

2002;109:1137-42.

15. MacDowell M, Somoza E, Rothe K, Fry R, Brady K, Bocklet A.

Understanding birthing mode decision making using artificial neu-

ral networks. Med Decis Making 2001;21:433-43.

16. Macones GA, Hausman N, Edelstein R, Stamilio DM, Marder SJ.

Predicting outcomes of trials of labor in women attempting vaginal

birth after cesarean delivery: a comparison of multivariate

methods with neural networks. Am J Obstet Gynecol 2001;184:

409-13.

17. Kattan MW, Zelefsky MJ, Kupelian PA, Cho D, Scardino PT,

Fuks Z, et al. Pretreatment nomogram that predicts 5-year

probability of metastasis following three-dimensional conformal

radiation therapy for localized prostate cancer. J Clin Oncol

2003;21:4568-71.

18. Kattan MW, Eastham JA, Wheeler TM, Maru N, Scardino PT,

Erbersdobler A, et al. Counseling men with prostate cancer: a

nomogram for predicting the presence of small, moderately differ-

entiated, confined tumors. J Urol 2003;170:1792-7.

19. Eastham JA, Kattan MW, Scardino PT. Nomograms as predictive

models. Semin Urol Oncol 2002;20:108-15.

20. Yarnold PR, Soltysik RC, Bennett CL. Predicting in-hospital mor-

tality of patients with AIDS-related Pneumocystis carinii pneumo-

nia: an example of hierarchically optimal classification tree

analysis. Stat Med 1997;16:1451-63.

21. Fonarow GC, Adams KF Jr, Abraham WT, Yancy CW, Boscar-

din WJ. Risk stratification for in-hospital mortality in acutely de-

compensated heart failure: classification and regression tree

analysis. JAMA 2005;293:572-80.

22. Dart RG, Kaplan B, Varaklis K. Predictive value of history and

physical examination in patients with suspected ectopic pregnancy.

Ann Emerg Med 1999;33:283-90.

23. Gerton GL, Fan XJ, Chittams J, Sammel M, Hummel A,

Strauss JF, et al. A serum proteomics approach to the diagnosis

of ectopic pregnancy. Ann N Y Acad Sci 2004;1022:306-16.

24. Guzick DS, Overstreet JW, Factor-Litvak P, Brazil CK, Nakajima

ST, Coutifaris C, et al. Sperm morphology, motility, and con-

centration in fertile and infertile men. N Engl J Med 2001;345:

1388-93.

25. Zhang H, Singer B. Recursive partitioning in the health sciences.

New York: Springer-Verlag; 1999.

26. Lemon SC, Roy J, Clark MA, Friedmann PD, Rakowski W. Clas-

sification and regression tree analysis in public health: methodolog-

ical review and comparison with logistic regression. Ann Behav

Med 2003;26:172-81.