assessment of bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf ·...

14
Assessment of Bayesian network classifiers as tools for discriminating breast cancer pre-diagnosis based on three diagnostic methods Ameca-Alducin Mar´ ıa Yaneli 1 , Cruz-Ram´ ırez Nicandro 2 , Mezura-Montes Efr´ en 1 , Martin-Del-Campo-Mena Enrique 3 , P´ erez-Castro Nancy 1 , and Acosta-Mesa H´ ector Gabriel 2 1 Laboratorio Nacional de Inform´ atica Avanzada (LANIA) A.C. ebsamen 80, Centro, Xalapa, Veracruz, 91000, M´ exico 2 Departamento de Inteligencia Artificial, Universidad Veracruzana Sebasti´ an Camacho 5, Centro, Xalapa, Veracruz, 91000, M´ exico 3 Centro Estatal de Cancerolog´ ıa: Miguel Dorantes Mesa Aguascalientes 100, Progreso Macuiltepetl,Xalapa, Veracruz, 91130, M´ exico August 25, 2012 Abstract. In recent years, a technique known as thermography has been again seriously considered as a complementary tool for the pre-diagnosis of breast cancer. In this paper, we explore the predictive value of thermo- graphic atributes, from a database containing 98 cases of patients with suspicion of having breast cancer, using Bayesian networks. Each patient has corresponding results for different diagnostic tests: mammography, thermography and biopsy. Our results suggest that these atributes are not enough for producing good results in the pre-diagnosis of breast cancer. On the other hand, these models show unexpected interactions among the thermographical attributes, especially those directly related to the class variable. Keywords: Thermography Breast cancer Bayesian networks 1 Introduction Nowadays, breast cancer is the first cause of death among women worldwide [1]. There are various techniques to pre-diagnose this disease such as auto- exploration, mammography, ultrasound, MRI and thermography [2–5]. The com- monest test for carrying out this pre-diagnosis is mammography [2]; however, due to the different varieties of such disease [3], there are situations where this test does not provide an accurate result [6]. For instance, women younger than 40-years old have more density in their breast: this is an identified cause for mam- mography not to work properly [7]. In order to overcome this limitation in the pre-diagnosis of breast cancer, a relatively new technique has been proposed as

Upload: others

Post on 10-Apr-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

Assessment of Bayesian network classifiers astools for discriminating breast cancer

pre-diagnosis based on three diagnostic methods

Ameca-Alducin Marıa Yaneli1, Cruz-Ramırez Nicandro2, Mezura-MontesEfren1, Martin-Del-Campo-Mena Enrique3, Perez-Castro Nancy1, and

Acosta-Mesa Hector Gabriel2

1 Laboratorio Nacional de Informatica Avanzada (LANIA) A.C.Rebsamen 80, Centro, Xalapa, Veracruz, 91000, Mexico

2 Departamento de Inteligencia Artificial, Universidad VeracruzanaSebastian Camacho 5, Centro, Xalapa, Veracruz, 91000, Mexico

3Centro Estatal de Cancerologıa: Miguel Dorantes MesaAguascalientes 100, Progreso Macuiltepetl,Xalapa, Veracruz, 91130, Mexico

August 25, 2012

Abstract. In recent years, a technique known as thermography has beenagain seriously considered as a complementary tool for the pre-diagnosisof breast cancer. In this paper, we explore the predictive value of thermo-graphic atributes, from a database containing 98 cases of patients withsuspicion of having breast cancer, using Bayesian networks. Each patienthas corresponding results for different diagnostic tests: mammography,thermography and biopsy. Our results suggest that these atributes arenot enough for producing good results in the pre-diagnosis of breastcancer. On the other hand, these models show unexpected interactionsamong the thermographical attributes, especially those directly relatedto the class variable.

Keywords: Thermography Breast cancer Bayesian networks

1 Introduction

Nowadays, breast cancer is the first cause of death among women worldwide[1]. There are various techniques to pre-diagnose this disease such as auto-exploration, mammography, ultrasound, MRI and thermography [2–5]. The com-monest test for carrying out this pre-diagnosis is mammography [2]; however,due to the different varieties of such disease [3], there are situations where thistest does not provide an accurate result [6]. For instance, women younger than40-years old have more density in their breast: this is an identified cause for mam-mography not to work properly [7]. In order to overcome this limitation in thepre-diagnosis of breast cancer, a relatively new technique has been proposed as

Page 2: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

a complement in such pre-diagnosis: thermography [5]. Such technique consistsof taking infrared images of the breasts with an infrared camera [8]. Thermogra-phy represents a non-invasive, painless procedure, which does not expose to thepatient to x-ray radiation [7]. Besides, it is cheaper than other pre-diagnosticprocedures. Thermography gives mainly information about temperature of thebreasts and their corresponding differences. It is argued that lesions in the breastsproduce significantly more temperature than healthy, normal breasts [9]. This isbecause these lesions (or tumors) contain more veins and a have metabolic ratethan the surrounding tissue.

Our main contribution is the exploration of the predictive value of the atributesfor three different diagnostic methods for breast cancer. With this exploration,we can more easily appreciate the performance of each method regarding accu-racy, sensitivity and specificity. Moreover, we can visually identify which thermo-graphic variables are considered, from the point of view of a Bayesian network,more important to predict the outcome.

The rest of the paper is organized as follows. Section 2 describes the stateof the art that gives the proper context so that our contribution is more easilyidentified. Section 3 presents the materials and methods used in our experiments.Section 4 presents the methodology to carry out such experiments and the re-spective results. Section 5 discusses these results and, finally, section 6 gives theconclusions and identifies some future work.

2 State of the Art

The state of the art of thermography includes introductory investigations, image-based works and data-based works [10, 11]. The first ones focus on the explana-tion of the technique as well as its advantages and disadvantages. A representa-tive work is that of Foster (1998) [6], who points out that thermography may bea potential alternative diagnostic method since it does not produce radiation.The second ones concentrate on techniques for image processing such as clus-tering or fractal analyses[12, 13]. The work of EtehadTavakol et al. (2008) [12]uses k-means and fuzzy c-means for separating lesions from no-lesions. The finalones present statistical and Artificial Intelligence techniques (such as ArtificialNeural Networks) [14, 15, 7, 16]. The work of Wishart et al. (2010) [16] performsthe comparison between two software that uses AI techniques for analyzing datacoming from thermographic images so that diagnosses can be carried out.

Our work focuses on the exploration of discriminative power of thermographicatributes for the pre-diagnosis of breast cancer using Bayesian networks.

3 Materials and Methods

3.1 The Database

We used a 98-case database which was provided by a medical oncologist whospecializes in the study thermographic since 2008. The database consists of 77

Page 3: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

sick patients and 21 healthy patients. Each of the patients (either sick or healthy)has tests for thermography, mammography and biopsy. 28 variables in total formthis dataset: 16 belong to thermography, 8 belong to mammography and 3 tobiopsy; the last variable taken into account is outcome (cancer and no cancer).This last variable is confirmed by an open biopsy, which is considered as thegold-standard test for diagnosing breast cancer. Table 1 presents the namesand a brief description of the corresponding thermographic variables. Table 2presents the same information for mammographic variables while Table 3 forbiopsy variables.

Table 1. Names, definitions and types of variables of thermography

Variable name Definition Variable type

Asymmetry Degree difference (in Celsius) between the right and the left breasts Nominal (range [1-3])Thermovascular network Amount of veins with highest temperature Nominal (range [1-3])Curve pattern Heat area under the breast Nominal (range [1-3])Hyperthermia Hottest point of the breast Binary2c Degree difference between the hottest points of the two breasts Nominal (range [1-4])F unique Amount of hottest points Nominal (range [1-4])1c Hottest point in only one breast BinaryFurrow Furrows under the breasts BinaryPinpoint Veins going to the hottest points of the breasts BinaryHot center The center of the hottest area BinaryIrregular form Geometry of the hot center BinaryHistogram Histogram in form of a isosceles triangle BinaryArmpit Difference degree between the 2 armpits BinaryBreast profile Visually altered profile BinaryScore The sum of values of the previous 14 variables BinaryAge Age of patient Nominal (range [1-3])Outcome Cancer/no cancer Binary

Table 2. Names, definitions and types of variables of mammography

Variable name Definition Variable type

BIRADS Assigned value in a mammography to measure the degree of the lesion Nominal (range [0-6])Clockwise Clockwise location of the lesion Nominal (range [1-12])Visible tumor Whether the tumor is visible in the mammography Binaryspiculated edges Whether the edges of the lesion are spiculated BinaryIrregular edges Whether the edges of the lesion are irregular Binarymicrocalcifications Whether microcalcifications are visible un the mammography BinaryAsymmetryM Whether the breast tissue is asymmetricdistortion Whether the structure of the breast is distorted Binary

Table 3. Names, definitions and types of variables of biopsy

Variable name Definition Variable type

sizeD Tamano del tumor discretizado Nominal (range [1-3])RHP Types of cancer Nominal (range [1-8])SBRdegree degree of cancer malignancy Nominal (range [0-3])

Page 4: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

3.2 Bayesian Networks

A Bayesian network (BN) [17, 18] is a graphical model that represents relation-ships of probabilistic nature among variables of interest. Such networks consistof a qualitative part (structural model), which provides a visual representation ofthe interactions amid variables, and a quantitative part (set of local probabilitydistributions), which permits probabilistic inference and numerically measuresthe impact of a variable or sets of variables on others. Both the qualitative andquantitative parts determine a unique joint probability distribution over thevariables in a specific problem [17–19]. In other words, a Bayesian network is adirected acyclic graph consisting of [20]:

a) nodes (circles), which represent random variables and arcs (arrows), whichrepresent probabilistic relationships among these variables and for each node,there exists a local probability distribution attached to it, which depends on thestate of its parents.

Figures 1 and 2 (see section 4) show examples of a BN. One of the greatadvantages of this model is that it allows the representation of a joint probabil-ity distribution in a compact and economical way by making extensive use ofconditional independence, as shown in equation 1:

P (X1, X2, ..., Xn) =

n∏i=1

P (Xi|Pa(Xi)) (1)

Where Pa(Xi) represents the set of parent nodes of Xi; i.e., nodes with arcspointing to Xi. Equation 1 also shows how to recover a joint probability from aproduct of local conditional probability distributions.

Bayesian Network Classifiers Classification refers to the task of assigningclass labels to unlabeled instances. In such a task, given a set of unlabeled caseson the one hand, and a set of labels on the other, the problem to solve is to finda function that suitably maps each unlabeled instance to its corresponding label(class). As can be inferred, the central research interest in this specific area isthe design of automatic classifiers that can estimate this function from data (inour case, we are using Bayesian networks). This kind of learning is known assupervised learning [21–23]. For the sake of brevity and the lack of space, we donot write here the code of the procedures used in the tests carried out in thiswork. Instead, we only describe them briefly and refer the reader to their originalsources. The procedures used in these tests are: a) the Naıve Bayes classifier, b)Hill-Climber and c) Repeated Hill-Climber [24, 25, 22].

a) The Naıve Bayes classifier (NB) between are the main appeals are simplicityand accuracy: although its structure is always fixed (the class variable has anarc pointing to every attribute). In simple terms, the NB learns for maximumlikelihood, from a training data sample, the conditional probability of eachattribute given the class. Then, once a new case arrives, the NB uses Bayes’

Page 5: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

rule to compute the conditional probability of the class given the set of at-tributes selecting the value of the class with the highest posterior probability.

b) Hill-Climber is a Weka’s [24] implementation of a search and scoring algo-rithm, which uses greedy-hill climbing [26] for the search part and differentmetrics for the scoring part, such as BIC (Bayesian Information Criterion),BD (Bayesian Dirichlet), AIC (Akaike Information Criterion) and MDL (Min-imum Description Length). For the experiments reported here, we selected theMDL metric. This procedure takes as input an empty graph and a databaseand applies different operators for building a Bayesian network: addition,deletion or reversal of an arc. In every search step, it looks for a structurethat minimizes the MDL score. In every step, the MDL is calculated andprocedure Hill-Climber keeps the structure with the best (minimum) score.It finishes searching when no new structure improves the MDL score of theprevious network.

c) Repeated Hill-Climber is a Weka’s [24] implementation of a search and scor-ing algorithm, which uses repeated runs of greedy-hill climbing [26] for thesearch part and different metrics for the scoring part, such as BIC, BD, AICand MDL. For the experiments reported here, we selected the MDL metric.In contrast to the simple Hill-Climber algorithm, repeated Hill-Climber takesas input a randomly generated graph. It also takes a database and appliesdifferent operators (addition, deletion or reversal of an arc) and returns thebest structure of the repeated runs of the Hill-Climber procedure. With thisrepetition of runs, it is possible to reduce the problem of getting stuck in alocal minimum [19].

3.3 Evaluation Method: Stratified K-fold Cross-validation

We follow the definition of the cross-validation method given by Kohavi [23]. Ink-fold cross-validation, we split the database D in k mutually exclusive randomsamples called the folds: D1, D2, . . . , Dk, where such folds have approximatelyequal size. We train this classifier each time i ∈ 1, 2, . . . , k using D \ Di andtest it on Di (again, the symbol denotes set difference). The cross-validationaccuracy estimation is the total number of correct classification divided by thesample size (total number of instances in D). Thus, the k-fold cross validationestimate is:

acccv =1

n

∑(vi,yi)∈D

δ(I(D \D(i), vi), yi) (2)

Where (I(D \ D(i), vi), yi) denotes the label assigned by classifier I to anunlabeled instance vi on dataset D \ D(i) (it means using the data set unlessthe test set), yi is the class of instance vi, n is the size of the complete datasetand δ(i, j) is a function where δ(i, j) = 1 if i = j and 0 if i 6= j. In other words,

Page 6: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

if the label assigned by the inducer to the unlabeled instance vi coincides withclass yi, then the result is 1; otherwise, the result is 0; i.e., we consider a 0/1loss function in our calculations of equation 2. It is important to mention thatin stratified k-fold cross-validation, the folds approximately contain (roughly)the same proportion of classes as in the complete dataset D. A special case ofcross-validation occurs when k = n (where n represents the sample size). Thiscase is known as leave-one-out cross-validation [21, 23].For differents classifier, we assess the performance of the classifiers presented insection 3.2 using the following measures [27–30]:

a) Accuracy : the overall number of correct classifications divided by the size ofthe corresponding test set.

b) Sensitivity : the ability to correctly identify those patients who actually havethe disease.

c) Specificity : the ability to correctly identify those patients who do not havethe disease

4 Methodology and Experimental Results

The procedure for making thermographic study begins with the obtaining thethermal images. These images are taken from 1 meter away of the patient, de-pending on her muscular mass in a temperature-controlled room (18-22o C),with a FLIR A40 infrared camera. Three images are taken for each patient: onefrontal and two laterals (left and right). Right after, the breasts are uniformlycovered with surgical spirit (using a cotton). Two minutes after, the same threeimages are taken again. All these images are stored using the ThermaCAM Re-searcher Professional 2.9 software. Once the images are taken and stored, thespecialist analyzes them and fills in the database with the corresponding valuesfor each thermographic variable. He also includes the corresponding values formammographic and biopsy variables.

We carried out the experiments using Weka [24], using the three differentBayesian network classifiers (see their parameter set in table 4) and other clas-sifiers that were used in Weka are: Artificial Neural Network, decision tree ID3and C4.5 (with default parameters). For measuring their accuracy, sensitivityand specificity, we used 10-fold cross-validation as described in section 3.3. Themain objective of these experiments is to the exploration the diagnostic perfor-mance of the atributes of the thermography, mammography and biopsy, includ-ing the unveiling of the interactions among attributes and class.

Table 5 shows the numerical results of thermography, mammography, biopsyand thermography and mammography for Naıve Bayes, Hill-Climber and Re-peated Hill-Climber. Table 6 shows the numerical results of thermography,mammography, biopsy and thermography and mammography for Artificial Neu-ral Network, decision tree ID3 and tree C4.5. Figures 2- 7 show the BN cor-responding to Hill-Climber and Repeated Hill-Climber for thermography, mam-mography and biopsy respectively. We do not present the structure of the Naıve

Page 7: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

Bayes classifier since its structure is always fixed: there is an arc pointing toevery attribute from the class.

Table 4. Used the following values for Hill-Climber and Repeated Hill-Climber

Parameters Hill-Climber Repeated Hill-Climber

The initial structure NB (Naıve Bayes) False FalseNumber of parents 100,000 100,000Runs - 10Score type MDL MDLSeed - 1arc reversal True True

Table 5. Accuracy, sensitivity and specificity of Naıve Bayes, Hill-Climber and Re-peated Hill-Climber for different methods of pre-diagnosis of breast cancer. For theaccuracy test, the standard deviation is shown next to the accuracy result. For the re-maining tests, their respective 95% confidence intervals (CI) are shown in parentheses.

Method NaıveBayes

Hill-Climber

RepeatedHill-Climber

Accuracy Sensitivity Specificity Accuracy Sensitivity Specificity Accuracy Sensitivity Specificity

Thermography 68.18%(±12.15)

79%(70-88) 24%(6-42) 78.56%(±3.14)

100%(100-100)

0%(0-0) 78.56%(±3.14)

100%(100-100)

0%(0-0)

Mammography 80.67%(±10.95)

87%(0-0) 52.4%(0-0)

72.56%(±10.30)

84%(76-93) 71%(52-91)

74.56%(±8.26)

87%(80-95) 71%(52-91)

biopsy 99%(±3.16)

99%(80-95) 100%(100-100)

99%(±3.16)

84%(76-93) 100%(100-100)

99%(±3.16)

87%(80-95) 100%(100-100)

Thermographyand mammog-raphy

77.22%(±14.13)

84%(76-93) 52%(31-74)

69.44%(±10.37)

83%(75-91) 19%(2-36) 70.56%(±11.85)

84%(76-93) 19%(2-36)

Fig. 1. Bayesian network resulting from running Hill-Climber and Repeated Hill-Climber with the biopsy 98-case database

Page 8: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

Table 6. Accuracy, sensitivity and specificity of Artificial Neural Network, decision treeID3 and C4.5 for different methods of pre-diagnosis of breast cancer. For the accuracytest, the standard deviation is shown next to the accuracy result. For the remainingtests, their respective 95% confidence intervals (CI) are shown in parentheses.

Method ArtificialNeuralNetwork

DecisionTree ID3

DecisionTree C4.5

Accuracy Sensitivity Specificity Accuracy Sensitivity Specificity Accuracy Sensitivity Specificity

Thermography 70.19%(±11.43)

78%(69-87) 33%(13-53)

74.87%(±12.15)

89%(82-96) 52%(31-74)

75.58%(±6.82)

94%(88-99) 5%(-4-14)

Mammography 85.30%(±9.55)

92%(86-98) 67%(47-87)

*73.79%(±12.79)

94%(88-100)

61%(39-84)

*77.96%(±4.36)

97%(94-101)

0%(0-0)

biopsy 100%(±0)

100%(100-100)

100%(100-100)

97.33%(±5.75)

100%(100-100)

100%(100-100)

100%(±0) 100%(100-100)

100%(100-100)

Thermographyand mammog-raphy

76.11%(±12.91)

87%(80-95) 48%(26-69)

68.67%(±12.56)

76%(67-86) 41%(18-65)

74.36%(±8.70)

94%(88-99) 5%(-4-14)

Fig. 2. Bayesian network resulting from running Hill-Climber with the thermographic98-case database

Page 9: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

Fig. 3. Bayesian network resulting from running Repeated Hill-Climber with the ther-mographic 98-case database

Fig. 4. Bayesian network resulting from running Hill-Climber with the mammography98-case database

Fig. 5. Bayesian network resulting from running Repeated Hill-Climber with the mam-mography 98-case database

Page 10: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

Fig. 6. Bayesian network resulting from running Hill-Climber with the thermographicand mammography 98-case database

Fig. 7. Bayesian network resulting from running Repeated Hill-Climber with the ther-mographic and mammography 98-case database

Page 11: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

5 Discussion

Our main objective was explore the predictive value of thermographic atributesfor pre-diagnosis of breast cancer. We decided to use the framework of Bayesiannetworks because its power to visually unveil the relationships among attributesthemselves and among the attributes and the class. Furthermore, this modelallows one to represent the uncertainty usually contained in the medical domain.First of all, let us check the accuracy, sensitivity and specificity performance ofthe Bayesian network classifiers using for the thermographic atributes (see Table5). The results of Bayesian networks for thermography and mammography arealmost comparable, but for the neural network (see table 6) mammography hasan accuracy of 85.30%. In fact, thermography is excellent in identifying caseswith the disease (100% sensitivity) but it performs very poorly for detectinghealthy cases (0% specificity) for both Hill-Climber and Repeated Hill-Climberclassifiers. And if we compare the results of the table 6 for Neural network, ID3and C4.5, Bayesian classifiers (78.56%) obtained increased accuracy. As seen inthe table 5.

It is remarkable the change in the performance of these two pre-diagnosistechniques with the inclusion of all their respective variables (Naıve Bayes clas-sifier): 24% for thermography and 52.4% for mammography. It seems that thisinclusion, far from improving the performance, makes it worse. Coming back tosensitivity and specificity values, it can be argued that in a certain sense, ther-mography can indeed be useful as a complementary tool for the pre-diagnosis ofbreast cancer. To see the picture more clearly, imagine a patient with a mam-mographic positive result for cancer. In order to be more certain of this result, athermography can be taken to confirm it since thermography seems to identifywithout much trouble a sick patient.

The biopsy (see results in the tables 5, 6) is indeed the gold-standard methodto diagnose breast cancer. One question that immediately pops up is: why notthen always use biopsy to diagnose breast cancer? The answer is because itimplies a surgical procedure that involves known risks such as those that havemainly to do with anesthesia apart from the economical costs. Methods suchas mammography try to minimize the number of patients undertaking surgery.In other words, if all non-surgical procedures fail in the diagnosis, biopsy is theultimate resource.

Regarding the unveiling of the relationships among attributes and among theattributes and the outcome, we can detect various interesting issues. For the caseof thermography, contrary to what was expected, there is only one variable di-rectly responsible for the explaining the behavior of the outcome: variable furrow(figures 2 and 3). It seems that this variable is enough for obtain the maxi-mum percentage of classification (78.56%) for identifying patients with cancer.For the case of mammography, variables distortion and BIRADS are the onlyresponsible to detect abnormal cases as well as normal cases (figures 4 and 3).This may mean that radiologists could just observe these two variables to diag-nose the presence/absence of the disease. For the case of biopsy, we have morearguments to trust in this technique, in spite of its related and well-known risks.

Page 12: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

Finally, Table 5 suggests not to analyze the thermographic and mammographicvariables together but separated: the former decreases accuracy, sensitivity andspecificity performance.

6 Conclusions and Future Work

Our results suggest that thermography may have potential as a tool for pre-diagnosing breast cancer. However, more study and tests are needed. Its overallaccuracy and sensitivity values are encouraging; however, on the other hand,its specificity values are disappointed. The Bayesian networks, resultant fromrunning 3 algorithms on the thermography database, give us a good clue for thisbehavior: it seems that most of thermographic variables are rather subjective,making it difficult to avoid the usual noise in this kind of variables. The presentstudy allows then to think revisiting these variables and the way they are beingmeasured. Such subjectivity does not belong only to thermography but also tomammography: according to the Bayesian network results, just two variablesare responsible for explaining the outcome. Indeed, as can be noted from theseresults, when all mammographic variables are included (Naıve Bayes classifier)the specificity values drop significantly with respect to those when only a subsetof such attributes is considered. Moreover, if there were no subjectivity regardingspecificity, then Naıve Bayes would perform better. Thus, it seems that thereexists an overspecialization of the expert radiologist in the sense of consideringall variables for diagnosing patients with the disease but an underspecializationfor diagnosing the absence of such a disease. It is important to mention that ourdatabase is unbalanced: we need to get more data (healthy and sick cases) sothat our conclusions are more certain.

For future work, we firstly propose to add more cases to our databases.Secondly, it would be desirable to have roughly the same number of positive andnegative cases. Thirdly balancing classes. Finally, we recommend a revision onhow thermographic variables are being measured.

The first, third, and fifth authors acknowledge support from CONACyTthrough project No. 79809. Also acknowledge the fourth author to provide thedatabase.

References

1. Jemal A., Bray F., Center M., Ferlay J., Ward E., and Forman D. Global cancerstatistics. CA: A Cancer Journal for Clinicians, 61:69–90, 2011.

2. Geller B.M., Kerlikowske K.C., Carney P.A., Abraham L.A., Yankaskas B.C., Ta-plin S.H., Ballard-Barbash R., Dignan M.B., Rosenberg R., Urban N., and BarlowW.E. Mammography surveillance following breast cancer. Breast Cancer Researchand Treatment, 81:107–115, 2003.

3. Bonnema J., Van Geel A.N., Van Ooijen B., Mali S.P.M., Tjiam S.L., Henzen-Logmans S.C., Schmitz P.I.M, and Wiggers T. Ultrasound-guided aspiration biopsyfor detection of nonpalpable axillary node metastases in breast cancer patients:New diagnostic method. World Journal of Surgery, 21:270–274, 1997.

Page 13: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

4. Schnall M.D., Blume J., Bluemke D.A., DeAngelis G.A., DeBruhl N., Harms S.,Heywang-Kobrunner S.H., Hylton N., Kuhl C., Pisano E.D., Causer P., SchnittS.J., Smazal S.F., Stelling C.B., Lehman C., Weatherall P.T, and Gatsonis C.A.Mri detection of distinct incidental cancer in women with primary breast cancerstudied in ibmc 6883. Journal of Surgical Oncology, 92:32–38, 2005.

5. Ng E.Y.K. A review of thermography as promising non-invasive detection modalityfor breast tumor. International Journal of Thermal Sciences, 48:849–859, 2009.

6. Foster K.R. Thermographic detection of breast cancer. IEEE Engineering inMedicine and Biology Magazine, 17:10–14, 1998.

7. Arora N., Martins D., Ruggerio D., Tousimis E., Swistel A.J., Osborne M.P., andSimmons R.M. Effectiveness of a noninvasive digital infrared thermal imaging sys-tem in the detection of breast cancer. The American Journal of Surgery, 196:523–526, 2008.

8. Hairong Q., Phani T.K., and Zhongqi L. Early detection of breast cancer us-ing thermal texture maps. In Biomedical Imaging, 2002. Proceedings. 2002 IEEEInternational Symposium on, pages 309–312, 2002.

9. Wang J., Chang K.J., Chen C.Y., Chien K.L., Tsai Y.S., Wu Y.M., Teng Y.C.,and Shih T.T. Evaluation of the diagnostic performance of infrared imaging of thebreast: a preliminary study. BioMedical Engineering OnLine, 9:1–14, 2010.

10. Gutierrez F., Vazquez J., Venegas L., Terrazas S., Marcial S., Guzman C., PerezJ., and Saldana M. Feasibility of thermal infrared imaging screening for breastcancer in rural communities of southern mexico: The experience of the centro deestudios y prevencion del cancer (ceprec). In 2009 ASCO Annual Meeting, page1521. American Society of Clinical Oncology, 2009.

11. Ng E.Y.K., Chen Y., and Ung L. N. Computerized breast thermography: studyof image segmentation and tempe rature cyclic variations. Journal of MedicalEngineering Technology, 25:12–16, 2001.

12. EtehadTavakol M., Sadri S., and Ng E.Y.K. Application of k- and fuzzy c-means forcolor segmentation of thermal infrared breast images. Journal of Medical Systems,34:35–42, 2010.

13. EtehadTavakol M., Lucas C., Sadri S., and Ng E.Y.K. Analysis of breast thermog-raphy using fractal dimension to establish possible difference between malignantand benign patterns. Journal of Healthcare Engineering, 1:27–44, 2010.

14. Ng E.Y.K., Fok S.-C., Peh Y.C., Ng F.C., and Sim L.S.J. Computerized detectionof breast cancer with artificial intelligence and thermograms. Journal of MedicalEngineering Technology, 26:152–157, 2002.

15. Ng E.Y.K. and Fok S.-C. A framework for early discovery of breast tumor usingthermography with artificial neural network. The Breast Journal, 9:341–343, 2003.

16. Wishart G.C., Campisi M., Boswell M., Chapman D., Shackleton V., Iddles S.,Hallett A., and Britton P.D. The accuracy of digital infrared imaging for breastcancer detection in women undergoing breast biopsy. European Journal of SurgicalOncology (EJSO), 36:535–540, 2010.

17. Pearl J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Infer-ence. Morgan Kaufmann series in representation and reasoning. Morgan KaufmannPublishers, 1988.

18. Neuberg L.G. Causality: Models, reasoning, and inference, by judea pearl, cam-bridge university press, 2000. Econometric Theory, 19:675–685, 2003.

19. Friedman N. and Goldszmidt M. Learning bayesian networks from data. Universityof California, Berkeley and Stanford Research Institute, page 117, 1998.

20. Cooper G. An overview of the representation and discovery of causal relationshipsusing bayesian networks. Computation Causation Discovery, page 362.

Page 14: Assessment of Bayesian network classi ers as tools for ...emezura/util/files/micai13-yaneli.pdf · Assessment of Bayesian network classi ers as tools for discriminating breast cancer

21. Han J. and Kamber M. Data Mining: Concepts and Techniques. The MorganKaufmann Series in Data Management Systems. Elsevier, 2006.

22. Friedman N., Geiger D., and Goldszmidt M. Bayesian network classifiers. MachineLearning, 29:131–163, 1997.

23. Kohavi R. A study of cross-validation and bootstrap for accuracy estimation andmodel selection. pages 1137–1143. Morgan Kaufmann, 1995.

24. Witten I. H. and Frank E. Data Mining: Practical Machine Learning Tools andTechniques. Morgan Kaufmann Series in Data Management Sys. Morgan Kauf-mann, second edition, 2005.

25. Duda R.O., Hart P.E., and Stork D.G. Pattern Classification. John Wiley & Sons,2nd edition, 2001.

26. Russell S.J. and Norvig P. Artificial Intelligence: A Modern Approach. PrenticeHall, 3rd edition, 2009.

27. Lavrac N. Selected techniques for data mining in medicine. Artificial Intelligencein Medicine, 16:3–23, 1999.

28. Cross S.S., Dube A.K., Johnson J.S., McCulloch T.A., Quincey C., Harrison R.F.,and Ma Z. Evaluation of a statistically derived decision tree for the cytodiagnosisof fine needle aspirates of the breast (fnab). Cytopathology, 9:178–187, 1998.

29. Cross S.S., Stephenson T.J., and Harrisont R.F. Validation of a decision supportsystem for the cytodiagnosis of fine needle aspirates of the breast using a prospec-tively collected dataset from multiple observers in a working clinical environment.Cytopathology, 11:503–512, 2000.

30. Cross S.S., Downs J., Drezet P., Ma Z., and Harrison R.F. Which decision supporttechnologies are appropriate for the cytodiagnosis of breast cancer?, pages 265–295.World Scientific, 2000.