survey on data mining techniques in heart disease prediction

INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS ICICES 2013

978-1-4673-5788-3/13/$31.00©2013IEEE

7

An Empirical Study on applying Data MiningTechniques for the Analysis and Prediction

of Heart DiseaseSivagowry .S#1, Dr. Durairaj. M*2 and Persia.A#3

#Research Scholar, School of Computer Science and Engg.Bharathidasan University, Trichy, Tamilnadu, India.

*Assistant Professor, School of Computer Science and Engg.Bharathidasan University, Trichy, Tamilnadu, India

Abstract— The health care environment is found to be rich ininformation, but poor in extracting knowledge from theinformation. This is because of the lack of effective analysis toolto discover hidden relationships and trends in them. By applyingthe data mining techniques, valuable knowledge can be extractedfrom the health care system. Heart disease is a group of conditionaffecting the structure and functions of heart and has many rootcauses. Heart disease is the leading cause of death in the worldover past ten years. Researches have been made with manyhybrid techniques for diagnosing heart disease. This paper dealswith an overall review of application of data mining in heartdisease prediction.

Index Terms— Data Mining, Decision Tree, Naïve Bayes,Classification, Clustering.

I. INTRODUCTION Data Mining is the exploration of large datasets to extracthidden and previously unknown patterns, relationships andknowledge that are difficult to detect with traditional statistics[17]. Data mining techniques are the result of a long processof research and product development. Data Mining is dividedinto two tasks such as Predictive Tasks and Descriptive Tasks.Predictive Tasks predict the value of a specific attribute basedon other attribute. Classification, Regression and DeviationDeduction come under Predictive Tasks. Descriptive Tasksderive pattern that summarize the relationship between data.Clustering, Association Rule Mining and Sequential PatternDiscovery are coming under Descriptive Tasks.Data Mininginvolves few steps from raw data collection to some form ofnew knowledge. The iterative process consists of followingsteps like Data cleaning, Data Integration, Data Selection,Data transformation, Data Mining, Pattern Evaluation, andKnowledge the core for Knowledge Discovery Process

Figure 1: Data Mining is the core of Knowledge Discovery Process

Medical Data Mining is a domain of challenge which involveslot of imprecision and uncertainty. Provision of qualityservices at affordable cost is the major challenge faced in thehealth care organization. Poor clinical decision may lead todisastrous consequences. Health care data is massive. Clinicaldecisions are often made based on doctor’s experience ratherthan on the knowledge rich data hidden in the data base. Thisin some cases will result in errors, excessive medical costwhich affects the quality of service to the patients [33].Medical history data comprises of a number of tests essentialsto diagnose a particular disease. It is possible to gain theadvantage of Data mining in health care by employing it as anintelligent diagnostic tool. The researchers in the medical fieldidentify and predict the disease with the aid of Data miningtechniques [29].

II. HEART DISEASE The initial diagnosis of a heart attack is made by acombination of clinical symptoms and characteristicelectrocardiogram (ECG) changes. An ECG is a recording ofthe electrical activity of the heart. Confirmation of a heartattack can only be made hours later through detection ofelevated creatinine phosphokinase (CPK) in the blood. CPK isa muscle protein enzyme which is released into the bloodcirculation by dying heart muscles when their surroundingdissolves [3].World Health Organization in the year 2003 reported that29.2% of total global deaths are due to Cardio VascularDisease (CVD). By the end of this year, CVD is expected tobe the leading cause for deaths in developing countries due tochange in life style, work culture and food habits. Hence,more careful and efficient methods of cardiac diseases andperiodic examination are of high importance [1].

III. DATA SETS

The Data set is taken from Data mining repository ofUniversity of California, Irvine (UCI). Data set fromCleveland data set, Hungary Data set, Switzerland Data set,long beach and Statlog data set are collected. Cleveland,Hungary, Switzerland and va long beach data set contains 76attributes in all. But only 14 attributes are used. Among allthose Cleveland data set and Statlog data set are the most


978-1-4673-5788-3/13/$31.00©2013IEEE

2

commonly used data set. Because all the other has missingvalues. The Figure 2 shows some sample of data set collectedfrom the UCI repository.

Figure 2: Sample Data Set Attributes Used

No Name Description1 Age Age in Years2 Sex 1=male, 0=female3 cp Chest pain type(1 =

typical angina, 2=atypical angina, 3 =non-anginal pain, 4 =asymptomatic)

4 trestbps Resting blood sugar(inmm Hg on admissionto hospital)

5 chol Serum cholesterol inmg/dl

6 fbs Fasting bloodsugar>120 mg/dl(1=true, 0=false)

7 restecg Restingelectrocardiographicresults(0 = normal, 1 =having ST-T waveabnormality, 2 = leftventricularhypertrophy)

8 thalach Maximum heart rate9 exang Exercise induced

angina10 oldpeak ST depression induced

by exercise relative torest

11 slope Slope of the peakexercise ST segment(1=upsloping, 2=flat,3= downsloping)

12 Ca Number of majorvessels colored byfluoroscopy

13 thal 3= normal, 6=fixeddefect, 7= reversibledefect

14 num Class(0=healthy,

1=have heart disease)

IV. DATA MINING TECHNIQUES USED IN HEARTDISEASE PREDICTION

Data mining techniques such as clustering, Classification,Regression and Association Rule mining are used inextracting knowledge from database. Medical data are minedby using the techniques above and the diagnosis of disease iscarried out. Application of Data mining technique in medicaldata is as explained below:

Data Mining and Association Rules

An Association Rule is an implication expression of the formY, where X and Y are disjoint sets (i.e) X Y = Ø. The

strength of Association Rule can be measured in terms ofsupport and confidence. Support determines how often a ruleis applicable to a given data set. Confidence determines howfrequently the item set is used.

Support (X Y) = P (X Y)Confidence (X Y) = P (X Y)/P(X).

Minimum support and minimum confidence are the twothresh hold in Association Rule mining. Association RuleMining is divided into two tasks like frequent item setgeneration and Rule generation.

Carlos Ordonez and et al.,[7] used Association rules topredict the heart disease. They used a simple mappingalgorithm. This algorithm uniformly treats attributes asnumerical or categorical. This is used to transform medicalrecords to a transaction format. An improved algorithm isused to mine the constrained association rules. A mappingtable is constructed and attribute values are mapped to items.The Decision tree is found incapable in mining data becausethey automatically split numerical values [7]. But in medicalfields the standard cutoffs are in the numerical format only.The split point chosen by the Decision tree are of little useonly. And interpreting the experimental results using Decisiontree is very difficult. Clustering was useful to have a globalunderstanding of data. A constrained version of clusteringfocusing on projection of data could be useful, but it deservesfurther research. So all these factors justify the use ofAssociation Rule in mining medical data. Deepika [11] have used Pruning-Classification AssociationRule (PCAR), for mining Association Rule.

PCAR comes from analyzing and considering of Apriorialgorithm. PCAR deletes minimum frequency item withminimum frequency item sets. It deletes infrequent item fromitem sets. Then classifies item sets based on frequency of itemsets and discovers frequent item sets. The below Figure 3shows how the PCAR algorithm works [11].


978-1-4673-5788-3/13/$31.00©2013IEEE

3

Figure 3: Procedure of PCAR Algorithm

B. Data Mining and Classification Usha Rani [38] has implemented artificial neural network inheart disease database using feed forward and backpropagation algorithm. The experiment is conducted byconsidering single and multilayered neural network models.Parallelism is implemented at each neuron in all hidden andoutput layer to speed up the learning process. 13 neurons areset for input, each neuron corresponds to each attributes.Neural network provides satisfactory results for classificationtask.

TrainingSamples

TestSamples

ClassificationEfficiency

SingleLayer

MultiLayer

100 300 76% 82%150 200 79.4% 83%250 150 86.2% 89.3%350 100 90.6% 94%

Table 1: Experimental Results of Heart Disease Data set [37]

In Asha Rajkumar [3] , the data mining Classification isbased on supervised machine learning Algorithm. Tangaratool is used to classify the data and evaluated using 10 foldcross validation. The performance analysis of Naïve Bayes, K-nn, Decision List Algorithm is based on accuracy and timetaken to build the model. Naïve bayes algorithm is consideredto be better since it takes only some to calculate the accuracythan other algorithm. And also it resulted in lower error rates.Naïve Bayes algorithm gives 52.23% of accurate result. Thebelow Table 2 shows the performance study of the algorithm.

Table 2: Performance study of the Algorithm [3]

In [24] R. Setthukkarasi,, proposed a novel neuro fuzzytechnique to diagnose the severity of the disease from the setof the patient records. A generalized database is constructedfor decision making from the reduced attributes set obtainedthrough genetic algorithm. A four layered fuzzy neuralnetwork for efficient modeling and reasoning with temporaldependencies under uncertainity is used. A Decision SupportSubsystem is constructed by extracting rules from thetemporal patterns retrieved from the relationship betweentrained data set. Radial Basic Function (RBF) neural networkis constructed with five input nodes, training andnormalization in hidden layer and output layer with one outputnode. RBF is used to train large amount of data with very fewinputs. Data preprocessing by genetic algorithm, generalizeddatabase generation, Dataset training by RBF fuzzy neuralnetwork, Rule generation, Query processing and Severitybased diagnosis are the steps followed for predicting the heartdisease. The Figure 4 shows the logical view of Rulegeneration Process [24].

Figure 4: Logical view of Rule Generation Process

Rafiah Awang and Sellapan Palaniappan [25, 26] developed aprototype called Intelligent Heart Disease Prediction System(IHDPS) using data mining technique namely Decision Trees,Naïve Bayes and Neural Network. Each technique has its ownstrength in realizing the objective of data mining. Data MiningExtension (DMX), a SQL based query language is forbuilding and accessing the models’ contents. IHDPS answerscomplex “what if” queries where decision support systemcannot. The most common modeling objective used in Datamining is Classification and Prediction. Decision Tree andNeural network come under Classification and Regression,Association Rules and Clustering [26] comes underprediction. Naïve bayes is the most effective in heart diseaseprediction as it has the highest percentage of correctpredictions (86.53%). Five Data mining rules are defined andevaluated using the three techniques namely Decision Trees,Naïve Bayes and Neural Network. Anbarasi.M and et Al., [1] has used Genetic Algorithm (GA)to determine the attributes for the diagnosis of heart disease.Future extraction is done using GA. Attributes number isreduced as 6. The Table 3 shows the reduced number ofattributes using GA [1].

Algorithmused

Accuracy Timetaken

NaïveBayes

52.33% 609ms

DecisionList

52% 719ms

KNN 45.67% 1000ms


978-1-4673-5788-3/13/$31.00©2013IEEE

4

Predictable attribute:Diagnosis

Value Healthy: No heart diseaseValue Sick: Has Heart diseaseReduced Input attributes:1.Type - Chest Pain Type

2.Rbp - Resting blood pressure3.Eia - Exercise induced angina

4.Oldpk - Old peak5.Vsl - No. of vessels colored

6.Thal -Maximum heart rateachieved

Table 3: Reduced Attributes list using GA [1]

Three Classifiers are used for comparing the performance inpredicting the heart disease. The classifiers taken for study areNaïve Bayes, Classification by Clustering and Decision Tree.Decision tree outperforms but it takes more time to build themodel. Naïve bayes perform consistently before and afterreduction of attributes. Classification via Clustering is verypoor in its performance. Weka tool is used in evaluation theperformance. Shantakumar B. Patil and et al.,[30, 31 ] clustered the datawarehouse using K-means clustering algorithm after preprocessing the data. Maximal Frequent Item Set Algorithm(MAFIA) is employed for the extraction of association rulesfrom the clustered dataset besides performing efficiently whenthe database consists of very long item sets specifically.Multilayer perception Network (MLPNN) and Backpropagation is used as training algorithm.Pseudo code for MAFIA [29]:

MAFIA(C, MFI, Boolean IsHUT) {name HUT = C.head C.tail;

if HUT is in MFIstop generation of children and return

Count all children, use PEP to trim the tail, andrecorder by increasing support,

For each item i in C, trimmed_tail {IsHUT = whether i is the first item in the tail

newNode = C IMAFIA (newNode, MFI, IsHUT)}

if (IsHUT and all extensions are frequent)Stop search and go back up subtree

If (C is a leaf and C.head is not in MFI)Add C.head to MFI

}In Application of Data Mining Techniques in Health Care andPrediction of Heart Attack [34] One Dependency AugmentedNaïve Bayes Classifier [17] (ODANB) and Naïve CredalClassifier 2 (NCC2) is used. The objective of this work is toexamine the potential use of Classification based Data Miningtechniques like Naïve Bayes, Rule based, Decision Tree andArtificial Neural Network. For the prediction of heart diseaseNaïve Bayes found to be the best when compared withODANB. Text Mining can also be used in future to mine the

vast amount of unstructured data available in health caredatabases. In [35], Subbulakshmi and et al., used Naïve Bayes forpredicting Decision Support in heart disease predictionSystem (DSHDPS). It can serve as a training tool for nursesand medical students in diagnosing the heart disease. NaïveBayes is found to be the best in heart disease prediction. It isused to create models with predictive capability. It providesnew ways of exploring and understanding the data. NaïveBayes is used to get efficient output.

Fig 5: Implementation of Naïve Bayes Algorithm on the patient data [35]

In Naïve Bayesian Classification Approach in Health Careapplication, Bhuvaneshwari. R [6] proposed that Naïve BayesClassification can be used as a best decision support system.In [15], Kavitha K.S used hybridization to train neuralnetwork using Genetic Algorithm. In this Feed- forwardneural network and Back propagation is used as a learningalgorithm. Chen A.H [10] used Artificial Neural Network(ANN) algorithm for classifying the heart disease based oninput. Learning Vector Quantization (LVQ) is a prototypebased supervised Classification Algorithm. Accuracy is 80%,Sensitivity is 85% and Specificity is 70%. To provegoodness, ROC Curve is also displayed. The below Figure 6shows the computational steps of LAV model.

Figure 6: The Computational Steps of LAV Model In [9] Chaitrali S. Dangare and Sulabha have added 2 moreattributes such as obesity and smoking to the existing 13

EnterPatientRecord

NaïveBayes

Data set

Calculatesprobability of each

Calculates yes or noprobability

Tells aboutrisk


978-1-4673-5788-3/13/$31.00©2013IEEE

5

attributes. Decision tree, Naïve Bayes and Artificial NeuralNetwork has been tested with both 13 and 15 attributes. Inboth the case, the neural network showed better performance.The Figure 7 shows the Graphical Representation of Accuracyfor each method.

Figure 7: Graphical Representation of Accuracy for each method [9]

Clustering, Time Series and Association Rule can also beused for Data Mining Classification. Text mining can be usedto mine huge amount of unstructured data in future. In [20], Milan Kumari compares RIPPER, SVM, DecisionTree and Artificial Neural Network based on Sensitivity,Specificity, Accuracy, Error Rate, True Positive Rate andFalse Positive Rate. SVM predicts with least error rate andhighest accuracy when compared with other techniques [19].Future work is predicted to done using Meta models. In [23],Feed forward back propagation neural network is used as aclassifier to distinguish between infected or non-infectedperson in both the cases. Neural network tool box fromMatlab 7.9 is used to evaluate the performance of theproposed network.

Levenburg-Marquardt back propagation algorithm is used totrain the network. Neural network can be used for identifyingthe patients suffering Heart disease. A Novel Approach for Heart Disease diagnosis using DataMining and Fuzzy Logic [21] reduces the number of attributesto reduce the number of tests for the patients. Decision Treeand Naïve Bayes combined with Fuzzy logic outplayed otherData Mining techniques in predicting Heart disease. In [32],Shouman and et. Al., used K-Nearest Neighbor (K-NN) inclassification problem. Integrating voting with K-NN cannotenhance its accuracy in diagnosing the heart disease. Whenvoting is applied to Decision Tree, different decision rules areextracted from each subset, which extract new knowledgeleads to increasing accuracy in Decision Tree. But in K-NNno new knowledge is extracted, just distance between clustersin measured. Applying K-NN achieved an accuracy of 97.4%without voting. But voting in K-NN results in decrease inaccuracy.C.Data Mining and Clustering

In [4], Bala Sundar and et. al., used K-means ClusteringAlgorithm for the prediction of the heart disease. It is amethod of cluster analysis which aims to partition ‘n’

observations to ‘k’ clusters. Euclidaen distance formula isused to minimize the sum of square of distance between data.Naïve Bayes is found to slow and Neural Network takesnumber of iterations in predicting the disease when comparedusing time and accuracy.

Figure 8: Exponential accuracy of k-means with existing techniques

The Figure 8 explains the exponential accuracy of k-meanswith existing techniques [4]. Santhi and et. Al., [28] proposesthe performance of clustering and classification algorithmusing heart disease data. Performance of classifiers of Bayes,functions, Lazy, Metammulti Boost AB, Rules, Trees, etc., arecalculated. The performance of classification is calculatedusing cross validation test mode and clusers by mode ofclasses to evaluate clusters. Data preprocessing is done byattribute subset selection algorithm. Final result showed thatNaïve Bayes tress has highest prediction accuracy thanclustering Algorithm.

V. CONCLUSION

Heart disease prediction is a major challenge in the healthcare industry. Instead of going for a number of tests,predicting heart disease with less number of attributes is achallenging task in Data Mining. Existing literature shows thatClassification task in Data Mining plays a vital role in heartdisease prediction when compared with Clustering,Association Rule and Regression. In Classification DecisionTree outperforms in some cases, where Neural Network andNaïve Bayes outperforms in some other cases. Each techniquehas its own merits and demerits. When combined with eachother or with Fuzzy logic, Heart disease prediction with DataMining techniques will become most successful with lessnumber of attributes. Text mining the medical data is anotherextension found in predicting the health care data.

REFERENCES

[1] Anbarasi.M, Anupriya and Iyengar “Enhanced Prediction of HeartDisease with Feature Subset Selection using Genetic Algorithm”,


978-1-4673-5788-3/13/$31.00©2013IEEE

6

International Journal of Engineering and Technology, Vol 2(10), 2010,pp 5370-5376.

[2] Annoj P.K.,” Clinical decision support system: Risk level prediction ofheart disease using Data Mining Algorithms”, Journal of King SaudUniversity- Computer and Information Sciences, 2012,pp 27-40.

[3] Asha Rajkumar and Mrs. Sophia Reena, “ Diagnosis of Heart Diseaseusing Data Mining Algorithms, Global Journal of Computer Science andTechnology, vol. 10(10), 2010, pp 38-43.

[4] Bala Sundar V, “Development of Data Clustering Algorithm forpredicting Heart”, IJCA, Vol 48(7), June 2012, pp 8-13.

[5] Bhagyashree Ambulkar and Vaishali Borkar “Data Mining in CloudComputing”, MPGINMC, Recent Trends in Computing, ISSN 0975-8887,2012, pp 23-26.

[6] Bhuvaneswari. R, “Naïve Bayesian Classification Approach in HealthCare Application”, International Journal of Computer Science andTelecommunication, vol 3(1), Jan 2012, pp 106-112.

[7] Carlos Ordonez, Edward Omincenski and Levien de Braal “MiningConstraint Association Rules to Predict Heart Disease”, Proceeding of2001, IEEE International Conference of Data Mining, IEEE ComputerSociety, ISBN-0-7695-1119-8, 2001, pp: 433-440.

[8] Cengiz colak.M , Cemiz colak and Hasan Kocatruk “Predictingcoronary artery disease using different artificial neural networkmodels”, CAD and Artificial neural network, pp 249-254, 2008.

[9] Chaltrali S. Dangare and Sulabha, “Improved Study of Heart DiseasePrediction System using Data Mining Classification Techniques”,IJCA, Vol 47(10), pp 44-48, June 2012.

[10] Chen A.H., “HDPS: Heart Disease Prediction System”, Computing inCardiology, ISSN 0276-6574, pp 557-560, 2011.

[11] Deepika. N, “Association Rule for Classification of Heart Attackpatients”, IJAEST, Vol 11(2), pp 253-257, 2011.

[12] Jabbar M.A., “Knowledge discovery from mining association rules forHeart disease Prediction”, JATIT, Vol 41(2), pp 166-174, 2012.

[13] Jyothi Soni, Uzma ansari and Dipesh Ansari “Intelligent and EffectiveHeart Disease Prediction System using Weighted Associate Classifer”,IJCSE, Vol 3(6), pp 2385-2392, June 2011.

[14] K.Rajeswari, “Prediction of Risk Score for Heart Disease in India usingMachine Intelligence”,IPCSIT, Vol 4, 2011.

[15] Kavitha K.S, “Modeling and designing of evolutionary neural networkfor heart disease prediction”, IJCSI, Vol 7(5), pp 272-283, September2010.

[16] Latha Parthiban and R.Subramanian, “Intelligent Heart DiseasePrediction System using CANFIS and Genetic Algorithm”,International Journal of Biological and Life Sciences, Vol 3(3), pp157-160,2007.

[17] Liangxiao. J, Harry.Z, Zhihua.C and Jiang.S “One DependencyAugmented Naïve Bayes”, ADMA, pp 186-194, 2005.

[18] Mia Shouman, “Using data mining techniques in heart disease diagnosisand treatment”, 978-1-4673-0483-2, Japan-Egypt Conference onElectronics, Communications and Computers, pp 189-193, 2012.

[19] Milan Kumari and Sunila Godara, “Review of Data MiningClassification Model in Cardio Vascular Disease diagnosis”, IJCA,2011.

[20] Milan Kumari and Sunila Godara, “Comparative Study of Data MiningClassification Methods in Cardio-Vascular Disease Prediction”, IJCST,Vol 2(2), June 2011.

[21] Nidhi Bhatia and Kiran Jyothi, “A Novel Approach for heart diseasediagnosis using Data Mining and Fuzzy logic”, IJCA, Vol 54(17), pp16-21, September 2012.

[22] Nithya N.S, Sarumathi. S and Dr. Duraisamy. K “ Assessment of therisk factors of Heart Attack using frequent feature Selection Method”,International Journal of Communications and Enggineering, Vol 1(1),ISSN 0988-0382, pp 127-133, March 2012.

[23] Qeethara Kadhim Al. Shayea, “Artificial neural network in MedicalDiagnosis”, IJCSI, Vol 3(2), March 2011.

[24] R. Setthukkarase and Kannan “An Intelligent System for miningTemporal rules in Clinical database using Fuzzy neuralnetwork”,European Journal of Scientific Research, ISSN 1450-216, Vol70(3), pp 386-395, 2012.

[25] Rafiah Awang and Palaniappan. S “Intelligent Heart Disease PredictionSystem Using Data Mining techniques”, IJCSNS, Vol 8(8), pp 343-350,Aug 2008.

[26] Rafiah Awang and Palaniappan. S “Web based Heart Disease DecisionSupport System using Data Mining Classification Modeling techniques”, Proceedings of iiWAS, pp 177-187, 2007.

[27] Raghu. D.Dr, “Probability Based Heart Disease Prediction using DataMining Techniques”, IJCST, Vol 2(4), pp 66-68, Dec 2011.

[28] Santhi. P, “Improving the performance of Data Mining Algorithm inHealth Care data”, IJCST, Vol 2(3), 2011.

[29] Setiawan N.A, “ Rule Selection for Coronary Artery Disease DiagnosisBased on Rough Set” ,International Journal of Recent Trends inEngineering, Vol 2(5), pp 198-202, Dec 2009.

[30] Shantakumar B.Patil, “Intelligent and Effective Heart Attack PredictionSystem using Data Mining and Artifical Neural Network”, EuropeanJournal of Scientific Research, Vol 31(4), pp 642-656, 2009.

[31] Shanthakumar B. Patil, “Extraction of Significant patterns from HeartDisease Ware Houses for Heart Attack Prediction”, IJCSNS, Vol 9(2),pp 228-235, Feb 2009.

[32] Shouman.M, Turner.T and Stocker.R, “Applying K-Nearest Neighbourin diagnosing Heart Disease Patients”, International Journal ofInformation and Education Technology, Vol 2(3), June 2012.

[33] Siri Krishnan Wasan, Vasutha Bhatnagar and Harleen Kaur “The Impactof Data Mining techniques on medical diagnostics”, Data ScienceJournal, Vol 5(19), pp 119-126, October 2006.

[34] Srinivas, Kavitha Rani and Dr. Govarthan, “Application of Data MiningTechniques in Health Care and Prediction of Heart Attack”, IJCSE, Vol2(2), pp 250-255, 2010.

[35] Subbulakshmi, Ramesh and Chinna Rao “Decision Support in HeartDisease Prediction System using Naïve Bayes”, IJCSE, ISSN 0976-5166, Vol 2(2), May 2011.

[36] Sudha.A, Gayathri.p and Jaishankar. N “Utilization of Data MiningApproaches for prediction of life Threatening Disease Survivability”,IJAC (0975-8887), Vol 14(17), March 2012.

[37] Jyothi. S, Ujma.A, Dipesh. S and Sunita. S “Predictive Data Mining forMedical Diagnosis : An Overview of Heart Disease Prediction”, IJCA,Vol 17(8), pp 43-48, March 2011.

[38] Usha. K Dr, “Analysis of Heart Disease Dataset using Neural networkapproach”, IJDKP, Vol 1(5), Sep 2011.