Intelligent Prognosis of Coronary Artery Disease Excluding Angiogram in Patient with Stable Angina
Total Page:16
File Type:pdf, Size:1020Kb
International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-9 Issue-5, March 2020 IProCAD: Intelligent Prognosis of Coronary Artery Disease Excluding Angiogram in Patient with Stable Angina Md. Shah Jalal Jamil, A. K. M. Muzahidul Islam, Bulbul Ahamed, Mohammad Nurul Huda Abstract: Cardiovascular diseases are one of the main causes of I. INTRODUCTION mortality in the world. A proper prediction mechanism system with reasonable cost can significantly reduce this death toll in the low-income countries like Bangladesh. For those countries we At present, more than seventeen (17) million people die propose machine learning backed embedded system that can globally every year by cardiovascular diseases, out of them predict possible cardiac attack effectively by excluding the high 30% die due to the cardiac attack and 80% of them happen in cost angiogram and incorporating only twelve (12) low cost the under developing or low-income countries [1]. features which are age, sex, chest pain, blood pressure, Bangladesh is one of them despite the exact number of cholesterol, blood sugar, ECG results, heart rate, exercise induced angina, old peak, slope, and history of heart disease. Here, two incidences of the disease in Bangladesh is not known. But the heart disease datasets of own built NICVD (National Institute of identification of heart disease depends on costly and Cardiovascular Disease, Bangladesh) patients’, and UCI composite of biomedical, pathological and clinical data [2]. (University of California Irvin) are used. The overall process In the low-income countries, most of the people avoid comprises into four phases: Comprehensive literature review, medical checkup because of ignorance and awareness due to collection of stable angina patients’ data through survey unaffordability of costly diagnosis for example, angiogram. questionnaires from NICVD, feature vector dimensionality is Besides, heart disease or coronary artery disease is most reduced manually (from 14 to 12 dimensions), and the reduced common and hazardous among all the diseases universally feature vector is fed to machine learning based classifiers to obtain a prediction model for the heart disease. From the because of blockage of blood stream to the brain. Moreover, experiments, it is observed that the proposed investigation using the angiogram is costly, complex, time consuming, and also NICVD patient’s data with 12 features without incorporating not available in rural area especially in Bangladesh. angiographic disease status to Artificial Neural Network (ANN) Therefore, an intelligent system that predicts heart status by shows better classification accuracy of 92.80% compared to the measuring simple information of low cost is inevitable, which other classifiers Decision Tree (82.50%), Naïve Bayes (85%), assists medical practitioners or non-cardiologists. Support Vector Machine (SVM) (75%), Logistic Regression For intelligent prediction system it is needed to integrate (77.50%), and Random Forest (75%) using the 10-fold cross supervised (Artificial Neural Network, Support Vector validation. To accommodate small scale training and test data in our experimental environment we have observed the accuracy of Machine, Naïve Bayesian Classifier, Decision Tree, Logistic ANN, Decision Tree, Naïve Bayes, SVM, Logistic Regression and regression, Random Forest) or unsupervised machine Random Forest using Jackknife method, which are 84.80%, 71%, learning (ML) algorithms may be used to detect the status of 75.10%, 75%, 75.33% and 71.42% respectively. On the other cardiac disease based on the feature of the clinical data [3], hand, the classification accuracies of the corresponding [4], [5]. classifiers are 91.7%, 76.90%, 86.50%, 76.3%, 67.0% and 67.3%, respectively for the UCI dataset with 12 attributes. Whereas the same dataset with 14 attributes including angiographic status shows the accuracies 93.5%, 76.7%, 86.50%, 76.8%, 67.7% and 69.6% for the respective classifiers. Keywords: Machine learning, Data mining, Coronary artery disease, Artificial neural network, Heart disease. Revised Manuscript Received on March 05, 2020. * Correspondence Author Md. Shah Jalal Jamil*, Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh. E-mail: [email protected], Mob: +880 1718287450. A.K.M. Muzahidul Islam, Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh. E-mail: [email protected] Fig. 1. Overview of this research. Bulbul Ahamed, Department of Computer Science and Engineering, Sonargaon University, Dhaka, Bangladesh. E-mail: [email protected]. Mohammad Nurul Huda, Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh. E-mail: [email protected]. Published By: Retrieval Number: E3101039520/2020©BEIESP Blue Eyes Intelligence Engineering DOI: 10.35940/ijitee.E3101.039520 2032 & Sciences Publication IProCAD: Intelligent Prognosis of Coronary Artery Disease Excluding Angiogram in Patient with Stable Angina In this study the proposed system suggests a low-cost Sellappan ANN, 15 A prototype-based prediction of cardiac status that comprises three stages where Palaniappan Naïve system, where Neural and Bayes and Network is a lower we collect dataset through survey questionnaires from RafiahAwang DT performer than Naïve National Institute of Cardiovascular Disease (NICVD), [12] Bayes and Decision Bangladesh [6] to construct corpus at first stage. At second Trees. stage we have reduced the noisy data that are integrated in the K. Srinivas, Decision 15 ANN is a low performer survey and the dimension of the data from standard 14 Dr.G.Raghaven Trees, than Decision Tree and draRao, and Dr. Naïve Naive Bayes. features to 12 features by excluding the angiographic status. A.Govardhan Bayes and Finally, we have injected 12-dimensional features vector (age, [13] Neural sex, chest pain, blood pressure, cholesterol, blood sugar, ECG Network results, heart rate, exercise induced angina, old peak, slope, Yanwei Xing, SVM, 11 They have used survival JieWang, ANN, and data for 11 attributes and history of heart disease or Stenosis) to machine learning Zhihong Zhao, Decision (e.g. TNF, IL6, IL8, tools to predict the cardiac status. The overall procedure of and Tree HICRP, MPO1, TNI2, our research is depicted in the “Fig. 1”. YonghongGao Sex, Age, Smoke, The originalities of the research are to construct the corpus [14] Hypertension, and Diabetes) as any using the Bangladeshi patients’ data from survey and to occurrence of CHD. reduce the dimensionality of features vector by excluding the HninWintKhain K-means 14 The author has proposed high cost angiographic status. Although the data collection g [16] clustering clustering based MAFIA process is inconvenient and risky, it seems that we are the first (Maximal Frequent Itemset to collect heart disease data in patient with stable angina in Algorithm) approach, Bangladesh so far. but more efficient The paper is organized as follows. Sections II explains prediction system has not literature review. The methodology of this research has been considered. explicated in Section III with diagrams, whereas Sections IV Atul Kumar Decision 14 In this model have used Pandey, Tree different approaches of and V analyze the experiments with corpus collection Prabhat (pruned, Decision Tree, where procedure and set up, and results. Finally, Sections VI Pandey,K.L un-pruned, fasting blood sugar is not concludes the paper, and Section VII highlights the Jaiswal, and and provided good accuracy. limitations with future remarks. Ashish Kumar reduced Sen [19] error pruning II. LITERATURE REVIEW approach) Randa El-Bialy, Fast 14 The authors have In this section, we have reviewed some related works given Mostafa A. decision observed different in the Table-I with architectures (authors name with Salamay, Omar tree and datasets with 14 reference, classifiers name, number of features in input H. Karam, and pruned attributes. vector) and comments, where most of the authors used UCI M Essam C4.5 tree Khalifa, [24] dataset [7] with 14 features by incorporating Fuzzy Logic, Mai Shouman, K-Nearest 13 The authors have Genetic Algorithms and various ML algorithms. Tim Turner, and Neighborh implemented only KNN Table- I: Reviewed some related works with analysis. Rob Stocker ood (KNN) algorithm with voting [26] and without voting Authors Name Classifiers Input Comments approach, but have not & Reference Vector considered larger dataset No. and efficiency. E. P. Ephzibah Genetic 6 The system can only Markos G. Decision 19 This study has and Dr. V. Algorithm implement the rules and Tsipouras, Tree and implemented for Sundarapandia and Fuzzy cannot learn as it goes Dimitrios I. Fuzzy significant or high-risk n [8] Expert along. They have not Fotiadis, Model CAD patients, not for all. System considered important Katerina K. attributes. Naka, and MoloudAbdar, C5.0, 13 They have observed that Lampros K. Sharareh R. Neural decision tree is an Michalis [27] NiakanKalhori, Network, outperformer, but the ToleSutikno, SVM, and accuracy result of ANN Imam Much K-Nearest is very low. Currently, ML is rapidly growing attention to the research IbnuSubroto, Neighborh community, where ML based method can produce reliable and GoliArji [9] ood (KNN) results, learn from earlier pattern and predict coronary heart Asha Rajkumar, KNN, 14 The accuracy (below disease status. We have mentioned various related ML and G. Sophia