Analysis of Heart Disease Using in Data Mining Tools Orange and Weka by Sarangam Kodati & Dr
Total Page:16
File Type:pdf, Size:1020Kb
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Global Journal of Computer Science and Technology (GJCST) Global Journal of Computer Science and Technology: C Software & Data Engineering Volume 18 Issue 1 Version 1.0 Year 2018 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Online ISSN: 0975-4172 & Print ISSN: 0975-4350 Analysis of Heart Disease using in Data Mining Tools Orange and Weka By Sarangam Kodati & Dr. R. Vivekanandam Sri Satya Sai University Abstract- Health care is an inevitable task to be done in human life. Health concern business has become a notable field in the wide spread area of medical science. Health care industry contains large amount of data and hidden information. Effective decisions are made with this hidden information by applying patient; however, with data mining these tests could be reduced. But there is a lack of analyzing tool according to provide effective test outcomes together with the hidden information, so and such system is developed using data mining algorithms for classifying the data and to detect the heart diseases. Data mining acts so a solution by many healthcare problems. Naïve Bayes, SVM, Random Forest, KNN algorithm is one such data mining method which serves with the diagnosis regarding heart diseases patient. This paper analyzes few parameters and predicts heart diseases, thereby suggests a heart diseases prediction system (HDPS) based total on the data mining approaches. Keywords: data mining, weka, orange, heart disease, data mining classification techniques. GJCST-C Classification: J.3 AnalysisofHeartDiseaseusinginDataMiningToolsOrangeandWeka Strictly as per the compliance and regulations of: © 2018. Sarangam Kodati & Dr. R. Vivekanandam. This is a research/review paper, distributed under the terms of the Creative Commons Attribution-Noncommercial 3.0 Unported License http://creativecommons.org/licenses/by-nc/3.0/), permitting all non- commercial use, distribution, and reproduction inany medium, provided the original work is properly cited. Analysis of Heart Disease using in Data Mining Tools Orange and Weka Sarangam Kodati α & Dr. R. Vivekanandam σ Abstra ct- Health care is an inevitable task to be done in human weight, symptoms, etc. This will help the doctors life. Health concern business has become a notable field in the diagnose the disease more efficiently. Knowledge wide spread area of medical science. Health care industry discovery in databases is the method of finding useful contains large amount of data and hidden information. information and patterns into data. Knowledge discovery 2 018 Effective decisions are made with this hidden information by within databases can be do using data mining. It makes applying patient; however, with data mining these tests could be reduced. But there is a lack of analyzing tool according to use of algorithms after extract the information and Year provide effective test outcomes together with the hidden patterns derived by the knowledge discovery in 17 information, so and such system is developed using data databases process. Various stages of knowledge mining algorithms for classifying the data and to detect the discovery in databases process are highlighted in Fig.1. heart diseases. Data mining acts so a solution by many healthcare problems. Naïve Bayes, SVM, Random Forest, KNN algorithm is one such data mining method which serves with the diagnosis regarding heart diseases patient. This paper analyzes few parameters and predicts heart diseases, thereby suggests a heart diseases prediction system (HDPS) based total on the data mining approaches. Keywords: data mining, weka, orange, heart disease, data mining classification techniques. I. Data Mining ata mining is concerned together with the ) C method of computationally extracting unknown ( knowledge from vast sets of data. Extraction of D useful knowledge from the enormous data sets and Fig. 1: KDD Process providing decision-making results for the diagnosis or remedy of diseases is very important. Data mining can Various stages concerning knowledge stand used to extract knowledge by analyzing and discovery of databases method are described as predicting some diseases. Health care data mining has follows. In Selection stage, that obtains the different data a large potential according to discover the hidden resources. In preprocessing stage, it removed the patterns among the data sets about the medical unwanted missing and noisy data and furnished the domain. Various data mining methods are available with clean data which execute format in accordance their suitability dependent on the healthcare data. Data including a common format of transform stage. Then mining applications in health care can have a wonderful data mining techniques are applied according to get potential and effectiveness. It automates the process of desired output. Finally into the between the signification finding predictive information in large databases. stage, that will present the result after end user in a Disease prediction plays an important role in data meaningful manner. mining. Finding of heart disease requires the performance of some tests on the patient. However, use II. Data Mining Techniques of data mining techniques can reduce the number of The most frequently used Data Mining tests. This reduced test set plays a significant role in techniques are specified below: performance and time. Health care data mining is an a) Classification learning: The learning algorithm takes Global Journal of Computer Science and Technology Volume XVIII Issue I Version important task because it allows doctors to see which a set of classified examples (training set) and uses it attributes are more important for diagnosis such as age, for training the algorithms. With the trained algorithms, classification of the test data takes place Author α: Research Scholar, Department of Computer Science and Engineering, Sri Satya Sai University of Technology and Medical based over the patterns and rules extracted from Science, Sehore, Bhopal, Madhya Pradesh, India. the training set. e-mail: [email protected] b) Numeric predication: This is a variant of Author σ: Professor, Director in Muthayammal Engineering College, Namakkal, India. classification learning with the exception that © 2018 Global Journals Analysis of Heart Disease using in Data Mining Tools Orange and Weka instead of predicting the discrete class the outcome available because of modifications. Two such examples is a numeric value. of open source licenses are the GPL, or general people c) Association rule mining: The association and consent (GNU.org, 2015a), then GNU(GNU.org, 2015b). patterns between the some attributes are extracted Anyone be able to develop extensions then or from its attributes, rules are created. The rules customizations about open source software; though, and patterns are used predicting the categories or charging a fee for certain things to do is typically classification of the test data. prohibited by using a public license agreement whereby any modifications to the source code automatically d) Clustering: The grouping of similar instances into become public domain. Communities emerge around clusters takes place. The challenges or drawbacks software with developers worldwide extending open considering this type of machine learning is that we source software. have according to first identify clusters and assign a new instance according to these clusters[8]. 2 018 V. Heart Diseases Out of this four types of learning methods, we The highest mortality in both India and abroad Year need to identify the algorithm as performs better. The is due to heart disease. So it is vital time to check this application of data mining methods depends on the 18 types of data which is fitted to be used in the death toll by correctly identifying the disease between techniques, or solving data mining troubles depend on initial stage. The matter becomes a headache for all the types of data to stand used and the selection about medical doctors both in India and abroad. Nowadays data mining technique which is most suitable for the doctors are adopting many scientific technologies and data used. methodology for both identifications or diagnosing not only the common disease but also many fatal diseases. III. Machine Learning The successful treatment is continually attributed to right and accurate diagnosis. Doctors may also sometimes Machine learning (ML), employed as like a fail to take accurate decisions while diagnosing the method in data science, is the process of programming heart disease about a patient, therefore heart disease computers after learning from past experiences prediction systems which use machine learning (Mitchell, 1997Machine Learning seeks to develop algorithms assist in such cases to get accurate algorithms to that amount learn out of data directly with results [1]. little or no human intervention. Machine Learning C () algorithms perform a range of tasks such so like VI. Heart Disease Dataset prediction, classification, or decision making. Machine Learning stems from artificial intelligence research and The dataset used for this work is from UCI has become an essential aspect of data science. Machine Learning repository from which the Cleveland Machine learning begins with input so a training data heart disease dataset is used. The dataset has 303 set. In this phase, the Machine Learning algorithm instance and 76 attributes. However, only 14 attributes