Machine Intelligence in Healthcare and Medical Cyber Physical Systems: a Survey
Total Page:16
File Type:pdf, Size:1020Kb
Received June 24, 2018, accepted August 10, 2018, date of publication August 20, 2018, date of current version September 7, 2018. Digital Object Identifier 10.1109/ACCESS.2018.2866049 Machine Intelligence in Healthcare and Medical Cyber Physical Systems: A Survey OMID RAJABI SHISHVAN , (Student Member, IEEE), DAPHNEY-STAVROULA ZOIS , (Member, IEEE), AND TOLGA SOYATA , (Senior Member, IEEE) The authors are with the Department of Electrical and Computer Engineering, SUNY Albany, Albany, NY 12203, USA Corresponding author: Tolga Soyata ([email protected]) ABSTRACT Today, the US healthcare industry alone can save $300 B per year by using machine intelligence to analyze a rich set of existing medical data; results from these analyses can lead to breakthroughs such as more accurate medical diagnoses, discovery of new cures for diseases, and cost savings in the patient admission process at healthcare organizations. Because healthcare applications intrinsically imply a vast amount of data, the execution of any algorithm on medical data is computationally intensive. Significant advancements made in computational power in the past decade have provided the opportunity for many researchers to successfully implement various machine intelligence-based healthcare applications, which didn't run efficiently on earlier computational platforms. In this paper, we provide a survey of machine intelligence algorithms within the context of healthcare applications; our survey includes a comprehensive list of the most commonly used computational models and algorithms. We view the application of these algorithms in multiple steps, namely, data acquisition, feature extraction, and aggregation, modeling, algorithm training, and algorithm execution and provide details—as well as representative case studies— for each step. We provide a set of metrics that are used to evaluate modeling and algorithmic performance, which facilitate the comparison of the presented models and algorithms. Medical cyber-physical systems are presented as an emerging application case study of machine intelligence in healthcare. We conclude our paper by providing a list of opportunities and challenges for incorporating machine intelligence in healthcare applications and provide an extensive list of tools and databases to help other researchers. INDEX TERMS Healthcare applications, medical decision support, machine intelligence, statistical signal processing, machine learning, data mining, feature selection. I. INTRODUCTION ability to ``learn'' input-output correlations and create an Improving healthcare by taking advantage of today's technol- approximation to describe their relationship; they work on ogy is a global interest; based on a 2011 report by McKinsey existing data and can find either patterns in the data or ana- Global Institute [1], it is estimated that the US healthcare lytical relations between the input and output data. alone can save 300 billion dollars annually by analyzing the Successful application of a variety of machine intelligence large corpus of healthcare data that has been accumulating for algorithms to medical data can revolutionize a wide range decades. This corpus of data includes patient health records of medical applications by improving diagnosis accuracy, (either in written or digital format) and past diagnoses and determining its cause and course of disease, and formulating outcomes [2], [3]. Despite the availability of this large vol- an effective treatment for diseases, among others. For exam- ume of data, the amount of time it takes to process it —to ple, by training an algorithm on multiple images of a certain make practical inferences— is not a trivial task either for medical condition from a set of prior patients, the existence of computers or healthcare professionals [4]. A set of algorithms that same medical condition can be detected in a new patient designed to find statistical connections among events and by using the new patient's image as the input to the algorithm. results efficiently has been the focus of research in both A remote health monitoring system consisting of body-worn electrical engineering and computer science disciplines for sensors [5] that acquire data from a patient, transmit to the decades. These algorithms are mathematical tools with the cloud, and process it using machine intelligence can improve 2169-3536 2018 IEEE. Translations and content mining are permitted for academic research only. VOLUME 6, 2018 Personal use is also permitted, but republication/redistribution requires IEEE permission. 46419 See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. O. R. Shishvan et al.: Machine Intelligence in Healthcare and MCPSs: Survey diagnostic accuracy, reduce healthcare costs due to the reduc- We provide a list of challenges and opportunities in tion in the time for in-patient care [6], and introduce new Section XIII to highlight the potential future applications of business opportunities [7]. machine intelligence algorithms in the healthcare domain. For a majority of the algorithms, the prediction accuracy Our concluding remarks are provided in Section XIV. depends on an initial ``training'' phase, where the algorithm In Appendixes A and B, we categorically enumerate a list of learns how to predict the output based on a set of training data, publicly-available medical databases and tools, respectively, which includes known input-output pairs. This is in contrast to aid readers of this paper in finding readily-accessible data to sets of algorithms that do not need any training, because and tools for use in their research. their intended purpose is the discovery of patterns within the data, without designating any of the data as ``input'' or ``out- II. USING MACHINE INTELLIGENCE IN HEALTHCARE put'' to the algorithm [8]. The concepts of ``training'' and APPLICATIONS ``pattern recognition'' provide two counteracting forces in In this section, we will discuss six specific healthcare applica- using machine intelligence in healthcare; while an increase tion categories in Sections II-A through II-F, which represent in the amount of available training data increases the demand the majority of existing applications that can benefit from the for computational resources, a decrease causes a reduction usage of machine intelligence. in algorithmic accuracy, despite a reduction in demand for A. CLINICAL DIAGNOSIS computational resources. Because the priority for medical In clinical diagnosis, machine intelligence algorithms recog- applications is typically algorithmic accuracy, it is reasonable nize the existence of a symptom or a specific health con- to expect that future healthcare applications that incorporate dition in a patient, which enables real-time monitoring of machine intelligence will utilize large datacenters, potentially patients via the analysis of large datasets that may include rented from cloud operators such as Amazon EC2 [9]. 3D images and long-term signal recordings. Following stud- Before selecting a machine intelligence algorithm, the first ies apply machine intelligence to Electrocardiogram (ECG) step is generally the design of a model in which the form of recordings of patients to detect the existence health hazards: the input and output values are specified, as well as a tolerable Hijazi et al. [10] determine their likelihood of having two accuracy. The choice of algorithm depends largely on the types of arrhythmias (LQT1 and LQT2) and achieve a ≥ type of data, complexity of the model, and the goal. Since 70% accuracy. Thakor and Zhu [11] take a different approach checking and understanding the decision process of the model for detecting multiple types of arrhythmias by first remov- is sometimes of interest, interpretability of the model for ing the noise in ECG recordings and using adaptive filters humans is another issue. Hence, the models should show the and the least mean square algorithm. Bsoul et al. [12] use relation between the variables and how each variable affect single channel ECG recordings to detect obstructive sleep the decision process with ease. apnea (OSA), thereby eliminating the need for a full sleep The remainder of this paper is organized as follows: study and achieve accuracies as high as ≈90% in some cases. In Section II, we categorically list a set of healthcare applica- The following two studies demonstrate the diagnosis of tions that can benefit from machine intelligence and provide cancer using machine intelligence. Mousavi et al. [13] iden- an overview of the algorithmic flow in these applications. tify glioma (a type of cancer that starts in the brain) in In Section III, we provide a conceptual architecture of how histopathological images of patients and classify the exis- machine intelligence algorithms are used in these applica- tence of glioma into one of two known categories: low-grade tions. We study the challenges related to the acquisition of glioma and high-grade glioma. They achieve up to an 88% different types of medical data in Section IV and describe how in recognition rate. The study presented in [14] investigates raw data can be turned into feature vectors and algorithms detecting the existence of melanoma using features such as to select the most useful set of these features in Section V. texture and color identify normal/abnormal skin regions. In Section VI, we study the metrics that are used to com- The following studies detect the existence of mental disor- pare the performance of different models and algorithms. ders by using