DEVELOPING A MACHINE LEARNING FRAMEWORK FOR 24-HOUR DATA ANALYSIS AIMED AT EARLY DETECTION OF CARDIAC AS A GUIDING TOOL FOR PHYSICIANS

by A. YASHAR TASHAKKOR

M.D., University of British Columbia, 2010

Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Applied Science

in the School of Engineering Science Faculty of Applied Sciences

A. YASHAR TASHAKKOR 2019 SIMON FRASER UNIVERSITY Summer 2019

Copyright in this work rests with the author. Please ensure that any reproduction or re-use is done in accordance with the relevant national copyright legislation. Approval

Name: A. YASHAR TASHAKKOR

Degree: Master of Applied Science

Title: DEVELOPING A MACHINE LEARNING FRAMEWORK FOR 24-HOUR DATA ANALYSIS AIMED AT EARLY DETECTION OF CARDIAC ARRHYTHMIAS AS A GUIDING TOOL FOR PHYSICIANS Examining Committee: Chair: Michael Sjoersdma Senior Lecturer

Andrew Rawicz Senior Supervisor Professor

Craig Scratchley Supervisor Senior Lecturer

Ash Parameswaran Internal Examiner Professor

Date Defended: May 20th, 2019

ii Abstract

Cardiovascular diseases (CVD), defined as a spectrum of disorders primarily impacting the and the , account for a substantial fraction of worldwide morbidity and mortality. Electrocardiograms (ECGs) are routinely implemented in a patient’s diagno- sis, both in hospitals and outpatient settings. They serve as one of the primary diagnostic tools as patients encounter medical personnel, particularly in suspected CVD. A cardiac holter monitor is a medical diagnostic device, connected to the patient via several conduc- tive leads placed across the chest, and "worn" on a strap across the shoulder. A holter is applied to record continuous ECG data (typically 24 hours). With recently emerging applications of Machine Learning (ML) in data analysis techniques, the need for human expertise and potential human error could be minimized, and predic- tion accuracy optimized considerably. Hence, the objective of this research is to develop a machine learning frame-work to eventually aid physicians with their decisions as a powerful guiding/assisting tool to analyse the ECG information reported by holter monitors. Fur- thermore, we aim to develop a computer aided diagnostic system that can assists expert cardiologists by providing intelligent, cost effective, and time saving diagnosis. In this thesis, we implement a deep learning-based solution to analyse readings of the holter monitors. In our proposed solution, we train neural networks to extract high-level features from temporal signal recordings of holter monitors. We present a supervised neural network framework to predict the physician’s final interpretations based on the holter recorded signals. The outputs of the network contain the likelihood of four possible scenarios of Normal and three types of . The high classification performance of the proposed methodology emphasizes the capability of this framework to be used as an assisting tool alongside the physicians to interpret holter reports.

Keywords: Holter monitor, Machine learning, ECG, Neural network

iii Dedication

This humble work is dedicated to my family, most especially my brother, who inspired this academic endeavour. It is also dedicated to Professor Andrew Rawicz, a gentleman and a scholar. I first met Professor Rawicz years ago, quite literally the minute I became a medical doctor; it seems whenever he is around, I somehow inherit extra abbreviations. I sincerely thank you Andrew.

iv Acknowledgements

It is with immense gratitude that I acknowledge the support and inestimable guidance of Professor Andrew Rawicz on this work. I further owe my deepest gratitude to my committee members, reviewers, colleagues, mentors, and friends, all of whom dedicated valuable hours towards this dissertation.

v Table of Contents

Approval ii

Abstract iii

Dedication iv

Acknowledgements v

Table of Contents vi viii List of Tables ix List of Figures Acronyms xi 1 Introduction 1 1.1 Motivation ...... 1 1.2 Objectives ...... 3 1.3 Proposed Solution ...... 3 1.4 Contributions ...... 4 1.5 Thesis Outline ...... 5

2 Background 6 2.1 Introduction ...... 6 2.2 Electrocardiogram ...... 6 2.3 Arrythmia types ...... 7 2.4 Heart Rate Variability ...... 9 2.5 Holter Monitors ...... 10 2.5.1 Holter System Description ...... 13

3 Machine Learning Approaches 16 3.1 Introduction ...... 16 3.2 Artificial Intelligence ...... 16 3.3 Supervised Machine Learning ...... 17 3.4 Deep Learning (DL) and Related Work ...... 19

vi 4 Methodology 22 4.1 Ethics ...... 22 4.2 Materials ...... 22 4.2.1 Block Diagram ...... 22 4.3 Data ...... 23 4.3.1 Data Acquisition ...... 23 4.3.2 Data Description ...... 25 4.3.3 Data (ROI) Selection ...... 28 4.3.4 Labeling ...... 29 4.4 Prepossessing ...... 31 4.4.1 Data Augmentation ...... 31 4.5 Network Architecture ...... 31 4.5.1 Activation Function ...... 32 4.5.2 Automatic Feature Learning ...... 33 4.5.3 Classification Prediction (Labeling) ...... 34 4.5.4 Classification Framework ...... 34 4.5.5 Fine Tuning ...... 34 4.5.6 Optimization ...... 36 4.5.7 Implementation ...... 36

5 Results and Verification 40 5.1 Network Analysis ...... 40 5.1.1 Error Analysis ...... 40 5.1.2 Network Accuracy ...... 41 5.2 Classification Performance ...... 42 5.3 Test phase ...... 45 5.3.1 Classification Verification ...... 45 5.4 Discussion ...... 47

6 Conclusion and Future work 48 6.1 Summary of Contributions ...... 49 6.2 Challenges and Limitations ...... 49 6.3 Future Work ...... 50

Bibliography 52

Appendix 61

vii List of Tables

Table 4.1 Network Input: Features derived from holter’s generate report . . . . 35 Table 4.2 Output labels: the predictable Normal condition and arrhythmia types. 35

Table 5.1 Evaluation of network ...... 44 Table 5.2 Evaluation of network on 47 new patients ...... 46

viii List of Figures

Figure 2.1 Schematic representation of an ECG signal. ECG wave forms P, QRS and T waves and standard features extracted from a single cardiac beat [45]...... 7 Figure 2.2 Definitions of ECG components and their "normal" duration [48]. . 8 Figure 2.3 pattern on ECG. Note absence of characteristic P, QRS, T pattern seen on a normal ECG. Also note inconsistency in the R-R duration (An ECG representing an fib waveform.) . . . 9 Figure 2.4 An ECG illustrating First Degree AV Block. Note prolonged duration of the PR interval compared to a normal ECG...... 10 Figure 2.5 A vector view of the standard 12 Lead ECG. The frontal leads are light blue and the pre-cordial leads are dark blue. [73]...... 11 Figure 2.6 a) Einthoven’s triangle, formed by the right arm, left arm and left leg electrodes. These three electrodes form the basis for the frontal axis. [66]., b) The frontal axis of the ECG consists of six vectors derived from three electrodes: left arm (L), right arm (R), and left leg (F). [66]. c) The electrodes V1-V6 are placed on the chest roughly in a semilunar line. [66]...... 12 Figure 2.7 Simplified functional block diagram of a holter monitor. [108]. . . . 13 Figure 2.8 Block diagram of a sample holter recorder system. [52]...... 14 Figure 2.9 a) Circuits of ECG amplifier, high-and low-pass filters [52], b) Sug- gested R-peak detection circuit [52]...... 15

Figure 3.1 Main machine learning methods used for ECG classification. (a) Sup- port vector machine (b) Random forest classification using n decision trees. (c) Hidden Markov model (d) Neural network with two hidden layer [65]...... 18 Figure 3.2 A Typical neural network with working of a single neuron explained separately [17]...... 19

ix Figure 4.1 An illustration of proposed methodology for automatic arrhythmia classification of holter reports. Main tasks include: a) data collec- tion, b) Prepossessing of data and Machine Learning frame work, c) Predicting results) ...... 23 Figure 4.2 Distribution of holter monitor data: a)Gender distribution, b)Age range...... 24 Figure 4.3 ScotCare holter used for data collection throughout the study. a) Chroma2 version with 5 leads, b) Close view of holter monitor [25] 26 Figure 4.4 Representation of a sample report generated report by ScotCare built-in software, holterCareTM...... 27 Figure 4.5 Samples of the physician’s comments on holter reports; representing the diagnosis based on ECG readings, the key words form the selected red boxes are extracted as labels: a) Sample of a Normal reading. b)sample of Benign, C) sample of AV nodal block reading, d) Sample of AFB reading ...... 30 Figure 4.6 Schematic of a proposed network, nodes and layers connectivity. The number of inputs is chosen based on the available information form the reported holter readings, therefore the input layer contains 10 nodes. The output layer of the network includes four classes to pre- dict; Normal, Benign, AV block, AFB...... 38 Figure 4.7 Network structure representation. The number of inputs and param- eters for each layer as well as the type of each layer...... 39 Figure 4.8 Distributions of data set used for training and testing a)suggested methods for cross validation and testing [56], b)The 5-k fold method- ology used for testing and training of data ...... 39

Figure 5.1 Learning curve: Training and validation losses during training and test phase ...... 41 Figure 5.2 Learning curve graph: Training and validation losses during training and test phase ...... 42 Figure 5.3 Accuracy graph indicating: categortical accuracy and validation cat- egorical accuracy ...... 43 Figure 5.4 Categorical confusion Matrix: original data set ...... 44 Figure 5.5 Categorical confusion matrix: Test phase-47 new patients ...... 46

x Acronyms

CVD Cardiovascular Diseases

WHO World Health Organization

ECG Electrocardiogram

HMM Hidden Markov Models

ML Machine Learning

NSR Normal Sinus Rhythm

AF Atrial Fibrillation

HRV Heart rate variability

AI Artificial Intelligencen

CNN Convolutional Neural Networks

AAMI Advancement of Medical Instrumentation

PVC premature ventricular contraction

ANNs artificial neural networks based

CHD coronary heart disease

DCNNs deep convolutional neural networks

VPB Ventricular Premature Beat or Premature Ventricular Complex

SVPB Supraventricular Premature Beats

PAF paroxysmal atrial fibrillation

xi Chapter 1

Introduction

1.1 Motivation

Cardiovascular Diseases (CVD), defined as a spectrum of disorders primarily impacting the heart and the circulatory system, account for a substantial fraction of worldwide morbidity and mortality. The World Health Organization (WHO) regularly and predictably reports CVD as the leading cause of death globally; An estimated 17.7 million people died from complications of CVD in 2015, representing 31% of deaths, worldwide. Of these deaths, an estimated 7.4 million were due to coronary heart disease and 6.7 million were due to [1]. The total number of inpatient cardiovascular operations and procedures increased 28% between 2000 and 2010 in US, signalling an epidemic [75]. In Canada, heart disease is the second leading cause of death and accounted for over 51,500 deaths in 2015 [19]. Accordingly, early detection of symptoms pertaining to cardio- vascular disease is a critical step in optimizing patient outcome. An Electrocardiogram (ECG) is a pictorial illustration of cardiac cell electrical impulses which propagate as action potentials across the myocardium, leading to con- traction and subsequent generation of cardiac output [32]. An ECG captures these electrical signals as relayed on the body surface via 12 conductive electrode leads. ECGs are routinely implemented in a patient’s diagnostic management by clinicians, both in hospitals and outpatient settings. They serve as one of the primary diagnostic tools as patients encounter medical personnel, particularly in suspected CVD. Furthermore, many cardiac structural and electrophysiological pathology have a universally recognized "signature" on the ECG and their clinical identification is necessary for diagnosing cardiac disorders. Hence, early detection of the patients at risk, and a better understanding of these disease mechanisms are arguably crucial in enhancing patient diagnosis and treatment. A popular and practical methodology in the study of diseases and the field of bioin- formatics, particularly for detection of cardiac arrhythmias is classification [87]. Cardiac arrhythmias, broadly defined, represent an abnormal pattern of cardiac tracing as detected on ECGs, which in turn signify an abnormality in the cardiac electrical conduction system.

1 This term encompasses a wide array of pathology, ranging from benign to immediately life threatening. When patient’s experience a cardiac arrhythmia, they may experience chest pain, shortness of breath, , or even no symptoms at all. Certain occasionally silent arrhythmias such as Atrial Fibrillation (AF) remain a major cause of cerebrovascular acci- dents world wide. Therefore, early detection of these arrhythmias is an inherent diagnostic necessity. A holter monitor is a commonly implemented cardiac diagnostic device. Holter monitors are connected to the patient via conductive leads across the chest, and "worn" on a strap across the shoulder. A holter is applied to record ECG data (typically 24 hours), whereas a standard 12-lead ECG records graphics on cardiac tracings over ten seconds. Effectively, a holter allows for obtaining ECG tracings from a patient over a longer period of time to allow for more sensitive detection of arrhythmias, frequently missed by an ECG [109]. It further informs physicians of the clinical impact of each arrhythmia on the patient’s functional capacity, as typically patients are instructed to partake in their routine activities while the monitor is connected, which allows for more practical data to guide patient management. As an example a patient might have a completely normal ECG at rest, but might experience detectable arrhythmias while climbing stairs or during sleep. Modern holter systems allow for documenting beat-to-beat cardiac contractions and rhythmic deviations from the norm. To capture these often infrequent events, classification of ECG signals using machine learning techniques can provide substantial input to inter- preting physicians formally confirming the diagnosis. Manual analysis of large volume ECG data remains tedious and time-consuming. Classification and detection of arrhythmia types can be a great asset in identifying the abnormality present in ECG signal of a patient. Identification of the abnormality leads to subsequent diagnosis and treatment initiation, improving overall patient outcome. Therefore, accurate computational methods can maxi- mize and expedite data extraction from a comprehensive ECG data set. Further supporting the need for these computational methods is the variety in ECG formats and the broad clinical applications of a computational technique. A longitudinal statistical analysis of ECG parameters can yield accurate and prompt solutions for the treatment and diagnosis of observed phenomena, such as T-Wave Alter- nans [44], Atrial Fibrillation and QTc-prolongation [23]. In addition, proper delineation of ECG waveforms can help to achieve more accurate results in applications such as pattern recognition or arrhythmia clustering and classification [9, 61]. Numerous approaches have been developed with the aim of detecting ECG events including mathematical models [91], Hilbert transform, and the first derivative [10, 14] multiple higher order moments [36], sec- ond order derivative [72], wavelet transform and filter banks [68], soft computing (neural networks, Neuro-fuzzy, genetic algorithm) [53], Hidden Markov Models (HMM) application [28] etc.

2 Machine learning methods have been utilized to automatically learn the dataset struc- ture in order to make predictions. A survey of classification of ECG signals using machine learning techniques is represented in Jambukia et al. [48]. Different machine learning algo- rithms have been used for classification of ECG signals in the past [88, 90, 84]. In another work for ECG classification, Pyakillya et. alb investigated the task of arrhythmia detection from a single short ECG leadusing the deep learning architecture with 1D convolutional layers and FCN layers [82].

1.2 Objectives

Cardiac arrhythmias are remarkably common and routinely go undiagnosed because they are often transient and asymptomatic. Effective diagnosis and treatment can substantially reduce the morbidity and mortality associated with cardiac arrhythmias [12]. Some commu- nity public health agencies, for instance, municipal public health centers or remote hospitals, are starved for the ECG recorders and expert physicians holter is the most com- mon diagnostic tool to monitor a patient’s ECG while performing daily activities [37]. Although holter technology has evolved over recent decades, the viability of this test remains of essence for pertinent medical professionals . Most clinicians agree that there is still no replacement for holter monitoring. To identify an arrhythmia using ECG data, a physician typically starts by looking at heart rate (beats per minute), duration and pattern of each segment of the wave as defined by features of wave components such as the width of the QRS, the height of the P wave etc. Hence, Our goal here is to mimic a similar procedure by the proposed algorithm. The emerging use of Machine Learning (ML) data analysis techniques can alleviate the need for human expertise and the possibility of human error while increasing prediction accuracy [11]. Hence, the objective of this research is to develop a machine learning frame- work to eventually aid physicians with their decisions as a powerful guiding/assisting tool to analyse the ECG information reported by holter monitors. The aim of this study is to develop such a computer aided diagnostic system which assists expert cardiologists by providing intelligent, cost effective and time saving diagnosis.

1.3 Proposed Solution

Herein, we present an automated method for ECG holter classification which is intended to work with regular, and commonly implemented, ECG holter recordings (i.e. containing noise and movement artifacts). Machine learning allows us to build systems which learn au- tomatically from data. Traditionally, machine learning approaches to arrhythmia detection have used feature engineering: rather than using the raw ECG signal. Feature engineering approaches have used transformations (such as wavelet transformations) of the signal and then trained shallow models on the transformed features.

3 Here, we present a supervised neural network framework to predict the physician’s final interpretations based on the holter reported values (output). For this purpose we are using a Deep Neural Network.These networks have shown powerful machine learning capabilities, specifically due to their supremacy in terms of accuracy when trained with huge amount of data. The main advantage of using DNNs algorithms is their ability to learn high-level features from data in an incremental manner. This eliminates the need of domain expertise and hard-core feature extraction. To further explain the motivation for the problem of arrhythmia detection and explain the need for its automation, one could prognosticate that automation of arrhythmia detec- tion cultivates an array of possibilities; with hundreds of millions of ECGs recorded annually, a high accuracy automated diagnostic tool can save physicians considerable time. Coupled with low-cost ECG devices such as holter monitors, automated detection can provide diag- nostic tools in parts of the world where access to skilled physicians is limited, substantially improving scalability of automated ECG analysis. For this project, we have used 24-hour holter ECG recording data from a large and de- mographically diverse patient set, attending various medical clinics across metro Vancouver and lower mainland, to demonstrate the effectiveness of the proposed framework. We pro- pose machine learning algorithms that perform classification tasks solely based on diagnostic features which are known to be of predictive clinical value for patient management.

1.4 Contributions

The work described in this thesis resulted in following main contributions:

1. In the proposed algorithm, the "input" is a test specific software generated report produced by the holter devices. Therefore, there is no need to have access to raw ECG signal data in the platform. This feature becomes more pragmatic in cases where there is limited or no access to manufacture’s raw data from each test. Further, our proposed algorithm can be utilized with essentially any manufactured device since most software generated reports by different manufacturers contain the same essential input data, required for our algorithm.

2. We developed a specialized multi layer deep neural network architecture for detecting normal versus three classes of arrhythmia, based on the evaluation and tuning of hyper parameters, that is of high accuracy. The training accuracy was 93.39% and test accuracy was 91.49%.

3. Our proposed algorithm creates a methods with low computational complexity that can be used on mobile devices and cloud computing for tele-, e.g. patient self-monitoring and preventive health.

4 1.5 Thesis Outline

The thesis is organized as follows: Chapter 2 reviews the background related to this work; some background on ECG acquisition and holter devices along with a short introduction on the performance of ECG recording circuits. Common arrhythmias are also further described. Chapter 3 reviews the machine learning background. This chapter starts with basics of the machine learning. We then describe the most common machine and deep learning approaches in ECG analysis for diagnosing different CVD. Chapter 4 describes the details of the proposed methodology. We describe the data collection and preprocessing as well as the proposed network structure. We primary discuss issues surrounding ethics and how its requirements are met throughout the data collection. We then describe the data collection process and the nature of the data. We also explain the preprocessed steps of the medical data. The proposed machine learning algorithm along with its structure are described next, followed by the explication of the proposed network architecture. Chapter 5 presents the results of the proposed network, where we illustrate differ- ent graphs and tables that demonstrate the performance of the network. In addition, we demonstrate the application of the current network on patient to evaluate the network’s performance on a previously unseen dataset. Chapter 6 summarizes and concludes the thesis and depicts potential future work.

5 Chapter 2

Background

2.1 Introduction

In this chapter we represent some background on ECG acquisition and related devices along with a short introduction to show these systems work. Some common arrhythmias are also described in this chapter.

2.2 Electrocardiogram

An electrocardiogram (ECG) measures the electric activity of the heart and is and essential tool for detecting heart diseases, in part due to its simplicity and non-invasive nature [64]. An ECG rhythm segment compromises of universally defined P, QRS and T waveforms demonstrating the electrical conduction activity of the heart [85]. It has been widely used in various applications as a non-invasive and well established diagnostic technique [20, 71]. It represents the changes of the electrical activity of the heart over time and allows for interpreting essential cardiac physiology [86]. Figure 2.1 depicts a standard 12-lead ECG signal and its normal respective components [22]. The RR interval is measured as the peak-to-peak interval between two consecutive QRS complexes, the PR interval is defined as the duration from the beginning of the P wave to the beginning of the QRS complex, the QRS duration (or width) is the duration between the start and the end of the QRS complex, the QRS amplitude is defined as the absolute value of the difference between the minimum and the maximum of the QRS complex, and the QT interval is measured as the time between the beginning of the Q wave and the end of the T [65]. ECG components and their "normal" duration are described in Figure 2.2; depending on the clinical application,there exists various interpretations of these components for the purpose of diagnosis [48].

6 Figure 2.1: Schematic representation of an ECG signal. ECG wave forms P, QRS and T waves and standard features extracted from a single cardiac beat [45].

2.3 Arrythmia types

An arrhythmia is a type of electrical pattern abnormality which occurs as a results of cardiac conduction disease. Broadly speaking, common arrhythmias may cause the heart to beat too slowly (bradycardia, less than 60 beats per minute) or too fast (tachycardia, more than 100 beats per minute), or cause uncoordinated cardiac contractions (Atrial Fibrillation) [41]. In medical practice, careful analysis of the ECG by a qualifying physician is necessary for the diagnosis of cardiac arrhythmias. Although most arrhythmias are considered harmless, there are several cases which could be life-threatening. Hence, automatic classification of cardiac arrhythmias can both provide objective diagnostic results and save time for the physician [47]. During an arrhythmia, the heart may not be able to pump adequate blood to the body, leading to distal organ malperfusion. As explained earlier, herein, arrhythmias are defined by the speed of the heartbeats: slow and fast. They include bradycardia and tachycardia, with a variety of conditions classified under those two categories.

• Normal Sinus Rhythm (NSR): Sinus Rhythm consists of an electrical impulse originating in the Sino-atrial (SA) node and radiating through both atria, traveling through the Atrio-ventricular (AV) node, continuing through the Bundle of His, both the left and right bundle branches, the Purkinje fibers and finally depolarizing the ventricular myocardium. When the SA node paces the heart at a rate between 60 and

7 Figure 2.2: Definitions of ECG components and their "normal" duration [48].

100 BPM, the rhythm is called normal sinus rythm; at a rate of 100 BPM or more, sinus tachycardia and at a rate below 60 BPM, sinus brachycardia [34].

• Atrial Fibrillation (AF): AF is the most common of clinically significant cardiac arrhythmias, and its incidence increases with age [74]. It affects about 1 percent of patients younger than 60 years of age and about 8 percent of patients older than 80 years [94]. Different types of heart conditions such as disease, myocardial ischemia, and decompensated heart failure can lead to atrial fibrillation. Further, non cardiac etiologies such as thyroid disease and obstructive sleep apnea can lead to AF. various infections that causes inflammation of the heart muscle or the outer layer of the heart may also lead to AF. Lastly, certain patients who born with congenital structural heart disease may develop atrial fibrillation later on in life [70]. Atrial Fibrillation as shown in Figure 2.3 is defined as a supraventricular tachyarrhyth- mia characterized by uncoordinated atrial activation with consequent deterioration of mechanical atrial and potentially ventricular function [49]. Atrial fibrillation is a source of significant morbidity and mortality because it impairs cardiac function and increases the risk of stroke [39]. Detection of the actual rhythm strip consistent with AF is paramount to the diagnosis given the clinical implications. The majority of patients with even one ECG strip demonstrating AF will be placed on blood thinning medications for life. In many instances patients have AF on only on ECG recording while all others may represent normal findings, namely in cases

8 Figure 2.3: Atrial Fibrillation pattern on ECG. Note absence of characteristic P, QRS, T pattern seen on a normal ECG. Also note inconsistency in the R-R duration (An ECG representing an Atrial fibrillation waveform.)

of Paroxysmal AF. As such, the importance of accurate detection of even one ECG illustrating this arrhythmia can be appreciated as it will lead to life changing medical management for the patient. Depending on patient risk factors, Atrial Fibrillation leads to an annual stroke risk of 2-18 percent . Moreover, subsequent to AF are notoriously more lethal and/or disabling than strokes caused by alternative eti- ologies. Secondary AF is caused by an underlying condition and is reversible if the instituting condition is treated. Atrial fibrillation may occur immediately after cardiac and thoracic surgery. It is usually self-limited, but should be treated aggressively if it persists because of the increased risk of stroke [39].

• First Degree AV Block: If a block exists in the AV node so that the electrical impulse is held for a longer than normal period of time, the rhythm is called First degree AV block. This rhythm is characterized by a PR interval prolonged to greater than 0.20 seconds [38]. Figure 2.4 represents a demonstration of First Degree AV Block with prolonged duration of PR interval compared to normal ECG.

2.4 Heart Rate Variability

Heart rate variability (HRV) is a physiological phenomenon of the variation in the time interval between consecutive heartbeats in milliseconds [26]. HRV has become the con- ventionally accepted term to describe variations of both instantaneous heart rate and RR intervals. There are many commercial holter devices which provide an automated measure- ment of HRV [31], and cardiologist has been provided with a seemingly simple tool for both research and clinical studies. However, the clinical significance and management implica- tions of the many different measures of HRV are more complex than generally appreciated, and there is a potential for incorrect conclusions and excessive or unfounded extrapola-

9 Figure 2.4: An ECG illustrating First Degree AV Block. Note prolonged duration of the PR interval compared to a normal ECG. tions. Different measurement techniques of HRV include: time domain methods, statistical methods, and geometric methods [89]. A holter monitors allows a 24-hour, multichannel ECG recording of HRV with the po- tential to provide additional valuable insight into physiological norms and pathological conditions to enhance patient risk stratification [18].

2.5 Holter Monitors

A physician may request a 24 holter recording if a patient has symptoms including, but not limited to, [50]:

• A fast or slow heartbeat detected on physical exam

• Dizziness or fatigue

• Shortness of breath

• Chest pain

or chest fluttering sensation

• Pulse irregularity on palpation

• Assessment of average heart rate to assess adequate medication dosing in patients with previously identified arrhythmias

Holter monitors take a differential measurement of the electrical potential on the body surface, via electrodes placed at predefined locations, generating different ECG vectors. A standard 12-lead ECG consists of 12 different vectors known as "leads". Six of these leads are in a parallel plane to the body and are known as "frontal" ECG leads. The remaining six ECG leads are "views" of the heart are placed in a plane perpendicular to the body

10 Figure 2.5: A vector view of the standard 12 Lead ECG. The frontal leads are light blue and the pre-cordial leads are dark blue. [73]. and are referred to as "pre-cordial" leads [73]. Depending on the type of the holter device, different QRS detection algorithms are used, although the physiologic principles remain the same [110]. Placement of all 12 ECG leads is illustrated in Figure 2.5. The frontal ECG leads are formed from three different electrodes placed on the body, with an optional fourth electrode used as a reference [29]. These electrodes are placed on the right arm (RA), left arm (LA), and left leg (LL). The optional reference electrode is usually placed on the right leg (RL). The three sensing electrodes (RA, LA, and LL) form what is known as Einthoven’s triangle (Figure 2.6(a)). The six frontal leads are derived from this triangle. Leads I, II and III are formed directly from the triangle itself. An additional three leads, known as the augmented limb leads (augmented foot, augmented right, augmented left) are formed by subtracting the average of two electrode potentials from the value of the third. These leads are known as augmented leads because their amplitude is 1.5 times the amplitude of the frontal leads. Each lead is derived from one of the three sensing electrodes using the equations provided in Equation Array 2.1The vector representation of each frontal lead is shown in Figure 2.6(b).

I = LA − RA (2.1)

II = LL − RA (2.2)

11 (a) (b)

(c)

Figure 2.6: a) Einthoven’s triangle, formed by the right arm, left arm and left leg electrodes. These three electrodes form the basis for the frontal axis. [66]., b) The frontal axis of the ECG consists of six vectors derived from three electrodes: left arm (L), right arm (R), and left leg (F). [66]. c) The electrodes V1-V6 are placed on the chest roughly in a semilunar line. [66].

12 .

Figure 2.7: Simplified functional block diagram of a holter monitor. [108].

III = LL − LA (2.3)

1 aV F = LL − (LA + RA) (2.4) 2

1 aV R = RA − (LA + LL) (2.5) 2

1 aV R = LA − (RA + LL) (2.6) 2 The remaining six leads of the 12 lead system are the pre-cordial leads. The electrodes V1-V6 are placed on the chest as shown in Figure 2.6(c). The lead is the difference between the potential at the electrode site and the average potential of RA, LA and LL. This can be thought of as the difference between the center of the body and the pre-cordial electrode, which creates a vector perpendicular to the body [29].

2.5.1 Holter System Description

Here we summarize a conceptual demonstration of holter monitors. An ECG holter monitor contains various mechanical and electrical parts [104]. Broadly, The main task of a holter monitor is to record ECG data with good quality for clinical interpretations. The ECG signal voltage level is as low as 0.5 to 5 [mV ], and understandingly susceptible to arti- facts with higher voltages. It is very important to filter out noise to ensure correct clinical interpretation [108]. Figure 2.8 shows a block diagram of a sample holter system and how the components are interrelated as shown in [52]. This system is designed such that it can transmit ECG signals in real time using modem on a mobile phone and store all ECG data to an SD card. The main components of a general holter system include, but are not limited to the following:

13 Figure 2.8: Block diagram of a sample holter recorder system. [52].

• Electrodes; For a stable and low-noise recording, clip-on wires, conductive gel, and an adhesive area for proper attachmenet on the skin.

• ECG amplifiers; Various amplifiers including differential amplification stages are re- quired for each lead. Figure 2.9 depicts a suggested circuit diagram.

• Microcontrollers

• A/D converter

• Various filters including bandpass filters

Since a holter monitor is a wearable device recording for an extended period of time (some existing products can operate for as long as 96 hours on an AA battery), low power con- sumption is a critical design objective [108].

14 (a)

(b)

Figure 2.9: a) Circuits of ECG amplifier, high-and low-pass filters [52], b) Suggested R-peak detection circuit [52].

15 Chapter 3

Machine Learning Approaches

3.1 Introduction

In this chapter we briefly discuss an introduction to machine learning techniques and their applications to CVD monitoring, as well as review of various Machine Learning (ML) ap- proaches in this field.

3.2 Artificial Intelligence

Artificial Intelligence (AI) commonly associated with human intelligence, such as learning, problem solving, and pattern recognition, is defined as the capacity of a computer sys- tems to perform tasks that would usually require human levels of intelligence and often in conjunction to considerable amounts of user time [69]. AI has transformed key aspects of human life in different areas. A sub-field of AI is ML, which is utilized to train and analyse a large quantity of data in a rapid, accurate, and efficient manner through the use of complex computing and statistical algorithms. Most existing intelligent systems that use machine learning, pattern recognition, data mining or natural language processing are examples of AI [78]. AI algorithms are trained to learn the relationships from existing data-sets and make predictions on unseen data. The application of ML is becoming increasingly broader within the medical community, and specifically within the domain of cardiovascular diseases. Hence, feature extraction and segmentation in ECG plays a significant role in diagnosing most of the electrical cardiac disease [85]. In a review paper by Subhi et al., they present a brief overview of ML methodologies that are used for the construction of inferential and predictive data- driven models and highlight several domains of ML application such as and [6]. ML is classified into three categories including supervised, unsupervised and reinforce- ment learning [92]. In supervised learning, the machine learns from a labeled data with the goal to approximate the mapping function so well that new input data can be predicted.

16 Examples of supervised learning methods include random forests, support vector machines and artificial neural networks [57].

3.3 Supervised Machine Learning

Supervised ML algorithms are deployed in both classification and regression problems. The goal in the former task is to correctly assign a binary or multi-class label, while in the latter, it is to correctly predict a real-valued output [6]. The flexible capability of these algorithms allow analysis of both classification and regression problems with only minor adaptations, however, constraints such as interpret ability, computational cost, and type of available data need to be considered in tailoring the choice of algorithm. Figure 3.1 represents different ML algorithms used for ECG classification in servery performed by Lyon et al. [65]. Among different machine learning techniques, Convolutional Neural Networks (CNNs) were first developed by Fukushima in 1980, and then in later years were improved [30]. Con- volutional networks, are trainable multistage architectures composed of multiple stages [67]. CNNs are regarded as deep architectures as they involve a hierarchy of layers, such that the outputs of a layer are connected to the next layer’s inputs [100]. The input and output of each stage are sets of arrays called feature maps [59]. In a survey bu Roopa et al. they represented the most common computational methods used for ECG analysis with a focus on machine learning and 3D computer simulations, as well as their accuracy, clinical implications and contributions to medical advances, including Support vectormachine, Random forest classification, Hidden Markov model and Neural network [85]. CNNs take advantage of characteristics of image data to improve tractability and reg- ularize the network [43, 103]. CNNs are multi-layer feed-forward neural networks widely studied to extract appropriate hierarchical feature representations for image recognition tasks [42]. Local feature detectors are employed to reduce the number of networks’ param- eters, and weight sharing is introduced to regularize the network as well as improving the robustness of CNNs to the variance occurring in images [60]. It is a form of Deep learning (DNN) which comprises one or more convolutional layers followed by one or more fully connected layers as in a standard multilayer neural network. Using CNN, Rajpurkar et al. developed an algorithm which exceeds the performance of board certified cardiologists in detecting a wide range of heart arrhythmias from elec- trocardiograms that were recorded with a single-lead wearable monitor. They trained a multi-layer convolutional neural network which maps a sequence of ECG samples to a se- quence of rhythm classes. A group of board certified cardiologists annotated a gold standard test set on which they compared the performance of their model to that of 6 other individual cardiologists [83].

17 Figure 3.1: Main machine learning methods used for ECG classification. (a) Support vector machine (b) Random forest classification using n decision trees. (c) Hidden Markov model (d) Neural network with two hidden layer [65].

The main advantages of CNNs are that they are easier to train and have fewer param- eters than fully connected networks with the same number of hidden layers [111]. CNNs are self-learned and self-organized networks which eliminate requirements of supervision. Nowadays, an important application of CNN is in image classification, object recognition, and handwriting recognition. In addition, it plays an important role in the medical field for automated disease diagnosis [15]. In another recent study, Chandra et. al used CNNs to learns fused features from mul- tiple physiological signals. Their goal is to improve robustness and accuracy of detection, especially, in certain critical care scenarios [21]. Plesinger et. al also used CNNs to for automatic detection of Atrial Fibrillation and other arrhythmias in holter ECG [79]. In a similar work, Plesinger et. al proposed a method for automated classification of 1-lead holter ECG recordings. Their proposed method classifies a tested record into one of four classes: ’normal’, ’atrial fibrillation’, ’other arrhythmia’ or ’too noisy’ [80].

18 Figure 3.2: A Typical neural network with working of a single neuron explained sepa- rately [17].

3.4 Deep Learning (DL) and Related Work

Deep learning methods are increasingly popular due to their ability to automatically learn features. There are growing number of research papers that use artificial neural networks to improve different heart disease diagnosis in different modalities. Application of deep learning methods even to the most complex medical pattern recognition tasks proved to be very promising, obtaining state-of-the-art results [46]. Figure 3.2 demonstrates a Typical neural network. The input to each neuron are like the dendrites. Just like in human nervous system, a neuron (artificial though!) collates all the inputs and performs an operation on them. Lastly, it transmits the output to all other neurons (of the next layer) to which it is connected. Neural Network is divided into layer of 3 types [17]:

1. Input Layer: The training observations are fed through these neurons

2. Hidden Layers: These are the intermediate layers between input and output which help the Neural Network learn the complicated relationships involved in data.

3. Output Layer: The final output is extracted from previous two layers. For Example: In case of a classification problem with 5 classes, the output later will have 5 neurons

A survey describing the ECG-based heartbeat classification for arrythmia detection is represented by Luz et al. [64]. In this work, they survey the current state-of-the-art

19 methods of ECG-based automated abnormalities heartbeat classification by presenting the ECG signal preprocessing, the heartbeat segmentation techniques, the feature description methods and the learning algorithms used. They describe some of the databases used for evaluation of methods indicated by a well-known standard developed by the Association for the Advancement of Medical Instrumentation (AAMI). They propose an evaluation process workflow to guide authors in future works as well. The other recent research classifies short segments of ECG into four classes (AF, normal, other rhythms or noise) and compares a state-of-the-art feature-based classifier with a convolutional neural network approach [8]. Recently Hannun et al. developed a deep neural network to classify 12 rhythm classes using single-lead ECGs from patients who used a single-lead ambulatory ECG monitoring device and their performance exceeded that of average cardiologists [40]. In another work, Acharya et al. used a convolutional neural network (CNN) technique to automatically detect the different ECG segments. They used ECG signals of two seconds and five seconds’ dura- tions without QRS detection.Their algorithm consists of a multi-layer deep CNN with the output layer of four neurons, each representing the Normal Sinus Rythm, Atrial fibrillation, atrial flutter, and ventricular fibrillation [3]. Rahhal et al. used deep learning approach for active classification of electrocardiogram signals from raw ECG [5]. In a study by Ozal et al. They propose a new, efficient and fast 1D version of CNN model (1D-CNN) for the automatic classification of cardiac arrhythmia based on 10-second (s) fragments of ECG signals [106]. The potential area of using data analysis, and in particular the design and use of deep neural networks(DNNs) for detecting heart disease based on routine clinical data is inves- tigated in [101], where they design, evaluation, and optimization of DNN architectures of increasing depth for heart disease diagnosis. kumar et al used recurrent neural network to detect Arterial Fibrillation from a three lead ECG [93]. In another classificaiton study, Sabri et al. proposed a multistage classification approach using K-Nearest Neighbor and decision tree of the 3 segments in the ECG cycle to detect Ar- rhythmia heartbeat from the early minute of ECG data. In their research specific attributes based on feature extraction in each heartbeat, are used to classify the Normal Sinus Rhythm and Arrhythmia [16]. Other group of researchers who worked on classificaiton of ECG are Jun et al. [51]. They proposed a deep neural network to recognize premature ventricular contraction (PVC) beats from ECG recording. Their deep neural network is composed of with six hidden layers and is trained by feeding six different features extracted from ECGs to carry out the classification between normal and PVC beats. In a study by Islin et al., they used deep learning for cardiac arrhythmia detection. They use three different conditions of ECG wave forms selected from MIT-BIH arrhythmia database to evaluate their proposed framework.The Main focus of their study is to imple- ment a simple, reliable and easily applicable deep learning technique for the classification of the selected three different cardiac conditions [47].

20 In another research published by Yu et al. they developed an artificial neural networks- based(ANNs) diagnostic model for coronary heart disease (CHD) using a complex of tradi- tional and genetic factors of this disease [11]. The other research published by Kim et al. addresses the problem of over fitting by ranking features, training the neural network with each feature ranking, and then training the neural network to output a potential diagnosis [54]. Another study published by Loh et al. demonstrates the accuracy of deep neural networks by proving their ability to learn from nonlinear relationships in data [63]. To address the deficiencies and drawbacks of existing AF detection algorithms, xia et al. proposed a novel method for automatic AF detection based on deep convolutional neural networks (DCNNs) [105]. Their method consists of of multiple processing layers which can learn abstract representations (called feature maps) of data. Pourbabaee et al. (2016), used a deep convolutional neural network which they trained it to extract features directly from raw ECG signals and to carry out the classification between two cardiac conditions, namely; normal beats and paroxysmal atrial fibrillation (PAF) [81].

21 Chapter 4

Methodology

4.1 Ethics

To comply with the ethics rules and regulations defined by Health Canada, prior to the study all patients are asked to sign a consent form. The consent form permits us the usage of their collected ECG data form the holters for research purposes. Therefore former to holter installation, patients were provided with informed consent to participate in this study. Each patient voluntarily approves the consent by signing it in front of a technician. Under the consent form we are allowed to use the 24-hour recorded ECG data of the patients for research purpose.

4.2 Materials

The details of the proposed method, data collection and prepossessing, proposed network structure along with the analysis pipeline are explained in this chapter.

4.2.1 Block Diagram

A schematic of the proposed methodology is represented in Figure 4.5. It shows the three main tasks of: a) data collection, b) Prepossessing of data and Machine Learning frame work, c) Predicting results) where we use a supervised machine learning algorithm for the purpose of holter output classification. The sub tasks are defined through this chapter. As Figure 4.5 depicts, once the data is captured by the holter’s sensors, the holter’s built-in software generates a report. This report includes a summary of the important instances in a 24-hour window of recordings of the heart-beat which includes a couple of pages. Therefore, the next step, is to select the sections of interest for the purpose of this project from the long report . Hence in the next step, these sections that hold the required information for network training, are selected and specified form the whole report data set and imported such that only specific areas that include related features are extracted. This

22 Figure 4.1: An illustration of proposed methodology for automatic arrhythmia classifica- tion of holter reports. Main tasks include: a) data collection, b) Prepossessing of data and Machine Learning frame work, c) Predicting results) step prepares the data to be fed into our machine learning framework. These selected data contain the input features of the machine learning system. The details of the network and analysis procedure is explained in the following sections. The flow is that the network learns form existing data and is able to predict outputs on the unseen (new) data.

4.3 Data

In this section we describe the data acquisition, labeling and other preparations required for the purpose of training the network as follows:

4.3.1 Data Acquisition

Currently "Scottcare" holters are used in various clinics in metro Vancouver and lower main- land. These lightweight holters provide accurate pace detection with adjustable sensitivity. Figure 4.3 demonstrates "chorme 2" version of these holters. They record the cardiac ac- tivity within a 24-hour period, providing us with a realistic time-series behaviour of the cardiac activity. Each of the 5-leads are attached to the patient’s chest with one adhesive patch providing 24 hours of continuous ECG recording. One remarkable feature to mentions is their high speed in reading the data: ECG data can be downloaded in less than 15 seconds, with automated analysis in 60 seconds. This feature enables speeding up the process for the clinicians to provide us more clinical data in a negligible amount of time. We collected a data set of 187 ECG records. The age of patients varies form 18 to 90 for men and women. Figure 4.2 depicts an overview on the distribution of data for different ages among the genders.

23 (a)

(b)

Figure 4.2: Distribution of holter monitor data: a)Gender distribution, b)Age range.

24 ScottCare’s holter monitoring analysis software is called holterCareTM. The Scottcare holterCare has many features such as an advanced analysis technique including all arrhyth- mic and ischemic episodes that can be edited using multi-level template editing, interactive trend editing, as well as superimposition and page mode review [62]. The generated report by the built-in "Scotcare" holter software includes different sections showing various graphs and tables. These graphs summarize the important instances of a 24-hour heart beat recording. The next step is the verification of each patient’s holter readings by the physician. The confirmed data would then be the used as the input of our network. Each holter report is composed of different sections including various graphs representing selected intervals of QRS wave and many other information. It also includes a table which is representing some selected features form the ECG analysis by holterCareTM. Figure 4.4 depicts a general view of the table with selected ECG features, as a typical printed report. Consequently, the input for Scottcare software is the ECG signal which is recorded by this device and the outputs for the purpose of this project, are reported values as shown in Figure 4.4.

4.3.2 Data Description

As Figure 4.4 represents, the input data contains various features. The time stretch for each recording is one hour. The details of this information is explained below.

1. Hr Min: Is the minimum recorded heart-rate in each time stretch.

2. Hr Avg: Is the average recorded heart-rate in each time stretch.

3. Hr Max: Is the maximum recorded heart-rate in each time stretch.

4. Tot.Bts: Is the total heart-beats recorded in each time stretch.

5. Ventricular Premature Beat or Premature Ventricular Complex (VPB): is character- ized by the absence of a P-wave, a QRS that is wide and bizarre in morphology, and typically is followed by a full compensatory pause (does not reset the SA node). These beats can occur as single isolated beats, in couplets (or pairs), or in runs of 3 or more. An R-on-T is a very early VPB that occurs on, or near, the peak of the T-wave of the previous beat, a very vulnerable period during repolarization, that could initiate ventricular tachycardia and other lethal arrhythmias. Clinicians may quantify VPB frequency on a per hour basis or per 1,000 beats (aver- age number of VPBs per 1,000 normal beats) to classify occurrence as rare, occasional or frequent during the holter monitoring period. When a VPB occurs early enough that the normally-timed impulse from the SA node conducts through to the ventri- cles immediately following the VPB, the VPB is called "interpolated". Rather than

25 (a)

(b)

Figure 4.3: ScotCare holter used for data collection throughout the study. a) Chroma2 version with 5 leads, b) Close view of holter monitor [25]

26 Figure 4.4: Representation of a sample report generated report by ScotCare built-in soft- ware, holterCareTM.

27 replacing the normal QRS, the VPB occurs between two normally-timed sinus beats and is not followed by a compensatory pause [7].

6. V-Pair: Represents ventricular episodes which included APCs (atrial premature con- tractions) and nodal escape pairs beats.

7. V-Run: Represents ventricular episodes which included APCs and nodal escape runs beats.

8. Supraventricular Premature Beats (SVPB): is a beat initiated by an irritable focus in one of the atria. The beat looks normal in morphology with a slight difference in the P-wave morphology, occurs early compared to the normal sinus rhythm preceding it, and resets the SA node so that the next beat is slightly later than the normal sinus rate [7]

9. SV-Pair: Represents supraventricular episodes which included APCs and nodal escape pairs beats.

10. Sv-Run: Represents supraventricular episodes which included APCs (atrial premature contractions) and nodal runs escape beats.

11. Pause: This feature demonstrates the number of sinus pauses during the period f recording in each hour.

In addition the above information, there are two other features recorded in each table named: "Start" and "Analyse". These two features represent the star of recording time and the duration of analysis for each line of report. We have not used these features directly in the current setup of the network, however the assumption is that all the readings of each line are for a window of approximately 50 minutes or more. The details of these feature extractions will be explained in the following sections.

4.3.3 Data (ROI) Selection

For the purpose of data analysis we choose the reported metrics depicted in 4.3.2. To follow the main goal of this thesis which is easy and accessible holter reading and interpretation, we intentionally choose these metrics as high level features rather than typical direct ECG related features being used conventionally, since there is no need to have access to Raw data to read the ECG signal. These selected areas of the report’s image create our Regions of interest (ROIs). Using Matlab, we designed an automated process to read the whole report generated by holterCareTM for each patient. These reports include a collection of information as explained earlier.

28 Our interest is to wisely select the regions with specified features as are shown in Fig- ure 4.4. These tables represent various information such as minimum heart rate, maximum heart rate, number of beats, pauses, etc of each patient for the duration of the heart beat monitoring with the intervals of 1 hour. Therefore a code is generated in Matlab to auto- matically select these regions from other parts of each report and save them as .tiff files. Therefore the saved ".tiff" files contain only the selected information. To read the text directly through the saved images, another Matlab script is generated for the next step. Here the goal is to read the text in each column of the image and feed them to the network as the input feature. For this purpose we mainly use the text recognition function; "ocr" from the Computer Vision System ToolboxTM to perform optical character recognition. This function returns an ocrText object containing optical character recognition information from the input image. The object contains recognized text, text location, and a metric indicating the confidence of the recognition result. The "ocr" function provides us the with venue to add text recognition functionality to our framework. Using this feature we can later expand the functionality of the network to include other parts of the reports as may be required in the future.

4.3.4 Labeling

The gold standard is that the physician reads the holter’s generated report and writes his comments in a specific section of the report. This section is refereed to as physician’s diag- nosis for each patient. For the purpose of labeling the data, we mimic the same procedure. Therefore another Matlab code is written to extract the physician’s comment’ section and find out the key words that describe their diagnosis. The keywords include the "Normal", "Benign", "AF" or "Av block". Table 4.2 represents the labels of the data. For this purpose we perform optical character recognition using the "ocr" function the similar to section 4.3.3, however, for the physician’s comment section of the report we require new settings to identify the text boundaries. Using this approach we can extract the text an its key words which we will use later as assigned labels. The four extracted labels are:

1. Normal Sinus Rhythm: Sinus Rhythm consists of an electrical impulse originating in the Sino-atrial (SA) node and radiating through both atria, traveling through the Atrio-ventricular (AV) node, continuing through the Bundle of His, both the left and right bundle branches, the Purkinje fibers and finally depolarizing the ventricular my- ocardium. When the SA node paces the heart at a rate between 60 and 100 BPM, the rhythm is called Normal Sinus Rhythm. NSR is the rhythm that originates from the sinus node and describes the characteristic rhythm of the healthy human heart [107], [35]

29 (a) (b)

(c) (d)

Figure 4.5: Samples of the physician’s comments on holter reports; representing the diagnosis based on ECG readings, the key words form the selected red boxes are extracted as labels: a) Sample of a Normal reading. b)sample of Benign, C) sample of AV nodal block reading, d) Sample of AFB reading

2. Abnormal but benign ECG: A benign abnormal ECG consists of a rhythm that is not a classic normal sinus and may fall outside of the normal heart rate ranges described, but does not necessarily have clinical implications for patient management. An example is sinus arrhythmia where the electrical impulse is originated outside of the SA, but the cardiac output is not effected and the patient is not at a higher risk of cardiac dysfunction or stroke. Presence of Ectopic beats are another example of technically abnormal but clinically benign pattern on ECG.

3. First Degree AV Block: If a block exists in the AV node so that the electrical impulse is held for a longer than normal period of time. This rhythm is characterized by a PR interval prolonged to greater than 0.20 seconds [38].

4. Atrial Fibrillation: When the electrical activity in the atria is chaotic and many ectopic foci are firing erratically, the atria are said to be "fibrillating". Some impulses conduct through the AV node and stimulate the ventricles. This rhythm is characterized by a chaotic baseline with no distinct P-wave and highly irregular RR intervals [24].

30 4.4 Prepossessing

Once the intended ROIs are found and read from reports, we will visualize data and show initial correlation of the different components of the data. This step would help us with a better intuition towards wiser choices of the features. The biggest issue to be resolved in the prepossessing step is to deal with the missing data issue. Missing data happens in occasions when there is circumstances of loose connections for some time. The strategy we choose is in case the ratio of the missing data is less than 2% of the whole data meaning only there are 4 occasions of missing values we can approximate these values otherwise the data will not be used and it is thrown away due to its uncertainty. There are various methods to deal with the missing data. Depending on the type of data/feature that is missing, we should follow a different imputation policy. After trying different methods we realized that this is best done by considering a series of factors. These factors include:

• The context of data

• Range of data

• Temporal dependencies

As a an example, we may impute a missing heart rate reading for a given patient by the average of the heart rate readings coming immediately before and after it depending when the missing data occurred.

4.4.1 Data Augmentation

With the reality of an infinite amount of data, the more the data, the better our ML models will be. However, every data collection process is associated with a cost and limitations. Data augmentation is a common way of creating new "data" with different orientations [77]. The benefits of this are two fold, the first being the ability to generate "more data" from limited data and secondly it prevents over fitting. Below we describe the algorithm we have used for data augmentation: For each patient in the rare classes, (Class 2,3 and 4) we generated as many rows as we needed to make the classes balanced and added it do the data set. In our case, we have less than 4 patients for classes 2,3, and 4. Subsequently 1500 instances were added to each case so it matches with the number of patients of class 1 (Normal). Namely, the total number of data points reached 6000 instances.

4.5 Network Architecture

Figure 4.6 represents the framework design and notations. As shown, the neural network is organized into L fully-connected ’layers’ (i = 1, ..., L) with n nodes (or artificial neurons)

31 per layer that function together to make a prediction. The connections between layers i − 1 and i are operating by numerical weights, stored in matrix W i of size ni × ni-1, and vector bi of length ni. Thus, if the input values for layer i, given by the values at the ni-1 nodes of layer i − 1, are represented as a vector ai−1 of size ni-1, the output of layer i will be a vector of size ni, given by the matrix-vector product W i ai-1 + bi.

Training stage will be performed in parallel for a batch of nb vectors, the inputs ai−1 will be matrices Ai−1 of size ni-1 × bi and the outputs will be given by the matrix products

Zi = Wi × Ai-1 + bi. During training, the loss function is minimized through a proper gradi- ent optimization algorithm like stochastic gradient descent, root mean square propagation (RMSprop) or adaptive moment estimation (Adam) [55]. The structure of the network is as follows, The network has 4 layers:

• Layer 1: Is the input layer, The number of units = 11×24 = 264, which represents the 10 input types times the number of measurements for each inputs (24 hour recordings).

• Both the input layer and the hidden layer would have as many units as the number of features in our dataset. The number of nodes in the first hidden layer are: 11×24 = 264 and the number of nodes in the second layer are: 2 × 11 × 24 = 528.

• The number of units in the output layer is equal to 4. This represents the number of classes to predict: Normal, Benign, AV node, AFB as explained in Table 4.2.

• The total number of network parameters are 211996. The network structure, number of layers and the type of each layer are depicted in Figure 4.7.

4.5.1 Activation Function

Every activation function (or non-linearity) takes a single number and performs a mathe- matical operation on it. There are several activation functions that may be used in different applications of neural networks as well as CNNs. The most applicable ones are sigmoid, Tanh, and rectified linear unit (ReLU) [4]. We have used the followings for each layer:

• Activation function for layer 1: ReLU

• Activation function for layer 2: ReLU

• Activation function for the output layer: Softmax

A brief mathematical description of the activation functions is described bellow:

1. ReLU: This activation function computes the function of σ(x) = max(0; x). This activation simply makes all negatives values equal to zero, which makes is applicable for first and middle layers [99]. One way ReLUs improve the neural networks is by speeding up training. The gradient computation is very simple (either 0 or 1 depending

32 on the sign of x). Also, the computational step of a ReLU is easy: any negative elements are set to 0.0. Therefore, no exponential, no multiplication or division operations

1 2. Sigmoid: This non-linearity has the mathematical form of σ(x) = 1+ex . It takes a real- valued number and scales it into the range of 0 and 1 [99]. We have used this activation function in the last layer which is the putout layer since this is a classification problem and the output is a group of classes rather than a probability distribution.

To construct the DNN architectures for hearth disease diagnosis, one can use one out of the many currently available frameworks, including TensorFlow [1], Keras, Caffe [76].

4.5.2 Automatic Feature Learning

Once the features have been selected, extracted and transformed from the original ones, the classifier should be designed. Typically, the available data set is divided into a training set and a test set. The design of the classifier is carried out using the training set. The machine learning framework for the proposed study is to use Neural Networks. While some machine learning algorithms ask for preprocessing of datasets and separate feature extraction techniques, CNN does not have these requirements. This makes CNN advantageous and reduces liability during training and picking of the best feature extraction procedure for the automatic detection of arrhythmias [21, 22]. Since there is no generic way to determine a priori the best number of neurons and number of layers for our neural network, given just a problem description, the idea is to start with a rough guess based on prior experience about networks used on similar problems. We start simple and build up complexity as we proceed. In the deep learning method, we set the number of epochs to 200, monitored the loss while selecting the model with smallest validation loss. The optimal number of hidden units could easily be smaller than the number of inputs. We choose a deep learning approach with a fully connected 3-layer neural network as the baseline method. We use this network structure because the dataset includes both temporal and non-temporal information and hence as initial attempt we do not employ time-series type of analysis. The network structure is depicted in Figure 4.8. There are different options to choose the number of hidden layers. The size of the hidden layer is normally between the size of the input and output. The number of hidden neurons should also be less than less than twice the size of the input layer. One recommendation is that it should be 2/3 the size of the input layer plus the size of the output layer. In our experiments, we used k-fold cross validation with k = 5. Therefore, 80% of data was used for training (chosen uniformly at random among all instances and 20% to test the network. We then used 30% of the trained data to validate the results. The training and test dataset will be randomly shuffled. As an alternative we could use cross-validation to test the accuracy on the test set Figure 4.8b.

33 4.5.3 Classification Prediction (Labeling)

The gold standard is that all the printed reports should be viewed by a qualified physician. Then his diagnosis of the physician is added to the report as comments representing the result for each patient. Similarly the outputs of the network are derived analogous to the physician’s report for each patient. The typical report of a physician may include "Normal" vs. "Abnormal" condition and they may also be able to detect the form of abnormality as well with the current reading of the data. Therefore, in the "Abnormal" category depending on the conditions there might be different arrhythmia such as First-degree (AV block) or , Atrial fibrillation (AF) presentation that the heart beats in an irregular fashion. Accordingly, the suggested labeling of the output is shown in Table 4.2

4.5.4 Classification Framework

Here, the objective is to develop a deep learning model to classify the outputs of the ECG holters into four categories of: 1: Normal, 2: Benign, 3: AV block, 4: AFB. The details of each category will be explained in the following sections. L = [f(i), y(i)] at i = 1 represent the collection of all labeled inputs (holter reports), (i) (i) where f is the ith feature vector and y indicates the corresponding label. The goal is to learn a mapping from f(i) to y(i) in a supervised framework by using CNNs. Figure 4.1 shows a schematic diagram of our method. Our framework consists of two parts. First, we use multi layer CNN to automatically learn a high level feature represen- tation of the temporal ECG data. We then can predict and classify the diagnosis based on the learned features on new and unseen data.

4.5.5 Fine Tuning

Here, we need to tune a group of parameters for to achieve better accuracy and network efficiency. To set these hyper-parameters, we heuristically tried different network struc- tures so that lowest reconstruction error with the default library values for all other hyper- parameters is obtained in the training data [98]. The parameters/hyper-parameters to be experimented with are:

• Batch size: The batch size of 200 is used.

• Choice of optimization algorithm; The categorical cross-entropy loss was optimized with Adam optimizer (adaptive moment estimation) in Keras. Adam is based on adap- tive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal

34 Table 4.1: Network Input: Features derived from holter’s generate report

Feature number Feature 1 Minimum hear rate 2 Maximum heart rate 3 Average hear rate 4 Total beats 5 Ventricular Premature beat 6 Ventricular pairs 7 Ventricular runs 8 Supraventricular Premature beats 9 Supraventricular pairs 10 Supraventricular runs 11 Pause

Table 4.2: Output labels: the predictable Normal condition and arrhythmia types.

Output number Label 1 Normal 2 Benign 3 AV Block 4 AFB

rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters [55].

• Number of epochs; After trying different epochs and evaluating the network perfor- mance we came up with 200 for number of epochs.

• Activation functions, are explained in details in section 4.5.1.

• The structure of the network (number of layers, number of units in each layer, whether we have dropout or not i.e. whether we are using a fully connected network, etc.)

Since there is not any mathematically rigorous formula for optimizing these parameters and hyper-parameters, we have gone though rounds of experimental testings for each of the above parameters.

35 4.5.6 Optimization

Learning process proceeds by optimizing the weights of each layer through forward and backward propagation. On the forward propagation, each layer transforms the output from the previous layer according to its functionality. The output of the last layer is compared with the label values and the total error is computed. On the backward propagation, the corresponding transformation occurs with the deriva- tives of obtained error with respect to the outputs and weights of each layer. After the backward propagation finishes, the weights are changed in the direction that decreases the total error. This procedure is repeated for all samples in a training set [60]. The training dataset is defined with m number of samples, in which: (x(1); y(1)), (x(2); y(2)), ..., (x(m); y(m)) are training data where x(i) is sample and y(i) is its corresponding label. The aim of learning process is to use stochastic gradient descent (SGD) and optimize weights of CNN models in order to minimize logarithmic softmax function [102].

eθT x h (x) = −log( i ) (4.1) θ Pm θT n=1 e xn where θ is the vector of parameters. Using this function, the outputs of CNN are the log likelihoods of class membership. We need to fit the parameters of loss function using the cost function calculating as [102].

1 m J (θ) = X cost(θ, (x(i)), y(i)) (4.2) train m i=1 And the standard cost function is defined as [102].

1 cost(θ, (x(i), y(i))) = (h (x(i) − y(i)))2 (4.3) 2 θ

4.5.7 Implementation

The implementation is in Python 3.7.1 executed on CPU (compatible with any python3.x version). We implement the DNNs in Keras [76] using the Tensorflow [2] back-end. The proposed algorithm for training the network is based on batch gradient descent with batch size 50 using Adam optimizer. Training and testing is performed on a computer with 16 GB of memory equipped with Intel CoreTM i7 CPU running at 2.11 GHz (8 cores). For the current dataset, training of the network for 100 epochs takes- less than a minute on this machine. Early stopping and calculation of additional performance metrics are im- plemented using Keras callbacks and the Tensorflow back-end to evaluate internal states and statistics of the model during training.

36 The loss function for the network is categorical cross entropy loss, and the metric used is categorical accuracy. This is chosen to interpret the values in training. The detailed results and analysis of the network behavior will be explained in the next chapter.

37 Figure 4.6: Schematic of a proposed network, nodes and layers connectivity. The number of inputs is chosen based on the available information form the reported holter readings, therefore the input layer contains 10 nodes. The output layer of the network includes four classes to predict; Normal, Benign, AV block, AFB.

38 Figure 4.7: Network structure representation. The number of inputs and parameters for each layer as well as the type of each layer.

(a)

(b)

Figure 4.8: Distributions of data set used for training and testing a)suggested methods for cross validation and testing [56], b)The 5-k fold methodology used for testing and training of data

39 Chapter 5

Results and Verification

5.1 Network Analysis

To analyze the network behavior and layers that contribute most to classifying between Normal, Benign and two classes of abnormality, we examine the final high-level feature representation for each class. As explained in section 4.5.2 model parameters are learned in the feed-forward step using training set. Then, through back-propagation step, the training error is evaluated and the parameters are adjusted so that this error is minimized [58]. In the next step, the labels are predicted for the test set. These labels are used to evaluate the classification performance [58]. We use learning curves to determine whether our algorithm has high variance (over- fitting) or high bias (under-fitting). This technique is based on the variation of the training error as the training progresses (epochs). Once a model memorizes training data rather than learning the trend of the data, over-fitting happens. In the case of over-fitting, test error starts increasing at a point while training error steadily decreases [58]. On the other hand, under-fitting happens when there is difference between training and test error as the training progresses. Under-fitting happens when the error of training and test remain large as the training proceeds [58].

5.1.1 Error Analysis

Cross entropy indicates the distance between what the model believes the output distribu- tion should be, and what the original distribution really is. It is defined as: P H(y, p) = i yilog(pi). where y is the label and pi is the probability distribution. Cross entropy measure is a widely used alternative of squared error. It is used when node activations can be understood as representing the probability that each hypothesis might be true, i.e. when the output is a probability distribution. Thus it is used as a loss function in neural networks which have softmax activations in the output layer [27].

40 Figure 5.1: Learning curve: Training and validation losses during training and test phase

In short, the entropy tells us the theoretical minimum average encoding size for events that follow a particular probability distribution. Therefor, we calculate the expectation with the probability distribution P. Figure 5.1, represents the loss graph of the proposed algorithm. This is to show the learning curve for the proposed algorithm. As depicted in this figures, both training and test error curves are gradually decreasing towards zero. Cross-entropy loss decreases as the predicted probability converges to the actual label. The categorical error is illustrated in Figure 5.2. Here again, the categorical error is gradually decreasing and tending towards a constant value as the number of epochs in- creases. This is also to assure that there is no over/under-fitting in the training and test phase.

5.1.2 Network Accuracy

Figure 5.3 depicts the categorical accuracy as well as validation categorical accuracy. Cate- gorical accuracy verifies whether the index of the maximal true value is equal to the index of the maximal predicted value. As represented in Figure 5.3 represents, the categorical accuracy increases as the epochs increase, staring from 80% and reaching to 93% by the

41 Figure 5.2: Learning curve graph: Training and validation losses during training and test phase end of epochs. In addition to that, the validation accuracy reaches 91.49%. Therefore, the training and test accuracies are 93.39% and 91.49% respectively.

5.2 Classification Performance

To represent the accuracy of classification, we compute confusion matrix. A confusion matrix is a technique for summarizing the performance of a classification algorithm. It is a table that is often used to describe the performance of a classification model (or "classifier") on a set of test data for which the true values are known. It allows the visualization of the performance of an algorithm [97]. By definition a confusion matrix C is such that is Ci,j is equal to the number of observations known to be in group i but predicted to be in group j [33]. We have a multi-class classification problem with four cases, hence we use a confusion matrix to asses the the prediction performance of the network. Here the following terminology is defined:

• True Positives: represent the number of predicted positives and it is true. Hence, True

positives (TPA) is the number of true positive samples in class A (Normal), i.e., the number of samples that are correctly classified from class A.

42 Figure 5.3: Accuracy graph indicating: categortical accuracy and validation categorical ac- curacy

• True Negative: represent the number of predicted negative cases and it is true. Hence,

True positives (TNA) is the number of true negative samples in class A (Normal), i.e., the number of samples that are correctly Not classified in class A.

• False negative: FNA in Calss A; is the samples from class A that were incorrectly

classified as class B, i.e., misclassified samples. EAB is the samples from class A that were incorrectly classified as class B, i.e., misclassified samples. Thus the false negative

in class A is: FNA = EAB +EAC +EAD, which indicates the sum of,all class A samples that were incorrectly classified as class B, C or D. Simply, any class which is located in a column can be calculated by adding the errors in that class/column.

• False positive: is located in a row represents the sum of all errors in that row. For

example, the false positive in class A (FPA) is calculated as follows; FPA = EBA +

ECA + EDA.

With m × m confusion matrix, there are m correct classifications and m2 − m possible errors [96]. Figure 5.4 depicts the confusion matrix for the whole data set of all cases. As seen, the diagonal elements represent the number of points for which the predicted label is equal to

43 Figure 5.4: Categorical confusion Matrix: original data set the true label, while off-diagonal elements are those that are mislabeled by the classifier. The higher the diagonal values of the confusion matrix the better, indicating many correct predictions. Here, the entries of the confusion matrix depicted in Figure 5.4 are the number of oc- currences of each class for the dataset being analysed. The total of diagonal elements allows us to obtain the categorical accuracy of 1 or 100% from the confusion matrix. Table 5.1 represents the evaluation metrics for the proposed network. The metrics’ corresponding formulations are described bellow.

Table 5.1: Evaluation of network

Measure Normal Benign AV Block AF F1 score 0.81 0.89 1.0 0.85 AUC 0.69 0.81 1.0 0.74

44 • Precision is calculated as the number of correct positive predictions (TP) divided by the total number of positive predictions. It defines the True Positive Rate (TNR) [13]:

TP PRES = TP +FP

• Recall: or Sensitivity [13]:

TP REC = TP +FN

• Accuracy is calculated as the total number of two correct predictions (TP + TN) divided by the total number of a dataset (P + N) [13].

TP +TN ACC = TP +TN+FP +FN

• F1-score is a harmonic mean of precision and recall. This feature explains the rela- tions between data’s positive labels and those given by a classifier [95].

2.P RES.REC F 1 = PRESS+REC

• Specificity Determines specificity or true negative rate (TNR). It defines how effec- tively a classifier identifies negative labels [95].

TN SP = FP +TN

• AUC Area under the ROC curve; defines classifier’s ability to avoid false classifica- tion [95].

1 TP TN AUC = 2 TP +FN + TN+FP

5.3 Test phase

To further verify the network efficiency, we evaluated the performance of our network with a new data set of 47 patients. This is considered as a test phase for this project. This new data set has not been previously by the network. In addition to that we have not performed any data augmentation on this data set. In other words, the pre-trained network is applied on this new data set without any tuning.

5.3.1 Classification Verification

To further test the performance of the network, we expose it with a totally new data set without any tuning. Figure 5.5 depicts the confusion matrix for this data set and Table 5.2 demonstrates the accuracy of the network for this new data set. The classification results indicated that 93.62% of the predicted outputs were correctly classified. The illustration of the above information on a confusion matrix, (Figure 5.5) allows us to have a better picture of the performance of the algorithm. A summary of Figure 5.5 is described in below where out of the 47 patients:

45 Figure 5.5: Categorical confusion matrix: Test phase-47 new patients

• There were 43 actual instances of "Normal" (Class:1), and the classifier predicted all of them correctly.

• There is 1 instance of "Normal" (Class:2, second row), which the classifier predicted "Benign".

• There is 1 instance of "AV" (class 3), and the classifier predicted as "AV".

• There is 1 instance of "Benign" (Class:2, second row), which the classifier predicted "Normal".

• There are 2 instances of "AF" which the classifier predicted as "Normal" and "Benign".

Table 5.2: Evaluation of network on 47 new patients

Measure Normal Benign AV Block AF F1 score 0.9655 0.0 1.0 0.0 AUC 0.7383 0.4782 1.0 0.5

46 5.4 Discussion

In this project, we had access to a limited data set in which the associated labels are highly unbalanced. A majority of data are labeled as "Normal" whereas in total less than 5% are labeled as the other three categories. We overcome the issue of unequal distribution though data augmentation and the methods were described in 4.4.1. As Table 5.1 represents even with an unbalanced data we reached to the AUC of 0.69, 0.81, 1 and 0.74 for Normal, Benign, AV Block and AF classification, respectively. IN the test phase we represented an extra step to further express capabilities of the proposed network. The reported metrics in Table 5.2 are without any data augmentation or any tuning. Consequently, in study of 47 patients since there is only 1 case of "Benign" category, once the network does not detect this class as Benign, the AUC drops to 0.5. Whereas if the number of these classes were higher or if we have used data augmentations, with same network and pipeline the accuracy would have been increased.

47 Chapter 6

Conclusion and Future work

Holter Monitoring is the most common diagnostic tool to monitor a patient’s ECG while performing daily activities. Although technology and use have evolved over recent decades, the viability of the study remains a foundation for medical professionals worldwide. Most clinicians agree that there is still no replacement for holter monitoring. Among various holters available in the market, we chose to perform this study on the thin and lightweight Chroma2 holter Recorder provides accurate pacer detection with adjustable sensitivity for the latest pacemaker technology [25]. In this wrok we generated an automated technique based on Machine Learning for classification of holter monitor ECG data which is recorded over a 24-hour period using ScotCare holter device. The classification is defined as to predict the of normal and abnormal categories. In addition to that three classes of abnormalities were defined in the abnormal category. Our proposed network structure could potentially have various applications in patient management due to the fact that early detection of symptoms of cardiac disease or clues to life-threatening cardiac arrhythmias could potentially save many lives either in patients with yet undiagnosed cardiac disease or with already known cardiac disease. To approach this goal, we used convolutional neural networks that can learn from ex- isting data for future prediction. The current standard procedure for interpreting holter’s recordings is that a physician reads the summary of the signals and according to q-r-s peaks’ detection and other factors of the ECG signal s/he decides on the normality vs abnormality of the patient’s condition and correspondingly the type of abnormality if abnormal. This approach is intuitive and based on previous experience and heavily relies on physician’s experience on reading the signals. We used physician’s diagnosis to label holter’s data. We then used a deep neural network and trained it using features that were directly derived form holter reports, not the raw ECG data.

48 6.1 Summary of Contributions

Accurate classification of arrhythmia type provides sufficient information to detect some heart diseases and helps physicians in finding best treatment therapy for patients with respect to the relatively large patient data-set fed into the algorithm. With an automatic and high accuracy diagnosis, the objective is to save doctors considerable time. Moreover, coupled with low-cost ECG devices such as holter monitors, an automated detection system can provide diagnostic tools in remote areas with limited access to skilled physicians. Performance evaluation of our proposed method was verified though a series of metrics; The training and test error graphs were studied, as well as categorical and validational categorical accuracy were also investigated. The training and test accuracies are 93.39% and 91.49% respectively Besides the routine performance metrics, we also tested our proposed platform with a data set of 47 new patients and we achieved the AUC of 0.7383 for Normal prediction, 0.5 for Benign class, 1.0 for AV Block and 0.5 for AF class. Consistent with our findings, in general, the information provided by the holter monitors are an obvious advantage for both automatic algorithm analysis and physician interpreta- tion.

6.2 Challenges and Limitations

Analysing results ECG signals generated by holter monitors with machine learning methods is a promising approach but dealing with medical data for clinical applications raises some additional challenges, including the limitations of accessing to high amounts of data. In addition to that in the holter monitor reading specifically; not all the data might be used by the network due to high amounts of noise or interrupting the holter recording before the 24 hour. Specifically, detecting arrhythmias from ECG records has traditionally been challenging for computer systems. Thus arrhythmia detection is usually performed by expert technicians and cardiologists. Besides the general challenges there are other specific challenges that are listed them below. The details of how to overcome them have been previously discussed.

1. The dataset is sparse, namely it contains many entries that are zero. Therefore a suitable strategy should be considered for normalization of data such that its sparsity is taken to account. One promising solution is to perform normalization on the selected features.

2. The dataset contains missing values; indicating i.e. there are cases where there have been disconnection of the sensor to the skin or might be due to other factors. Thus, one needs to impute these missing values by taking into consideration the nature of the deterministic or stochastic phenomena that caused them to be missing.

49 3. The dataset includes temporal (sequential) features as well as non-temporal features which makes the modeling difficult.

4. The labels associated to samples are highly unbalanced: a large proportion of them are of normal type and only a few of the samples are of abnormal types. This issue can have a negative impact on generalization guarantees and renders most learning algorithms ineffective. As a result, it is critical to enrich the dataset by employing do- main knowledge in order to synthesize/simulate valid dummy samples from abnormal classes.

6.3 Future Work

For the purpose of this project all the holter readings were verified by one physician. How- ever, In order to define a gold standard to verify the accuracy of proposed network, since currently there are no "simulators" for this purpose, one can hypothetically have 5 physicians interpreting the holter’s readings and use them as the gold standard. Considering the wide area of holter monitor applications, a large number of interesting areas of research can be suggested to further expand and improve the avenues that take advantage from the proposed solutions. These may include:

1. Implementation of the proposed platform on non-invasive cardiovascular diagnostic devices, software, that perform 24/7 clinical monitoring to provide comprehensive diagnosis and management systems for physicians,

• Hospitals • practices • Implantable device clinics • Cardiopulmonary rehab clinics • Independent diagnostic testing facilities

2. The application of our ML framework can be expanded by defining extra features that can be derived form holter monitors, as well as additional labeling including a lager verity of arrhythmia. This will generate more powerful network by providing extra influential features. Additional features may include:

• Time: The ECG recording time could potentially be added as a feature, this will relieve some disorders depending the time of its occurrence. • Gender: Adding gender as an extra feature could potentially improve the network performance. • Other direct ECG relevant features could be found from raw data such as: the width of pulses, etc.

50 3. This research is far from concluded due to lack of abnormal data. To reduce the sparsity of data it is suggested to increase the data size, as well as improving the data augmentation techniques. This could be performed by embedding the science for each arrhythmia type in the data augmentation process as a guide tool.

4. Application generation: with a larger data set provided, there is a great opportunity of creating cellphone applications based on current ML platform which could easily be installed on cellphones. These applications could simply then be used as an interpre- tation guild for those in need of continues in-house wearing of ECG recording devices as well as caregivers using holters as a help tool to read and analyse the outputs of holters monitors.

51 Bibliography

[1] Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. Tensorflow: A system for large-scale machine learning. In 12th {USENIX} Symposium on Oper- ating Systems Design and Implementation ({OSDI} 16), pages 265–283, 2016.

[2] Martín Abadi and Agarwal A Barham P TensorFlow. Large-scale machine learning on heterogeneous distributed systems. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16)(Savannah, GA, USA, pages 265–283, 2016.

[3] U Rajendra Acharya, Hamido Fujita, Oh Shu Lih, Yuki Hagiwara, Jen Hong Tan, and Muhammad Adam. Automated detection of arrhythmias using different intervals of tachycardia ecg segments with convolutional neural network. Information sciences, 405:81–90, 2017.

[4] Abien Fred Agarap. Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375, 2018.

[5] Mohamad Mahmoud Al Rahhal, Yakoub Bazi, Haikel AlHichri, Naif Alajlan, Farid Melgani, and Ronald R Yager. Deep learning approach for active classification of electrocardiogram signals. Information Sciences, 345:340–354, 2016.

[6] Subhi J Al’Aref, Khalil Anchouche, Gurpreet Singh, Piotr J Slomka, Kranthi K Kolli, Amit Kumar, Mohit Pandey, Gabriel Maliakal, Alexander R van Rosendael, Ashley N Beecy, et al. Clinical applications of machine learning in cardiovascular disease and its relevance to . European heart journal, 2018.

[7] QUICK REFERENCE FOR THE HOLTER ANALYST. 2018 welch allyn, 2018.

[8] Fernando Andreotti, Oliver Carr, Marco AF Pimentel, Adam Mahdi, and Maarten De Vos. Comparing feature-based classifiers and convolutional neural networks to detect arrhythmia from short segments of ecg. In 2017 Computing in Cardiology (CinC), pages 1–4. IEEE, 2017.

[9] Muhammad Arif et al. Robust electrocardiogram (ecg) beat classification using dis- crete wavelet transform. Physiological measurement, 29(5):555, 2008.

[10] Natalia M Arzeno, Zhi-De Deng, and Chi-Sang Poon. Analysis of first-derivative based qrs detection algorithms. IEEE Transactions on Biomedical Engineering, 55(2):478– 484, 2008.

52 [11] Oleg Yu Atkov, Svetlana G Gorokhova, Alexandr G Sboev, Eduard V Generozov, Elena V Muraseyeva, Svetlana Y Moroshkina, and Nadezhda N Cherniy. Coronary heart disease diagnosis by artificial neural networks including genetic polymorphisms and clinical parameters. Journal of cardiology, 59(2):190–194, 2012.

[12] Paddy M Barrett, Ravi Komatireddy, Sharon Haaser, Sarah Topol, Judith Sheard, Jackie Encinas, Angela J Fought, and Eric J Topol. Comparison of 24-hour holter monitoring with 14-day novel adhesive patch electrocardiographic monitoring. The American journal of medicine, 127(1):95–e11, 2014.

[13] Mohamed Bekkar, Hassiba Kheliouane Djemaa, and Taklit Akrouf Alitouche. Eval- uation measures for models assessment over imbalanced data sets. J Inf Eng Appl, 3(10), 2013.

[14] D Benitez, PA Gaydecki, A Zaidi, and AP Fitzpatrick. The use of the hilbert transform in ecg signal analysis. Computers in biology and medicine, 31(5):399–406, 2001.

[15] Ashwin Bhandare, Maithili Bhide, Pranav Gokhale, and Rohan Chandavarkar. Appli- cations of convolutional neural networks. International Journal of Computer Science and Information Technologies, 7(5):2206–2215, 2016.

[16] Mohamad Sabri bin Sinal and Eiji Kamioka. Early abnormal heartbeat multistage classification by using decision tree and k-nearest neighbor. In Proceedings of the 2018 Artificial Intelligence and Cloud Computing Conference, pages 29–34. ACM, 2018.

[17] Nikhil Buduma and Nicholas Locascio. Fundamentals of deep learning: Designing next-generation machine intelligence algorithms. " O’Reilly Media, Inc.", 2017.

[18] A John Camm, Marek Malik, J Thomas Bigger, Günter Breithardt, Sergio Cerutti, Richard J Cohen, Philippe Coumel, Ernest L Fallen, Harold L Kennedy, RE Kleiger, et al. Heart rate variability: standards of measurement, physiological interpretation and clinical use. task force of the european society of cardiology and the north amer- ican society of pacing and . 1996.

[19] Statistics Canada. Table 102-0561-leading causes of death, total population, by age group and sex, canada, annual, cansim (database), 2018.

[20] Lorenzo Carnevale, Antonio Celesti, Maria Fazio, Placido Bramanti, and Massimo Villari. Heart disorder detection with menard algorithm on apache spark. In European Conference on Service-Oriented and Cloud Computing, pages 229–237. Springer, 2017.

[21] BS Chandra, Challa Subrahmanya Sastry, and Soumya Jana. Robust heartbeat de- tection from multimodal data via cnn-based generalizable information fusion. IEEE Transactions on Biomedical Engineering, 66(3):710–717, 2019.

[22] Shanti Chandra, Ambalika Sharma, and Girish Kumar Singh. Feature extraction of ecg signal. Journal of medical engineering & technology, 42(4):306–316, 2018.

[23] I Christov and I Simova. Q-onset and t-end delineation: Assessment of the perfor- mance of an automated method with the use of a reference database. Physiological Measurement, 28(2):213, 2007.

53 [24] Vivian L. Clark and James A. Kruse. Clinical Methods: The History, Physical, and Laboratory Examinations. JAMA, 264(21):2808–2809, 12 1990.

[25] The ScottCare Corporation. Scotcare cardiovascular solutions, 2016.

[26] John Creason, Lucas Neas, Debra Walsh, Ron Williams, Linda Sheldon, Duanping Liao, and Carl Shy. Particulate matter and heart rate variability among elderly retirees: the baltimore 1998 pm study. Journal of Exposure Science and Environmental Epidemiology, 11(2):116, 2001.

[27] P Dahal. Classification and loss evaluationâĂŤsoftmax and cross entropy loss, 2016.

[28] Gaël de Lannoy, Benoît Frénay, Michel Verleysen, and Jean Delbeke. Supervised ecg delineation using the wavelet transform and hidden markov models. In 4th European Conference of the International Federation for Medical and Biological Engineering, pages 22–25. Springer, 2009.

[29] Margaret K Delano. A long term wearable electrocardiogram (ECG) measurement system. PhD thesis, Massachusetts Institute of Technology, 2012.

[30] Usha Desai, Roshan Joy Martis, U Rajendra Acharya, C Gurudas Nayak, G Seshikala, and RANJAN SHETTY K. Diagnosis of multiclass tachycardia beats using recurrence quantification analysis and ensemble classifiers. Journal of Mechanics in Medicine and Biology, 16(01):1640005, 2016.

[31] Leonard S Dreifus, Jai B Agarwal, Elias H Botvinick, Keith C Ferdinand, Charles Fisch, John D Fisher, J Ward Kennedy, Richard E Kerber, Charles R Lambert, Okike N Okike, et al. Heart rate variability for risk stratification of life-threatening arrhythmias. Journal of the American College of Cardiology, 22(3):948–950, 1993.

[32] Anthony Dupre, Sarah Vincent, and Paul A Iaizzo. Basic ecg theory, recordings, and interpretation. In Handbook of cardiac anatomy, physiology, and devices, pages 191–201. Springer, 2005.

[33] Tom Fawcett. An introduction to roc analysis. In Proc. Natl. Acad, pages 10–1016, 2006.

[34] National Collaborating Centre for Women’s and UK Children’s Health. Clinical pre- sentation, diagnosis and management. 2008.

[35] Kim Fox, Jeffrey S. Borer, A. John Camm, Nicolas Danchin, Roberto Ferrari, Jose L. Lopez Sendon, Philippe Gabriel Steg, Jean-Claude Tardif, Luigi Tavazzi, Michal Ten- dera, and . Resting heart rate in cardiovascular disease. Journal of the American College of Cardiology, 50(9):823–830, 2007.

[36] A Ghaffari, MR Homaeinezhad, M Khazraee, and MM Daevaeiha. Segmentation of holter ecg waves via analysis of a discrete wavelet-derived multiple skewness–kurtosis based metric. Annals of biomedical engineering, 38(4):1497–1510, 2010.

[37] C Michael Gibson, Lauren N Ciaglo, Matthew C Southard, Shaun Takao, Caitlin Harrigan, Jason Lewis, Jason Filopei, Michelle Lew, Sabina A Murphy, and Jacqueline Buros. Diagnostic and prognostic value of ambulatory ecg (holter) monitoring in

54 patients with coronary heart disease: a review. Journal of thrombosis and thrombolysis, 23(2):135–145, 2007.

[38] Anton P. M. Gorgels, Frits W. Bär, KarelDen Dulk, and Hein J. J. Wellens. Atri- oventricular Dissociation, pages 1259–1290. Springer London, London, 2010.

[39] Cecilia Gutierrez and Daniel G Blanchard. Atrial fibrillation: diagnosis and treatment. Am Fam Physician, 83(1):61–68, 2011.

[40] Awni Y Hannun, Pranav Rajpurkar, Masoumeh Haghpanahi, Geoffrey H Tison, Codie Bourn, Mintu P Turakhia, and Andrew Y Ng. Cardiologist-level arrhythmia detec- tion and classification in ambulatory electrocardiograms using a deep neural network. Nature medicine, 25(1):65, 2019.

[41] Heart and Stroke Foundation of Canada. Heart arrhythmia, 2018.

[42] Mikael Henaff, Joan Bruna, and Yann LeCun. Deep convolutional networks on graph- structured data. arXiv preprint arXiv:1506.05163, 2015.

[43] Geoffrey E Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580, 2012.

[44] MR Homaeinezhad, A Ghaffari, H Najjaran Toosi, R Rahmani, M Tahmasebi, and MM Daevaeiha. Ambulatory holter ecg individual events delineation via segmentation of a wavelet-based information-optimized 1-d feature. Scientia Iranica, 18(1):86–104, 2011.

[45] Data Sciences inc. Ecg research, 2019.

[46] Ali Işın, Cem Direkoğlu, and Melike Şah. Review of mri-based brain tumor image segmentation using deep learning methods. Procedia Computer Science, 102:317–324, 2016.

[47] Ali Isin and Selen Ozdalili. Cardiac arrhythmia detection using deep learning. Procedia computer science, 120:268–275, 2017.

[48] Shweta H Jambukia, Vipul K Dabhi, and Harshadkumar B Prajapati. Classification of ecg signals using machine learning techniques: A survey. In 2015 International Conference on Advances in Computer Engineering and Applications, pages 714–721. IEEE, 2015.

[49] Craig T January, L Samuel Wann, Joseph S Alpert, Hugh Calkins, Joaquin E Cigar- roa, Joseph C Cleveland, Jamie B Conti, Patrick T Ellinor, Michael D Ezekowitz, Michael E Field, et al. 2014 aha/acc/hrs guideline for the management of patients with atrial fibrillation: executive summary: a report of the american college of cardiol- ogy/american heart association task force on practice guidelines and the heart rhythm society. Journal of the American College of Cardiology, 64(21):2246–2280, 2014.

[50] Jon Johnso. 24-hour holter monitoring: What to know, 2019.

55 [51] Tae Joon Jun, Hyun Ji Park, Nguyen Hoang Minh, Daeyoung Kim, and Young-Hak Kim. Premature ventricular contraction beat detection with deep neural networks. In 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pages 859–864. IEEE, 2016. [52] Piroon Kaewfoongrungsie, Nipon Theera-Umpon, and Sansanee Auephanwiriyakul. Ecg holter recorder via mobile phone. 03 2019. [53] N Kannathal, CM Lim, U Rajendra Acharya, and PK Sadasivan. Cardiac state diagno- sis using adaptive neuro-fuzzy technique. Medical Engineering & Physics, 28(8):809– 815, 2006. [54] Jae Kwon Kim and Sanggil Kang. Neural network-based coronary heart disease risk prediction using feature correlation analysis. Journal of healthcare engineering, 2017, 2017. [55] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. [56] Kristjan Korjus, Martin N Hebart, and Raul Vicente. An efficient data partitioning to improve classification performance while keeping parameters interpretable. PloS one, 11(8):e0161788, 2016. [57] Chayakrit Krittanawong, HongJu Zhang, Zhen Wang, Mehmet Aydar, and Takeshi Kitai. Artificial intelligence in precision cardiovascular medicine. Journal of the Amer- ican College of Cardiology, 69(21):2657–2664, 2017. [58] Steve Lawrence and C Lee Giles. Overfitting and neural networks: conjugate gradient and backpropagation. In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium, volume 1, pages 114–119. IEEE, 2000. [59] Yann LeCun, Bernhard E Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne E Hubbard, and Lawrence D Jackel. Handwritten digit recogni- tion with a back-propagation network. In Advances in neural information processing systems, pages 396–404, 1990. [60] Yann LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. [61] Chia-Hung Lin, Yi-Chun Du, and Tainsong Chen. Adaptive wavelet network for mul- tiple cardiac arrhythmias recognition. Expert Systems with Applications, 34(4):2601– 2611, 2008. [62] MultiMedia LLC. 2019 cardiacmonitoring.com, 2019. [63] Brian CS Loh and Patrick HH Then. Deep learning for cardiac computer-aided diag- nosis: benefits, issues & solutions. Mhealth, 3, 2017. [64] Eduardo José da S Luz, William Robson Schwartz, Guillermo Cámara-Chávez, and David Menotti. Ecg-based heartbeat classification for arrhythmia detection: A survey. Computer methods and programs in biomedicine, 127:144–164, 2016.

56 [65] Aurore Lyon, Ana Mincholé, Juan Pablo Martínez, Pablo Laguna, and Blanca Ro- driguez. Computational techniques for ecg analysis and interpretation in light of their contribution to medical advances. Journal of The Royal Society Interface, 15(138):20170821, 2018.

[66] Peter W Macfarlane, Adriaan Van Oosterom, Olle Pahlm, Paul Kligfield, Michiel Janse, and John Camm. Comprehensive electrocardiology. Springer Science & Business Media, 2010.

[67] Arnaz Malhi and Robert X Gao. Pca-based feature selection scheme for machine defect classification. IEEE Transactions on Instrumentation and Measurement, 53(6):1517– 1525, 2004.

[68] Juan Pablo Martínez, Rute Almeida, Salvador Olmos, Ana Paula Rocha, and Pablo Laguna. A wavelet-based ecg delineator: evaluation on standard databases. IEEE Transactions on biomedical engineering, 51(4):570–581, 2004.

[69] Andreas Mayr, Harald Binder, Olaf Gefeller, and Matthias Schmid. The evolution of boosting algorithms. Methods of information in medicine, 53(06):419–427, 2014.

[70] Pamela J McCabe, Karen Schumacher, and Susan A Barnason. Living with atrial fibrillation: a qualitative study. Journal of Cardiovascular Nursing, 26(4):336–344, 2011.

[71] Paolo Melillo, Rossana Castaldo, Giovanna Sannino, Ada Orrico, Giuseppe De Pietro, and Leandro Pecchia. Wearable technology and ecg processing for fall risk assessment, prevention and detection. In 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 7740–7743. IEEE, 2015.

[72] M Mitra and S Mitra. A software based approach for detection of qrs vector of ecg signal. In 3rd Kuala Lumpur International Conference on Biomedical Engineering 2006, pages 348–351. Springer, 2007.

[73] Patricia Gonce Morton, Dorrie Fontaine, CM Hudak, and BM Gallo. Critical care nursing: A holistic approach, volume 1. Lippincott Williams & Wilkins Philadelphia, 2005.

[74] Dariush Mozaffarian, Emelia J Benjamin, Alan S Go, Donna K Arnett, Michael J Blaha, Mary Cushman, Sandeep R Das, Sarah de Ferranti, Jean Pierre Després, Heather J Fullerton, et al. Heart disease and stroke statistics-2016 update a report from the american heart association. Circulation, 133(4):e38–e48, 2016.

[75] Dariush Mozaffarian, Emelia J Benjamin, Alan S Go, Donna K Arnett, Michael J Blaha, Mary Cushman, Sarah De Ferranti, Jean-Pierre Després, Heather J Fullerton, Virginia J Howard, et al. Executive summary: heart disease and stroke statistics-2015 update: a report from the american heart association. Circulation, 131(4):434–441, 2015.

[76] Lucien Ng, Kwai Wong, Azzam Haidar, Stanimire Tomov, and Jack Dongarra. Mag- madnn high-performance data analytics for manycore gpus and cpus. In magma-DNN, 2017 Summer Research Experiences for Undergraduate (REU). 2017.

57 [77] Jeffrey J Nirschl, Andrew Janowczyk, Eliot G Peyster, Renee Frank, Kenneth B Mar- gulies, Michael D Feldman, and Anant Madabhushi. A deep-learning classifier iden- tifies patients with clinical heart failure using whole-slide images of h&e tissue. PloS one, 13(4):e0192726, 2018.

[78] Javier Andreu Perez, Fani Deligianni, Daniele Ravi, and Guang-Zhong Yang. Artificial intelligence and robotics. arXiv preprint arXiv:1803.10813, 2018.

[79] Filip Plesinger, Petr Nejedly, Ivo Viscor, Josef Halamek, and Pavel Jurak. Automatic detection of atrial fibrillation and other arrhythmias in holter ecg recordings using rhythm features and neural networks. In 2017 Computing in Cardiology (CinC), pages 1–4. IEEE, 2017.

[80] Filip Plesinger, Petr Nejedly, Ivo Viscor, Josef Halamek, and Pavel Jurak. Parallel use of a convolutional neural network and bagged tree ensemble for the classification of holter ecg. Physiological measurement, 39(9):094002, 2018.

[81] Bahareh Pourbabaee, Mehrsan Javan Roshtkhari, and Khashayar Khorasani. Feature leaning with deep convolutional neural networks for screening patients with parox- ysmal atrial fibrillation. In 2016 International Joint Conference on Neural Networks (IJCNN), pages 5057–5064. IEEE, 2016.

[82] Boris Pyakillya, N Kazachenko, and N Mikhailovsky. Deep learning for ecg classifica- tion. Journal of Physics: Conference Series, 913:012004, 10 2017.

[83] Pranav Rajpurkar, Awni Y Hannun, Masoumeh Haghpanahi, Codie Bourn, and An- drew Y Ng. Cardiologist-level arrhythmia detection with convolutional neural net- works. arXiv preprint arXiv:1707.01836, 2017.

[84] NNSV Rama Raju, V Malleswara Rao, and BN Jagadesh. Identification and classifi- cation of cardiac arrhythmia using neural network. HELIX, 7(5):2041–2046, 2017.

[85] CK Roopa and BS Harish. A survey on various machine learning approaches for ecg analysis. International Journal of Computer Applications, 163(9):25–33, 2017.

[86] Giovanna Sannino and Giuseppe De Pietro. An evolved ehealth monitoring system for a nuclear medicine department. In 2011 Developments in E-systems Engineering, pages 3–6. IEEE, 2011.

[87] Giovanna Sannino and Giuseppe De Pietro. A deep learning approach for ecg-based heartbeat classification for arrhythmia detection. Future Generation Computer Sys- tems, 86:446–455, 2018.

[88] Mario Sansone, Roberta Fusco, Alessandro Pepino, and Carlo Sansone. Electrocardio- gram pattern recognition and analysis based on artificial neural networks and support vector machines: a review. Journal of healthcare engineering, 4(4):465–504, 2013.

[89] J Philip Saul, Paul Albrecht, Ronald D Berger, and Richard J Cohen. Analysis of long term heart rate variability: methods, 1/f scaling and implications. Computers in cardiology, 14:419–422, 1988.

58 [90] Shalin Savalia and Vahid Emamian. Cardiac arrhythmia classification by multi-layer perceptron and convolution neural networks. Bioengineering, 5(2):35, 2018.

[91] Omid Sayadi and Mohammad B Shamsollahi. A model-based bayesian framework for ecg beat segmentation. Physiological measurement, 30(3):335, 2009.

[92] Khader Shameer, Kipp W Johnson, Benjamin S Glicksberg, Joel T Dudley, and Partho P Sengupta. Machine learning in cardiovascular medicine: are we there yet? Heart, 104(14):1156–1164, 2018.

[93] Supreeth P Shashikumar, Amit J Shah, Gari D Clifford, and Shamim Nemati. De- tection of paroxysmal atrial fibrillation using attention-based bidirectional recurrent neural networks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 715–723. ACM, 2018.

[94] Daniel E Singer, Gregory W Albers, James E Dalen, Alan S Go, Jonathan L Halperin, and Warren J Manning. Antithrombotic therapy in atrial fibrillation: the seventh accp conference on antithrombotic and thrombolytic therapy. Chest, 126(3):429S–456S, 2004.

[95] Marina Sokolova and Guy Lapalme. A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4):427–437, 2009.

[96] Ashwin Srinivasan. Note on the location of optimal classifiers in n-dimensional roc space. 1999.

[97] Justin Talbot, Bongshin Lee, Ashish Kapoor, and Desney S Tan. Ensemblematrix: interactive visualization to support machine learning with multiple classifiers. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1283–1292. ACM, 2009.

[98] Masayuki Tanaka and Masatoshi Okutomi. A novel inference of a restricted boltzmann machine. In 2014 22nd International Conference on Pattern Recognition, pages 1526– 1531. IEEE, 2014.

[99] Sebastian Thrun, Lawrence K Saul, and Bernhard Schölkopf. Advances in Neural Information Processing Systems 16: Proceedings of the 2003 Conference, volume 16. MIT press, 2004.

[100] Luis Tobias, Aurelien Ducournau, Francois Rousseau, Gregoire Mercier, and Ronan Fablet. Convolutional neural networks for object recognition on mobile devices: A case study. In 2016 23rd International Conference on Pattern Recognition (ICPR), pages 3530–3535. IEEE, 2016.

[101] Nathalie-Sofia Tomov and Stanimire Tomov. On deep neural networks for detecting heart disease. arXiv preprint arXiv:1808.07168, 2018.

[102] Andrea Vedaldi and Karel Lenc. Matconvnet: Convolutional neural networks for matlab. In Proceedings of the 23rd ACM international conference on Multimedia, pages 689–692. ACM, 2015.

59 [103] Li Wan, Matthew Zeiler, Sixin Zhang, Yann Le Cun, and Rob Fergus. Regulariza- tion of neural networks using dropconnect. In International conference on machine learning, pages 1058–1066, 2013.

[104] Eric S Winokur, Maggie K Delano, and Charles G Sodini. A wearable cardiac mon- itor for long-term data acquisition and analysis. IEEE Transactions on Biomedical Engineering, 60(1):189–192, 2013.

[105] Yong Xia, Naren Wulan, Kuanquan Wang, and Henggui Zhang. Detecting atrial fibrillation by deep convolutional neural networks. Computers in biology and medicine, 93:84–92, 2018.

[106] Ozal Yildirim, Ru San Tan, and U Rajendra Acharya. An efficient compression of ecg signals using deep convolutional autoencoders. Cognitive Systems Research, 52:198– 211, 2018.

[107] Shamil Yusuf and A John Camm. The sinus tachycardias. Nature Reviews Cardiology, 2(1):44, 2005.

[108] Zhaohong Zhang. Smart sensing with ultra-low-power mcus âĂŞ part 4: Holter mon- itor, 2017.

[109] Peter Zimetbaum and Alena Goldman. Ambulatory arrhythmia monitoring: choosing the right device. Circulation, 122(16):1629–1636, 2010.

[110] W Zong, GB Moody, and D Jiang. A robust open-source algorithm to detect onset and duration of qrs complexes. In Computers in Cardiology, 2003, pages 737–740. IEEE, 2003.

[111] Muhammad Zubair, Jinsul Kim, and Changwoo Yoon. An automated ecg beat classi- fication system using convolutional neural networks. In 2016 6th International Con- ference on IT Convergence and Security (ICITCS), pages 1–5. IEEE, 2016.

60 Appendix

Supplementary Data File

Graphic representation of accuracy learning curve, error learning curve, and results.

File Name: etd20310-amir-tashakkor-Appendix.zip

61