GAWDE, PURVA R., Ph.D., December 2018 COMPUTER SCIENCE

INTEGRATED ANALYSIS OF TEMPORAL AND MORPHOLOGICAL FEATURES

USING MACHINE LEARNING TECHNIQUES FOR REAL TIME DIAGNOSIS OF

ARRHYTHMIA AND IRREGULAR BEATS

Dissertation Advisor: Arvind Bansal

Heart diseases are the major causes of morbidity and fatality in senior age group which affect their productivity and lifestyle significantly. ECG is a noninvasive means of maintaining healthy heart. One of the major abnormalities of heart is that consists of irregular heartbeats due to ectopic nodes. Currently available systems lack sufficient accuracy and finer real time classification, which affects the treatment.

In this research, machine learning and parallelization techniques have been developed for the real-time analysis of ECG for diagnosing the finer classes of and irregular heartbeats. An integrated approach combining Markov model and bivariate

Gaussian distribution has been proposed for an integrated analysis of the temporal and morphological features. Area subtraction techniques have been proposed for detecting the embedded waveforms. The analysis has been extended with a look-ahead pattern analysis algorithm for identifying different classes of irregular beats. The execution efficiency has been further improved to accommodate diagnosis of other heart-diseases in real-time by exploiting GPU based SIMT parallelism that performs beat level analysis concurrently.

The implementation results show very high accuracy.

INTEGRATED ANALYSIS OF TEMPORAL AND MORPHOLOGICAL FEATURES

USING MACHINE LEARNING TECHNIQUES FOR REAL TIME DIAGNOSIS OF

ARRHYTHMIA AND IRREGULAR BEATS

A dissertation submitted

to Kent State University in partial

fulfillment of the requirements for the

degree of Doctor of Philosophy

by

Purva R. Gawde

December 2018

© Copyright

All rights reserved

ii

Dissertation written by

Purva Rajendra Gawde

B.S., Mumbai University, India, 2010

M.S., Kent State University, USA, 2013

Ph.D., Kent State University, 2018

Approved by

Dr. Arvind K. Bansal, Chair, Doctoral Dissertation Committee

Dr. Javed I. Khan, Members, Doctoral Dissertation Committee

Dr. Cheng Chang Lu,

Dr. Jeffery A. Nielson

Dr. Gokarna Sharma

Accepted by

Dr. Javed I. Khan, Chair, Department of Computer Science

Dr. James L. Blank, Dean, College of Arts and Sciences

iii

TABLE OF CONTENTS

TABLE OF CONTENTS ...... IV

LIST OF FIGURES ...... XIV

LIST OF TABLES ...... XX

GLOSSARY...... XXII

DEDICATION...... XXIV

ACKNOWLEDGEMENTS ...... XXV

CHAPTER 1 INTRODUCTION ...... 1

1.1 Motivation ...... 1

1.2 Heart Structure and electrophysiology ...... 5

Anatomy of heart ...... 5

Electrical behavior of heart ...... 7

Electrophysiology...... 8

Heart and ECG ...... 12

1.3 Arrhythmia and Irregular Beats ...... 13

Heart Arrhythmia ...... 13

Premature Beats and Beat-patterns ...... 14

1.4 Problem Definition ...... 15

1.5 Previous Work and Limitations ...... 16

1.6 This Research ...... 17

Contributions ...... 20

iv

Applications ...... 21

1.7 Roadmap ...... 22

CHAPTER 2 BACKGROUND ...... 24

2.1 ECG Signal Representation ...... 24

2.1.1 Components of ECG-complex ...... 25

2.1.2 Components of ECG-complex ...... 26

2.2 Signal Processing Techniques ...... 27

2.2.1 Convolution ...... 28

2.2.2 Fast Fourier Transform (FFT) ...... 28

2.2.3 Wavelet Transform ...... 29

2.3 ECG Signal Classification ...... 31

2.3.1 Denoising ...... 32

2.3.2 Feature-extraction...... 34

2.4 Arrhythmias ...... 38

2.4.1 Supraventricular arrhythmias ...... 38

2.4.2 Ventricular arrhythmia ...... 41

2.5 Premature Beats and Patterns ...... 42

2.5.1 Premature atrial contractions (PAC) ...... 43

2.5.2 Premature ventricular contractions (PVC) ...... 43

2.5.3 Premature junctional contractions (PJC) ...... 44

2.5.4 Classification of irregular beat-patterns ...... 44

v

2.6 ECG Variations in Arrhythmias and Irregular Beats ...... 45

2.7 Mathematical Concepts ...... 47

2.7.1 Probability ...... 47

2.7.2 Gaussian distribution ...... 47

2.7.3 Gaussian mixture model (GMM) ...... 48

2.7.4 Expectation maximization (EM) ...... 49

2.8 Machine Learning Concepts ...... 49

2.8.1 Supervised learning ...... 49

2.8.2 Rule-based system ...... 50

2.8.3 Decision tree ...... 50

2.8.4 Naive Bayes classifier ...... 51

2.8.5 Support vector machine ...... 52

2.8.6 Clustering ...... 53

2.8.7 Genetic algorithm ...... 54

2.8.8 Principal component analysis (PCA) ...... 54

2.8.9 Linear Discriminant Analysis...... 56

2.8.10 Markov model ...... 56

2.8.11 Hidden Markov Model ...... 58

2.8.12 Mixing Markov model and Gaussian distribution...... 59

2.8.13 Forward algorithm ...... 60

2.8.14 Artificial neural network (ANN) ...... 61

vi

2.9 Parallelism ...... 63

2.9.1 Dependency Analysis ...... 64

2.10 Graphics Processing Unit (GPU) ...... 65

2.10.1 Compute unified device architecture (CUDA)...... 66

2.10.2 CUDA programming model ...... 68

2.11 Accuracy Metrics ...... 68

2.11.1 Sensitivity and specificity analysis ...... 68

2.11.2 Receiver Operating Characteristics (ROC) Curve ...... 69

CHAPTER 3 PREVIOUS WORKS ...... 71

3.1 Rule based systems ...... 72

3.2 HMM-based techniques ...... 74

3.3 SVM-based techniques ...... 76

3.4 ANN-based techniques ...... 78

3.5 Clustering-based techniques ...... 80

3.6 Other methods ...... 82

3.7 Limitations ...... 84

3.8 Discussion ...... 87

CHAPTER 4 IDENTIFICATION AND CLASSIFICATION OF ARRHYTHMIA 88

4.1 Issues in arrhythmia ...... 89

4.1.1 P-wave in supraventricular arrhythmia due to ...... 90

4.1.2Embedded waveforms ...... 93

vii

4.2 Markov model approach for classification ...... 94

4.2.1 Integrating Markov model with bivariate Gaussian distribution ...... 95

4.2.2 Bivariate Gaussian distribution Markov model (BGMM) ...... 97

4.3 Solutions for arrhythmia issues ...... 98

4.3.1 Resolution of P-waves using BGMM ...... 98

4.3.2 Identification of embedded waveform using area analysis ...... 99

4.3.3 Identification of embedded wave using Simpson’s rule ...... 100

4.4 BGMM for arrhythmia ...... 102

4.4.1 BGMM for supraventricular arrhythmia ...... 102

4.4.2 BGMM for ventricular arrhythmia ...... 105

4.5 Bivariate Gaussian Probabilistic Transition Graph (BGTG) ...... 107

4.5.1 Example of BGTG for EAT ...... 107

4.5.2 Example of BGTG for VTach ...... 108

4.6 BGMM and BGTG matching ...... 108

4.7 Overall classification approach ...... 109

4.7.1 Training phase ...... 109

4.7.2 Dynamic analysis phase ...... 110

4.7.3 Feature-extraction ...... 112

4.7.4 Clustering for Level-1 Arrhythmia Classification ...... 113

4.8 Algorithms ...... 114

4.8.1 Identification of embedded P-waves ...... 114

viii

4.8.2 Expectation maximization (EM) ...... 115

4.8.3 Algorithm for BGMM and BGTG matching ...... 116

4.9 Discussion ...... 118

CHAPTER 5 IDENTIFICATION OF PREMATURE BEATS AND IRREGULAR

BEAT-PATTERNS ...... 120

5.1 Challenges in premature beat analysis ...... 121

5.1.1 Embedded waveforms ...... 121

5.1.2 Irregular beat-pattern identification ...... 123

5.2 Premature beats detection ...... 125

5.2.1 Integration of rule-based analysis and embedded waveform analysis ...... 126

5.3 Markov model approach for finer classification ...... 130

5.3.1 Example of BGMM ...... 130

5.3.2 Example of BGTG ...... 132

5.3.3 BGTG and BGMM graph matching ...... 133

5.4 Identifying beat-pattern ...... 134

5.4.1 Nondeterministic automata ...... 136

5.5 Overall approach ...... 137

5.5.1 Training phase ...... 137

5.5.2 Dynamic phase ...... 137

5.6 Premature beat detection algorithms ...... 139

5.6.1 Rule-based beat detection algorithm ...... 139

ix

5.6.2 Algorithm to identify beat-pattern ...... 141

5.7 Discussion ...... 143

CHAPTER 6 PARALLEL PROCESSING USING CUDA ENABLED GPU ...... 144

6.1 GPU and CUDA architecture ...... 145

6.2 Parallelization of Arrhythmia Subclasses using CUDA enabled GPU ...... 148

6.2.1 Dependency analysis for arrhythmia classification...... 148

6.2.2 Types of SIMT parallelism in ECG analysis ...... 149

6.2.3 Feature-extraction ...... 149

6.2.4 Embedded waveform detection ...... 151

6.2.5 GMM-based classification ...... 151

6.2.6 BGTG formation ...... 153

6.2.7 BGMM and BGTG graph matching ...... 154

6.2.8 Parallelization of arrhythmia subclassification ...... 155

6.3 Parallelization of Arrhythmia Algorithms ...... 158

6.3.1 Parallel algorithm for feature-extraction ...... 158

6.3.2 Algorithm to detect embedded waveform ...... 160

6.3.3 Parallel algorithm for GMM-based classification ...... 160

6.3.4 Parallel algorithm for BGTG construction ...... 164

6.3.5 Parallel algorithm for BGTG-BGMM graph matching ...... 165

6.4 Parallelization of premature beats using CUDA enabled GPU ...... 168

6.4.1 Dependency analysis ...... 169

x

6.4.2 Overall approach for parallelization of irregular beat-pattern classification 170

6.4.3 Parallel premature beat detection ...... 170

6.4.4 Parallelization of rule-based analysis ...... 171

6.4.5 Parallelization of graph matching for beat-labeling ...... 172

6.5 Chapter Summary and Discussion ...... 174

CHAPTER 7 IMPLEMENTATION ...... 175

7.1 ECG databases ...... 175

7.2 Software and libraries ...... 177

7.2.1 WAVE software and LightWAVE editor ...... 177

7.2.2 WFDB toolbox for MATLAB ...... 178

7.3 NVIDIA CUDA ...... 178

7.4 Machine configuration ...... 179

7.5 Preprocessing waveforms ...... 180

7.5.1 ECG signal denoising ...... 181

7.5.2 Feature-extraction ...... 181

7.5.3 Embedded waveform analysis ...... 182

7.6 Bivariate Distribution Markov Model (BGMM) ...... 183

7.7 Implementing arrhythmia classification ...... 184

7.7.1 GMM parameter using EM ...... 184

7.7.2 Bivariate Gaussian Distribution Transition Graph (BGTG) ...... 185

7.7.3 sGraph matching for labeling ...... 186

xi

7.8 Implementing premature beat and irregular beat-pattern classification ...... 187

7.8.1 Integrated rule-based and embedded waveform analysis ...... 188

7.8.2 Irregular beat-pattern analysis ...... 189

7.9 Implementation of GPU accelerated algorithms ...... 190

7.10 Real-time simulation ...... 191

7.11 Chapter Summary ...... 194

CHAPTER 8 PERFORMANCE EVALUATIONS ...... 195

8.1 Arrhythmia classification ...... 195

8.1.1 Level 1 classification using GMM ...... 196

8.1.2 Level 2 classification using Markov model approach...... 197

8.2 Premature beat and irregular beat-pattern classification ...... 199

8.2.1 Irregular beat-pattern classification ...... 201

8.3 GPU-based acceleration ...... 202

8.3.1 Arrhythmia ...... 203

8.3.2 Premature beats ...... 205

8.3.3 Speedup comparison to obtain optimum number of BGTGs ...... 208

8.4 Chapter summary and discussion ...... 215

CHAPTER 9 RELATED WORKS AND LIMITATIONS ...... 216

9.1 Arrhythmia Classification ...... 216

9.2 Premature Beat and Irregular Heartbeat Patterns Classification ...... 221

9.3 GPU-based classification ...... 224

xii

9.4 Limitations ...... 226

9.5 This research ...... 227

CHAPTER 10 CONCLUSION ...... 229

10.1 Conclusion ...... 229

10.2 Limitations ...... 233

10.3 Future Work ...... 234

REFERENCES ...... 235

xiii

LIST OF FIGURES

Figure 1.1 Cross Sectional View of the Heart ...... 6

Figure 1.2 The Electrical Conduction of the Heart ...... 7

Figure 1.3 Myocardial cell (myocyte) ...... 9

Figure 1.4 Solutions inside and outside the cell are different. The pump (dark-blue blue

dots) maintains the right number of ions on both sides of the walls ...... 10

Figure 1.5 Phases of myocyte (cardiac cell) stimulation ...... 11

Figure 1.6 a) The morphology and timing of the action potentials from different regions of

the heart; and b) The related cardiac cycle of the ECG as measured on the body surface

...... 13

Figure 1.7 Arrhythmia and Premature Beat Finer Subclasses ...... 14

Figure 2.1 Components of ECG Complex ...... 24

Figure 2.2 12-leads placement ...... 27

Figure 2.3 Convolution ...... 28

Figure 2.4 DWT decomposition ...... 31

Figure 2.5 Daubechies Db6 wavelet and Q-wave match ...... 31

Figure 2.6 Block Diagram of ECG Signal Classification ...... 31

Figure 2.7 . Result for QRS detection ...... 35

Figure 2.8 Result for P and detection ...... 37

Figure 2.9 Atrial and ...... 39

Figure 2.10 Atrio-Ventricular reentrant and ectopic ...... 40 xiv

Figure 2.11 ...... 41

Figure 2.12 Premature atrial contraction and blocked-premature atrial contraction ...... 43

Figure 2.13 Premature ventricular contraction and premature junctional contraction ..... 44

Figure 2.14 Beat variations of ventricular and supraventricular arrhythmia ...... 45

Figure 2.15 Beat variations for premature beats ...... 46

Figure 2.16 Normal distribution ...... 48

Figure 2.17 SVM with two hyperplanes for two dimensional data ...... 53

Figure 2.18 An illustration of principal component analysis ...... 55

Figure 2.19 An illustration of transitions in first-order Markov model ...... 57

Figure 2.20 Hidden Markov Model Representation ...... 59

Figure 2.21 Example of program dependency graph ...... 65

Figure 2.22 Architecture of GPU and CPU ...... 66

Figure 2.23 A schematic of CUDA memory architecture ...... 67

Figure 2.24 ROC curve ...... 69

Figure 4.1 Two level classification of arrhythmia into finer subclasses ...... 88

Figure 4.2 P-wave Morphology ...... 91

Figure 4.3 P--wave in LAE (left) and RAE (right) when impulse starts in right atria ..... 91

Figure 4.4 P- wave in LAE (left) and RAE (right) when impulse starts in left atria ...... 92

Figure 4.5 AFib with left enlargement ...... 92

Figure 4.6 AFlu with right atrium enlargement ...... 93

Figure 4.7 EAT with right atrium enlargement ...... 93

xv

Figure 4.8 P-wave embedded in a QRS-complex ...... 94

Figure 4.9 ECG modeling using a Markov model ...... 95

Figure 4.10 Bivariate distribution of P-wave amplitude and duration...... 97

Figure 4.11 P-wave morphology for the atrial enlargement ...... 98

Figure 4.12 P-wave embedded in QRS-complex for VTach (left) and missing P-wave for

JTachy (right) ...... 99

Figure 4.13 Simpson's rule for area calculation ...... 100

Figure 4.14 ROC curve for threshold selection ...... 101

Figure 4.15 BGMM for EAT ...... 102

Figure 4.16 BGMM for VTach ...... 105

Figure 4.17 A sample of BGTG for 20-beat window ...... 107

Figure 4.18 A sample of BGTG for a 20-beat window ...... 108

Figure 4.19 An overall approach of arrhythmia classification ...... 111

Figure 4.20 An algorithm for embedded P-waves ...... 114

Figure 4.21 GMM based classification ...... 116

Figure 4.22 An algorithm for transition graph matching ...... 117

Figure 5.1 P-on-T phenomenon in B-PAC ...... 122

Figure 5.2 R-on-T phenomenon in PVC ...... 123

Figure 5.3 RR-interval variability due to premature beats ...... 126

Figure 5.4 RR-interval analysis for B-PAC with P-on-T ...... 128

Figure 5.5 RR-interval analysis for PVC with R-on-T ...... 129

xvi

Figure 5.6 BGMM for PAC ...... 131

Figure 5.7 BGTG constructed for a patient ...... 133

Figure 5.8 Nondeterministic automata for beat-pattern analysis ...... 136

Figure 5.9 Indexed sequence of beat-pattern ...... 137

Figure 5.10 Overall approach ...... 139

Figure 5.11 Algorithm to classify premature beat ...... 140

Figure 5.12 Algorithm to identify beat-pattern ...... 142

Figure 6.1 . A CUDA architecture ...... 146

Figure 6.2 Timing analysis for arrhythmia classification ...... 147

Figure 6.3 Dependency graph for feature-extraction ...... 150

Figure 6.4 . Dependency graph for embedded waveform detection ...... 151

Figure 6.5 Dependency graph for GMM parameter estimation ...... 152

Figure 6.6 Concurrent thread execution approach for GMM-based classification ...... 153

Figure 6.7 Dependency graph for BGTG formation ...... 154

Figure 6.8 Concurrent graph matching tasks ...... 155

Figure 6.9 Overall parallel approach for arrhythmia subclassification...... 156

Figure 6.10 Algorithm for concurrent denoising and feature-extraction ...... 159

Figure 6.11 Algorithm for concurrent embedded waveform detection ...... 160

Figure 6.12 Algorithm for concurrent GMM based classification ...... 161

Figure 6.13 Construction of Posterior Matrix and mixture coefficients ...... 162

Figure 6.14 Estimate Mean ...... 163

xvii

Figure 6.15 Estimate Variance ...... 163

Figure 6.16 Estimate Covariance ...... 164

Figure 6.17 Algorithm for concurrent BGTG construction ...... 165

Figure 6.18 Algorithm for concurrent denoising and pruning ...... 166

Figure 6.19 Algorithm for concurrent MPP ...... 166

Figure 6.20 Algorithm concurrent maximum probability ...... 168

Figure 6.21 Parallel approach for premature beat and beat pattern classification ...... 170

Figure 6.22 Dependency graph for premature beat detection ...... 172

Figure 6.23 Algorithm for concurrent rule based analysis ...... 173

Figure 7.1 ECG preprocessing step ...... 180

Figure 7.2 ECG denoising module...... 181

Figure 7.3 Feature-extraction module ...... 182

Figure 7.4 Embedded waveform detection module ...... 183

Figure 7.5 BGMM construction module ...... 184

Figure 7.6 Preprocessing stage for updated feature-extraction ...... 184

Figure 7.7 GMM based classification module ...... 185

Figure 7.8 BGTG construction module ...... 186

Figure 7.9 Denoising and BGMM pruning module ...... 186

Figure 7.10 Most probable path module ...... 187

Figure 7.11 Maximum probability module ...... 187

Figure 7.12 Premature beat and beat pattern classification modules ...... 188

xviii

Figure 7.13 Integrated rule based and embedded waveform analysis module ...... 189

Figure 7.14 Irregular beat pattern analysis module ...... 190

Figure 7.15 Real time simulation of ECG in LightWAVE ...... 192

Figure 7.16 Real time simulation timeline ...... 193

Figure 8.1 Result of GMM-based clustering ...... 197

Figure 8.2 Effect of memory optimization on speedup for arrhythmia ...... 204

Figure 8.3 Effect of memory optimization on speedup for premature beats ...... 207

Figure 8.4 Speedup comparison of ECG denoising and preprocessing ...... 210

Figure 8.5 Speedup comparison of embedded waveform detection ...... 211

Figure 8.6 Speedup comparison for BGTG construction ...... 213

Figure 8.7 Speedup comparison for BGTG-BGMM matching ...... 214

xix

LIST OF TABLES

Table 2.1 Probability matrices in Markov model ...... 58

Table 4.1 Transition probability matrix ...... 103

Table 4.2 Bivariate Gaussian distribution ...... 104

Table 4.3 Transition probability matrix ...... 106

Table 4.4 Bivariate Gaussian distribution ...... 106

Table 5.1 Patterns of Irregular Beats ...... 124

Table 5.2 Transition probability matrix ...... 131

Table 5.3 Bivariate distribution for each state in BGMM of PAC ...... 131

Table 5.4 Bivariate distribution for each state of BGTG ...... 133

Table 6.1 Average processing time for modules ...... 148

Table 6.2 Average processing time for modules ...... 169

Table 8.1 Result of GMM-based classification using EM clustering ...... 196

Table 8.2 Result for supraventricular classification ...... 198

Table 8.3 Result for ventricular classification ...... 199

Table 8.4 Result for embedded wave detection ...... 199

Table 8.5 Result for premature beat classification...... 200

Table 8.6 Result for classification accuracy of premature beats ...... 200

Table 8.7 Result for irregular beat-pattern classification...... 201

Table 8.8 Speedup comparison for arrhythmia classification ...... 203

Table 8.9 Accuracy of arrhythmia subclassification...... 205

Table 8.10 Speedup comparison for premature beat classification ...... 206

Table 8.11 Accuracy of premature beat and beat-pattern classification ...... 208

Table 8.12 Execution time for denoising and preprocessing module ...... 210

Table 8.13 Execution time required for embedded wave detection module ...... 212

Table 8.14 Execution time for BGTG construction module ...... 213

Table 8.15 Execution time for graph matching module ...... 214

Table 9.1 Comparison of classification accuracy ...... 220

xxi

GLOSSARY

Notation Meaning BGMM Bivariate Gaussian distribution Markov model BGTG Bivariate Gaussian distribution transition graph M BGMM G BGTG VG Set of vertices in BGTG G SM Set of BGMMs GAFib BGMM for GAFlu BGMM for Atrial Flutter GAVNRT BGMM for Atrio-ventricular Nodal Reentry GEAT BGMM for Ectopic Atrial Tachycardia GJTachy BGMM for GVTach BGMM for Ventricular Tachycardia GVFib BGMM for GVFlu BGMM for GPAC BGMM for Premature Atrial Contraction GB-PAC BGMM for Blocked-Premature Atrial Contraction GPJC BGMM for Premature Junctional Contraction GPVC BGMM for Premature Ventricular Contraction

Pamplitude Amplitude of P-wave

Pduration Duration of P-wave

Paverage-amplitude Average amplitude of P-wave

Paverage-duration Average duration of P-wave

Qamplitude Amplitude of Q-wave

Qduration Duration of Q-wave

Qaverage-amplitude Average amplitude of Q-wave

QRSarea Area of QRS-complex

QRSaverage-area Average area of QRS-complex

Tarea Area of T-wave

Tarea-average Average area of T-wave

RRaverage Average RR-interval

푆 푆 푆 푆 S = {푏1 , 푏2 , 푏3 , 푏4 , … } Set of labeled beats EG Set of weighted edges in G ei ith edge in the set EG Wij weight of edge that denotes transition probability for edge ei to ej

푆푡퐺 State in BGTG G 푏푑(푆푡퐺) bivariate distribution of state in BGTG 푇푟(푆푡퐺) transition probability of state in G 푆푡푀 State in BGMM M 푏푑(푆푡푀) bivariate distribution of state in BGMM 푇푟(푆푡푀) Transition probability from state in BGMM. VG Set of vertices in BGTG G

퐿퐺−푀 Probability of BGTG G belonging to BGMM M quadBool first occurrence of quadrigeminy triBool first occurrence of trigeminy BiBool first occurrence of PValue Type of premature beats Cobegin Begin concurrent execution Coend End concurrent execution

th Ti i thread

th bi i block

xxiii

DEDICATION

I dedicate this dissertation to my dearest family members: my parents, Rajashri and

Rajendra Gawde, my husband, Sabyasachi, and the rest of my family for their love, support, and understanding throughout my research.

xxiv

ACKNOWLEDGEMENTS

First and foremost I want to thank my advisor Dr. Arvind K. Bansal. It has been an honor to work with him. I would like to thank him for his help, effort, support, patience and motivation throughout my years at Kent State University. I would also like to thank

Dr. Javed Khan, Dr. Cheng Chang Lu, Dr. Jeffery Nielson, and Dr. Gokarna Sharma for agreeing to be in my committee and carefully reviewing my dissertation.

I would also like to thank my family, especially my parents and my husband for bearing with me during my research and study. Without my family’s support, help, understanding and motivation, I wouldn’t be able to complete my dissertation.

Special gratitude and love goes to the rest of my family members who kept me motivated all the way from India.

Purva R. Gawde.

December 2018, Kent, Ohio.

Introduction

1.1 Motivation

Heart is a complex muscle system, consisting of four chambers and four valves along with a complex electrical conduction system that pumps oxygenated blood to the body continuously to supply oxygen, biomolecules for repair of blood vessels, and other nutrients throughout the body [1]. The pumping of heart is based upon heart’s incessant contraction and relaxation cycle caused by electrical depolarization and repolarization cycles of heart-cells. The cycle is started by an electrical impulse at the top left corner (SA node), and becomes the cause of regular depolarization and repolarization cycle [1, 2]. The lack of blood-flow or the lack of oxygen in the blood starves the human-body of the required nutrients and energy leading to the malfunctioning and death.

Cardiac (heart) diseases are the leading cause of death and disability among men and women in the United States [3]. About 610,000 people die of heart diseases in United

States every year, that’s one in every four deaths [3, 4]. SCD (Sudden Cardiac Death) is a leading cause of death in the United States, killing more than 325,000 people each year [3].

The risk of SCD increases in presence of an underlying heart disease [3, 4].

Cardiac diseases are caused by various abnormalities in the structure and function of heart such as the lack of oxygen supply [1], blood-clot formations due to multiple reasons, blockage in the vessels due to plaque formations in the blood vessels, irregular

1

rhythm that adversely affects the blood pumping, defects in the blood flow due to defective or damaged valves, hypertensive heart, thickening of muscles adversely affecting the blood-pumping capability, or improper post-surgical immune system response [1,2]. Clot formations are also the cause of (lack of oxygen) in different parts of body including brain-stroke leading to sudden death and paralysis. Congenital malfunctions of heart structure existing at birth can be fatal if not treated and/or maintained life-long.

The regular rhythmic beating of the heart is dependent on electrical impulse being conducted throughout the heart. Abnormal conduction or origination of electrical impulse causes irregular or abnormal heartbeats [2]. When the electrical impulse conduction within the heart is interrupted or disturbed, and causes irregular heartbeats, it is called arrhythmia

[2]. The resulting impulses may happen too fast, too slow, or erratically causing the heart to beat too fast, too slow, or erratically. Irregular heartbeats disturb the underlying blood flow in the body. Irregular heartbeats occur most frequently in adults after mid-30s, and from there the risk increases with age [3, 4].

One of the classes of irregular heartbeats called atrial fibrillation, originating in upper chambers of heart, causes chaotic heart rate and increases risk of clot formation leading to stroke and ischemia of various parts of the body, and other heart related complications [3]. Another class of irregular heartbeats called ventricular fibrillation, originating in lower chambers of heart, causes erratic behavior of blood flow, and leads to sudden cardiac death (SCD).

When an electrical impulse starts too early, it gives rise to an irregular beat called

2

premature beat. Irregular beats can be classified into multiple categories based on origination and conduction of electrical impulse. Some types of arrhythmia and premature beats are benign initially [4]. However, they degrade into life-threatening conditions, if not treated in time. Classification into respective subclasses is necessary for many reasons:

1) Certain benign subclasses of arrhythmia like ventricular tachycardia and subclasses of

premature beats like premature ventricular complex degrade to life-threatening

conditions such as ventricular fibrillation that causes SCD [2].

2) Atrial fibrillation, a subclass of arrhythmia originating in upper chambers of heart,

causes blood to pool in atria (upper chambers of heart) which can lead to blood clot

formation [4]. Unless classified and treated in time, it can cause stroke [2].

3) The medical treatment for various subclasses of arrhythmia is different [1, 2]. If these

subclasses are detected, classified, and treated in early stages, further degradation of

condition is prevented or effectively maintained [1, 2, 3]. For instance, premature

ventricular contractions can be treated with beta-blockers while, sustained ventricular

tachycardia has to be treated with defibrillator to bring back the normal heart rate [2].

4) Premature beats and arrhythmia episodes are transitory, and not always reproducible

[2, 5]. Thus, monitoring heart for arrhythmia classification especially for elderly

patients becomes necessary.

1.1.1 Electrocardiograms (ECG) and Analysis

The study of heart irregularities by means of the computerized analysis of the electrocardiogram (ECG) signal is a cost-effective, non-invasive and well established tool

3

to detect heart abnormalities. By associating the structure and cycle pattern of ECG, various cardiac abnormalities can be detected. A trained team of physicians can recognize many heart diseases including arrhythmia by analyzing the heartbeats.

1.1.2 Automated Intelligent Analysis of Irregular Beats

At present, machine learning techniques and software have been developed to identify: tachycardia (fast heart beat), (slow heartbeat), ventricular arrhythmia

[6,8,9,11,12], atrial fibrillation [7], and to a limited extent, myopathy (thickening of heart muscles), ischemia (cell death due to lack of oxygen), premature ventricular contraction

(irregular beat originating in lower chambers of heart) [6], and (heart attack due to ischemia). But at present, there are no software tools or algorithms for various finer level classifications of irregular beat-patterns and arrhythmia. Besides, the accuracy of the current techniques is lacks sufficient accuracy to be used reliably by wearable devices.

1.1.3 Mobile Lifestyle and Wearable Devices

In this age of mobile patients, we need portable and wearable monitoring devices to detect transitory cases of arrhythmic and premature beats. Wearable smart personal monitoring devices are gaining popularity due to the availability of cost-effective miniaturization of monitoring devices [8, 9] that can be linked to mobile phones for information transmission. These devices are equipped to detect, with limited accuracy, many abnormalities like stress, oxygen saturation level, ischemia and arrhythmia [9].

Wearable devices have to concurrently identify many life-threatening heart

4

conditions for elderly patients such as possibility of myocardial infarction due to prolonged ischemia, stroke due to sudden traversal of a blood clot, etc. This poses added time constraint on ECG signal analysis to diagnose accurately finer subclass of arrhythmia along with other abnormalities in the same time domain for real time analysis.

In this research, we focus on two major problems:

1) Identification of subclasses of arrhythmia in real time;

2) Identification of different classes of premature beats in real time.

This dissertation focuses on single lead detection since it is common in emergency- room scenario as well as wearable monitoring devices and has sufficient data available for analysis [2].

1.2 Heart Structure and electrophysiology

A heart is a muscular organ roughly the size of a closed fist. It is located between lungs in the middle of the chest, behind and slightly left of breastbone (sternum). During each cycle of contraction and relaxation, it sucks deoxygenated blood from the body and pumps oxygenated blood throughout the body using a network of blood vessels called arteries. It sends deoxygenated blood to the lungs where blood absorbs oxygen, and releases carbon dioxide as a waste product of metabolism. In this section, anatomy, electrical behavior, electrophysiology of heart and ECG are discussed in brief.

Anatomy of heart

Overall, there are four chambers in heart. The two upper chambers are called atria

(left atrium and right atrium), and the two lower chambers are called ventricles (left 5

ventricle and right ventricle). The left atrium empties blood into the left ventricle through the during atrial compression (fig. 1). The compression of the left ventricle empties blood into the peripheral , and the compression of the right ventricle empties blood into the pulmonary system. A natural electrical system causes the heart muscle to contract and pump blood through the heart to the lungs and the rest of the body [3].

Figure 1.1 Cross Sectional View of the Heart

There are four valves to force the direction of the blood, as shown in Fig. 1.1. The regulates blood from the right atrium and the right ventricle. The controls blood flow from right ventricles into pulmonary arteries, which carry blood to lungs to absorb oxygen. The mitral valve lets oxygen-rich blood from lungs pass from the left atrium into the left ventricle. The opens the way for oxygen- rich blood to pass from the left ventricle into the aorta, which is body’s largest artery.

6

Figure 1 shows the cross-sectional view of the heart.

Electrical behavior of heart

An electrical impulse is responsible for the mechanical activation of the muscle.

Each cycle is initiated by spontaneous generation of an action potential in the sinoatrial node (SA node). This node is located in the superior lateral wall of the right atrium near the opening of the superior vena cava (see Fig. 1). The impulse travels through both atria reaching the atrio-ventricular (AV) node, where it is delayed by about 0.1 second. AV node is the junction point between atria and ventricles. The delay allows the atria to pump blood into the ventricles. After this, the ventricles are filled and ready to be activated. The impulse is propagated from the AV node to the whole ventricular muscle fast, allowing a synchronized activation and consequently, an effective pump of the blood.

Figure 1.2 The Electrical Conduction of the Heart

7

This signal creates an electrical potential at the skin surface current recorded using one or more leads. This recording is called the electrocardiogram (ECG). ECG is further described in subsection 1.2.4. Fig 1.2 shows the electrical conduction of the heart where,

RA and LA denote right atria and left atria respectively. RV and LV denote right and left ventricle respectively.

Electrophysiology

To understand why ECG is altered by electrolyte abnormality, it is necessary to understand how (myocardial) cells become polarized and depolarized, and the biochemical mechanism that allows the cells to contract.

1.2.3.1 Mechanics of contraction

Heart comprises series of small cells. Each cell comprises two sliding halves held together by an interlocking mechanism using two proteins: actin and myosin as shown in

Fig. 1.3. The outside of cells are fused together to form long bands called myofibrils. These bands are held together side-by-side by connective tissue to form sheets covered with extracellular fluids. The main function of the band is to contract and expand. During depolarization, each cell contracts by a small amount due to change in the ion potential.

The cumulative contraction is manifested into muscle-sheet contraction. The sheets are arranged to form the four sacs that constitute the four chambers of heart.

8

1.2.3.2 Ion movement and polarity

The fluid inside and outside of the cells contains water, salt and proteins. In liquids, salts break down into positively and negatively charged particles known as ions. In the body, main positively charged ions are sodium (Na+), potassium (K+), and calcium (Ca++).

Chloride (Cl-) is the main negatively charged ion [3]. A live cell maintains differences in these concentrations across the cell membrane as shown in Fig. 1.4. The inside of the cell has a higher K+ concentration, whereas the outside has higher concentration of Na+ and

Ca++. The higher positive charge outside the cell causes relatively more negative charge inside the cell. This difference between the charges outside and inside the cell wall is known as electric potential.

Figure 1.3 Myocardial cell (myocyte)

The cell is semi-permeable allowing ions to flow across. Two main forces that drive ions across cell membrane are: chemical potential which forces an ion to move down

9

its concentration gradient [1], and electric potential, which forces ion to move away from ion with like charge. Transmembrane potential (TMP) is electric potential difference

(voltage) between inside and outside of cell. Cardiac ion channels have following properties:

Figure 1.4 Solutions inside and outside the cell are different. The pump (dark-blue blue dots) maintains the right number of ions on both sides of the walls

Selectivity: Channels are permeable to single ion type based on physical configuration. Voltage sensing gating: A specific TMP range is required for a particular channel to open. Time dependency: Some ion channels (fast Na+ channels) are configured to close a fraction of second after opening and cannot be opened again until TMP is back at resting level. The action potential in a cardiac cell comprises five phases (fig. 5) that are responsible for depolarization and repolarization.

The resting phase: Na+ has natural tendency to enter the cell and K+ to exit the cell through the inward rectifier channels [1, 2] because of chemical concentration differential.

10

The electric potential of the resting cell is approximately -70 to -90mV. Na+ and Ca+ channels are closed during resting TMP.

Depolarization: An action potential triggered in neighboring cell causes TMP to rise above resting potential. Fast Na+ channels open up and Na+ leaks into cell raising TMP.

After TMP reaches threshold potential, Na+ current rapidly depolarizes TMP to slightly above 0mV for transient time and fast Na+ channels close due their time dependency property. L-type (long opening) Ca++ channels open when TMP is greater than -40 mV and causes a steady influx of Ca++ down its concentration gradient.

Figure 1.5 Phases of myocyte (cardiac cell) stimulation

Early repolarization: At this stage, TMP is positive. Some K+ channels open up briefly and outward flow of K+ returns TMP to approximately 0mV.

The plateau phase: Ca++ channels are still open and there is constant inward current of

Ca++. K+ leaks out through delayed rectifier K+ channels due to chemical concentration

11

gradient. These countercurrents are electrically balanced, and TMP is maintained at the plateau.

Repolarization: Ca++ channels are gradually inactivated. Persistent outflow of K+, brings

TMP back towards resting potential of -70 to -90 mV to prepare cell for a new cycle of depolarization. Normal transmembrane ionic concentration gradient are restored by returning Na+ and Ca++ ions to extracellular environment by using Na+-Ca++ exchanger pump and K+ ions to cell interior by using Na+-K+ ATPase pump.

Heart and ECG

As the heart function produces an electrical field, the voltage generated can be recorded by the electrocardiograph from the surface of the body. The ECG is characterized by five consecutive waveforms denoted as: P-wave, Q-wave, R-wave, S-wave, T-wave, and three baselines between these waveforms denoted as: TP, PQ and ST segments. Q, R and

S-waves are called QRS-complex, which represents ventricular depolarization.

TP segment refers to resting action potential before atrial depolarization begins. P- wave is caused by the spread of depolarization through the atria. The PT segment refers to time taken by impulse to spread throughout atria and reach AV-node. About 0.16 seconds after the onset of the P-wave, the QRS-complex appears because of electrical depolarization of the ventricles. At this point, atria starts repolarizing. This event is not seen on ECG, since it occurs concurrently with the formation of QRS-complex, and is buried in the complex. Q-wave indicates beginning of ventricular depolarization. R-wave represents electric impulse depolarizing ventricles when impulse travel down Purkinje 12

fibers – three dimensional electrical-impulse conduction mesh in the ventricles. S-wave indicates complete depolarization of ventricles. ST segments refer to beginning of ventricular repolarization. T-wave represents the stage of ventricular repolarization of the ventricles when the ventricular muscle fibers relax. The morphology and timing of action potentials from different regions of heart and cardiac cycle of ECG are shown in Fig. 1.6.

Figure 1.6 a) The morphology and timing of the action potentials from different regions of the heart; and b) The related cardiac cycle of the ECG as measured on the body surface. 1.3 Arrhythmia and Irregular Beats

Heart Arrhythmia

Heart arrhythmias occur when the electrical impulses that coordinate the heartbeats are irregular or missing allowing ectopic nodes to take over. Ectopic nodes are group of fibrous cells, which fire randomly as the action potential builds up [1, 2]. The movement of these electric pulses cause irregular beats partly due to irregular movement of pulses within atria and ventricles. Arrhythmias are characterized by the speed of the heartbeats.

A very slow heart rate, called bradycardia, means the heart rate is less than 60 beats per

13

minute. Tachycardia is a very fast heart rate, meaning the heart beats faster than 100 beats per minute.

Arrhythmias are classified by their origin of occurrence and the circulation pattern of electric current within the heart. Arrhythmias that originate in the atria (upper chambers) or AV-junction are called supraventricular arrhythmias. Ventricular arrhythmias originate in the ventricles. Based on the specific ectopic location of electrical impulse, supraventricular and ventricular arrhythmias are classified into finer subclasses. We consider four major subclasses of supraventricular Arrhythmia: Atrial Fibrillation (AFib),

Atrial Flutter (AFlu), Atrial Ventricular Nodal Reentry Tachycardia (AVNRT) and Ectopic

Atrial Tachycardia (EAT). Three major subclasses of ventricular arrhythmia are:

Ventricular Tachycardia (VTach), Ventricular Flutter (VFlu), and Ventricular Fibrillation

(VFib). The subclassifications are illustrated in the Fig. 1.7.

Supraventricular Arrhythmia Ventricular Arrhythmia

AFib AFlu AVNRT EAT VTach VFlu VFib Premature beats

PAC B-PAC PJC PVC

Figure 1.7 Arrhythmia and Premature Beat Finer Subclasses

Premature Beats and Beat-patterns

Premature beats or contractions originate in atria, ventricles, or AV-junction.

Depending on the location of ectopic foci, premature beats are classified as: premature 14

ventricular contractions (PVC) originating in ventricles; premature atrial contractions

(PAC) originating in atria; or premature junctional contractions (PJC) originating at AV- node. Classifications of arrhythmias and premature beats based on ectopic location are shown in fig. 7.

Based upon the patterns of premature and regular sinus beats, ECG rhythms with premature contractions are further subclassified as: 1) bigeminy – one regular sinus beat followed by one premature ; 2) trigeminy – two regular sinus beats followed by one premature ectopic beat; and 3) quadrigeminy – three regular sinus beats followed by one premature ectopic beat.

1.4 Problem Definition

The algorithms proposed to analyze ECG signal to classify heart diseases must satisfy high accuracy and quick response time. The technique should be able to catch every event in single heart beat cycle and analyze its characteristics correctly to classify the heart disease accurately.

Task to choose the algorithm that can classify all the heart diseases by analyzing

ECG signal is challenging due to: 1) capturing finer characteristics and embedded waveforms suggestive of a subclass of arrhythmia or irregular beat, 2) temporal domain properties of arrhythmia, and 3) signal classification in real time to provide quick response time for physicians.

These requirements can be contradictory to each other. To achieve first and second requirement, complexity of the algorithms is increased, which increases the processing 15

time, and requires larger sample-size. The third requirement limits the sample-size for real-time processing. By reducing complexity of the algorithms, required features cannot be retrieved from ECG and included in complex classification techniques, which might cause loss of classification-accuracy.

1.5 Previous Work and Limitations

The techniques for ECG signal analysis to obtain classification of heart diseases include heuristic and statistical techniques and testing the methods experimentally. Early work on automated classification of heart diseases includes heart rate variability study [4], hidden Markov Models (HMM) [10, 11], artificial neural networks (NN) [12, 13], support vector machines (SVM) [14], and cluster based techniques [15, 16].

Previous approaches for automated detection of subclasses of arrhythmias have met with limited success due to the lack of electric pulse synchronization that causes inconsistent morphological changes in P-QRS-T waveforms and their occurrence- sequence due to irregular beats. The focus of previous research was mainly on classifiers to include inter-beat temporal characteristics like RR-interval. However, intra-beat temporal features like transitions between waveforms integrated with morphological features were not studied. This limited previous research to detect and classify normal sinus beats, ventricular and supraventricular beats.

When large acquired dataset is analyzed, classification techniques use of dimension reduction techniques such as Support Vector Machine, Principle Component Analysis, etc. to reduce the computational overhead. ECG signal classification depends on small interval 16

and duration of morphology features, which range in milliseconds. The reduced features often play major role in finer classification of irregular beats, and are lost after dimension reduction.

1.6 This Research

This research has concentrated on task of: 1) classifying ventricular and supraventricular arrhythmia into further subclasses; 2) classifying premature beats based on origin of ectopic location; 3) classifying patterns of premature beat with sinus beat by targeting the main challenges faced by ECG analysis; and 4) developing accelerated algorithms to classify arrhythmia and premature beats implemented on GPUs.

To develop a solution for first task, a novel classification technique has been developed, which integrates Markov model and bivariate Gaussian distribution. Bivariate

Gaussian distribution is used to capture finer features of ECG waveforms. Markov models are used to capture inter-beat and intra-beat temporal-patterns of ECG including embedded waveform calculations [17, 19, 20].

Markov models are developed for each finer subclass of arrhythmia using training data available in annotated databases [21]. For observed patient’s ECG, transition graphs are developed for a group of beats in real time from a limited sample size. Transition graphs are essentially Markov models with Gaussian distribution; however, with a smaller subset of available patient’s data [21]. To classify patient’s ECG, graph matching technique has been developed, which matches a transition graph with Markov models

17

developed for each subclass of arrhythmia. Transition graph is classified with Markov model with highest probability given by graph matching algorithm.

Second task of classifying premature beats also used the same technique as used for arrhythmia sub-classification with one modification. Markov models with Gaussian distribution were developed for each subclass of premature beat. However, instead of developing transition graph for patient’s signal with few beats, transition graph was developed for each beat. Graph matching algorithm to classify each premature beat remained same for second task.

To develop a solution for third task, pattern matching techniques have been developed. The first part of the solution is to obtain and annotate a stream of classified beats as either regular or premature beat. This is achieved by using rule based system that analyzed RR-interval for each beat. Each premature beat is classified based on ectopic location using algorithm developed for second task. Next, pattern matching algorithm is developed to classify the stream based on pattern using nondeterministic fine state automata.

One of the major challenges faced in this research was the similarity of waveform features exhibited by different subclasses of arrhythmia and premature beats. Some of the previous researchers tackled this problem by using different signal processing techniques like Discrete Wavelet Transform (DWT), and Fast Fourier Transform (FFT). However, none of the techniques extract hidden features. For instance, P-wave of a premature beat originating in atria can get embedded in the T-wave of the previous beat. This phenomenon

18

is known as “P-on-T” phenomenon [2, 5]. Since these two waveforms are superimposed on each-other, either waveform is considered missing depending upon the observed magnitude. Presence or absence of each waveform is important criteria for any classification model. Physicians tackle this problem by closely examining the ECG strip and inferring embedded waveforms from added height or width that the P-wave gives to the underlying T-wave. Added height and/or width increases area under the waveform.

To automate the embedded waveform detection, a novel technique of area subtraction was used. If missing waveforms is observed during feature-extraction, area of the possible underlying waveform is calculated and checked with average area of normal waveform. Based on threshold obtained statistically, waveform is considered either present or absent and average features are assigned to them.

Another feature-related challenge faced was P-wave formation on ECG.

Morphology of P-wave (amplitude, duration) plays major role in supraventricular arrhythmia analysis [1, 5]. The recorded P-wave is a superimposition of depolarization waves from left atrium and right-atrium: The first one-third of a P-wave corresponds to right-atrial activation, the final 1/3 corresponds to left atrial activation, and the middle one-third combines the two [5]. This information about P-waves reveals the origin of ectopic location in atria useful for finer classification of supraventricular arrhythmia. In this research, detailed analysis was performed to resolve P-wave. In this technique, P- wave is split into two waves by doing slope analysis of the wave. Based on rising and falling edge of the two superimposed waves that form P-wave, four slopes with directions

19

are obtained. These four slopes with their height and duration of the peak of edge are included in Markov model for finer classification of supraventricular arrhythmia.

Since these developed algorithms involve extracting finer features from signal and using them in multivariate classifier, computational complexity of the algorithm increased.

To achieve a real time response without compromising computational complexity, accelerated versions of classification algorithms for arrhythmia and premature beats were developed to be implemented on Graphics Processing Unit (GPU).

ECG signals used for training, and testing were obtained from freely available

PhysioNet [21] databases. Developed algorithms have been implemented in MATLAB

[22] by using the vast libraries present and GPU algorithms were written in C++ using

Microsoft visual studio 2017. The software was executed on a machine having Intel(R)

Xeon(R) dual core CPU E5-2680 @2.70GHz 64-bit system with 128 GB RAM and CUDA enabled GeForce GTX 1050 ti GPU card [23].

Contributions

The major contributions of this dissertation are:

1) Finer analysis of temporal-patterns using Markov models;

2) Integration of temporal and morphological features such as amplitude, duration, and

slope;

3) Resolution of P-waveform and integration in the Markov model for the

subclassification of supraventricular arrhythmia;

20

4) Embedded waveform detection using area subtraction technique;

5) Integration of Markov model with bivariate Gaussian distribution to integrate both

statistical distribution of temporal and morphological features ;

6) Transition graph and graph matching technique for labeling the patients’ conditions in

real-time;

7) Nondeterministic finite state algorithm for classifying premature beat-patterns; and

8) Coarse-grain parallelization of algorithms for implementing on Graphics Processing

Unit (GPU) without loss of accuracy.

A prototype system that simulates the analysis in real-time has been developed using standard one-lead database recording available from MIT database. The system can be ported to GPU based wearable devices for technology transfer.

Applications

When a patient reports symptoms of an irregular cardiac event, physicians can monitor cardiac activity using a Holter monitor, cardiac event monitor, or mobile cardiac telemetry [7]. Holter monitors are worn by patients for 24-48 hours. A cardiac event monitor is worn for 30 days for patients whose symptoms occur infrequently. Mobile cardiac telemetry (MCT) are small portable monitors suitable for elderly patients. When a cardiac anomaly is detected, MCT automatically sends data to a 24-hour manned monitoring center via a mobile network, and then interpreted by a qualified, cardiac-trained registered nurse. In contrast to the cardiac event monitor, MCT provides real-time monitoring and analysis.

21

Developed algorithms are suitable for the three monitoring devices mentioned. Following are the main application of our research:

1. Automated tool to help physicians accurately classify signal in real time.

2. Prediction of sudden cardiac death and other life-threatening arrhythmia.

3. Real-time monitoring of the elderly patients.

The developed algorithms for real time classification can be integrated in any monitor since it uses ECG data from a single lead. The data obtained can be used by health care professionals, including physicians, nurses, therapists and technicians to bring together knowledge from many technical sources to identify pattern of ECG data observed for a patient, and to predict the adverse condition based on classified signals with arrhythmia or premature beat subclasses.

1.7 Roadmap

The overall organization of this research is:

In chapter 2, background knowledge of cardiac disease, heart arrhythmia with subclasses, irregular beat and beat-pattern is discussed. Several statistical background concepts, including probability, Gaussian distribution and machine learning concepts used in developing the dissertation are discussed. ECG signal feature-extraction techniques and parallelism concepts used for the algorithm are discussed.

In chapter 3, a literature survey of previous methods used for ECG signal analysis and classification of heart abnormalities in broader classes, and their limitations are

22

discussed.

In chapter 4, identification and classification of ECG signal into finer subclasses of ventricular and supraventricular arrhythmia are discussed. In this chapter, algorithms that improve feature-extraction and integrating it with Markov models are presented.

In chapter 5, identification and classification of irregular beats based on origin of ectopic location are discussed. In this chapter, irregular beat-pattern classification algorithm is presented. Performance analysis of the implemented algorithms is also presented.

In chapter 6, parallel processing of classification algorithms with GPU are presented. In this chapter, module dependency and algorithms for parallel processing of arrhythmia and irregular beat-patterns are presented. Performance analysis of the parallelized algorithms is also presented.

In chapter 7, implementation details about the algorithms are discussed along with database resources used. Machine configurations of CPU and GPU used are also discussed.

In chapter 8, performance evaluation of the obtained results is discussed.

In chapter 9, contemporary related work in the field of arrhythmia and irregular beat-pattern classifications are discussed. Some limitations on their work are discussed and compared with this research.

The last chapter concludes this dissertation. In this chapter, contribution and application of this research along with limitations and future work is discussed.

23

Background

This chapter describes, the basic knowledge of ECG waveforms; signal processing concepts; ECG measurement; arrhythmia subclasses, premature beat subclasses, and beat- pattern; mathematical and statistical concepts; machine learning concepts; GPU based parallelization concepts; and accuracy of the results obtained.

2.1 ECG Signal Representation

An ECG is the recording of electrical activity of heart on a well-defined metric scale. The X-axis measures time, and Y-axis measures electrical activity in millivolts. The peaks and valleys in the line tracings are called waves. There are five major waveforms: P- wave, Q-wave, R-wave S-wave, and T-wave [5] as shown in fig. 2.1. Q, R and S waves are associated with the ventricular depolarization, and are grouped as QRS-complex.

In the following subsection, individual components and features for those components observed for lead II of ECG are presented.

Figure 2.1 Components of ECG Complex

24

2.1.1 Components of ECG-complex

The P-wave – P-wave represents electrical depolarization of atria. The wave starts when SA node, located on the top right corner of right atrium, fires. The duration varies between 80 - 110 milliseconds (ms) and amplitude is 0.25 millivolt (mV) in healthy adults. The PQ segment – The PR segment occupies the time-frame between the end of P-wave and beginning of QRS-complex. It is usually found along baselines. The PR-interval – The PR-interval represents the period from the beginning of a P-wave to the beginning of the corresponding QRS-complex. It includes the P-wave and the PR segment. It covers all events from the initiation of electrical impulse in the SA node up to moment of ventricular depolarization. The QRS-complex – It represents ventricular depolarization with Q, R and S-waves. By convention, Q-wave is the first negative deflection after P-wave with amplitude of −0.7 mV and duration of 20 ms. Q wave is associated with earlier part of ventricular depolarization when right ventricle gets the electrical signal before the left ventricle. R- wave is the measure depolarization of the ventricles. It is a sharp peak with an amplitude of 2.5 mV and a duration of 50−60 ms. The first negative deflection after R-wave is S- wave with an amplitude of −0.8mV and a duration around 20 ms. The duration of QRS- complex is less than 100 ms for healthy adults. The ST segment – The ST segment is the section of ECG cycle from the end of QRS- complex to the beginning of T-wave. ST segment is usually found along the baseline with a duration that varies from 50 ms to 150 ms. The QT-interval – The QT-interval represents the time of ventricular activity including ventricular depolarization and repolarization. It is measured from the beginning of a QRS- complex to the end of the corresponding T-wave in the same ECG cycle. T-wave – T-wave represents ventricular repolarization. It is the next positive deflection after ST segment. The amplitude of T-wave is less than 0.3 mV and duration varies from

25

110 to 250 ms for healthy adults. To accurately characterize heart rate analysis, precise and reliable ECG waveform recognition procedure is necessary. The goal of ECG signal analysis is to extract various temporal and morphological features that characterize various heart diseases.

2.1.2 Components of ECG-complex

Electrodes are sensing devices that pick up electrical activity on the skin-surface.

When a positive electrical impulse is moving away from the electrode, the ECG machine records it as a negative (downward) waveform. When a positive wave moves towards an electrode, the ECG device records as a positive (upward) waveform. When the electrode is somewhere in the middle ECG shows a positive deflection for the amount of energy that is coming toward it and a negative wave for the amount going away from it [3].

Fig. 2.2 shows leads placement for 12-leads electrodes. Total 10 electrodes are placed on a patient. Six chest leads or precordial leads (V1 to V6) are placed around the chest to cover the horizontal plane. Four limb-leads or extremity leads are placed on the right arm (RA), left arm (LA), right leg (RL) and left leg (LL). Electrode on the right leg

(RL) is not included in any ECG measurement. Rather, it serves as a ground wire [1, 2].

The three leads I, II, and III are computed using the actual measurements the electrodes

RA, LA and LL. In addition, there are three virtual leads aVL, aVR and aVF that are computed using the leads I, II and III. Together, leads I, I, III, aVR, aVL and aVF measure the electrical activities in the vertical plane.

26

Leads I, II, and III require a negative and positive electrode (bipolarity) for monitoring. Lead I is formed using the RA as the negative electrode and LA as the positive electrode. Lead II is formed using RA as the negative electrode and LL as the positive electrode. Lead III is formed using LA as the negative electrode and LL as the positive electrode. Leads aVL, aVF and aVR are composite leads, and requires only a positive electrode for monitoring. Limb leads look at the heart in coronal (vertical) plane. Chest leads look at the heart in the transverse (horizontal) plane.

Lead II is used more frequently because most of the heart's electrical current flows are parallel to the direction of lead II [2, 3]. In this research, lead II has been analyzed for classifying arrhythmia and irregular beat-patterns.

Figure 2.2 12-leads placement

2.2 Signal Processing Techniques

In this section, fundamental signal processing techniques used for ECG signal analysis are discussed. Signal processing techniques are used for the extraction of P-QRS-

T waveform and for the various feature-extractions. 27

2.2.1 Convolution

Convolution in digital signal processing (DSP) is used to analyze the effect of combining two signals to form a third signal [25]. An input signal is decomposed into simple additive components. Each of these components is passed through a linear system, and the resulting output components are synthesized [25]. Two ways of decomposition are: impulse decomposition and Fourier decomposition. When impulse decomposition is used, the procedure is described by a mathematical operation called convolution. Fig. 2.3 shows convolution using linear systems. An input signal x[n], enters a linear system with an impulse response h[n], resulting in an output signal y[n] as described in equation 2.1.

푥[푛] ∗ ℎ[푛] = 푦 [푛]. (2.1)

Figure 2.3 Convolution

2.2.2 Fast Fourier Transform (FFT)

A signal comprises one or more frequencies, and can be viewed from two standpoints: time-domain and frequency-domain [25, 26]. A fast Fourier transform (FFT) is an algorithm that samples a signal over an interval (or space) and divides it into its frequency components. These components are single sinusoidal oscillations at distinct frequencies each with their own amplitude and phase. An FFT algorithm computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IFFT). Fourier analysis

28

converts a signal from its original domain to a representation in the frequency domain and vice versa. DFT for complex numbers (푥0, … 푥푁−1) is defined by the equation 2.2.

푁−1 −푖2휋푘푛/푁 푋푘 = ∑푛=0 푥푛푒 Where, 푘 = 0, … , 푁 − 1. (2.2)

The major disadvantage of a Fourier transform is that it can only derive the constituent frequencies in a signal; it cannot determine the time-of-occurrence or the relative temporal order.

2.2.3 Wavelet Transform

The wavelet transform is a convolution of the wavelet function with the signal.

Wavelet analysis uses a fully scalable modulated window [27, 28]. The window is shifted along the signal and for every position the spectrum is calculated using wavelet. This process is repeated with a slightly shorter (or longer) window for every new cycle. In the end, the result is a collection of time-frequency representations of the signal, all with different resolutions.

Wavelets are described by real-valued or complex-valued waveforms, which have a definite beginning and end with a mean value of zero. The wavelet transform of a signal is obtained by comparing the input signal with the dilated and shifted versions of the wavelet, also known as the mother wavelet. Due to their limited duration, wavelets can deal simultaneously with time and frequency. There are two types of wavelet transforms: continuous wavelet transform (CWT) and discrete wavelet transform (DWT).

29

The continuous wavelet transform (CWT) is a time–frequency analysis which allows variable window-width related to the scale of observation. The wavelet transform of a continuous signal 푥(푡), is defined by equation 2.3.

1 ∞ 푡−푏 푇(푎, 푏) = ∫ 푥(푡)휓∗( ) 푑푡 (2.3) √푎 −∞ 푎

Where, 휓∗ is the complex conjugate of an analyzing wavelet function 휓(푡), 푎 is the dilation parameter of wavelet, and b is the location parameter.

DWT is sufficient for extracting interesting features of ECG. In its most common form, the DWT employs a dyadic grid (integer power of two for the dilation parameter a) and orthonormal wavelet basis functions, and exhibits zero redundancy. DWT is given by the equation 2.4.

∞ 푇 = 푥(푡)휓 (푡)푑푡 (2.4) 푚,푛 ∫−∞ 푚,푛

Where 푇푚,푛 is known as the wavelet (or detail) coefficient at scale and location indices (m, n). The DWT method enables a multiresolution analysis by decomposing a discrete signal x(t) iteratively into low and high frequency components by low-pass h(t) and high-pass g(t) filters determined by a mother wavelet function (see Fig. 2.4). A signal x(t) is decomposed into low (approximations) and high (details) frequency components for denoising and feature-extraction in specific frequency band.

The choice of mother wavelet depends on the application of interest [28]. Selecting a wavelet that closely matches the signal to be processed is important in wavelet applications [27]. Daubechies 6 (Db6) [28] wavelet is similar in shape to QRS-complex

30

(see figure 2.5) and their energy spectrum is concentrated around low frequencies. In this research, we use Db6 wavelet to denoise and extract ECG waveforms.

Figure 2.4 DWT decomposition

Figure 2.5 Daubechies Db6 wavelet and Q-wave match

2.3 ECG Signal Classification

ECG signal classification process for has three generic modules: denoising, feature- extraction and classifier as shown in Fig. 2.6.

Figure 2.6 Block Diagram of ECG Signal Classification

31

In this subsection, denoising and feature-extraction techniques that are used in this research prior to implementation of classification algorithms are discussed.

2.3.1 Denoising

There are three common sources of noise in ECG signal: 1) Baseline wander (BW)

– a low frequency high bandwidth component caused by the respiration and body movement; 2) Power line interference (PLI) caused by electromagnetic fields from power line; and 3) Electromyography (EMG) noise contributed by contraction of muscles besides heart.

In this research, DWT based ECG denoising proposed by D. Sadhukhan [29] and

Mahmoodabadi [30] is used. The DWT utilizes two sets of functions, associated with the low pass and the high pass filters respectively, to obtain components of frequency based on filters known as frequency sub-band. Denoising uses wavelet transforms to filter out wavelet coefficients component produced mainly by noise. The two approaches used for denoising are: linear approach and non-linear approach based on wavelet thresholding.

Linear approach assumes that noise can be found within certain scales. Non-linear approach considers the fact that signal has its energy concentrated in a subset of wavelet dimensions, and coefficients are relatively large compared to noise that has its energy spread over a large number of coefficient. This means that thresholding or shrinking the wavelet transform will remove the low amplitude noise. The commonly used threshold selection rules are FIXTHRESH, RIGRSURE, HEURSURE, MINIMAX, and soft thresholding [25].

32

2.3.1.1 Removal of baseline wander and low-frequency noise

To remove baseline variation frequency which lies around 0.15-0.5Hz, Daubechies wavelet Db6 is selected as the mother wavelet due to its shape similarity with QRS- complex. By dividing the sampling frequency into the dyadic scale, the frequency range for each subband is identified. The test databases used for testing the algorithm had 360

Hz sampling frequency. Hence according to the dyadic scale, the signal needs to be decomposed to the 8th level to identify the low-frequency noises. The decomposition level needs to be selected for which the coefficients of that last level correspond to the noise frequency band. Approximate frequency band at level 10 contains this frequency component of the ECG. Discarding this frequency band and reconstructing the signal eliminates the low-frequency noises.

2.3.1.2 Removal of high frequency and EMG noise

The 60 Hz power frequency interference, EMG interference, and background noises are high frequency noises which overlap with the high frequency components in the

ECG signal. This noise is eliminated using wavelet threshold based denoising. Wavelets

Db6 was chosen as the best choice which leads to lowest reconstruction error [25]. The

ECG data is first decomposed up to the 4th level using db6 as the mother wavelet. The coefficients of each sub-band are then thresholded using HEURSURE threshold selection

[25]. The approximate coefficients of the ECG signal contain the low frequencies of the original signal where most energy exists. Hence the approximate coefficients are not thresholded. The original signal is then reconstructed using the inverse DWT. 33

2.3.2 Feature-extraction

ECG delineation (determination of peaks, onset and offset points of the individual

QRS waves, P and T-waves), is the key step for feature-extraction techniques. The detection is quite challenging due to the time-varying nature of the ECG morphology subjected to physiological conditions and the presence of noise. In this research, feature- extraction technique presented by Sadhukhan and Mahmoodabadi [29, 30] based on multiresolution DWT and relative magnitude, and slope comparison is used. For feature- extraction, three separate algorithms are implemented for: 1) R-wave detection, 2) Q-wave and S-wave detection and 3) P-wave and T-wave detection. Based on this information the isoelectric lines (baselines) between the waveforms are computed in the end.

2.3.2.1 Detection of R wave

The R-wave is a significant feature in the ECG signal, which is characterized by sharp slopes and its energy spectrum is mostly between 1 and 40 Hz and almost centered on 17 Hz [26]. Accordingly, the detail sub-bands d4 and d5 have the most contribution of the QRS regions. The squared product of these bands (d4*d5)2 is used to localize the R- waveform. The bands were sorted in descending order of magnitude.

2.3.2.2 Detection of Q, R and S peaks

The individual peaks within the QRS regions are identified by relative magnitude and slope based criteria used in [31]. The position of the absolute maximum of the relative magnitudes in the corresponding QRS window is detected to be the location of significant

34

peak. The peak having an amplitude greater than the average amplitude of a signal is labeled as a positive R-peak. A detected peak having negative amplitude before R-peak is labeled as Q-peak. If detected peak is within the QRS window with negative amplitude after R-peak, it is labeled as S-peak.

2.3.2.3 Detection of the onset and offset points

The onset point is detected as the minimum slope point within a window of 40 ms starting from the Q-peak index along the start of the ECG sampled amplitudes. The offset point is detected as the minimum slope point within a window of 40 ms starting from S- point index towards the end of QRS window. Similarly, onset and offset point of individual waveforms Q-wave, R-wave and S-wave are detected. The results are shown in the Fig.

2.7.

Figure 2.7 . Result for QRS detection 35

2.3.2.4 Detection of P and T waves

Since, the energies of the T and P waves are mainly distributed at sub-bands d6 and d7, squared sum of these two bands (d6+d7)2 is used to detect the T and P-wave regions.

The process is illustrated in Fig. 8. Since d6 and d7 sub bands contain a part of the QRS energy; the sampled values corresponding to the QRS time windows are eliminated from these subbands. The elements exceeding the threshold value after a detected S peak within an interval of 2/3 of the total interval between two successive QRS regions was identified as the T-wave region. The elements exceeding the threshold within the latter 1/3 of the interval are detected as the P-wave region.

2.3.2.5 Detection of the Inverted T-waves

The absolute maxima point or the slope inversion points within the wave regions are labeled as the peaks based on QRS location, to ensure detection of the negative T- waves in pathological cases.

2.3.2.6 Detection of the onset and offset points

The average slope for each wave region is calculated. A slope threshold based search is initiated starting from the peak points on its either side. The point where the slope first falls below the threshold value towards the upside is labeled as the onset point, and that on the other side is labeled as the offset point. Result obtained to label P and T-wave features are shown in Fig. 2.8.

36

2.3.2.7 Detection of baselines

Baseline segments are detected by obtaining onset and offset points of waveforms.

For the detection of PQ segment, P-wave offset, and Q-wave onset are computed, and the difference between the detection times of these points is assigned as duration for PQ segment. If one or both of these points do not have zero crossing baseline points, then the average of amplitudes of these points are calculated and are assigned as the amplitude for the baseline detected. Similarly, TP, and ST segments are derived.

Figure 2.8 Result for P and T wave detection

37

2.4 Arrhythmias

Arrhythmias are abnormal heartbeat rhythms caused by irregular electrical impulse generation and conduction [5]. There are many subclasses of arrhythmias based upon their origin. Some cause heart to skip or add a beat, but are benign in the short term. Other arrhythmias are life threatening. Untreated arrhythmias affect heart’s pumping action and blood flow, which can lead to serious problems like coronary heart disease, stroke, sudden cardiac death (SCD), and congestive heart failure [3, 4].

Arrhythmias originating in the atria or above the ventricles are called supraventricular arrhythmias. Ventricular arrhythmias originate in the ventricles. All arrhythmias are caused by the presence of the ectopic nodes – a grouping of fibrous tissues that generate low amplitude electric signals interfering with regular heart-beats.

2.4.1 Supraventricular arrhythmias

Supraventricular arrhythmias originate in atria, AV-node, or the atrial conduction pathways. Supraventricular arrhythmia are responsible for blood clots and strokes, and are chronic in nature [2]. Four major subclasses of supraventricular arrhythmia are: atrial fibrillation (AFib), atrial flutter (AFlu), atrial ventricular nodal reentrant tachycardia

(AVNRT), and ectopic atrial tachycardia (EAT).

Atrial Fibrillation (AFib)

AFib occurs when action potentials fire rapidly within the atria in a chaotic manner

[1, 2]. The result is fast unsynchronized electrical activity in the atria. The atrial action

38

potentials attempt to conduct through the AV node as shown in Fig. 2.9. However, the AV node becomes intermittently refractory and allows only a fraction of atrial action potentials to reach the ventricles. This results in a ventricular rate of 100-150 beats per minute (bpm).

AFib causes blood to pool in the atria and form clots, which often leads to stroke [5]. The treatment for AFib includes blood thinners to prevent clots and strokes, and/or beta-blocker or calcium-channel blockers [2, 5] to slow down the heart-beats.

Figure 2.9 Atrial fibrillation and atrial flutter

Atrial Flutter (AFlu)

AFlu occurs when a reentrant electrical circuit is present in atria causing a repeated loop of electrical activity to depolarize the atrium at the rate of about 250-350 bpm (figure

9b). Due to faster heart rate, heart does not pump enough blood to the body. This can lead to congestive heart failure, heart attack or stroke [2]. It is treated by oral drugs such as beta-blockers, or calcium-channel blockers to slow down the heart-beats. Based on severity and prolongation, electrical cardioversion (a shock to restore rhythm) is also considered. 39

Atrial-Ventricular Nodal Reentrant Tachycardia (AVNRT)

AVNRT is a frequently occurring junctional or AV nodal arrhythmia [2, 5]. There are two pathways for depolarization impulse to travel in AV junction as shown in figure

10. The impulse travels over the slow pathway towards the ventricles and returns via the fast pathway to the atria causing interference with the next heart-beat. Most episodes of

AVNRT do not require treatment until prolonged or frequent episodes are experienced [1,

2, and 5]. Treatments include cardioversion, catheter ablation, beta-blockers, or vagal maneuvers such as holding breath and straining or coughing [1, 2, 5].

Figure 2.10 Atrio-Ventricular reentrant tachycardia and ectopic atrial tachycardia

Ectopic Atrial Tachycardia (EAT)

EAT has discrete focus in the main mass of atria outside sinus node and junctional region as shown in Fig. 2.10. This focus creates action potentials faster than sinus node and becomes the predominant pacemaker of the heart [1, 2, and 3]. The treatment of EAT is focused on rhythm control and prevention of recurrence. For persistent EAT, treatment can be beta-blockers or catheter-based ablation [2].

40

2.4.2 Ventricular arrhythmia

Ventricular Arrhythmias originate in ventricles due to the rapid and irregular firing of ectopic nodes present in ventricles. The electric potential is low, and does not yield regular blood flow in the body. Ventricular Arrhythmias are life threatening, and can lead to sudden cardiac death (SCD) [3, 4].

Ventricular Tachycardia (VTach)

VTach is a sequence of four or more depolarization with ventricular origin (see Fig.

2.11), and have a frequency of around 100 bpm. The start and end of ventricular tachycardia crisis are sudden [3]. Long-term treatment of VTach includes oral antiarrhythmic medications, implantable cardioverter defibrillator to correct abnormal rhythms, radiofrequency ablation where an electric current produced by radio wave destroys abnormal tissues, or cardiac-resynchronization therapy where an implantable device regulates heartbeats.

Figure 2.11 Ventricular tachycardia

41

Ventricular Flutter (VFlu)

VFlu is a rapid ventricular arrhythmia, with a rate of 180−250 bpm. Ventricular flutter occurs in severe organic heart disease, and consists of many ectopic locations that start depolarization. The most common cause of ventricular flutter is ischemia – lack of oxygen [3]. If not treated with either CPR (cardiopulmonary resuscitation) or electric shock in time, it degenerates into ventricular fibrillation [4].

Ventricular Fibrillation (VFib)

VFib represents the disappearance of organized ventricular electrical activity with ten or more ectopic nodes in ventricles causing depolarization [2, 3, and 4]. It is the most serious cardiac arrhythmia with serious hemodynamic consequences. The outcome includes loss of pumping action of the heart, collapse of cardiac output, and the loss of blood pressure. Most of the times, ventricular fibrillation is irreversible. Emergency treatment of VFib includes cardiopulmonary resuscitation (CPR) or defibrillation, where electric shock is delivered to the heart for arresting the chaotic rhythm [2].

2.5 Premature Beats and Patterns

The presence of ectopic nodes in heart cause irregular asynchronized beats that interfere with normal heart rhythm. Premature heartbeats get interleaved with regular sinus beats. This exhibits a pattern of sinus beat and premature beats. Premature beat-patterns are precursor to arrhythmias. Based on the number of premature beats between sinus beats, irregular beat-patterns are classified as bigeminy, trigeminy, and quadrageminy [1, 2].

42

Occasionally, the unsynchronized contraction momentary disrupts the regular blood- ejection volume pattern [3] and can start arrhythmia, especially in elderly age-group [2].

Increased frequency of premature beats are statistically predictive of serious type of arrhythmias like ventricular fibrillation and sudden-death [2].

2.5.1 Premature atrial contractions (PAC)

PACs occur due to the presence of one or more irritated ectopic nodes in atria as shown in Fig. 2.12. An ectopic node eventually misfires earlier than the SA-node, creating a premature beat. The subclass of PAC where transmission of an atrial impulse is blocked at or below AV-junction is called B-PAC (Blocked PAC). Frequent occurrence of B-PACs leads to hemodynamic compromise [3]. Persistent PACs precipitate into atrial arrhythmia, and are treated with beta-blockers [2].

Figure 2.12 Premature atrial contraction and blocked-premature atrial contraction

2.5.2 Premature ventricular contractions (PVC)

A PVC occurs when an ectopic node in the ventricle misfires, generating an action potential before the next scheduled regular heart-beat. Persistent PVCs precipitate into

43

ventricular arrhythmia. Treatment for PVCs include radiofrequency ablation, or oral medication like beta blockers for frequent PVCs [2].

2.5.3 Premature junctional contractions (PJC)

PJCs are caused by the increased automaticity of the AV junction and can be recurrent. PJCs are associated with a noncompensatory pause because the retrogradely conducted atrial impulse depolarizes and resets SA node as shown in Fig. 2.13. Unless frequent enough, PJCs do not require treatment. For symptomatic PJCs, antiarrhythmic drug is prescribed before it degenerates into junctional arrhythmia [3].

Figure 2.13 Premature ventricular contraction and premature junctional contraction

2.5.4 Classification of irregular beat-patterns

Beat-patterns are classified based on periodicity of occurrence of premature beats with regular sinus beats. If a premature beat occurs in every-other cycle, it is known as bigeminy. If a premature beat occurs every third cycle, it is known as trigeminy. If it occurs every fourth complex, it is quadrigeminy. Each beat-pattern is characterized by type of premature beat that accompanies it. Treatment for beat-patterns depend on ectopic 44

focus in the heart that causes premature beat in the observed pattern [2, 3]. In terms of severity, bigeminy is more severe than trigeminy that is more severe than quadrageminy.

2.6 ECG Variations in Arrhythmias and Irregular Beats

An overview of specific ECG characteristics for each subclass of arrhythmia for lead II are described briefly. Fig. 2.14 and 2.15 shows example of ECG for each subclass obtained from MIT-BIH arrhythmia dataset [24].

Figure 2.14 Beat variations of ventricular and supraventricular arrhythmia

AFib: Since depolarization of the atria and ventricles is not synchronized, action potentials produced are of low amplitude, and P-waves are not visible. In case of patients with left atrial enlargement, P-wave is present with a notch [1]. AFlu: Characteristic "sawtooth" pattern of the P-waves is seen. Baseline segments are characterized with negative or positive amplitude. S-waves can have notch representing left and right ventricular repolarization at a different time. AVNRT: The retrograde P-wave shows up at the end of the QRS-complex. PR distance is less than 100 ms and narrow QRS-complex is observed. Due to the presence of the loop near AV junction, R-waveform has shorter amplitude. S-wave can be missing. EAT: Due to an ectopic location(s) in atria, P-wave can be negative or with less than normal amplitude. PR-interval is shorter than usual.

45

VTach: It is characterized by AV dissociation. That means P-waves are not always followed by QRS-complexes. QRS-complex have wider duration due to slower depolarization. S-waves can be missing. VFlu: It is characterized by continuous sine wave, no identifiable QRS-complexes, T- waves, or baseline segments. The amplitudes of observed waveforms are twice more than normal R-waves. VFib: Due to chaotic irregular deflections of varying amplitude, no identifiable P-waves, or T-waves are detected. QRS-complex is seen as action potential traveling from atria to ventricles during ventricular repolarization. No baseline segments are observed. PAC: P-wave morphology and duration vary depending upon the location of the misfiring ectopic nodes. This can include P-wave inversion, burial of P-wave inside the preceding T-wave, or a change in PR-interval. Embedding of a P-wave in the previous T-wave alters the shape and amplitude of the T-wave. B-PAC: B-PAC exhibit similar morphology to PAC. However, due to the block at AV node, P-wave is embedded in previous beat’s T-wave [1, 2]. PVC: It is characterized by long QRS duration (> 120 ms), ST depression and missing or retrograde P-waves. P-wave can be embedded in QRS-complex leading to missing Q- wave. PJC: Since the impulse starts very close to the ventricles, a narrow QRS-complex is observed, and when a P-wave is present there is often a short PR-interval.

Figure 2.15 Beat variations for premature beats

46

2.7 Mathematical Concepts

As this study is focused on the analysis of ECG signal, it uses many mathematical concepts such as conditional probability, Gaussian distribution function, mixed Gaussian distribution function and Markov Model. This section briefly describes these mathematical concepts.

2.7.1 Probability

Probability describes the likeliness of an event to occur [32] based upon statistical analysis of unbiased experiments or observations. Probabilities are associated with experiments where the outcome is not known in advance or cannot be predicted. There are three kinds of probabilities: 1) Marginal − probability of an event occurring, 2) Joint − probability of combination of two or more events occurring, and 3) conditional − probability that an event occurs after another event has occurred. The conditional probability of A given B is equal to the joint probability of A and B divided by the marginal probability of B (see equation 2.5).

푃(퐴 푎푛푑 퐵) 푃(퐴|퐵) = (2.5) 푃(퐵)

2.7.2 Gaussian distribution

A Gaussian distribution, also commonly called the "normal distribution", has a

"bell-shaped curve" [32]. Gaussian distributions are used to model real-valued random variables whose distributions are not known. The probability density of normal distribution is given by the equation 2.6.

47

(푥−휇)2 1 − 푃(푋|µ, 휎2) = 푒 휎2 , (2.6) √2휋휎2 Where µ is the mean, 휎 is standard deviation, and 푋 is normally distributed variable as shown in Fig. 2.16.

Figure 2.16 Normal distribution

Multivariate Gaussian distribution is a generalization of the one-dimensional

(univariate) normal distribution to higher dimensions [33]. A random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. The probability density of multivariate normal distribution is given by the equation 2.7.

1 exp (− (푥−휇)푇 Σ−1 (푥−휇)) 2 푃푋(푥1, … , 푥푘) = (2.7) √2휋푘|Σ| Where, 푥 is k-dimensional vector, Σ is the corresponding covariance matrix, and 휇 is mean.

2.7.3 Gaussian mixture model (GMM)

GMM is a distribution assembled from weighted multivariate Gaussian distributions [32, 34]. Weighting factors assign each distribution different level of importance. In the data set, each point is assumed to be generated by a mixture of

Gaussians and can compute probability as shown in equation 2.8.

48

푘 푝(푥) = ∑푗=1 푤푗 . 푁 (푥|휇푗, 푗), (2.8)

th 푘 Where 푤푗 is the weight of j Gaussian and ∑푗=1 푤푗 = 1 and 0 ≤ 푤푗 ≤ 1 .

GMM formulates problem as follows: Given a set of data 푋 = {푥1, 푥2, . . , 푥푛} drawn from GMM distribution, estimate parameters 휃 of GMM model that fits the data. Solution of this problem is to maximize likelihood of set of data given model parameters, formulated as 푃(푋|휃).

2.7.4 Expectation maximization (EM)

One of the popular approaches to maximize the likelihood of GMM parameters

(weight, mean and standard deviation for each component) is to use expectation maximization [32]. The EM iteration alternates between performing an expectation (E) step and maximization (M) step. E-step creates a function for the expectation of the log- likelihood evaluated using the current estimate for the parameters of GMM. M-step computes parameters maximizing the expected log-likelihood found on the E-step. These parameter-estimates are then used to determine the distribution of the latent variables in the next cycle [32]. The process is repeated until parameters do not change significantly from E-step to M-step [33].

2.8 Machine Learning Concepts

This section describes the machine learning concepts used in this dissertation.

2.8.1 Supervised learning

Supervised learning is the machine learning task of learning a function that maps

49

an input to an output based on provided examples of input-output pairs [36]. It infers a function from labeled training data consisting of a set of training examples.

Given a set of 푁 training examples of the form {(푥1, 푦1), … , (푥푁, 푦푁)}, such that 푥푖 is the feature vector of the 푖th example and푦푖 is its label (class). A supervised learning algorithm seeks a function 푔: 푋 → 푌 where 푋 is the input space and 푌 is the output space.

The most widely used supervised learning algorithms are: support vector machine, naïve

Bayes classifiers, linear discriminant analysis, decision trees, and neural networks.

2.8.2 Rule-based system

Rule-based systems are used as a way to store and manipulate knowledge to interpret information in a useful way. They are often used in artificial intelligence applications and research. Rule-based systems involve human-crafted or curated rule sets.

A typical rule-based system has four basic components:

1. A list of rules or rule base, which is a specific type of knowledge base. 2. An inference engine, which infers information or takes action based on the interaction of input and the rule base [36]. 3. Temporary working memory to store the rules and input. 4. A user interface or other connection to the outside world through which input and output signals are received and sent.

2.8.3 Decision tree

A decision tree is a supervised machine learning algorithm that uses a tree-like graph or model of decisions and their possible consequences to classify data using training examples [36]. Internal node of the tree represents a test on an attribute, each branch 50

represents the outcome of the test, and each leaf node represents a class label (decision taken after computing all attributes).

Decision tree algorithm starts with training data set that contains classification attributes. Best attribute is determined by finding out the attribute that produces most information gain. The dataset is split into subset that contains possible values for best attributes. Decision tree node is assigned with the subset that contains best attribute. New decision trees are generated recursively using subset of data created until no further classification is possible.

2.8.4 Naive Bayes classifier

Naive Bayes classifiers are a family of probabilistic classifiers based on applying

Bayes' theorem with strong independence assumptions between the features [33]. Given a problem instance to be classified, represented by a vector 푥 = (푥1, … , 푥푛) representing some n features (independent variables), probabilistic naïve Bay’s model assigns to this instance probabilities 푝(퐶푘| 푥1, … , 푥푛) for each of 퐾 possible outcomes or classes 퐶푘.

Conditional probability 푝(퐶푘| 푥1, … , 푥푛) is calculated using Bayesian probability and conditional independence of features. Conditional probability is calculated using equation

2.9.

푝(퐶 )푝(푥|퐶 ) 푝(퐶 | 푥) = 푘 푘 (2.9) 푘 푝(푥) The naive Bayes classifier combines this probabilistic model with a decision rule

[32]. A decision rule is a function which maps an observation to an appropriate action.

51

The most popular decision rule for naïve Bayes’ classifier is maximum a posterior (MAP) estimation function [32].

2.8.5 Support vector machine

Support Vector Machine (SVM) is a discriminative classifier formally defined by a separating hyperplane [35]. It is a supervised algorithm, which computes an optimal hyperplane that separates data, given labeled training examples. In two dimensional space, hyperplane is a line dividing a plane in two parts where in each class lay in either side as shown in Fig. 2.17. Support vectors are the data points (shown in blue) in training examples that lie closest to the decision surface also known as hyperplane.

Fig. 2.17 shows an example of dataset with two dimensions 푋1 and 푋2. For the data which can be separated linearly, two parallel hyperplanes shown with dashed lines are computed that separate two classes of data, so that distance between both lines is maximum. The region between these hyperplanes is known as margin and maximum margin hyperplane is the one that lies in the middle of them. The hyperplanes are computed using equation 2.10.

푤푥푖 − 푏 ≥ 1 푖푓 휃푖 = 1 And 푤푥푖 − 푏 ≤ 1 푖푓 휃푖 = −1 (2.10)

Where, ||푤|| is a normal vector to the hyperplane, 휃푖 denotes classes, 푏 denotes bias

2 and 푥 denote features. The distance between two hyperplanes is . To maximize the 푖 ||푤|| distance, denominator value ||푤|| should be minimized.

52

Figure 2.17 SVM with two hyperplanes for two dimensional data

2.8.6 Clustering

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar to each other than to those in other groups (clusters) [36]. Clustering can be divided into two subgroups:

Hard Clustering: In hard clustering, each data point either belongs to a cluster completely or not. Soft Clustering: Instead of putting each data point into a separate cluster, a probability or likelihood of that data point to be in those clusters is assigned. Most popular types of clustering algorithms include: Connectivity models: these models are based on the notion that the data points closer in data space exhibit more similarity to each other than the data points lying farther away. Examples of these models are hierarchical clustering algorithm and its variants [36]. Centroid models: These are iterative clustering algorithms in which the notion of similarity is derived by the closeness of a data point to the centroid of the clusters. K- Means clustering algorithm is a popular algorithm that falls into this category [36]. Distribution models: These clustering models are based on the notion of how probable is it that all data points in the cluster belong to the same distribution (For example: Normal,

53

Gaussian). A popular example of these models is Expectation-maximization algorithm which uses multivariate normal distributions [36, 37].

2.8.7 Genetic algorithm

Genetic algorithms (GA) are commonly used to generate high-quality solutions to optimization and search problems by relying on bio-inspired operators such as mutation, crossover and selection [36]. This algorithm reflects the process of natural selection where the fittest individuals are selected for reproduction in order to produce offspring of the next generation. GA considers five phases:

1. Initial population: The process begins with a set of individuals which is called a population. 2. Fitness function: The fitness function determines how fit an individual is. It gives a fitness score to each individual. The probability that an individual will be selected for reproduction is based on its fitness score. 3. Selection: The idea of selection phase is to select the fittest individuals and let them pass their genes to the next generation. 4. Crossover: For each pair of parents to be mated, a crossover point is chosen at random from within the genes. 5. Mutation: In certain new offspring formed, some of their genes can be subjected to a mutation with a low random probability. Mutation occurs to maintain diversity within the population and prevent premature convergence. The algorithm terminates if the population has converged (does not produce offspring which are significantly different from the previous generation).

2.8.8 Principal component analysis (PCA)

Principal component analysis (PCA) [35] improves the computational efficiency 54

by reducing the dimensionality of a data set consisting of many correlated variables, while retaining the variation present in the dataset, up to the maximum extent [35]. There are four steps to reduce dimensionality of data using PCA:

1. Data is normalized for each dimension by first subtracting mean from each data point in each dimension. 2. Covariance matrix is calculated for the dimensions. 3. Eigenvectors and eigenvalues of covariance matrices are calculated. 4. Eigenvectors with the highest eigenvalue are chosen as the principle components of dataset. After choosing the components (eigenvectors), transpose of the vector is multiplied with the transposed dataset to obtain data with reduced dimensions.

Figure 2.18 An illustration of principal component analysis

Fig. 2.18 shows an example of PCA with two features: feature1 and feature2. The principal components shown with green lines are the eigenvectors of covariance matrix.

The eigenvector with maximum eigenvalue is picked because it shows the maximum variance.

55

2.8.9 Linear Discriminant Analysis

Linear Discriminant Analysis (LDA) is most commonly used supervised dimensionality reduction technique [35]. The goal is to project a dataset with n- dimensional samples onto a space with k-dimensions (where 푘 ≤ 푛 − 1 ); with good class- separability in order avoid overfitting and also reduce computational costs. There are five steps to reduce data using LDA:

1. Compute the 푑-dimensional mean vectors for different classes from the dataset. 2. Compute covariance matrices between and within class. 3. Compute eigenvectors and corresponding eigenvalues. 4. Sort eigenvectors by decreasing eigenvalues and choose k eigenvectors with largest eigenvalues to form 푑 × 푘 dimensional matrix. 5. Use 푑 × 푘 eigenvector matrix to transform samples onto new subspace. While PCA concentrates on finding the component axes that maximize the variance, LDA attempts to find feature subspace that maximizes class separability.

2.8.10 Markov model

A Markov model is a probabilistic nondeterministic finite state automata to study temporally changing systems [38] as a set of discrete states and probabilistic transitions between the states in discrete unit time-step. The transition to the next state depends upon the current state and a finite set of prior states.

To reduce the computational complexity, first-order Markov model, a restricted case of Markov model is used. First-order Markov model [38] assumes that transition to the next state depends only on the current state; all other previous states are ignored. Given

56

the variables 푋1, 푋2, 푋3, . . , 푋푡 taking on values 푥1, 푥2, 푥3, … , 푥푡, first-order Markov model assertion is given by the equation 2.11.

푃(푋푡 = 푥푡| 푋푡−1 = 푥푡−1, 푋2 = 푥2, 푋1 = 푥1) = 푃(푋푡 = 푥푡| 푋푡−1 = 푥푡−1) (2.11) A Markov model 푀 is characterized by a quadruple of the form {푆, 푇, 퐼, 푂}, where,

푆 is the finite set of states of size 푁, 푇 is the 푁 × 푁 matrix of transition-probabilities, 퐼 is the initial probability vector for the start states, and 푂 is the set of emissions for each state in 푆. A transition probability for Markov model is generated from an observation sequence by calculating the frequency of transition from each state to another state. This probability is normalized by dividing frequency of transitions between two states by cumulative number of transitions.

Figure 2.19 An illustration of transitions in first-order Markov model

An example of a first order Markov model comprised of three states 푆 = {푆1, 푆2,

푆3} is represented as shown in Fig. 2.19. Each state generates an output from finite set 푂 =

{푎, 푏, 푐}. The arrows between states represent transition probabilities of going from one state to another. It is an example of cyclic fully connected Markov model, since each state is reachable from any other state.

57

Table 2.1 represents transition probability matrix 푇 for the model. Each cell Tij represent the transition probability of going from state 푖 to state 푗. As can be seen from

Table I, each row of the table sums up to 1; which represents normalized probability for

Markov model. Table 1b represents initial state distribution for each state.

Table 2.1 Probability matrices in Markov model S1 S2 S3 S1 0.8 S1 0.2 0.2 0.6 S2 0.1 S2 0.3 0.2 0.5 S3 0.1 S3 0.2 0.6 0.2

2.8.11 Hidden Markov Model

In Markov models, the state and output are directly visible to the observer.

However, in the hidden Markov model (HMM), the state is not directly visible, but multiple signals emitted from the states are visible. There is a many-to-many mapping between the states and the emissions. Each state has a probability distribution over the possible emissions. Therefore, the sequence of tokens generated by an HMM gives probabilistic information about the sequence of states. HMMs have been applied in temporal-pattern recognition such as speech, gesture, and genomics to identify genes [36].

An example of HMM comprised of three hidden states {푆1, 푆2, 푆3} and three observed output {푂1, 푂2, 푂3} generated from hidden states is shown in Fig. 2.20.

HMMs are modeled as 5-tuple of the form 휆 = (Σ, 퐸 퐴, 퐵, 휋) where, Σ is the set of the hidden states, E is the set of emissions, 퐴 is the set of transition probabilities between hidden states, 퐵 represents observation probabilities of the emissions and 휋 represents initial probability distribution of hidden states. 58

Figure 2.20 Hidden Markov Model Representation

Based on observation sequence, there are three types of problems in HMM:

1. The evaluation problem: Given HMM 휆 and observation 푂, find out the probability that observations are generated by the model. This problem is solved using forward-backward algorithm [36]. 2. The decoding problem: Given a model and a sequence of observations, find out the most likely state sequence in the model that produced the observations. This problem is solved using Viterbi algorithm, which is a dynamic programming algorithm. The algorithm generates a path by calculating most likely sequence to each state once and use it for remaining calculations. 3. The learning problem: Given a model and a sequence of observations, how to adjust the model parameters in order to maximize 푝(푂|휆). This problem is solved using Baum-Welch algorithm which reduces the number of calculations by finding locally optimal parameters [36]. It uses EM algorithm to estimate parameters until local maxima is reached.

2.8.12 Mixing Markov model and Gaussian distribution

A multivariate Markov model is a special case of Markov model that captures the behavior of observation sequences, where each observation is composed of more than one

59

variable. When each observation in the sequence has more than one variables and each variable is distributed normally; then relationship between multiple variables for each observation can be included by using distribution for all variables using multivariate distribution [38].

An example of sequence of observations with two variables (푥푖, 푦푖) is represented as two sequences: 푋 = {푥1, 푥2, 푥3, … } and 푌 = {푦1, 푦2, 푦3, … }. If each variable in the observation sequence is distributed normally, each observation of the sequence is represented as bivariate distribution of two variables 푥푛 and 푦푛. In this case, Markov model

푀 is represented as {푆, 푇, 퐼, 푂}, where each observation 푂푛 for states 푆푛 is represents bivariate distribution for 푥푛 and 푦푛.

2.8.13 Forward algorithm

Given a model 푀 = (푆, 푇, 퐼, 푂) and observation sequence 푂 = 표1, 표2, … , 표푡, the goal of the forward algorithm [36] is to find out the conditional probability 푝(푂|푀) of an observation sequence O that was generated by the model 푀. The probability is calculated by using an auxiliary variable 훼푡 known as forward variable. The forward variable is used to hold the probability-value of the partial observation sequence when it terminates at state 푖 as given by the equation 2.12.

훼푡 = 푝(표1, 표2, … , 표푡, 푠푡푎푡푒푡 = 푚| 푀) (2.12) The probability is calculated recursively until 푇 = 푡 is reached. The required probability is given by equation 2.13. 푡 푃(푂|푀) = ∑푖=1 훼푖 (2.13)

60

2.8.14 Artificial neural network (ANN)

ANNs [36] are computing systems inspired by the biological neural networks. An

ANN is based on a layered collection of nodes called perceptrons. Each perceptron transmits a signal using a weighted edge to another perceptron at the next layer. A perceptron receives the signals from multiple perceptrons, and fires if the cumulative value is above a threshold.

ANN is composed of seven basic elements: 1) input signal, 2) weighted edges for signal transmission, 3) linear aggregator to sum the input signals, 4) negative bias to handle response in the presence of noise, 5) activation threshold to compare the aggregated value,

6) activation function to produce the output value, and 7) output signal produced by a perceptron that becomes input for the next layer of perceptrons.

Different kinds of ANN can be described by the activation functions of their neuron, learning rule that adjust the weight according to output and by the connection formula which describe how the neurons are connected.

2.8.14.1 Fuzzy neural network

The fuzzy neural network (FNN) refers to combination of artificial neural network and fuzzy logic [36]. The integration of neural networks and fuzzy inference systems could be formulated into three main categories: cooperative, concurrent and integrated neuro- fuzzy models.

A cooperative model consists of ANN learning mechanism that determines the

61

fuzzy inference system membership functions or fuzzy rules from the training data. In a concurrent model, neural network assists the fuzzy system or fuzzy systems assist neural networks continuously to determine the required parameters. Such combinations do not optimize the fuzzy system but only aids to improve the performance of the overall system.

In an integrated model, neural network learning algorithms are used to determine the parameters of fuzzy inference systems to optimize parameters. Integrated systems share data structures and knowledge representations.

2.8.14.2 Self-organizing map

A self-organizing map (SOM) is a type of ANN that is trained using unsupervised learning to produce a low-dimensional, discretized representation of the input space of the training samples, called a map, and is therefore a method to do dimensionality reduction

[35]. SOM applies competitive learning, in which output neurons compete amongst themselves to be activated based on discriminant functions like Euclidean distance, with the result that only one is activated at any one time. This activated neuron is called a winner-takes-all neuron. Such competition can be induced/implemented by having lateral inhibition connections (negative feedback paths) between the neurons. Due to this the neurons are forced to organize themselves. The objective of an SOM is to transform an incoming signal-pattern of arbitrary dimension into a one or two dimensional discrete map, and to perform this transformation adaptively in a topologically ordered fashion.

62

2.8.14.3 Learning vector quantization

The Learning Vector Quantization (LVQ) algorithm is a supervised neural network that uses a competitive (winner-take-all) learning strategy when labeled input data is available [35, 37]. The objective of the algorithm is to prepare a set prototype vectors in the domain of the observed input data samples and to use these vectors to classify unseen examples. An initially random pool of vectors is prepared which are then exposed to training samples. A winner-take-all strategy is employed where one or more of the most similar vectors to a given input pattern are selected and adjusted to be closer to the input vector. The repetition of this process results in the distribution of prototype vectors in the input space which approximate the underlying distribution of samples from the test dataset.

2.9 Parallelism

In a multiprocessor system, when each processor executes different thread on the same or different data then task parallelism is achieved [39, 40]. When each processor performs the same task on different pieces of distributed data, then data parallelism is achieved [39]. Various types of parallelism according to Flynn’s Taxonomy are as follows

[40]:

 Single Instruction Single Data (SISD) - Single instruction stream on a single data. Single-

core CPU is enough to execute SISD.

 Multiple Instruction Multiple Data (MIMD) - Multiple autonomous processors

simultaneously executing different instructions on different data.

63

 Single Instruction Multiple Data (SIMD) - It represents the organization of a single

computer containing a control unit, processor unit and a memory unit. Instructions are

executed sequentially. It can be achieved by pipelining or multiple functional units.

 Multiple Instruction Single Data (MISD) - Multiple instructions operate on one data

stream. Many functional units perform different operations on the same data.

2.9.1 Dependency Analysis

Before exploiting data or task parallelism, dependency imposed on the statements to be executed has to be analyzed, and the statements have to grouped so that the overhead of the data transmission and packing and unpacking of data does not negatively impact the improvement if computational efficiency gained by parallelization [41].

Control dependency imposes sequentiality due to dependency of other statements on conditional expressions like if-then-else statements or for loop, where each operation in for loop is dependent on index-variable. This dependency of for-loop can be removed by technique called loop unrolling, where for-loop is substituted by the block of statements

(inside for-loop) that are executed concurrently on a multiprocessor machine.

Data dependency is sequential execution restriction imposed because of destructive update of shared memory locations between statements and restrictions imposed by consistency. In order to exploit concurrency, program is modeled as a graph, such that each statement is a node, and dependency is modeled as an edge between nodes. The process of automated parallelization of a program involves following steps:

1. Transform the program to have minimal control dependency by using techniques like loop-unrolling. 64

2. Create data-dependency graph. 3. Superimpose data-dependency and control dependency graph to create program dependency graph. 4. Parallelize those parts of graph which are not connected through any dependency.

Fig. 2.21 shows an example of a program dependency graph with data dependency shown with dotted line arrow and control dependency. The end of the arrow in the graph represents dependent statement or block of statements.

Figure 2.21 Example of program dependency graph

2.10 Graphics Processing Unit (GPU)

GPU is a specialized architecture for compute-intensive, highly data parallel computations [23]. This is done by devoting more processing elements for concurrent data processing. Fig. 2.22 shows a comparison of CPU and GPU architectures. Tasks well- suited for data-parallel computations are implemented on GPU to improve computational efficiency [14, 23]. Popular GPU vendors are AMD, Intel and NVIDIA [14].

65

Figure 2.22 Architecture of GPU and CPU

2.10.1 Compute unified device architecture (CUDA)

CUDA is a compiler and toolkit for programming NVIDIA GPUs [23]. CUDA

API extends C programming language and gives high level abstraction from hardware.

2.10.1.1 CUDA components

Grid - A grid is a group of threads all running the same kernel. These threads are not synchronized. Every call to CUDA from the host CPU is made through one grid.

Block- Grids are composed of blocks. Each block is a logical unit containing several coordinating threads sharing certain amount of memory. All blocks in a grid use the same program.

Thread - Blocks are composed of threads. Threads run on individual cores of the multiprocessors.

2.10.1.2 Memory Architecture

CUDA threads may access data from multiple memory spaces during their execution as illustrated by Fig. 2.23. There are different types of memory used by the threads as follows:

66

 Global memory - It is a read-and-write memory. It is slow and uncached, and requires sequential reads and writes.  Texture memory - It is a read-only memory. It is cache optimized for two- dimensional spatial access pattern.  Constant memory – It is used for data that doesn’t change over the course of execution of kernel and it is read-only. It is slow, but uses cache.  Shared memory – Shared memory is specific to a block. All the threads in a block can use the shared memory for read or write operations. Its size is smaller than the global memory. The number of threads that can be executed simultaneously in a block is determined by the shared memory size.  Local memory - It is generally used for the data that does not fit into registers. It is slow & uncached, but allows automatic coalesced reads and writes.  Registers - This is the fastest memory available. One set of registers is given to each thread. Threads use registers for fast storage and retrieval of data.

Figure 2.23 A schematic of CUDA memory architecture

67

2.10.2 CUDA programming model

The CUDA programming model is a heterogeneous model in which both the CPU and GPU are used. In CUDA, the host refers to the CPU and its memory, while the device refers to the GPU and its memory. Code run on the host can manage memory on both the host and device, and also launches kernels, which are functions executed on the device.

These kernels are executed by many GPU threads concurrently.

2.11 Accuracy Metrics

In this subsection, evaluation methods used to measure accuracy of classification algorithm is discussed.

2.11.1 Sensitivity and specificity analysis

Sensitivity and specificity [32] are used to evaluate classification accuracy [32].

Sensitivity refers to ability of the classifier to correctly detect cases with a condition.

Specificity refers to the classifier’s ability to correctly reject the cases not satisfying the condition. False positives (FP) represent cases incorrectly identified as having the medical condition(s). True negatives (TN) represents the cases that are correctly identified as not having the medical condition(s). False negatives (FN) are cases incorrectly identified as not having the medical condition(s). True Positive (TP) are cases correctly identified as having a condition. Positive predictive value (PPV) is the proportion of positive results by classifier given by eq. 2.16. Classification accuracy rate is the percentage of true

68

detections in the total number of cases. The sensitivity, specificity, and accuracy are given by the equations 2.14, 2.15 and 2.17 respectively.

푇푃 푆푒푛푠푖푡푖푣푖푡푦 % = ∗ 100 (2.14) 푇푃+퐹푁

푇푁 푆푝푒푐푖푓푖푐푖푡푦 % = * 100 (2.15) 푇푁+퐹푃

푇푃 푃푃푉 % = ∗ 100 (2.16) 푇푃+퐹푃

푇푃+푇푁 퐴푐푐푢푟푎푐푦 % = ∗ 100 (2.17) 푇푃+퐹푃+퐹푁+푇푁

2.11.2 Receiver Operating Characteristics (ROC) Curve

ROC curve is a popular method to choose a cut-off value used in medical diagnosis

[32, 35]. A ROC curve is obtained by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings [32] as shown in Fig. 2.24.

Figure 2.24 ROC curve

69

It displays the full picture of trade-off between the sensitivity (true positive rate) and specificity (false positive rate) across a series of cut-off (threshold) points.

Each point on the ROC curve represents a sensitivity/specificity pair corresponding to a particular decision threshold. There are multiple methods to choose threshold based on

ROC curve. The most popular method is to select a point on ROC curve closest to highest sensitivity and lowest false positive value (left-most point on graph plot) as shown in Fig.

2.24.

70

Previous Works

The electrocardiogram (ECG) signal contains temporal and morphological information about patient’s heart depolarization and repolarization. Depolarization is associated with heart contractions and repolarization is associated with restoring of the resting state before next cycle. Any variation in the electrical activities within the heart is reflected in the ECG through morphological and temporal changes. Morphological changes refer to changes in morphology of individual components such as amplitude, slope of waveforms and baselines. Temporal changes refer to changes observed in ECG components over time such as duration of components, interval between heartbeats. ECG analysis reveals the rhythm and activity of the heart is an important non-invasive clinical tool for cardiologists to diagnose various heart diseases. Many automatic ECG arrhythmia classification systems have been investigated using computational intelligence over the years [6, 7, 10, 11, 12, 13, 14, 15, and 16].

Early work on ECG signal analysis began with simple rule based systems where, rules defined by cardiologists were automated with simple if-then statements [43]. Next attempts focused on analyzing the heart rate variability (HRV) by analyzing the variation in the durations between two consecutive R-waveforms (RR-interval analysis) [43, 44].

Research then progressed into automated analysis using machine learning techniques of

71

RR-interval variability with extracted morphological features such as amplitude, slopes of waveforms and baselines. Many different machine learning concepts were used such as hidden Markov models (HMM) [10, 11], artificial neural networks (ANN) [12, 13], and cluster based techniques [15, 16] for the diagnosis of arrhythmia.

This chapter briefly describes previous work, their influence on this dissertation, and their inherent limitations in accurate analysis of arrhythmia.

3.1 Rule based systems

Guvenir [6] presented a classification algorithm called Voting Feature Intervals

(VFI) using a set of feature intervals on each feature dimension such as QRS duration, RR- intervals, PR-intervals and QT-intervals separately. Each feature participates in the classification by distributing real-valued votes among classes based on training examples.

The class receiving the highest vote is declared to be the predicted class. VFI can be compared with the Naive Bayesian Classifier, which also considers each feature separately.

The algorithm classified 15 classes of arrhythmia and irregular beat with 62% accuracy.

Although multiple features were considered for this supervised classification scheme, the relationships between these features were not considered. That includes transition between waveforms indicative of ectopic location of impulse generation which narrows down arrhythmia subclass. The scheme also did not consider changes in the morphological features due to superimposition of the waveforms due to different origin of the arrhythmias.

72

Another rule-based system presented by Tsipouras [43], explored consecutive RR- intervals for arrhythmia classification. A sliding window of three RR-intervals was used to classify each beat using rules obtained from cardiologists. RR-interval is sufficient for analyzing heart-rate variability, and it provides information about prematurity of the beat based on how early next R-peak appears compared to previous R-peak. However, to detect and classify finer arrhythmia subclasses, information provided by other waveform and baseline features should also be included in classifier. The rule-based system presented in this paper classified beats into normal, premature ventricular, atrial and ventricular flutter beats. Higher sensitivity of 98% was achieved for normal beat and ventricular flutter beats detection. However, premature ventricular beats and atrial beat detection achieved lower sensitivity of 75%. The research was also limited to broader arrhythmia classification; while finer classification of atrial and ventricular arrhythmia is important due to different treatment requirement for each subclass [1, 2, and 3].

A study proposed by O. Wieben [44] presented decision tree algorithm based on inductive learning for identification of premature ventricular complex (PVC). The heart rate features used in classifiers included RR-interval, the ratio of RR-interval of current beat with previous RR-interval and standard deviation of heart rate over the mean. In decision tree algorithm, nodes contained tests conducted on features and branches contained possible outcome. The classification process started in the node at the root of the tree, performed tests in successive nodes and took the appropriate branches until categorization to a class in a leaf. Rules included in a decision tree set precedence among

73

the rules for classification.

PVC beats are characterized by superimposition of waveform from beat with waveform from previous beat due to prematurity of the beat [2, 5]. Due to this superimposition, R-waveform can be missed due to lower amplitude. RR-interval analysis requires detection of superimposition of waveforms and detection of R-waves with lower amplitude; which was missing in the scheme. The proposed method doesn’t consider premature beats originating in atria and junction; which can degenerate into serious arrhythmia. The scheme achieved 85% sensitivity for PVC beat detection.

3.2 HMM-based techniques

Hidden Markov Model was used successfully since the mid-1970’s to model signal analysis problems such as speech waveforms for automatic speech recognition. Popularity of these models is primarily due to an automatic parameter estimation algorithm discovered by Baum and Eagon [36] in the late 1960’s and subsequently applied to speech processing.

Douglas et al. [10] used HMM to identify each beat by its waveform components to achieve complete arrhythmia analysis. The research formulated the arrhythmia analysis problem as one of the decoding the observation sequence of a hidden Markov process. The objective was to recover an unobservable Markov state sequence representing the electrical activation of the heart from the observed ECG signal. Each waveform or interval in the

ECG signal was assumed to correspond to a state of this Markov process. The observation sequence elements were considered as digital samples of the ECG signal. The presented approach classified beats into three categories: Normal, ventricular and supraventricular 74

beats. Normal beats were classified with 98% sensitivity. Ventricular and supraventricular beats were classified with 80% sensitivity. This study introduced and applied the importance of transitions between waveforms for classification problem.

However, Markov chain topology was limited to transitions observed within waveforms. Morphological features such as amplitude, slope, duration of waveforms pertaining to each class and subclass of arrhythmia were not included in HMM.

Superimposition of waveforms with waveforms of same or previous beat leading to misclassification was also not considered.

Another study by W. T. Cheng [11] used HMM to classified ECG beat in three types: Normal, PVC and atrial arrhythmic beats. In this research, original signal was approximated by line segments. The differential amplitude and the normalized interval of each line segment was the observations used in modeling. HMMs of each type of beat are trained based on the annotated ECG data. Finally, the trained models were used in the classification of the ECG signals. The average classification accuracy is 93% for normal beats, 65.55% for PVC beats, and 56.38% for atrial arrhythmic beats respectively.

Although these line segments referred to waveform and baseline segments, waveform features (duration, amplitude, and slope) were not included in the model.

Temporal features were considered for one beat in terms of waveform transitions within a beat. Temporal features between multiple beats were not included. The proposed method is limited to arrhythmia classification in broader classes. Ventricular and atrial arrhythmia subclasses which are predictive of serious arrhythmia are not classified.

75

Study presented by Andreao [45] used HMM approach for online beat segmentation and classification of ECGs for detection of PVC. To represent ECG signal by observation sequence, continuous wavelet transform was used. The HMM topology used for beat modeling introduced transitions between waveforms for arrhythmic beats so that one model can include several morphological classes. The beat model included transition from P- wave to isoelectric/baseline (ISO) to represent missing ventricular activity and transition from ISO to QRS to represent missing atrial activity. However, single ISO line was used to represent many segments (TP, PQ and ST). Each segment was characterized by amplitude and duration indicating different pathological conduction of electrical impulse.

The research presented in this paper used HMM for beat segmentation and PVC beat recognition was done using rule-based system with 59 to 87% sensitivity.

HMM topology used for beat segmentation did not consider all the transitions between waveforms observed for arrhythmic beats. Also, by combining all segments into one ISO line in HMM, morphological characteristics of segments are not utilized for identification of arrhythmia. The rule-based analysis based on RR-interval for PVC beat detection did not consider superimposition of waveforms which limited the sensitivity.

3.3 SVM-based techniques

SVM is one of the most popular classifiers found in literature for ECG-based arrhythmia classification models. SVM performs classification by non-linearly mapping their n-dimensional input into a high dimensional feature space. In this high dimensional feature space, a linear classifier is constructed. 76

Study presented by Park [46] used SVM to introduce hierarchical classification method to classify beats as normal, ventricular or supraventricular-based on QRS morphological similarity. Hierarchical classification introduced in this research was based on the assumption that QRS-complexes of normal and supraventricular beats have similar morphology than those of ventricular and premature beats. Based on it, two groups were created. In the first phase, heartbeats were classified into two groups based on fast Fourier transform (FFT) and interval features. In second phase, groups were further classified based on RR-interval and morphology features. Classifier obtained sensitivity of 86% for normal beats, 82% for supraventricular beats, 80% for ventricular beats and 54% for premature ventricular beats.

Assumption of morphological similarity of QRS-complex limits the classification to ventricular and supraventricular beats. Finer arrhythmia subclasses are not considered.

Although some morphological features pertaining to QRS-complex were considered; morphological features for other waveforms and baselines were not considered. Temporal features such as RR-intervals and duration of waveform was also not included in classifiers.

Superimposition of waveform analysis to detect waveform for PVC beat was missing in the proposed study. Since prior of normal class is by far larger than other classes, classifier obtained by standard SVM training achieved higher accuracy for normal beats compared to arrhythmic beats.

Study proposed by Lannoy [47] overcame the issue imbalance of the MIT-BIH database due to higher number of normal beats compared to arrhythmic beats. The study

77

introduced weighted SVM, which assigned weights to data points such that SVM learnt separating hyperplane according to relative importance of data points. For the feature set used in SVM classifier, seven features, which included RR-interval, segmentation features, and FFT features pertaining to frequency of waveforms were included. The classification achieved sensitivity of 85% for PVC beats, 78% for ventricular beats, 88% for supraventricular beats and 80% for normal beats.

Since morphological features and transitions between waveforms for beats belonging to each subclass of arrhythmia was not be included, classification was limited to separating normal, ventricular, supraventricular and PVC beats. Superimposition of waveforms was not considered for PVC beat detection and arrhythmia subclasses which limited the sensitivity of the scheme.

3.4 ANN-based techniques

There is much interest in the processing of ECG signal with neural networks. ANNs have been used as pattern and statistical classifiers in many application areas, including medicine [12, 13].

A study presented by Osowski [12] presented the application of fuzzy neural network for ECG beat recognition and classification. In this method, features extraction of RR-interval, QRS-area, QRS amplitude was done using FFT analysis and higher order statistics analysis, which considers amplitude variance and shape parameters of waveforms. Final classifier consisted of fuzzy self-organizing subnetworks connected in cascade with the multilayer perceptron (MLP). Self-organization of neural network was 78

done using clustering algorithm with fuzzy membership degree. The signals of all self- organizing neurons formed the input vector to the second subnetwork of MLP. Each output from the neuron was responsible for one type of beat. Learning of the MLP was done using gradient method and back propagation.

Both self-organizing and MLP layers’ operations depend on training examples.

Hence, in case of arrhythmic beats with fewer examples, finding out gradient of error function for training back propagation can lead to lower sensitivity. The authors also noted that as the number of beats for training decreased, error rate for testing the classification of that beat increased. In this study, fuzzy hybrid neural network recognized normal beats with 90%, ventricular beats with 89%, atrial beats with 79%, PVC beats with 90%, flutter beats with 78%, PAC beats with 70% sensitivity. The classification did not separate between atrial and ventricular flutter that have different morphology of QRS-complex and

T-wave [2, 5]. Some important arrhythmia subclasses such as ventricular tachycardia, ectopic atrial tachycardia requiring different treatment were not considered for classification.

Another ANN-based method was presented by Shyu [13] where PVC beat detection was achieved using fuzzy neural network (FNN). The feature-extraction was done using wavelet transform to obtain QRS duration and area. For beat detection, five layer FNN was adapted including an input layer, a membership layer, a rule layer, a hidden layer and an output layer. Training of the FNN was done using supervised back-propagation method.

Three membership functions assumed to be Gaussians were defined as positive for normal

79

ECG, negative for PVC and zero for other beats.

However, membership function which is crucial in defining a degree of membership to each type of beat considered only few features like QRS-area, R-wave amplitude. This can lead to a higher number of false negatives. As authors noted, when

R-wave amplitude of PVC beat is smaller than other PVC beats then their approach misclassified it as normal beat. Superimposition of waveforms observed for PVC beat was not considered for classification. The presented approach for PVC classification achieved

95% sensitivity for 7 records tested from MIT-BIH database [21].

3.5 Clustering-based techniques

Since ECG analysis for classification of beats is based upon the number of extracted features, many clustering algorithms were studied for classification of beats. Cluster-based classification is based on distance measure between feature vectors.

Study presented by Martis [15] introduced automatic classifier for ECG using

Gaussian Mixture Model (GMM). A set of features obtained using pan Tompkins's algorithm [48] and FFT were used as an input to the classifier. The features includes R wave amplitude and FFT coefficients. GMM-based classifier was used to find out two types of beats (normal and arrhythmic) and assumed that they were generated by two

Gaussian processes. Parameters for GMM were obtained using expectation maximization clustering. Classification accuracy of separating normal beats was 95% from arrhythmic beats was 92%.

However, for classifying arrhythmia-based on ectopic location, more features 80

should be included into the model. By providing initial parameters for each cluster for EM algorithm, optimization of maximum likelihood estimation for clustering can be improved.

Since expectation maximization is a nonlinear optimization approach that optimizes the log likelihood over entire feature space, it is possible to include more morphological and temporal features specific to each subclass of arrhythmia.

Another clustering-based study for detection of abnormal beats from compressed

ECG was presented by Sufi [16]. Amplitude, duration and slope of all waveforms and isoelectric lines were used for classification. In the proposed approach, model learned normal and abnormal ECG using selected attributes used for clustering. Greedy best first algorithm was used for selecting attributes’ relative utility for each class and correlation with other attributes. Attributes that provided maximum separation between two classes with reduced dimensionality were selected as a subset used for clustering. Incoming ECG signal was first analyzed for feature-extraction and subset of attributes was selected. Then multidimensional expectation maximization clustering was used to form two clusters- based on a subset of attributes. Number of clusters to be formed are determined by expectation maximization algorithm without prior information. The approach claims

100% accuracy for separating normal and abnormal beats.

The study does not classify atrial or ventricular arrhythmia. For finer classification of arrhythmia, detailed morphological and temporal features for each baseline and waveform needs to be included in the classifier. Although some morphological features were used for classification, temporal features within each beat captured by transition

81

probabilities within waveforms were not considered. Since clustering-based classification depends on Gaussian distribution property of data [32], including features such as the transition probability within waveforms that do not exhibit known distribution function is challenging.

3.6 Other methods

Research published by Nasiri [49] presented a method that combined both SVM and genetic algorithm approaches. Temporal features included RR-interval and morphological features included P-wave, S-wave, R-wave and R-wave amplitude. All features were selected manually. Features were reduced using a statistical approach of principal component analysis and meta-heuristic approach of genetic algorithm. Genetic algorithm trained to select a proper subset of features was chosen for feature reduction because of its property to converge to a semi-optimal solution. Four types of beats were distinguished using SVM-based classifier. Accuracy achieved for normal beats, left beats, right bundle branch block beats and PVC beats was 93%, 85%, 86% and 86% respectively.

The research presented combined genetic algorithm and SVM-based classifier for

ECG classification concentrated on offline processing of the signal. Feature-extraction process was not automated which is crucial for real time processing of signal. Although combinations of approaches are used for improving classification, both methods depend on training data availability for computing fitness function for genetic algorithm and may have

82

issues for imbalanced data with fewer training examples [50]. Waveform transitions and superimposition of waveforms for PVC beat detection was not considered.

Research study proposed by Lagerholm [51] developed a method for unsupervised characterization of ECG signals. An integrated method for clustering of QRS-complexes was presented, which was based on self-organizing neural networks (NN’s). The signal was analyzed using FFT to obtain RR-interval, amplitude and width of the QRS-complex.

The aim was to partition the beats into clusters-based on these parameters that represent central features of the data such that similarity structures between clusters are preserved.

The clustering was obtained using self-organizing NNs. The accuracy obtained for normal, ventricular, supraventricular, PAC beats was 90%, 94%, 90% and 85% respectively.

The study did not classify finer subclasses of atrial and ventricular arrhythmia which are characterized by morphological and temporal features of waveforms and basslines. Temporal features is the presented research included RR-interval. However, intra-beat temporal changes characterized by waveform transition probabilities for different arrhythmia subclasses was not considered by classifier. Premature beat classification-based on location of ectopic focus into premature junctional and ventricular beats was not achieved.

To let the classification algorithm adapt to special characteristics of each patients’

ECG characteristics, mixture-of-experts (MOE) approach was developed by Yu Hen Hu, where multiple experts (learners) were used to divide the problem of ECG classification into multiple regions [52]. A small customized classifier was developed based on brief,

83

patient-specific ECG data, which was then combined with a global classifier, which was tuned to a large ECG database of many patients, to form an MOE classifier structure. The extracted features included amplitude of R-wave, RR-interval, and QRS-complex duration.

The proposed approach was based on three popular artificial neural network (ANN)-related algorithms: the self-organizing maps (SOM), learning vector quantization (LVQ) algorithms, along with the mixture-of-experts (MOE) method. SOM and LVQ were used to train patient specific classifier and MOE was used to combine the two classifiers with patient-specific adaptation. This research emphasized on patient-specific training involved in classifier to improve accuracy. The algorithm differentiated between normal and PVC beats for 20 records with sensitivity 86 and 82% respectively.

Presented approach was limited to PVC beat detection. Finer arrhythmia and premature beat subclasses requires morphological features for all ECG components and temporal features within beats to be included in classifier. Since the research focused on patient specific analysis, at least 5-min training of observed ECG is required before classification. This increases real time response of classifier significantly.

3.7 Limitations

Early research on automated ECG signal classification focused mainly on automation of interval analysis [43], rule-based analysis [44]. By introducing better feature-extraction techniques with rule-based and interval-based techniques; primary classification of beats into either normal or irregular beat was achieved with limited accuracy of up to 87%. However, other features of ECG waveform were not considered 84

for classification. For instance, presence and absence of P-wave are important to distinguish between atrial flutter and atrial fibrillation [2], inverted P-wave appears in

Junctional Tachycardia [2, 3].

Automated classification research progressed into using more pattern recognition and AI techniques like PCA, LDA, ANN, HMM [36] among other techniques. Hidden

Markov beat modeling was presented by Coast [10] for ventricular, supraventricular and normal beat. Although, presence and absence of waveform, morphological characteristics like amplitude and duration were not considered for classification. Neural network technique presented by Maglaveras [53], dealt with detection and classification of PVC and ischemic beats using non-linear PCA. The training set consisted ST-T segment and

QRS width for classification. Other waveform features and waveform transition between different beats were also not considered, which gave specificity of 82%. Various approaches with SVM variations [50] present a negative behavior for imbalanced classes.

Since standardized datasets like MIT-BIH arrhythmia database [21] also exhibit this imbalance of available data for normal and arrhythmic beats, SVMs show low sensitivity of beat classification.

All the mentioned techniques classify each beat into major arrhythmia classes.

Research has shown that often ECG signal is contaminated by noise such as power line interference or movement of patient [3]. This noise can contribute into signal analysis by giving misclassification of single beat. Although techniques like DWT, filter banks exist to remove noise from sources like power line interference; it is not possible to remove noise

85

completely from each beat since patient movement cannot be controlled. Instead, classification process can be improved by analyzing a limited number of beats at a time and classifying a group of beats into a particular subclass of arrhythmia.

Previous research focuses mainly on separation of ventricular beats, supraventricular beats, normal beats and limited identification of subclasses of arrhythmia.

However, finer subclassification of arrhythmia is missing. Different finer subclasses are treated differently. Misdiagnosis of different subclasses of arrhythmia can be fatal [1, 2] or mistreatment can cause serious side-effects [2, 3]. Some of the non-life threatening arrhythmia subclasses like VTach needs immediate treatment, so they do not degenerate into life-threatening VFib. For these two reasons, correct automated diagnosis of subclasses of arrhythmia is important.

Arrhythmia subclasses are characterized by three challenges not addressed in previous research: 1) detailed morphological analysis of amplitude and duration of each waveform and 2) detection of embedded/superimposed waveforms and 3) including transitions between beats for classification of signal. Arrhythmia subclasses and irregular beats exhibit morphological and temporal characteristics that are unique for each subclass.

For instance, for supraventricular arrhythmia (arrhythmia that begins in the upper chambers of heart), P-wave morphology plays major role to identify subclasses. It is necessary to consider and include these morphological changes in classifier to identify subclasses.

When ventricles and atria are depolarized concurrently, one of the waveform may get embedded in another waveform. By mistaking this embedded waveform as missing

86

waveform, classification process can give wrong results. Transition between beats provides information about temporal characteristics of beat important for finer classification of signal.

3.8 Discussion

In this chapter, we summarized popular machine learning, pattern recognition and rule-based approaches used in classification of ECG signal before we began this research.

We noted advantages and limitations observed for these approaches. Our work was influenced by some techniques mentioned, and we tried to overcome limitations by improving existing methods. In next chapter, we introduce finer arrhythmia classification approach using Markov model bivariate Gaussian distribution.

87

Identification and Classification of Arrhythmia

In this chapter, we present a novel method for finer arrhythmia subclassification using bivariate Gaussian distribution Markov models. The focus of this chapter is on: 1) solving issues in arrhythmia analysis to include finer features, and 2) finer arrhythmia classification technique, which uses Markov model that includes both morphological and temporal features of waveforms and baselines. Each state of the Markov model has bivariate distribution of two features: amplitude and duration. Transitions in Markov model capture probability of transitions between waveforms and baselines for multiple heartbeats of observed ECG signal.

Level 1: Classify beats into S, V or N using GMM

S V

Level 2: Level 2: Classify signal into finer subclasses of S Classify signal into finer subclasses of V using bivariate Gaussian distribution using bivariate Gaussian distribution Markov model Markov model

Figure 4.1 Two level classification of arrhythmia into finer subclasses The classification is done in two levels as shown in Fig. 4.1. First level of classification separates ventricular (V), supraventricular (S) and normal beats (N) using

Gaussian mixture model (GMM) based clustering; where parameters of GMM are

88

estimated using expectation maximization (EM) clustering. The second level of classification focuses on finer level classification for supraventricular arrhythmia into five subclasses and ventricular arrhythmia into three subclasses using bivariate Gaussian distribution Markov model (BGMM).

4.1 Issues in arrhythmia

An abnormality in ECG signal indicative of arrhythmia is captured by studying morphological and temporal changes in the signal [54]. Morphological changes refer to change in amplitude, slope, or shape of the individual components. Temporal changes refer to changes in signal measured over time, which include waveform and baseline duration, and heart rate variability measured using RR-interval.

Arrhythmia is diagnosed as supraventricular or ventricular arrhythmia by studying a subset of morphological and/or temporal changes. Since supraventricular arrhythmia begins in the upper chambers of a heart, it is characterized mainly by observing changes in

P-wave amplitude, duration, and PQ segment duration. Ventricular arrhythmia is characterized by changes observed in R-wave amplitude, T-wave amplitude, duration and faster heart rate [54, 55].

Finer level subclassification of arrhythmia requires study of both temporal and morphological characteristics intertwined. Finer level classification is challenged by two issues:

 Subclasses of arrhythmia with a similar underlying condition such as atrial

89

enlargement [1, 3]. This requires detailed study of morphological and temporal

features pertaining to particular subclasses due to similarity of features [54]. For

instance, EAT and AVNRT have similar duration P-wave when atrial enlargement

is present [55, 56]. This requires splitting of P-waves to superposition of p-waves

from left atrium and P-waves from the right-atrium. To study the variations of

superimposed P-waves, more features of P-wave like amplitude, shape are included.

 In certain subclasses of arrhythmia like VTach, depolarization and repolarization of

atria and ventricles occur in parallel instead of sequentially, leading to overlapping

of P-waveforms and QRS-complex of the same beat or the T-waveform from the

previous beat. This is known as embedding of waveforms and leads to missed

detection of important waveforms suggestive of particular subclass of arrhythmia.

4.1.1 P-wave in supraventricular arrhythmia due to atrial enlargement

P-wave morphology

P-wave is the first positive deflection on ECG and it represents atrial depolarization which is a combination of the left and right atrial depolarization. The right atrial depolarization wave (green) precedes the left atrium (blue) since SA node in right atria starts impulse and travels to left atria, as shown in Fig. 4.2 [2, 5]. For a normal sinus beat, the combined depolarization P-wave is observed on ECG with one rising edge followed by one falling edge.

90

Figure 4.2 P-wave Morphology

Atrial enlargement

Atrial enlargement is a condition which refers to one or both atria enlarged due to multiple issues such as hypertension, and [57]. Atrial enlargements are characterized by morphological and temporal changes in P-wave because it takes longer to depolarize the enlarged chamber. Left atrial enlargement (LAE) is a condition that occurs when left atria is enlarged. (RAE) is a condition when right atria is enlarged. The morphological and temporal changes in P-wave due atrial enlargement depend on: 1) where the impulse started, and 2) which atria is enlarged. The signal may start from SA-node or the presence of ectopic nodes.

When impulse starts in right atria; then LAE is characterized by a notch in P-wave since the impulse takes more time to depolarize left atria. Similarly, RAE is characterized by increased P-wave amplitude, as shown in Fig. 4.3.

Figure 4.3 P--wave in LAE (left) and RAE (right) when impulse starts in right atria

91

When impulse is started in left atria, it depolarizes left atria first and then right atria.

This change in direction causes inverted P-wave. In this case, LAE is characterized by negative P-wave with lower amplitude, and RAE is characterized by a notch in negative P- wave as shown in Fig. 4.4.

Figure 4.4 P- wave in LAE (left) and RAE (right) when impulse starts in left atria

Arrhythmia and atrial enlargement

Complications from LAE are one of the causes of AFib [57]. In this case, AFib is characterized by a notch in positive P-wave since depolarization wave takes longer time to depolarize left atria [2, 5]. Fig. 4.5 shows an instance of ECG from MIT-BIH dataset [21] of a patient with LAE suffering from AFib.

Figure 4.5 AFib with left atrium enlargement

Complications from RAE also cause AFlu or EAT [57]. When RAE is associated with AFlu, it is characterized by taller P-waves with increased and positive amplitude, since right atria takes longer time to depolarize due to atrial enlargement, and reentrant circuit

92

present near AV junction [56, 57]. Figure 4.6 shows an instance of ECG from MIT-BIH dataset [] of a patient with RAE suffering from AFlu.

Figure 4.6 AFlu with right atrium enlargement

When RAE is associated with EAT, it is characterized by negative P-wave and a notch in P-wave. Negative P-wave indicates that impulse began in ectopic location in left atria and traversed to depolarize right atria. This morphology change indicates EAT [2,

57]. A notch in P-wave indicates RAE since impulse takes longer to depolarize right atria than left atria. Fig. 4.7 shows an instance of ECG from MIT-BIH dataset [21] of a patient with RAE suffering from EAT.

Figure 4.7 EAT with right atrium enlargement

4.1.2 Embedded waveforms

The origin of electrical activities in ventricular arrhythmia is one or more ectopic point(s) in the ventricle [55, 57]. Ventricles and atria may get depolarized concurrently leading to the embedding of P-waves in larger QRS-complex [3] as shown in Fig. 8.

Undetected P-waves lead to misdiagnosis of subclasses of ventricular arrhythmia. For instance, JTachy is characterized by missing P-wave [1, 2, and 55]. While in VTach and

93

VFlu, the P-wave can be embedded in QRS-complex, but it is not identifiable [57]. Fig.

4.8 shows an instance of ECG of a patient suffering from VTach with P-wave embedded in QRS-complex collected from MIT-BIH dataset [21].

Figure 4.8 P-wave embedded in a QRS-complex

4.2 Markov model approach for classification

A heartbeat is a waveform sequence, separated by isoelectric segments (TP, PQ and

ST segments). These waveforms are produced cyclically. Due to periodic transition, it is reasonable to consider each waveform or segment as a state of a Markov model. For real- time classification of ECG signal, a cyclic first order Markov model is considered, which assumes that after one beat ends (T-wave), another beat begins, and the cycle repeats from baseline1 (ISO1). Each state represents either a waveform or baseline, and transition represents probability of changing state. ISO1 refers to TP baseline segment, ISO2 refers to PQ baseline segment, and ISO3 refers to ST baseline segment. In the Markov model of a healthy individual’s ECG, there are at least eight states: P-wave, Q-wave, R-wave, S- wave, T-wave, ISO1, ISO2, and ISO3. As a convention, the starting state is ISO1. The normal transition sequence of Markov model with sinus heartbeat in a healthy heart is represented as: ISO1PISO2QRSISO3T.

Markov model is represented as a weighted directed graph, where weights are

94

transition probability of going from one state to another and each state represents features of waveform or baseline. Since each waveform and baseline of heartbeat are characterized by amplitude and duration, each state is modeled as a bivariate normal distribution.

Example of ECG beat modeling with waveforms and ISO lines as states is given in fig 4.9.

A missing waveform can be modeled as a missing state or a missing transition to the fixed state in the Markov model. For the convenience of analysis, a missing waveform has been treated as an edge missing to the state, and the states have been fixed in the Markov model.

The number of states may be over eight because P-waveform can be split into two waveforms changing the morphology and the transitions between individual P-waveforms from the left and right chambers of atria.

Figure 4.9 ECG modeling using a Markov model

4.2.1 Integrating Markov model with bivariate Gaussian distribution

Each waveform and baseline of the signal is characterized by amplitude and duration. To include these two features for each state of Markov model (waveform and

ISO), which are distributed normally, bivariate Gaussian distribution of two variables are included as each state of Markov model [17, 18, 19].

95

Bivariate distributions of two variables, denoted as A (amplitude) and B (duration), having normal Gaussian distribution is calculated using conditional variance [32].

Conditional variance is used based on correlation between variables [32, 35, and 58].

Assuming 휇퐴 and 휎퐴 represent mean and variance of the variable A, and 휇퐵 and 휎퐵 represent the mean and variance of B.

Conditional mean for the variable 퐵 is calculated by the equation 4.1.

휎퐵 퐸(퐵|퐴) = 휇퐵 + 휌 (퐴 − 휇퐴) (4. 1) 휎퐴 Where, ρ represents the correlation coefficient between the variables 퐴 and 퐵. Correlation coefficient of two variables which are dependent has value between -1 (negatively correlated) and +1 (positively correlated). Correlation coefficient of 퐴 and 퐵 is calculated using equation 4.2.

퐶표푣푎푟푖푎푛푐푒 (퐴,퐵) 휎퐴퐵 E[(A − 휇퐴)(B − 휇퐵)] 휌퐴퐵 = = = (4.2) 휎퐴휎퐵 휎퐴휎퐵 휎퐴휎퐵

Conditional variance of B is calculated using equation 4.3.

2 2 2 휎퐵|퐴 = 휎퐵 (1 − 휌 ) (4.3)

Conditional distribution of the variable 퐵 given 퐴 = 푎 is calculated using equation 4.4.

2 1 (퐵−휇퐵|퐴) ℎ(푏|푎) = exp[− 2 ] (4.4) 휎퐵|퐴 √2휋 2휎퐵|퐴

Using conditional distribution of 퐵, joint probability distribution is calculated using equation 4.5.

푓(푎, 푏) = 푓퐴(푎). ℎ(푏|푎) (4.5)

96

Figure 4.10 Bivariate distribution of P-wave amplitude and duration

Fig. 4.10 shows a bivariate normal distribution of P-wave plotted using amplitude

(millivolt) and duration (milliseconds) from ECG samples for a sample size of 10,000 beats collected from the records of 10 subjects with no heart condition in MIT-BIH [22]. The peak has a duration of about 100 milliseconds and an amplitude of 0.20 millivolts which corresponds to normal P-wave morphology [1, 2].

4.2.2 Bivariate Gaussian distribution Markov model (BGMM)

A bivariate Gaussian distribution Markov model (BGMM) [17, 18] is a weighted directed graph where transition probabilities between two adjacent states represent weight of the edges and state value of graph represent bivariate Gaussian distribution of two variables: amplitudes of P, Q, R, S and T waveforms and the durations of each waveform, and segments (ISO1, ISO2 and ISO3) between waveforms. Modeling isoelectric lines between waveforms as separate states is important to capture delays in the depolarization

97

and repolarization cycle, and is needed to reason about missing or misplaced P-waves, S- waves and T-waves due to the lack of synchronization in arrhythmia.

4.3 Solutions for arrhythmia issues

4.3.1 Resolution of P-waves using BGMM

To include all morphologically varied P-waves into classification using BGMM, P- waves is split into four states as shown in Fig. 4.11: 1) P-wave rising edge of right atrium

(green) denoted as P11, 2) falling edge from right atrium denoted as P12, 3) rising edge from left atrium (blue) denoted as P21, and 4) falling edge from left atrium denoted as P22.

Figure 4.11 P-wave morphology for the atrial enlargement

Updated BGMM including slopes for supraventricular arrhythmia has 11 states as follows: ISO1P11P12P21P22ISO2QRSISO3T. Each of the new state has two variables to characterize it: amplitude of the peak of slope and duration till the peak slope. There are three possibilities of P-wave splitting:

1. P-wave of right atrium is superimposed with P-wave from left atrium which results in

transition of P11 P22.

98

2. Falling edge of P-wave from right atrium is partially superimposed with rising edge of

P-wave from left atrium, which results in transition P11 P12 P21 P22.

3. For negative P-wave with a notch observed, P-wave splitting is modeled as P12 P11

P22 P21.

The second and third transition sequences are suggestive of atrial enlargement which can be seen as a notch in P-wave [1, 2, 5, and 55].

4.3.2 Identification of embedded waveform using area analysis

The embedding of P-waves in the corresponding QRS-complexes is seen as larger

QRS-complexes in terms of the amplitude and the duration [1, 2, 5, and 55]. Fig. 4.12 shows ECG of a patient suffering from VTach on left, which shows missing P-wave; and

ECG of a patient suffering from JTachy on right, which shows missing P-wave. However, it can be seen that for VTach, QRS-complex is bigger than QRS-complex for JTachy in terms of area occupied on ECG paper. This embedding of P-waves is captured by using area calculation of QRS-complex and then comparing it to the average area of QRS- complex.

Figure 4.12 P-wave embedded in QRS-complex for VTach (left) and missing P-wave for JTachy (right)

99

As noted by the cardiologists [55, 56, 57], the difference above a threshold suggests embedded P-waves. The threshold for the embedding was calculated by analyzing QRS- complex area with embedded P-wave using ROC method [58] in the annotated MIT-BIH database [21] for patients with both sinus rhythm and patients suffering from ventricular arrhythmia. Threshold is calculated using ROC method explained in section 1.10.1, which is based on cardiologists’ analysis for VTach that shows that embedded P-waves results in increased area of QRS-complex [5, 56].

4.3.3 Identification of embedded wave using Simpson’s rule

The area of the QRS-complex is approximated using Simpson’s rule [59, 60]. As opposed to trapezoidal rule which uses straight line, Simpson’s rule uses parabolas to approximate each part of the curve. Area of the curve is found out by finding area under each of the parabolas and adding the areas. In this method, area of the curve is divided into n equal segments of width ∆푥 as shown in Fig. 4.13.

Figure 4.13 Simpson's rule for area calculation

After the feature-extraction, if P-wave is missing, then the area for QRS-complex is calculated and subtracted from the average QRS-area obtained from training phase. If 100

the difference is more than a threshold then we conclude that P wave is embedded in QRS- complex.

Threshold calculation for embedded wave

To detect P-wave embedded in QRS-area, threshold of 28% was used as difference between QRS-area and QRS-area with P-wave embedded in it. Optimal threshold was calculated by using the method of receiver operating characteristic curve (ROC) as shown in Fig. 4.14 [58]. The optimal threshold value was obtained by first finding out the distance to the top-left corner of the ROC curve for each cut-off value. Lowest distance to the corner which is 0.28 was considered as the threshold since it indicates highest sensitivity and lowest false positive rate. The cut-off values considered ranged from 0.08 to 0.96 as the difference between QRS-area with and without P-wave embedded in it.

Figure 4.14 ROC curve for threshold selection

101

4.4 BGMM for arrhythmia

4.4.1 BGMM for supraventricular arrhythmia

A BGMM is constructed by extracting the amplitude and duration of each of the eleven states and transitions between them. Zero duration in any waveform or ISO states reflects missing states. BGMM for supraventricular arrhythmia is modeled with 11 states::

1) P11- amplitude, duration; 2) P12-amplitude, duration; 3) P21-amplitude, duration; 4)

P22-amplitude, duration; 5) Q-wave amplitude, duration; 6) R-wave amplitude, duration;

7) S-wave amplitude duration; 8) T-wave amplitude duration; 9) ISO1 amplitude duration;

10) ISO2 amplitude duration; and 11) ISO3 amplitude duration. Frequency analysis is used to derive normalized transition probability between states (see Fig. 4.15).

Example of BGMM for EAT

Fig. 4.15 shows an example of BGMM constructed for annotated ectopic atrial tachycardia (EAT) beats observed for 10 patients’ records in MIT-BIH dataset [18].

Figure 4.15 BGMM for EAT

102

EAT is characterized by 1) morphology changes in P-wave (amplitude and duration) when ectopic focus is closer to AV node, 2) missing T-waves since next depolarization begins before the repolarization of ventricles, 3) shorter ISO2 duration since impulse travels to ventricles without delay. Negative P-waves result in transition P12

P11 P22 P21. This transition sequence suggests a notch in negative P-wave. For heartbeats observed with negative P-wave without notch, transition sequence P12P21 is included with 0.12 probability. Also, ISO2 was missing for few observations, which indicate that impulse reached AV-node without delay since ectopic focus was closer to AV.

For relatively few patients, positive P-waves were observed when ectopic focus is away from AV node, which is indicated by low probability of 0.17 for ISO1P11. Transition from ISO3T is 0.25 exhibiting missing T-waves during EAT because next depolarization begins before the repolarization [3].

Table 4.1 Transition probability matrix

ISO1 P11 P12 P21 P22 ISO2 Q R S ISO3 T ISO1 0 0.17 0.83 0 0 0 0 0 0 0 0

P11 0 0 0 0 1 0 0 0 0 0 0

P12 0 0.88 0 0.12 0 0 0 0 0 0 0

P21 0 0 0 0 0 0.81 0.19 0 0 0 0

P22 0 0 0 1 0 0 0 0 0 0 0 ISO2 0 0 0 0 0 0 1 0 0 0 0 Q 0 0 0 0 0 0 0 1 0 0 0 R 0 0 0 0 0 0 0 0 1 0 0 S 0 0 0 0 0 0 0 0 0 1 0 ISO3 0.75 0 0 0 0 0 0 0 0 0 0.25 T 1 0 0 0 0 0 0 0 0 0 0

103

Table 4.2 Bivariate Gaussian distribution

State Amplitude Duration Correlation Bivariate (mv) (msec) coefficient (흆) distribution

P11 0.09 32 -0.52 0.88

P12 -0.03 8 0.63 0.63

P21 0.02 10 -0.71 0.79

P22 -0.10 30 0.74 0.59 Q -0.20 18 0.81 0.82 R 1.8 45 -0.93 0.88 S -0.2 22 0.69 0.70 T 0.12 120 -0.78 0.56

ISO1 0.0 130 -0.32 0.52

ISO2 0.0 110 -0.19 0.82

ISO3 0 90 -0.23 0.69

Table 4.1 shows average transition probability matrix. Table 4.2 shows bivariate

Gaussian distribution of amplitude and duration of beats for observed window. Correlation coefficients for each state of BGMM represent if amplitude and duration for the state are positively or negatively correlated. For P-wave slopes, amplitude was assumed to be the highest peak of the slope for positive slopes (P11 and P21) and lowest peaks of the negative slopes (P21 and P22). The duration of P-wave slopes were calculated from highest to lowest peak.

104

4.4.2 BGMM for ventricular arrhythmia

BGMMs developed for ventricular arrhythmia have 8 states and transitions between them. The eight states are: 1) P-wave amplitude, duration; 2) Q-wave amplitude, duration;

3) R-wave amplitude, duration; 4) S-wave amplitude, duration; 5) T-wave amplitude, duration; 6) ISO1 amplitude duration; 7) ISO2 amplitude duration; and 8) ISO1 amplitude duration.

Example of BGMM for VTach

Fig. 4.16 shows BGMM developed for VTach using arrhythmia dataset [22] of 32 patients using 30 min recording for each patient.

Figure 4.16 BGMM for VTach

Defining characteristics of VTach are: 1) broad QRS-complex > 160ms; 2) Fraction of P-wave embedded inside QRS-complex; 3) no correlation between P-wave and QRS- complex; and 4) rare presence of isoelectric lines. The transition probabilities from TP,

PQ, and ST are 0.58, 0.89 and 0.69 respectively, which indicate missing ISO lines for most patients. Transitions ISOP, TP and SP indicate that P-wave is present and 105

embedded in QRS-complex detected using area subtraction method. QRS transition suggests presence of QRS-complex for most patients.

Table 4.3 Transition probability matrix

ISO1 P ISO2 Q R S ISO3 T ISO1 0 0.69 0 0.31 0 0 0 0 P 0 0 0.01 0.89 0.10 0 0 0 ISO2 0 0 0 0.20 0.80 0 0 0 Q 0 0 0 0 1 0 0 0 R 0 0 0 0 0 0.86 0 0.14 S 0 0 0 0 0 0 0.09 0.69 ISO3 0.75 0 0 0 0 0 0 0.25 T 0.08 0.58 0 0.34 0 0 0 0

Table 4.4 Bivariate Gaussian distribution

State Amplitude Duration Correlation coefficient Bivariate (mv) (msec) (흆) distribution P 0.18 80 0.74 0.42 Q -0.20 27 0.81 0.80 R 1.80 60 -0.93 0.78 S -0.20 25 0.69 0.89 T 0.12 120 -0.78 0.66

ISO1 0.01 60 -0.12 0.48

ISO2 0.03 80 -0.19 0.32

ISO3 0.01 65 -0.15 0.56

Table 4.3 shows average transition probability matrix. Table 4.4 shows bivariate

Gaussian distribution of amplitude and duration of beats for observed window.

106

4.5 Bivariate Gaussian Probabilistic Transition Graph (BGTG)

A bivariate Gaussian transition graph (BGTG) is a weighted directed graph that shows the probability of transition between the adjacent states of a finite state automaton like Markov model [9, 10]. However, BGTG is made in real-time from a single patient using a small sample-size of heartbeats in a time-window compared to the BGMMs. It is used to match various BGMMs corresponding to different subclasses of arrhythmia for the proper labeling of patient’s ECG.

4.5.1 Example of BGTG for EAT

Fig. 4.17 shows an example of BGTG constructed from data collected in real time from a patient suffering EAT.

Figure 4.17 A sample of BGTG for 20-beat window

BGTG is constructed using a moving window which considers a group of 20 beats at a time. The transition from ISO1P12 represents negative transition of P-wave. The sequence P12P11P22P21 suggests, notch in negative P-wave and atrial enlargement

[1, 2]. Transition P21ISO2QRSISO3 suggests normal ventricular depolarization. Transition from ISO3T is 0.12 suggests missing T-waves. 107

4.5.2 Example of BGTG for VTach

Fig. 4.18 shows a BGTG constructed for a patient suffering from VTach using a group of 20 beats observed in real time. The missing transitions to and from ISO lines suggest no visible baseline presence in ECG. Transition from TP suggests that even if no discernible P-wave could be found from ECG, QRS increase above threshold suggested presence of P-wave and was included in BGTG with average features. MPP pf the observed BGTG-based on noise threshold is TPQRT.

Figure 4.18 A sample of BGTG for a 20-beat window

4.6 BGMM and BGTG matching

BGTG for patients are constructed from ECG waves in real-time using a moving window to monitor a patient in real-time [35]. The problems of identifying patient's disease state reduces to labeling the BGTG by matching the transition-graphs corresponding to

BGMM of each supraventricular or ventricular arrhythmia-subclass and identifying the closest match.

Step 1: For constructed BGTG, transition-probabilities below 0.05 are removed from to eliminate noise. The derivation of the threshold is based on the statistical analysis by 108

plotting ROC curve to attain maximum sensitivity and specificity to eliminate noise present in dataset [20].

Step 2: For the constructed BGTG, most probable path (MPP) is identified. MPP is the path from ISO1 to ISO1 with the highest transition probability.

Step 3: A subset of BGMM is selected that includes all the transitions present in BGTG.

This step gives the list of prospective matching BGMMs.

Step 4: For all BGMMs obtained from the Step 3, graph matching is performed by multiplying two values: 1) probability that observed bivariate distribution of state in BGTG is produced by state in BGMM [58] and 2) probability that the state in the observed beat is generated by a BGMM-based on transition-probabilities using standard forward-algorithm

[36]. BGTG is classified based upon the BGMM with the highest likelihood.

4.7 Overall classification approach

The overall approach of arrhythmia subclassification has training phase, which is executed to collect generic information about ECG features for each class and subclass of arrhythmia using MIT-BIH dataset [21]; and dynamic phase which is executed in real time as patient’s data is collected.

4.7.1 Training phase

In this phase, clustering parameters for GMM which include mean, deviation and covariance matrices between three mixture components for normal (N), ventricular (V) and supraventricular (S) beats are estimated. GMM classification uses multiple features of

109

ECG. To reduce the number of features to reduce computational complexity, principal component analysis (PCA) is used to select only those features having eigenvalue greater than 1. Bivariate distribution Markov models (BGMM) are developed for various subclasses for ventricular and supraventricular arrhythmia. Average QRS-area of healthy beats is also stored to detect embedded P-wave for ventricular arrhythmia subclassification.

4.7.2 Dynamic analysis phase

Dynamic analysis is executed for an observed ECG to be classified in real-time using information collected from the observed signal, training phase, and classifiers (GMM and Markov models). In this phase, observed ECG signal is preprocessed to remove noise and extract updated features after the embedded wave detection. The first window consisting of limited number of beats of ECG signal is analyzed for each patient. This initial window analysis is used to obtain following statistical information about the features of the waveforms:

1) Number of beats and individual waveforms in window;

2) Mean, median, minimum and maximum of the amplitude and duration of each type of waveform and ISO-lines for bivariate distribution calculation; and

3) Average RR-interval for the observed heartbeats.

This information in initial window analysis is used for two-level classification performed in real-time by analyzing consecutive windows of ECG signal, where each window contains a limited number of beats based upon the statistical analysis that retains real-time nature without sacrificing accuracy. 110

In level 1 of two-level classification, the selected window is analyzed to classify each beat in it as either V, S or N beat using GMM. Level 2 of the classification, takes the same window of labeled beats, and performs finer arrhythmia subclassification using

BGMM. Further features are extracted based on the labels obtained (V or S) and BGTG is developed for the beats in observed window. This BGTG is then matched with BGMMs for respective subclass using developed graph-matching algorithm. The overall approach is described in Fig. 4.19.

Figure 4.19 An overall approach of arrhythmia classification

The window-length selected for initial window analysis was chosen to be 30 seconds. Number of beats considered for BGTG construction was chosen to be 20 beats

111

per BGTG. Both the window-length for initial window analysis and number of beats per

BGTG were chosen empirically and based on ROC criteria by comparing it with classification error analysis. The length of the window and number of beats in BGTG were plotted against classification results obtained for arrhythmia and premature-beats. The values were chosen that produced highest sensitivity and lowest false positive results.

4.7.3 Feature-extraction

Morphological features are collected for five waveform and three baselines using

DWT technique (see Chapter 2). Each feature is a pair (amplitude, duration). RR- intervals of each beat are computed using the time-difference between adjacent R-peaks.

Initially, eighteen features were collected: ten for five waveforms, six for the baselines, and two for RR-intervals of the previous and current beat.

Although all features are important for the classification of signal into finer subclasses, clustering for GMM requires only eight features [58] as shown in Table 4.5 without any loss of accuracy. Reduction of features is done using PCA (Principle

Component Analysis) [32, 33].

Table 4.5. Features selected using PCA

Feature Description R-wave Amplitude, duration P-wave Amplitude, duration

T-wave Amplitude, duration

RR-interval Previous RR-interval and RR-interval with the current beat

112

4.7.4 Clustering for Level-1 Arrhythmia Classification

Observed beats are classified as either V (ventricular arrhythmia), S (supra ventricular arrhythmia) or N (normal) beats using GMM (Gaussian Mixture Model). GMM based clustering is a soft-clustering method. Arrhythmia classification requires soft- clustering because each feature characteristic can belong to more than one class of arrhythmia [56]. Clustering was done in eight reduced dimensions (see Table 4.5) using

Gaussian distribution. Covariance matrices of heartbeats are diagonal which implies statistical independence of Gaussian mixture components [33, 58].

If 푥 = (푥1, … , 푥8)푇 represents a feature-vector of observed heartbeat 푥 with eight dimensions at time = T, Gaussian distribution of the vector is defined by the equation 4.7:

1 1 푃(푥|휇, ∑ ) = exp (− (푥 − 휇)푇 ∑−1(푥 − 휇)) (4.7) (2휋)푑/2 √|∑| 2

Where, 휇 is the mean for each component, ∑ is the covariance matrix and 푑 is the dimension of data. The probability of 푥 given in a mixture of K-Gaussians is given by equation 4.8:

푘 푝(푥) = ∑푗=1 푤푗. 푁(푥|휇푗, ∑푗) (4.8) 푡ℎ Where, 푤푗 is the prior probability of the 푗 Gaussian, and cumulative probability

푘 ∑푗=1 푤푗 is equal to 1.0.

The GMM is trained to obtain model parameters for each Gaussian component.

Model parameters include prior probability, mean, and covariance matrices for the components. The conditional probability density is estimated using equation 1. Given a

113

sequence of heartbeats푋 = < 푥1, 푥2, … , 푥푁 >, parameters of GMM, represented by θ, are

estimated by maximum likelihood p(X|θ) using expectation maximization (EM) algorithm.

4.8 Algorithms 4.8.1 Identification of embedded P-waves

Embedded wave identification is done in dynamic phase after level 1 classification.

푆 푆 푆 푆 The input to this phase is 1) stream of labeled beats of the form {푏1 , 푏2 , 푏3 , 푏4 , … } such

that 푏 represents beat and 푠 represents supraventricular arrhythmia, 2) features extracted

for each beat of the form [Pamplitude, Pduration, Qamplitude, Qduration], and 3) average QRS-area, P-

wave amplitude and duration represented as QRSavg-area, Paverage-amplitude, Paverage-duration

obtained in training phase. Identification of P-wave embedded in QRS-complex follows

three steps:

Algorithm to identify P-wave embedded in QRS complex

푆 푆 푆 푆 Input: Set of labeled beats S = {푏1 , 푏2 , 푏3 , 푏4 , … }

set of extracted feature vector [Pamplitude, Pduration, Qamplitude, Qduration]

Average features for beats QRSaverage-area, Paverage-amplitude, Paverage-duration Output: P-wave presence or absence with amplitude and duration 푆 { for each beat 푏푖 휖 푆

{ if (Pamplitude ==0 && Pduration == 0) { //if P-wave is missing

QRSarea = Calculate the QRS complex area using Simpson’s rule

diff = QRSarea – QRSaverage-area //Perform Subtraction: if (diff > 0.28) {// Mark P-wave present and embedded. Assign average

Pamplitude = Paverage-amplitude , Pduration = Paverage-duration}. ex exit;} //exit the analysis without changing feature vector}

Figure 4.20 An algorithm for embedded P-waves

114

Step1: Analyze features for each beat represented as for the presence (absence) of P-wave.

Step 2: If P-wave is missing, then QRS-area is calculated for the beat using Simpson’s rule. Calculated QRS-area is compared with average QRS-area obtained in training phase.

Step 3: If the difference between areas is greater than threshold, P-wave is marked present and average features of P-wave are assigned for the embedded P-wave.

An algorithm to detect embedded P-waves in QRS-complex is given in Fig. 4.20.

4.8.2 Expectation maximization (EM)

EM algorithm maximizes the likelihood of clustering with respect to the BGMM parameters comprising means and covariance of the components, and prior probabilities.

The algorithm has three steps:

Step 1. Provide initial parameters: mean and standard deviation for each type of beat (V,

S, and N) obtained from the training phase.

Step 2. Iteratively renew the parameters with E and M steps until 푃(푋|휃). In the E step, compute the membership possibility for each instance based on the initial parameter values

(extracted temporal and morphological features). In the M step, recompute the parameters based on the new membership possibilities. Continue E and M step until convergence of parameters defined by (휃푡 − 휃푡) ≤ 휖, where 휃푡 is parameter at time t and 휖 is defined tolerance. The tolerance is selected as 10−6 based upon ROC criteria [58].

Step 3. Assign each instance to the cluster with which it has the highest membership possibility. The algorithm is given in Fig. 4.21.

115

Algorithm: Classification of beats into N, V or S beats using GMM

Input: Feature vector set of heartbeats 푋 = {푥1, 푥1 … 푥푁}

Trained GMM 휃 = {휇푗, ∑푗, 푤푗}, j= 1,….J, where, 휇′푠 and ∑′푠 are the mean and

covariance and j represents Gaussian components or clusters and 푤푗′푠 are the prior th 푘 Probability of j Gaussian subject to ∑푗=1 푤푗 = 1 and 0 ≤ 푤푗 ≤ 1

Output: Model parameter 휃 that maximizes data likelihood 푝(푋|휃) = ς푛 ∑푗 푤푗푝(푥푛|휃푗)

Partition data vectors 푌 given by cluster identity vectors 푌 = {푦1, … 푦푛}, 푦푛 ∈ {1, . . , 푗} Algorithm: // Initialize the model parameters

휃 = ൛휇푗, ∑푗 , 푤푗ൟ using training dataset // Expectation maximization step to update parameters

Loop: Until ( (휃푡 − 휃푡) ≤ 휖 ) // until 푝(푋|휃) converges

{// Step E: Find the posterior probability of model j, given data vector 푥푛 and current model parameters 휃 푤푗푝൫푥푗ห휃푗) 푝(푗|푥푛, 휃) = ∑푗 푤푗푝(푥푛|휃푗) // Step M Maximum likelihood re-estimation of model parameters 휃 is given by: ∑ 푢푝푑푎푡푒푑 푛 푝(푗|푥푛, 휃)푥푛 휇푗 = ∑푛 푝(푗|푥푛, 휃) 푇 푢푝푑푎푡푒푑 ∑푛 푝(푗|푥푛,휃)(푥푛− 푥തതത푛ത)(푥푛− 푥തതത푛ത) ∑푗 = ∑푛 푝(푗|푥푛,휃) 1 푤 푢푝푑푎푡푒푑 = ∑ 푝(푗|푥 , 휃) } 푗 푁 푛 푛

// for each data vector 푥푛, find out the cluster with maximum likelihood

푦푛 = arg max(푤푗푝(푥푛|휃푗))}

Figure 4.21 GMM based classification

4.8.3 Algorithm for BGMM and BGTG matching

The input to the graph matching block is a BGTG represented as G and BGMMs.

Let 푉퐺denote set of vertices in G and 푇푟퐺 denote transition from state 푆푡퐺 in 퐺. Let

푏푑(푆푡퐺) denote bivariate-distribution of state in BGTG and 푏푑(푆푡푀) denote bivariate

116

distribution of state in BGTG. Let 푇푟(푆푡퐺) denote transition probability of state in G and

푇푟(푆푡푀) denote transition probability from state in BGMM.

Algorithm: Transition Graph Matching

Input: 1. Transition graph or BGTG constructed for patient denoted by G

2. Set of Markov models or BGMM, S = {GAFib, GAFlu, GAVNRT, GEAT, GJTachy}; or S = {GVTach, GVFib, GVFlu } Output: classification C; { // Denoise the transition graph G

let EG be the set of weighted edges in G; for each e  EG do { if (weight(e) < noise-threshold) EG = EG – e; // remove the insignificant edge from the set // Prune the possible Markov models that do not match with EG Possible matching Set SM = { }; }

for each Markov model M  S { // find potential Markov model let EM be the set of edges in M; if (EG  EM ) SM = SM + M; } // Construct MPP let W be the set of weights associated with each edge

let Wij be the weight of edge that denotes transition probability for edge ei to e j for each ei  EG do { max [Wij] //find out maximum weight for outgoing weight for ei MPP = ei + ej } //add edge to the MPP // Use forward algorithm to match and find the probability of best matching path let 푏푑(푆푡퐺) denote bivariate distribution of state in BGTG 푏푑(푆푡푀) denote bivariate distribution of state in BGMM. let 푇푟(푆푡퐺) denote transition probability of state in G and 푇푟(푆푡푀) denote transition probability from state in BGMM. let the set of vertices in G be VG

for each matching Markov model M  SM { while (VG is not empty) do { //calculate state and transition probability in BGTG belonging to BGMM 퐿퐺−푀 = 푃(푇푟(푆푡퐺)|푇푟(푆푡푀)) * 푃(푏푑(푆푡퐺)|푏푑(푆푡푀)) VG = VG – c; c = next-state(VG); }} // classify BGMM with highest likelihood L

for each M  SM { C = max ( 퐿푀−퐺 )} //classify BGTG as a subclass of BGMM with highest likelihood }

Figure 4.22 An algorithm for transition graph matching 117

Transition graph matching is done in four steps:

Step 1: For constructed BGTG 퐺, transition probabilities below threshold are removed to eliminate noise. This step gives a subset of BGMMs denoted as 푆푀.

Step 2: For the constructed BGTG 퐺 , most probable path (MPP) is identified. MPP is the path from ISO1 to ISO1 with the highest transition probability.

Step 3: For each BGMM in 푆푀, graph matching is performed by multiplying two values:

a) Probability that the state 푆푡퐺 in the observed BGTG is generated by a state 푆푡푀

in BGMM-based on transition probabilities using standard forward algorithm.

The probability is represented as P(Tr(StG)|Tr(StM)).

b) Probability that observed bivariate distribution of state 푆푡퐺 in BGTG is produced

by state 푆푡푀 in BGMM [22] represented as: 푃(푏푑(푆푡퐺)|푏푑(푆푡푀)).

Step 4: BGTG is classified based upon the BGMM with the highest likelihood.

The detailed algorithm for graph matching is given in Fig. 4.22.

4.9 Discussion

In this chapter, we have discussed classification method for arrhythmia. Finer arrhythmia classification is done in two stages. In first stage, each beat is classified as either N, V or S using GMM clustering; where EM was used for parameter estimation. In second stage, finer level classification is done using novel method bivariate Gaussian

Markov model (BGMM). This model is capable of classifying a group of beats observed in real time as specific subclass of arrhythmia. Since both amplitude, duration of

118

waveforms and baselines are included in this model, it accurately classifies the subclasses of arrhythmia. To tackle issues in arrhythmia; a method of P-wave resolution and embedded wave detection has been introduced. These techniques help to analyze finer morphological features of each beat and include these details in BGMM to accurately classify the signal. In next chapter, we introduce premature beat classification and premature beat-pattern with normal beat classification method using the techniques introduced in this chapter.

119

Identification of Premature Beats and Irregular Beat-patterns

This chapter describes finer premature beat classification-based on Markov models with bivariate Gaussian distribution and irregular beat-pattern classification-based on lookahead mechanism. The focus of this chapter is: 1) finer premature beat classification- based on the location of ectopic focus using Markov model that includes both morphological and temporal features of waveform and baselines, and 2) identification of various patterns combining premature beats and regular sinus beats. Detection of beat- patterns along with the location of ectopic focus is important to prevent degeneration into serious arrhythmia.

A premature beat occurs due to an ectopic location that acts as a pacemaker, and starts impulse in the heart before the next regular sinus beat. This results in an extra unexpected heartbeat. Based on the location of ectopic focus, premature beats are classified as premature atrial contraction (PAC), blocked-premature atrial contraction (B-

PAC), premature junctional contraction (PJC), and premature ventricular contraction

(PVC).

A premature beat occurs between two regular sinus beats. Based on a number of premature beats between sinus beats, irregular beat-patterns are classified as bigeminy, trigeminy, and quadrigeminy. Premature beat-patterns are challenging due to embedded

120

waveforms, which can result into missing a beat causing misclassification. The identification of irregular beat-patterns requires: 1) correct identification of embedded waveforms, and 2) a lookahead algorithm to disambiguate between the beat-pattern subclasses.

Premature beat classification is performed at two levels. First level separates premature beats from regular sinus beats using rule-based system-based on RR-interval analysis. Second level performs finer classification using BGMMs, which consider morphological features of individual beats and transitions within each beat.

5.1 Challenges in premature beat analysis

Certain classes of premature beats are difficult to identify due to similar morphology. PVCs have an aberrant but consistent morphology, allowing an easy identification. The PACs and B-PACs have a similar morphology as normal sinus beat and

PJC [1], which introduces ambiguity in the detection and accurate classification among

PAC, B-PAC and PJC [1, 2, and 5]. To identify subclasses of premature beats, detail analysis of morphology and temporal changes is required.

5.1.1 Embedded waveforms

Premature beats exhibit two types of superimpositions of waveforms: 1) embedding of P-wave of a PAC or B-PAC in the T-wave of the preceding beat; and 2) superimposition of R-wave of a PVC with the T-wave of the preceding beat [1, 2]. The phenomenon of embedding of P-wave in PAC or B-PAC in the preceding T-wave is also known as “P-on-

121

T phenomenon” [1, 2]. Similarly, the superimposition of R-wave in PVC with the preceding T-wave is also known as “R-on-T phenomenon”.

P-on-T Phenomenon

In the case of P-on-T phenomenon seen in B-PACs and PACs, P-wave starts before the ventricular depolarization is complete. The superimposition of the P-wave makes the

T-wave appear biphasic, and alters the area of the T-wave [1, 2, 5, and 61]. In B-PAC, this is followed by a pause, an absent QRS-complex, and T-wave caused by blockage of the impulse by AV-junction, which causes lack of ventricular depolarization. In PAC, P-on-T is followed by QRS-complex and T-wave. Fig. 5.1 shows such an instance of an ECG signal of containing a B-PAC beat from MIT-BIH database [21].

When premature beat occurs due to an ectopic focus generating an electrical impulse; the depolarization wave from this ectopic focus resets sinus node. This phenomenon is known as non-compensatory pause [1, 2]. Due to this, the interval between premature beat and next regular sinus beat is increased.

Figure 5.1 P-on-T phenomenon in B-PAC

122

R-on-T Phenomenon

PVCs typically occur after the previous complex has finished repolarization. In the case of R-on-T phenomenon, PVC beat starts during the refractory period of previous complex causing R-wave to superimpose on the preceding T-wave. This deforms and alters the morphology of T-wave significantly, and the area of the T-wave is altered [1, 2].

Fig. 5.2 shows an instance of R-on-T phenomenon from MIT-BIH database [21]. R-on-T phenomenon can create potential for serious reentry-loops, which can lead to arrhythmia like VTach [1, 2, 5, 61, and 62].

Figure 5.2 R-on-T phenomenon in PVC

5.1.2 Irregular beat-pattern identification

Based upon the patterns of premature and regular sinus beats, ECG rhythms with premature contractions are further subclassified as bigeminy, trigeminy, quadrageminy, couplets and triplets [1, 2, and 5].

Number of occurrences of P between R denotes the rate at which ectopic focus is irritated. As the number of P beats in the pattern increases, risk of degeneration into serious arrhythmia increases [3, 62]. PVC bigeminy has a potential of degenerating into VTach or

123

life-threatening VFib if not treated in time [1, 2, 62]. Similarly, PAC bigeminy has a potential of degenerating into AFib, and PJC can degenerate into JTachy [1, 2, and 62].

Table 5.1 illustrates embedded occurrences of bigeminy, trigeminy, and quadrigeminy. Regular sinus beats are denoted as ‘R’, and premature beats are denoted as

‘P’. For the first row in table 1, if P denotes PVC, then PVC bigeminy is detected in a patient.

Table 5.1 Patterns of Irregular Beats

Irregularity Beat-patterns (R is sinus and P is premature beat) Bigeminy 푹 푹 푹 푷 푹 푷 푹 푷 푹 푷 푹 푷 푹 푷 푹 푷 푹 푹 Trigemini 푹 푹 푹 푹 푹 푹 푹 푷 푹 푹 푷 푹 푹 푷 푹 푹 푷 푹 푹 푹 푹 Quadrageminy 푹 푹 푹 푹 푹 푹 푹 푷 푹 푹 푹 푷 푹 푹 푹 푷 푹 푹 푹 푹

After the first occurrence of premature contraction ‘푃’, the repeating pattern for bigeminy, trigeminy and quadrigeminy is (2푛 – 1), (3푛 – 1) and (4푛 – 1) (푛 being a positive integer), respectively [23]. There must be two or more repeats of the pattern to make a distinction. For example, a single occurrence of “푅 푅 푅 푃” can have three interpretations:

1) Two regular sinus beats followed by a bigeminy “푅 푅 푹 푷”;

2) One regular beat followed by a trigeminy “푅 푹 푹 푷”; or

3) A quadrigeminy “푹 푹 푹 푷”.

Hence, the single occurrence of “ … 푅 푅 푅 푃 … ” is not treated as quadrigeminy; the

124

single occurrence of “ … 푅 푅 푃 … ” is not treated as trigeminy”; and single occurrence of

“ … 푅 푃 … ” is not treated as bigeminy [63, 64, 65, 66]. This treatment of singleton

“푅 푅 푅 푃” is consistent with MIT-BIH [21] annotations. Actual patterns are not as clean as shown in Table 5.1, and include occasional sandwiched sinus rhythm and switching between different beat-patterns that prohibit straightforward pattern analysis [66].

5.2 Premature beats detection

Premature beat detection separates premature beats and regular sinus beats using rule-based analysis. The input to this stage is an ordered set of beats with various feature- values from the feature-extraction phase. Output is a sequence of beats labeled as 푃 for premature beats and 푅 for regular sinus beats. This stage identifies the premature beats and superimposed P and R-waveforms on T-wave, estimates the feature-values of the embedded R-waves, and returns another ordered set of corrected feature values.

There are two steps in the premature beat detection: 1) identification of premature beats using rule-based RR-interval analysis; and 2) identifying the embedded P-waves and

R-waves using area subtraction method that estimates the area of T-waves using Simpson’s rule as described in chapter 4.

Integration of embedded waveform analysis with rule-based analysis is required for two reasons: a) R-waves embedded in T-wave observed in PVC remain undetected. This causes RR- interval-based analysis to miss PVC beats [62].

B-PAC beats blocked at AV junction are characterized by P-waves embedded in T- 125

wave of previous beat, without QRS-complex or T-waves. Due to R-wave absence, B-

PAC beats go undetected.

5.2.1 Integration of rule-based analysis and embedded waveform analysis

Rule-based analysis

Rule-based interval analysis [43] identifies premature beats by analyzing heartbeat variability in the three consecutive RR-intervals (containing four beats) as shown in fig

5.3. The symbol 푅푅1 denotes the interval between first and second beat. The symbol

푅푅2 denotes the interval between second and third beat, and the symbol 푅푅3 denotes the interval between third and fourth beat. Initially, all the beats are presumed to be normal sinus beats.

Figure 5.3 RR-interval variability due to premature beats

If rule 1 given by equation 5.1 is satisfied, then the beat corresponding to the start of the RR2 is a premature beat. The rationale for the RR2 to be larger is that the premature second beat causes a non-compensatory pause in the regular sinus beat elongating the interval between the premature beat and the regular sinus beat.

(RR2 > 1.13 RR1) ⋀ (RR2 > 1.15 RR3) ⋀ (1.15 RRaverage < RR2 < 1.8 RRaverage) (5.1)

126

The rule to check prematurity of second beat is an integration of three expressions.

First expression (RR2 > 1.13 RR1) ensures that RR2 is significantly greater than RR1 that cannot be attributed to normal Gaussian distribution of intervals. Second expression, (RR2

> 1.15 RR3) ensures that RR2 is significantly greater than RR3, and cannot be attributed to normal Gaussian distribution. Increase in the value of RR2 suggests an early occurrence of R-wave of premature beat and regular occurrence of R-wave of normal sinus beat. Third expression, (1.15 RRaverage < RR2 < 1.8 RRaverage) ensures that the value of RR2 is within a range based on the value of RRaverage. The values excluding the range for RR2 values fall outside the 95% confidence-interval of Gaussian distribution obtained by analyzing premature beats [43].

Embedded waveform analysis

For B-PAC and PVC, R-wave morphology is changed, or R-wave is missing [62].

In these cases, RR-interval based analysis fails to detect premature beat [62]. To prevent this, RR2-intervals are checked by the rule in equation 5.2, to detect possibly missing B-

PAC or PVC beat.

RR2 ≥ 1.8 RRaverage (5.2)

The rule checks if the value of RR2 is greater than the value of RRaverage by a threshold. If the condition is satisfied, it suggests a possible occurrence of premature beat with either missing R-wave or R-wave with different morphology within the RR2.

127

Embedded P-wave analysis

B-PAC is characterized by missing both QRS-complex and T-wave due to the block at the AV junction [1, 2, and 62]. To identify undetected B-PAC within RR2, a presence of embedded P-wave is checked by the rule in equation 5.3. If the rule is satisfied, then location of T-wave corresponding to the start-beat of RR2 is marked as B-PAC. Fig. 5.4 shows an example of missed B-PAC with missing QRS-complex and T-wave.

(Tarea > 1.38 Tarea-average) ⋀ (Qamplitude-average > 2.4 negative-peaks within RR2) (5.3)

Figure 5.4 RR-interval analysis for B-PAC with P-on-T

The rule given by equation 3 is a conjunction of two expressions. The first expression ((Tarea > 1.38 Tarea-average) ensures that T-wave area is significantly larger than average T-wave area. Second expression (Qamplitude-average > 2.4 negative-peaks within

RR2) ensures that any negative peak observed within RR2 is significantly less than the average Q-wave amplitude. This ensures that the amplitude of any conducted Q-wave is significantly smaller than average Q-wave amplitude conducted by the premature beat as observed for B-PAC beats [62]. The amplitude of Q-wave observed for B-PAC beat is smaller since AV junction blocks the impulse conduction through ventricles.

128

Embedded R-wave analysis

PVC beats occasionally have missing P-waves and R-wave embedded in previous beat’s T-wave. Also, the amplitude is lower for R-waves since the ectopic focus is located in ventricles causing the impulse to move in different direction to depolarize ventricles.

PVCs can also have prominent Q-wave with negative amplitude greater than normal Q- wave amplitude, since the ventricular depolarization traversal from left-to right ventricles takes more time. RR2 is checked by the rule given by equation 5.4. If the rule is satisfied, then the next positive deflection within RR2 is marked as R-wave, and beat within RR2 is marked as PVC beat.

((Tarea > 1.38 Tarea-average) ⋀ (negative peak in RR2 ≥ 2.4 Qamplitude-average) (5.4)

The rule is an integration of two expressions. The first expression ensures that T- wave area is increased significantly. Second expression (negative peak in RR2 > 2.4

Qamplitude-average) ensures the presence of Q-wave with amplitude much greater than average Q-wave amplitude. Left side of Fig. 5.5 shows an example of missed PVC beat and right side shows the detected PVC beat with R-wave marked.

Figure 5.5 RR-interval analysis for PVC with R-on-T

129

The thresholds used to detect a premature beat in the rules are clinically significant, and is provided by medical experts based on the RR-interval in premature beats and arrhythmic events [43, 62]. The chosen thresholds follow maximum sensitivity and specificity that can be achieved to detect premature beats. The stepwise description of RR- interval rule-based analysis is described in next section.

5.3 Markov model approach for finer classification

To include finer features (amplitude and duration) for each component of ECG, bivariate Gaussian distribution Markov model (BGMM) approach explained in chapter 4, section 4.4 is used for premature beat classification such that each state of BGMM includes bivariate Gaussian distribution of amplitude and duration. BGTG is specific to one patient and it is constructed for each beat of the observed ECG signal. The input required to construct BGTG is labeled sequence of beats as premature or regular beat.

5.3.1 Example of BGMM

Fig. 5.6 illustrates an example of BGMM for PAC after the analysis of 2500 PAC beats from MIT-BIH dataset [21]. Table 5.2 shows transition probability matrix. Table

5.3 illustrates amplitude measured in millivolts, duration measured in milliseconds, correlation coefficient for amplitude, duration, and bivariate distribution obtained for each state of BGMM.

130

Figure 5.6 BGMM for PAC

Table 5.2 Transition probability matrix

ISO1 P ISO2 Q R S ISO3 T ISO1 0 0.98 0.02 0 0 0 0 0

P 0 0 0.54 0.36 0.10 0 0 0 ISO2 0 0 0 1 0 0 0 0 Q 0 0 0 0 1 0 0 0 R 0 0 0 0 0 1 0 0 S 0 0 0 0 0 0 1 0 ISO3 0.05 0 0 0 0 0 0 0.95 T 1 0 0 0 0 0 0 0

Table 5.3 Bivariate distribution for each state in BGMM of PAC

State Amplitude Duration Correlation coefficient Bivariate (mv) (msec) (흆) distribution P -0.25 78 +0.85 0.86 Q -0.16 12 +0.75 0.78 R 1.32 45 -0.80 0.80 S -0.12 26 +0.59 0.59 T 0.25 100 -0.76 0.63

ISO1 0.0 102 -0.56 0.36

ISO2 0.0 69 -0.57 0.30

ISO3 0.0 75 -0.20 0.10 Since PAC is characterized by different morphology of P-wave, but not from the presence or the absence of a P-wave, BGMM has all the states and the transitions (also see 131

Table 5.2). Detailed information about morphology of a P-wave is given in Table 5.3. For most PACs observed, P-wave was negative. As the amplitude of a P-wave becomes more negative, duration of P-wave decreases giving positive correlation (0.85) between P-wave amplitude and duration. PAC is also characterized by short ISO2 duration or missing ISO2 when ectopic location is closer to the AV-node. This can be seen from table 5.3, as ISO2 duration is 69 milliseconds, which is significantly lower than ISO2 duration observed in a healthy heartbeat which is ≥ 120 milliseconds. Transition from PISO2 is 0.54, due to missing ISO2 segments.

5.3.2 Example of BGTG

Fig. 5.7 illustrates a BGTG constructed from record 124 of the MIT-BIH [21] dataset. For the baseline segments: ISO1 (TP-segment), ISO2 (PQ-segment) and ISO3

(ST-segment), zero durations imply that the corresponding state in the BGMM is bypassed.

However, zero amplitude does not imply the absence of transitions between the ISO-states, and the corresponding P-QRS-T states because ISO-states ideally have zero amplitude in

ECG of a healthy heart. For non-zero durations in ISO line segments, the corresponding states are considered present and transitions to, and from those states are counted. With P- waves embedded in the T-wave, a baseline ISO1 is not observed and is considered missing.

Table 5.3 shows the amplitude and duration of each waveform and baseline along with bivariate distribution obtained using the correlation coefficient for the observed beat.

Missing entry for ISO2 amplitude and duration indicates missing baseline.

132

Figure 5.7 BGTG constructed for a patient

Table 5.4 Bivariate distribution for each state of BGTG

State Amplitude Duration Correlation Bivariate (mv) (msec) coefficient(흆) distribution P -0.23 80 0.82 0.98 Q -0.16 18 +0.81 0.81 R 1.6 45 -0.90 0.78 S -0.2 22 +0.79 0.83 T 0.19 120 -0.89 0.92

ISO1 0.0 130 -0.62 0.52

ISO2 absent absent absent absent

ISO3 0 80 -0.23 0.19

5.3.3 BGTG and BGMM graph matching

BGTGs for patients are constructed from ECG signal in real-time by considering one beat at a time to monitor a patient in real-time [17, 18, and 19]. The problems of labeling a BGTG reduces to matching the BGTG with the probability-graphs corresponding to BGMM of each subclass of premature beat and identifying the closest match.

For the constructed BGTG for each beat of observed signal, graph matching is performed by multiplying two values: 1) probability that observed bivariate distribution of

133

state in BGTG is produced by state in BGMM [33, 35] and 2) probability that the state in the observed beat is generated by a BGMM based on transition probabilities using standard forward algorithm [36]. BGTG is classified based upon the BGMM with the highest likelihood. The algorithm for graph matching is already given in Chapter 4, and has been omitted here.

Example of beat annotations

Following beat classification was derived for a 39-beat sequence from record 136 from MIT-BIH using: 1) premature beat detection using integration of rule-based analysis,

2) embedded waveform detection, and 3) finer classification of premature beats using

BGMM approach.

푅 푅 푃푉퐶 푅 푃푉퐶 푅 푃푉퐶 푅 푃푉퐶 푅 푃푉퐶 푅 푅 푃푉퐶 푅 푃푉퐶 푅 푅 푃푉퐶 푅 푃퐴퐶 푅 푅 푃푉퐶 푅 푃푉퐶 푅 푃푉퐶 푅 푃푉퐶 푅 푃푉퐶 푅 푃푉퐶 푅 푃푉퐶 푅 푃푉퐶 푅

Multiple types of premature contractions such as PVC and PAC are present in the same sequence. The sequence shows the presence of PVC-bigeminy of the form

…푅 푃푉퐶 푅 푃푉퐶 … in the sequence. Single pattern …푅 푅 푃푉퐶… and … 푅 푅 푃퐴퐶… are not treated as trigeminy due to the ambiguity with bigeminy.

5.4 Identifying beat-pattern

After the beat-classification, the labeled sequence of beats with premature beat subclass and regular sinus beats is further analyzed using a look-ahead pattern-analysis algorithm developed by me to distinguish between bigeminy, trigeminy and quadrigeminy.

134

The input for the algorithm is a sequence of beat-patterns. The output from the algorithm is a sequence of 5-tuples of the form (beat-type, beat-pattern type, start of the beat-pattern, end of the beat-pattern, count of beats in the beat-pattern).

To avoid ambiguity between beat-patterns, there must be at least two consecutive beat-pattern repeats having the same type. After finding the first premature-complex, the overall algorithm is based upon: 1) skipping regular sinus beats ‘R’ until first premature beat is identified, and 2) using lookahead of beat values in the beat sequence: four for quadrigeminy, three for trigeminy and two for bigeminy.

The conditions for the pre-mature beat-patterns are given by equations 5.5, 5.6, and

5.7. The notation ‘⋀’ denotes logical-AND. The index i of the first occurrence of premature beat is treated as 0.

Quadrigeminy: ′ ′ ′ ′ ′ ′ ′ ′ 퐵(푖−3) … 퐵푖 = 푅푅푅푝 ∧ 퐵(푖+1) = 푅 ∧ 퐵(푖+2) = 푅 ∧ 퐵(푖+3) = ′푅′ ∧ 퐵(푖+4) = 푝

∧ 푖 − 푖푛푑푒푥 (푙푎푠푡 푏푒푎푡 − 푝푎푡푡푒푟푛 푒푛푑) ≥ 4 (5.5)

Trigemini: ′ ′ ′ ′ ′ ′ 퐵(푖−2) … 퐵푖 = 푅푅푝 ∧ 퐵(푖+1) = 푅 ∧ 퐵(푖+2) = 푅 ∧ 퐵(푖+3) = ′푝′

∧ 푖 − 푖푛푑푒푥 (푙푎푠푡 푏푒푎푡 − 푝푎푡푡푒푟푛 푒푛푑) ≥ 3 (5.6)

Bigeminy: ′ ′ ′ ′ 퐵(푖−1) … 퐵푖 = 푅푝 ∧ 퐵(푖+1) = 푅 ∧ 퐵(푖+2) = ′푝′

∧ 푖 − 푖푛푑푒푥 (푙푎푠푡 푏푒푎푡 − 푝푎푡푡푒푟푛 푒푛푑) ≥ 2 (5.7)

135

5.4.1 Nondeterministic automata

Each equation for the beat-patterns discussed are modeled as a nondeterministic automata to capture the lookahead technique. For each pattern in Fig. 5.8, the first 푃 observed from left-to-right is the current beat and the next 푃 or 푅 beats are look-ahead beats to classify the observed pattern.

P Bigeminy

P 2 3 R 4 R R R R R 1 Trigeminy P R P R R 5 R 6 7 8 9 R

Quadrageminy P

R P R R R 10 R 11 12 13 14 15 16

R R R R R R R

Figure 5.8 Nondeterministic automata for beat-pattern analysis

Example of beat-pattern analysis

The indexed version of the example has been given below in Fig. 5.9 for convenience. It is a sequence of pairs of the form (index-value, beat-type).

136

< (0, 푅)(1, 푅)(2, 푃푉퐶)(3, 푅)(4, 푃푉퐶)(5, 푅)(6, 푃푉퐶)(7, 푅)(8, 푃푉퐶)(9, 푅)(10, 푃푉퐶)

(11, 푅)(12, 푅)(13, 푃푉퐶)(13, 푅)(15, 푃푉퐶)(16, 푅)(17, 푅)(18, 푃푉퐶)(19, 푅)(20, 푃퐴퐶)

(21, 푅)(22, 푅)(23, 푃푉퐶)(24, 푅)(25, 푃푉퐶)(26, 푅)(27, 푃푉퐶)(28, 푅), (29, 푃푉퐶)(30, 푅)

(31, 푃푉퐶) (32, 푅) (33, 푃푉퐶) (34, 푅) (35, 푃푉퐶) (36, 푅) (37, 푃푉퐶) (38, 푅) >

Figure 5.9 Indexed sequence of beat-pattern

The output sequence is: <(푃푉퐶, 푏푖푔푒푚푖푛푦, 1, 10, 5), (푃푉퐶, 푏푖푔푒푚푖푛푦, 12, 15, 2), (푃푉퐶, 푏푖푔푒푚푖푛푦, 22, 37, 8) >

5.5 Overall approach

The overall approach of premature beat detection using rule-based analysis and finer classification using BGMM involves training and dynamic phase.

5.5.1 Training phase

In training phase, rules based on RR-intervals, that separate premature beats and regular beats were derived. For finer level classification of premature beats, a BGMM is trained for each subclass of premature beat. For embedded waveform analysis, average area of T-wave is stored. Information collected in training phase is used in dynamic phase analysis for real-time classification of premature beats.

5.5.2 Dynamic phase

Dynamic analysis is executed for observed ECG to be classified for subclasses of premature beats and irregular beat-pattern in real-time using information collected from an

137

observed signal, training phase and classifiers (rule-based system and BGMMs). In this phase, observed ECG signal is preprocessed to remove noise and extract features using the method described in chapter 2. The first window of a limited number of beats of ECG signal is analyzed for each patient. This initial window analysis is used to obtain following statistical information about the features of the waveforms:

1) Number of beats and individual waveforms in the window;

2) Mean, median, minimum and maximum of the amplitude and duration of each type of waveform and ISO-lines for bivariate distribution calculation; and

3) Average values of RR-intervals, average value of Q-wave amplitudes, and average value of PR-intervals for rule-based analysis.

The window length selected for initial window analysis was chosen to be 30 seconds. The collected information in initial window analysis is used for two-level classification of premature beats. For the rule-based classification, a window of four consecutive beats is analyzed in real-time to classify the second beat in window as a premature or normal beat. Each RR-interval is analyzed based on the rules to detect embedded premature beat. The number of beats chosen for BGTG construction is 20 for each BGTG.

The window-sizes for initial window analysis and number of beats per BGTG are based on ROC criteria by comparing it with classification-error analysis. The window-size and the number of beats in BGTGs were plotted against the classification results. ROC

138

values are chosen that produced highest sensitivity and lowest false-positive results. The overall approach is given in figure 5.10.

Figure 5.10 Overall approach

5.6 Premature beat detection algorithms

5.6.1 Rule-based beat detection algorithm

The input to the algorithm is features, extracted in feature-extraction stage in the form Pamplitude, Pduration, Qamplitude, Qduration … such that Pamplitude is amplitude of P-wave, Pduration duration of P-waves, etc. The algorithm also takes average features obtained in the training 139

phase and the initial window analysis as input in the form Pavergae-amplitude, Qaverage-amplitude,

Tarea-average, RRaverage …, such that RRaverage denotes the average value of RR-intervals, Tarea- average denotes the average area of T-waves, and Qaverage-amplitude denotes the average of Q- wave amplitudes. The algorithm is described in Fig. 5.11. The algorithm for matching

BGTG and BGMM is similar to the algorithm for graph matching described in Chapter 4, and has been omitted in this chapter.

Algorithm: Premature beat detection Input: 1. Features in the form Pamp, Qamplitude …

2. Average features for beat Pavergae-amplitude, Qaverage-amplitude, Tarea-average, RRaverage Output: Beat classification P or R;

Do until end of beat sequence is reached for each window

{ let RR1, RR2, RR3 are intervals considered for beats: b1, b2, b3, b4

{//check rule 1 to detect premature beat

if ((RR2 > 1.13 RR1) ⋀ ((RR2 > 1.15 RR3) ⋀ (1.15 RRaverage < RR2 < 1.8 RRaverage))

Mark b2 P //mark second beat as premature beat

else { //check rule 2 for embedded beat in RR2 if ( RR2 ≥ 1.8 RRaverage) { //check rule 3 to detect embedded P-wave for B-PAC

if ((Tarea > 1.38 Tarea-average) ⋀ (Qamplitude-average > 2.4 negative-peak within RR2)

Mark bnew B-PAC; //mark the new beat as B-PAC Pamp = Paverage-amplitude //assign average features

else //check rule 4 to detect embedded R-wave

if ((Tarea > 1.38 Tarea-average) ⋀ (negative-peak within RR2 > 2.4 Qamplitude-average) Mark bnew PVC ; positive peak = Rnew} }

else Mark b2 R // if none of the rules are satisfied; beat is marked as regular beat}

Figure 5.11 Algorithm to classify premature beat

140

5.6.2 Algorithm to identify beat-pattern

A lookahead-based algorithm to identify beat-pattern subclassification is given in

Fig. 5.12. The algorithm uses three Boolean variables: quadBool, triBool, and BiBool for checking the first occurrence of quadrigeminy, trigeminy and bigeminy, respectively. The variables start and end describe the start of the current beat-pattern and end of the last beat- pattern respectively. The variable count stores the number of repeats in one beat-pattern.

The variable PValue holds the type of the premature beat associated with the beat-pattern.

The algorithm begins by assuming quadBool, triBool and BiBool to be true. The assumption is used to eliminate possible sub-patterns observed for quadrigeminy, trigeminy and bigeminy. For instance, pattern 푅 푅 푅 푝 is ambiguous suggesting the possibility of all three beat-patterns: quadrigeminy, trigeminy or bigeminy. If quadrigeminy condition is not met, then quadBool is assigned false and trigeminy and bigeminy are explored. Similarly, after the trigeminy is ruled out, the possibility of bigeminy are explored.

The stream of labeled beat sequence are traversed until first premature beat P is observed using a look-ahead. Subclass of premature beat is stored in variable 푃푉푎푙푢푒. To check presence of quadrigeminy, equation 5.5 is checked. The algorithm checks three beats preceding current R beat to check presence of pattern 푅 푅 푅 푝. It also checks the next four beats to find the second pattern of quadrigeminy using lookahead strategy. If eq.

5.5 is true, and 푃푉푎푙푢푒 for preceding and succeeding beat-pattern are true, then presence of pattern is stored in beat-pattern sequence 퐵푝. If one of the conditions is false, then

141

quadBool is assigned false and possibility of trigeminy, and bigeminy is analyzed in a similar way. Algorithm identify-beat-pattern

Input: Sq - Sequence of beats in 30-second window labeled with type of premature contraction Output: Bp -A beat-pattern sequence of 5-tuples of the form (beat-class, beat-pattern-type, start, end, count) start = 0; end = 0; i = 0; % Bp = < > // initialize beat pattern to sequence to empty. N = length(Sq); % N is the number of beats in the input Sq;

Label 1: quadBool = true; triBool = true; biBool = true; while (i < N and Sq[i] == ‘R’)i++; // get next premature beat if (i ≥ N) return (Bp); % exit if i is > length of the sequence

PValue = sq[i]; // get the specific beat-type // check for quadrageminy // if (quadrageminy condition in equation 5 is true){ quadBool = true; triBool = false; biBool = false; Start = i -3; count = 2; i = i + 4; while (quadBool is true and quadrageminy condition is true)

{count = count + 1; i = i + 4; } // iterate for similar beat-patterns end = i; quadBool = false; triBool = true; biBool = true; Bp = Bp + (PValue, quadrigeminy, start, end, count);

Go to Label 1;} // check for trigeminy // elseif (trigeminy condition in equation 6 is true){ triBool = true; biBool = false; start = i -2; count = 2; i = i + 3; while (triBool is true and trigeminy condition in equation 2 is true){ % iterate

count = count + 1; i = i + 3;} end = i; triBool = false; biBool = true; Bp = Bp + (PValue, trigeminy, start, end, count); Go to Label 1;}

elseif (bigeminy condition in equation 7 is true){ // check for bigeminy // biBool = true; start = i -1; count = 2; i = i + 2; while (biBool is true and bigeminy condition in equation 3 is true) % iterate {count = count + 1; i = i + 2; } end = i; biBool = false; Bp = Bp + (PValue, bigeminy, start, end, count); Go to Label 1; }

Figure 5.12 Algorithm to identify beat-pattern 142

5.7 Discussion

In this chapter, we discussed finer premature beat classification using rule-based analysis and BGMM-based method. Issues in premature beat detection due to superimposition of waveforms were addressed by integrating rule-based analysis with embedded waveform analysis. Embedded waveform analysis is based on area subtraction technique discusses in chapter 4. Integration of rule-based analysis and embedded waveform analysis are used to detect premature beats and regular sinus beats.

Finer classification of premature beats based on ectopic location was done using

BGMM-based method. BGMM-based classification is based on including bivariate distribution of both amplitude and duration of waveforms and baselines along with the transition probabilities between them. The classification was done by constructing BGTG for each premature beat detected and then matching it with all BGMMs for premature beats to obtain the BGMM with most likelihood obtained using forward algorithm.

143

Parallel Processing Using CUDA Enabled GPU

The demand for an improved healthcare system requires the development of information technology, and one area of opportunity is wearable smart monitoring devices

[8, 9]. Advances in microelectronics have provided smaller, faster and more affordable embedded platforms for personal monitoring systems such as the NVIDIA Jetson GPU

[67]. Most of these wearable biomedical systems can detect a variety of abnormalities such as stress, oxygen level saturation, limited cases of ischemia and arrhythmias, and with limited accuracy.

ECG signal analysis for real-time detection of abnormalities involves computationally expensive modules like signal denoising, morphological and temporal feature-extractions; complex functional transforms [13, 15], computational intelligence techniques for classification and machine learning. The AI techniques include the use of

PCA (Principle Component Analysis) [35], neural networks [79, 80, 81], and Markov models [10, 11, 12, 13]. The computational overhead of exploiting these techniques is significant and violates the basic requirement of resource-limited smart wearable devices diagnosing abnormality accurately in real-time.

Recent development in Graphics Processor Units (GPU) facilitates faster execution of these modules by accelerating computations without losing accuracy. In recent years,

144

several researchers have exploited GPU-based SIMT (Single Instruction Multiple Threads) parallelism to improve the computational efficiency for automated ECG analysis by developing concurrent algorithms for, denoising [68], wavelet transform [68, 69], and classification [70, 71], separately. However, limited research has been conducted to accelerate arrhythmia classification by concurrent execution of all modules by utilizing

GPU for real-time monitoring of ECG.

The focus of this chapter is to develop a parallel version of algorithms to exploit

SIMT parallelism on GPU. The parallel algorithms perform: 1) concurrent analysis of ECG for the arrhythmia-subclassification, and 2) concurrent analysis of the ECG for the premature beat-subclassification.

6.1 GPU and CUDA architecture

A CUDA-based GPU has multiprocessor cores, and acts as a coprocessor to the main CPU. CUDA (Compute Unified Device Architecture) supports SIMT parallelism by spawning a high number of concurrent threads on different sets of data elements in compute-intensive applications [23, 73, 74, and 75]. Streaming multiprocessors (SM) are assigned to multiple groups of concurrently executing threads called blocks using a grid architecture as shown in Fig. 6.1.

The threads in a block communicate to each other using low latency shared memory. The threads are automatically allocated CUDA cores to exploit concurrency, and balance the load. Threads are automatically allocated in CUDA cores, over which a programmer has no control. Distribution of data on SMs for exploiting concurrency is also 145

automated, and this cannot be specified by the programmer. The spawning of multiple blocks enhances the chance of concurrent utilization of multiple SMs by mapping different blocks on different SMs [74].

Figure 6.1 . A CUDA architecture

Each SM has multiple CUDA cores that are comprised of ALUs, FPUs (Floating

Processing Unit), load/store units and registers. These cores are assigned automatically to balance the load by the SM scheduler. The GPU supports high latency global memory to share information between CPU and GPU, short latency constant memory that cannot be altered during a thread’s execution [73, 74, and 75], limited on-chip shared memory, and local memory. Global memory is also used to share information across SMs.

Three major issues in exploiting GPU-based parallelism are: 1) mismatch of the latency time of different memories, 2) mismatch between the data transfer rate between

CPU and GPU, and 3) low data transfer rate between streaming multiprocessors (SMs) within GPUs. Thus, optimized task distribution needs to be performed, so that faster memory accesses in GPUs are exploited without excessive data transfer between slower

146

global memories. In addition, the accuracy of the diagnosis has to be maintained while distributing the beats across SMs in GPU-based on statistical analysis.

The simplest approach to solve the first issue is to use the lowest latency shared memory by thread blocks, which is read/write memory. However, shared memory is of limited size, and only threads within the same block utilize shared memory. For concurrent blocks scheduled by OS on multiple SMs, constant memory, which is next low latency memory, is used. However, constant memory cannot be altered during execution. If blocks across SMs need to update information concurrently, then global memory is used.

In this research, first issue is solved by using a combination of shared and constant memory whenever update across blocks on multiple SMs is not required. To solve the issue of mismatch between data transfer rate, CPU performs real-time ECG collection and spawning of the data analysis. SIMT parallelism is exploited in the GPU for the beat-level analysis.

Figure 6.2 Timing analysis for arrhythmia classification

147

6.2 Parallelization of Arrhythmia Subclasses using CUDA enabled GPU

6.2.1 Dependency analysis for arrhythmia classification

Arrhythmia subclassification performed in real-time involves sequential execution of the following modules: preprocessing module; GMM-based beat-level classification module; embedded wave detection module; BGTG construction module from a group of beats; graph matching module for the labeling.

Table 6.1 Average processing time for modules

Module Processing time (ms) Initial window Analysis 650 Preprocessing 950 Embedded wave 200 GMM 300 BGTG construction 2800 Graph matching 3200

Figure 6.2 shows various modules and their execution time. Table 6.1 shows average processing time required for the four major modules. Time consuming modules in decreasing order of time consumption are: BGTG construction - 39%, graph matching -

35%, and preprocessing - 12%, and initial window analysis – 8%.

148

6.2.2 Types of SIMT parallelism in ECG analysis

There is inherent dependency between various high-level modules. However, two types of SIMT parallelism can be exploited in ECG analysis: 1) parallelism within the module for beat-level analysis, and 2) parallelism in the analysis of a group of beats.

Beat-level parallelism exploits the same instruction executed on different components of the same beat for performing finer beat level analysis. For beat level granularity, same SM is used to perform concurrent executions using shared memory to load and store information. Feature-extraction, embedded wave analysis, P-wave resolution, and BGTG graph construction modules require beat-level analysis and shared memory to merge the data from individual beat analysis. Preprocessing, embedded waveform analysis, P-wave resolution, GMM-based classification, and BGTG construction modules can exploit beat-level parallelism within the same set of SMs.

Another level of parallelism is based on executing the same instruction on different groups of beats. For this level of granularity, different SMs are used to perform analysis using global memory to collect and store information. Graph matching module matches one BGTG, which is constructed for a group of beats, with multiple BGMMs. Hence, graph matching module requires concurrent analysis of group of beats on different SMs.

6.2.3 Feature-extraction

Feature-extraction has two functions: 1) identification of the waveforms; and 2) extraction of amplitude and duration of each waveform and ISO lines. The first task does not have any knowledge about waveforms, and has eight subtasks: 1) R-wave extraction; 149

2) Q-waves extraction; 3) S-wave extraction; 4) zero-crossing detection to get ISO2 baseline; 5) zero-crossing detection to get the ISO3 base line; 6) P-wave extraction; 7) T- wave extraction; 8) ISO1 extraction using knowledge of P and T-waves (see Fig. 6.3).

Figure 6.3 Dependency graph for feature-extraction

There is a task-dependency in identifying the beats. R-wave is identified first followed by two tasks: (Q-wave detection  zero-crossing to get ISO2  P-wave detection) and (R-wave detection  zero-crossing to get ISO3  T-wave detection). After the detection of P-wave and T-wave, ISO1 is identified.

Beat-detection and feature-vector analysis is done in one block to exploit shared memory with low latency. Based upon the estimate of the R-waveform counts obtained in initial window analysis, as many threads are spawned for the concurrent detection of multiple R-waveforms using barrier-based synchronization (see chapter 2).

After detecting R-waveforms, two sets of concurrent threads are spawned to detect and derive features of (Q-wave, ISO2, P-wave) and (S-wave, ISO3, T-wave) respectively.

The number of threads spawned in each set is equal to number of detected R-waves. After detecting the waveforms, one thread is spawned to identify all ISO1 lines in the sample.

After the feature-extraction, feature data is transferred to global memory.

150

6.2.4 Embedded waveform detection

Embedded waveform detection to detect P-wave embedded in QRS-complex involves two tasks: 1) calculation of area of observed in QRS-complex; and 2) comparison of average area of QRS-complex with the observed area of QRS-complex to detect an embedded P-wave. The input to perform this task are: average QRS-complex and beats with the detected features stored in global memory of GPU. To detect embedded waveform, multiple thread blocks are spawned such that each thread is responsible for detecting an embedded waveform in one beat and updating the feature vector for the beat by assigning average P-wave features. For an observed window of 30 seconds for the real time classification, (containing around 120 beats), one block of 120 threads is sufficient to perform this calculation.

Figure 6.4 . Dependency graph for embedded waveform detection

6.2.5 GMM-based classification

As discussed in Chapter 4, for parametric estimation of GMM, EM algorithm is executed. EM algorithm is a compute intensive algorithm and takes output from feature- extraction stage, and parameters obtained in training phase as input. At the end of every

151

iteration, new estimates are formed, which update the existing parameters. Each iteration of EM clustering is executed on GPU until convergence.

GMM parameter estimation involves the calculation of mixture-coefficients, mean and covariance for three Gaussian mixtures. The EM algorithm uses five sequential steps to obtain these parameters as shown in Fig. 6.5. The first step calculates the probability- density for each beat. The second step uses the probability-density to perform an expectation step of the EM algorithm to find out prior probability of each beat belongs to each cluster. It also performs maximization step, and finds out the mixture coefficients and mean for each mixture. Third step estimates maximization by calculating the mean for each mixture. Fourth step calculates variance for each beat-based on eight dimensions.

Fifth step calculates the covariance for each beat and three Gaussian components.

Figure 6.5 Dependency graph for GMM parameter estimation

152

Figure 6.6 Concurrent thread execution approach for GMM-based classification

For each iteration, multiple threads are concurrently spawned such that each thread performs five sequential steps for one beat as shown in Fig. 6.6. For a window of 30 seconds, approximately 120 beats are analyzed for real time classification by executing a block of 120 threads. Six iterations are required until convergence is obtained, based on the training data.

6.2.6 BGTG formation

For each window of 30 seconds (120 beats) observed in real time, six BGTGs are constructed. A BGTG is constructed using a group of 20 beats. To construct one state of

BGTG, two tasks are executed: 1) computing the joint-probability of amplitude and duration; and 2) computing the transition probability, as shown in Fig. 6.7. Since the calculations for each state of BGTG are independent of other states; multiple-thread blocks are spawned concurrently. Threads performing calculations for a BGTG are synchronized using a barrier.

153

Figure 6.7 Dependency graph for BGTG formation

6.2.7 BGMM and BGTG graph matching

The concurrent graph matching algorithm has three sequential tasks: 1) Denoising

BGTGs and obtaining a set of potential matching BGMMs for BGTGs 2) computing most probable path (MPP) in BGTG; and 3) matching BGTG-BGMM pair using set of potential

BGMMs for each BGTG.

First task is performed by spawning multiple thread blocks concurrently, such that each block works on one BGTG. Each thread in a block works on one state of BGTG.

Second task of obtaining MPPs for BGTGs in a window is achieved by spawning multiple thread blocks such that each block works on one BGTG, and each thread works on a state in BGTG to calculate an outgoing edge with maximum transition.

Graph matching task takes output of the first two tasks as input to match BGTG-

BGMM pair by spawning multiple thread blocks to exploit SIMT parallelism. Each block takes a BGTG-BGMM pair as input, and stores in its shared memory. Each thread in a block compares one state of the BGTG-BGMM pair. The results obtained by threads in a block are synchronized. The output of the graph matching task is in the form of probability

154

for each BGTG belonging to each BGMM. The results of the pair are passed back to CPU to classify BGTG with BGMM-based on highest probability.

Concurrent approach for the three sequential tasks in terms of block execution is given in Fig. 6.8. The block execution illustrates the task implemented by each block and thread in the block. For instance, the task of concurrent cleaning one BGTG is implemented by a block. Each thread in the block works on a single state of the BGTG.

Figure 6.8 Concurrent graph matching tasks

6.2.8 Parallelization of arrhythmia subclassification

An overall approach to exploit SIMT parallelism for arrhythmia classification is:

1. Block level parallelism for denoising, waveform extraction by dividing the data into equal time-slots;

2. Exploiting SIMT parallelism at the beat level for the amplitude and duration analysis, embedded wave detection, GMM-based classification using EM, and BGTG construction; and

155

3. Concurrent matching of BGTG-BGMM pairs by spawning multiple blocks, one for each BGTG-BGMM pair.

An overall approach for the parallel version of arrhythmia classification is illustrated in figure 6.9.

CPU GPU ECG data collection of 30 Preprocessing sec. Noise Removal Feature Extraction

Initial Window Analysis Embedded Waveform Detection

EM to estimate GMM parameters Basis Matrix Construction Prior Matrix Construction Convergence Criteria for Mean Matrix Construction GMM Variance Matrix Construction Covariance Matrix Construction

Labeled beats with S, V or N Transition Graph Construction

Graph Matching Concurrent BGMM Pruning Concurrent MPP

Concurrent Maximum Probability BGTG-BGMM pairs with probabilities

Select maximum probability for each BGTG to classify group of beats

Figure 6.9 Overall parallel approach for arrhythmia subclassification

156

Before starting concurrent processing of windows, the initial window analysis using ECG signal for limited time of 30 seconds is performed sequentially in the CPU to estimate the initial statistical information about the features of the waveforms using limited sample-size. The analyzed features are: 1) number of beats and individual waveforms in the window; 2) mean, median, minimum and maximum of the amplitude, and duration of each type of waveform and ISO-lines. The obtained information is transferred from CPU to GPU to spawn multiple threads during concurrent analysis of subsequent modules. This information is stored in the global memory and the constant memory of GPU to be used by corresponding SMs.

Noise removal submodule takes as the input a window of 30 seconds of raw ECG signal, and has no knowledge of the waveforms. It performs convolution, low-pass filtering, and high-pass filtering. Hence, these windows are divided equally in multiple blocks (> 6 sec. per window in our case).

The result is passed back to CPU, where the pair with maximum probability is selected, and BGTG is classified with respective arrhythmia subclass.

After the noise-removal, the signal is input to the feature-extraction module. Since the data is already present in GPU, there is no data transfer overhead. After the feature- extraction phase, each beat is analyzed further for embedded waveform detection in GPU and results with updated beats are stored in global memory. Next, each detected beat in global memory is classified as either ventricular, supraventricular or regular sinus beat using GMM.

157

GMM-parameters are estimated using EM algorithm by sequentially launching five kernels for each iteration on GPU. Classified beats are stored in GPU for BGTG construction. Next, each BGTG is matched with BGMM in GPU by spawning multiple threads in a block such that each block is responsible for matching one BGTG-BGMM pair. Result obtained for each block are in the form of probability calculated for each

BGTG-BGMM pair.

6.3 Parallelization of Arrhythmia Algorithms

This section describes algorithms for the major concurrent tasks: 1) concurrent denoising and feature-extraction; 2) concurrent embedded-wave detection; 3) concurrent

GMM parameter estimation using EM for beat classification; 4) concurrent BGTG construction; 5) concurrent MPP (most probable path) detection; and 6) concurrent matching.

6.3.1 Parallel algorithm for feature-extraction

To execute this kernel function, 30 seconds of data is divided into a number of blocks corresponding to a set of beats-based on average beat area calculated in initial window analysis. On each block, data is divided between threads. Noise removal and R- wave detection with amplitude and duration is performed by threads concurrently.

A barrier is used to finish execution of all R-wave detection threads. Next, data is divided into two chunks: the left of R-wave (R-wave – Δ), and the right of R-wave (R-

158

wave + Δ), which are spawned on multiple threads concurrently. The symbol ‘Δ’ denotes

500 milliseconds of ECG signal on both left and right of an R-wave.

Each of the left-side threads extracts features of one corresponding Q-wave, ISO2 and P-wave. Similarly, each of the right-side threads extracts features of one corresponding

S-wave, ISO3 and T-wave. The final data is transferred to the global memory. Detailed algorithm is given in Fig. 6.10. Algorithm: Concurrent denoising and feature extraction Input: ECG signal, D6 wavelet Output: denoised beats with features extracted { //Execute grid of blocks on GPU for window of 30 sec.

forall block1 : blockn //dispatch 5 second window to block { forall threads T1:Tm { //denoising and R-wave detection spawn Ti for denoising and R-wave detection; store derived information in memory, and wait; end barrier;} count number of R-waves from memory. Let it be k;} Co-begin forall threads T1 : TK { spawn Ti to detect and store Q-wave  ISO2  P-wave store derived information in memory; terminate if distance > R-wave-location – Δ; wait end barrier; } forall threads TK+1 : T2*K { spawn Ti to detect and store S-wave  ISO3  T-wave store derived information in memory; terminate if distance > R-wave-location + Δ; wait end barrier; } Co-end calculate ISO1 based on P-wave and Q-wave; store ISO1 information}}

Figure 6.10 Algorithm for concurrent denoising and feature-extraction

159

6.3.2 Algorithm to detect embedded waveform

A kernel function with grid of six blocks is launched on one SM. Each block spawns multiple threads such that one thread works on one beat. To detect embedded wave, each block gets information for average area calculation from initial window analysis and features calculated for each beat in previous module. This information is stored in shared memory for a block. Each thread in a block works on one beat. If P-wave is missing, then corresponding threshold area is checked to assign average features for missing waveform. Otherwise, unchanged features are passed back to global memory.

Detailed algorithm for kernel function is given in Fig. 6.11.

Algorithm: Concurrent embedded waveform detection 푆 푆 푆 푆 Input: Set of labeled beats S = {푏1 , 푏2 , 푏3 , 푏4 , … }

set of extracted feature vector [ Pamplitude, Pduration, Qamplitude, Qduration…]

Average features for beats QRSaverage-amplitude, Paverage-amplitude, Paverage-duration Output: updated-beats-and-features {//Execute grid of blocks on GPU forall block1 : blockm //execute m concurrent blocks with 20 beats/block forall T1 : Tm // each thread works on one beat

{ if (Pamplitude == 0 && Pduration == 0) { //if P-wave is missing

QRSarea= Calculate the QRS-complex area using Simpson’s rule

diff = QRSarea – QRSaverage-area //Perform Subtraction: if (diff > 0.28) { // Mark P-wave present and embedded. Assign average features for P- wave.

Pamplitude = Paverage-amplitude, Pduration = Paverage-duration Update vector for beat features} //pass results back to global memory }

Figure 6.11 Algorithm for concurrent embedded waveform detection

6.3.3 Parallel algorithm for GMM-based classification To estimate the parameters of GMM in order to classify each beat, five kernel functions are launched sequentially. Each kernel function takes input from previous kernel

160

function. Let N be the number of beats, M be the mixture length and D is the dimension of mixture components. For real time analysis of data, following parameters are initialized where, N = 120 beats, M = 3 clusters and D = 8.

Algorithm: concurrent GMM based classification Input: vectors for each beat with eight selected features Output: Classified beats let N be the number of beats, M be the mixture length and D is the dimension of mixture components let X be the vector of observed beats in the form [풙തതത … . 풙തതത] and 풙തതത = [풙 … . 풙 ]푓표푟 푖 = 1 … . 푁 ퟏ 풏 ퟏ 풊ퟏ 풊푫 Mean 휇 in the matrix form [휇തത1ത … . 휇തതത푀ത] where 휇ഥ푖 = [휇푖1 … 휇푖퐷] 푓표푟 푖 = 1 … . 푀 Covariance ∑ in the matrix form [∑ഥ . ∑ഥ ], where ∑ഥ = [휎 , … 휎 ]푓표푟 푖 = 1, … 푀 1 푀 푖 푖1 푖퐷 Kernel 1: Construction of Basis Matrix B of order N by M Input: X, N, M, initial parameters µ and ∑

Output: Basis Matrix B { forall block1: blockm { //Execute four thread blocks.

forall T1 : T30 { // spawn concurrent threads for each probability calculation 1 1 Probability P = ∗ exp(− (푥 − 휇)푇 ∑−1(푥 − 휇) (2휋)퐷/2 √|∑| 2 wait; end-barrier return B of order N by M = P //Transfer result back to global memory}}

Figure 6.12 Algorithm for concurrent GMM based classification

First kernel function computes probability densities for input data by constructing matrix called Basis of the order N x M. Kernel function spawns 120 thread blocks each containing three threads. Each block constructs matrix with probability densities for three

Gaussian mixtures for one beat. Each thread calculates probability-density for one beat i.e., one column of N×M. Once all threads within the block perform calculations, thread calculations are synchronized using a barrier. Another thread from a block builds a matrix

161

of the order N×M. Detailed algorithm for first kernel function which constructs Basis matrix is given in Fig. 6.12.

Second kernel builds a N × M matrix called Prior, which computes the probability of each mixture component for the given data and the parameters and uses Basis matrix value. It also estimates mixture coefficients or weights using elements of the matrix.

Thread blocks are spawned, such that each block has three threads. Each thread performs calculations for one column of the matrix, and are synchronized to create a prior matrix. The matrix is stored in the constant memory. Next, thread from another block uses the prior matrix to estimate the weights for three mixtures. The results from the prior calculations and mixture coefficients are transferred to the global memory. The detailed algorithm is given in Fig. 6.13.

Kernel 2: Construction of prior matrix and mixture coefficients Input: Basis Matrix B, X, N, M Output: Posterior Matrix and mixture coefficient { forall block1: blockm { //Execute thread blocks one for each beat

forall T1 : T30 { // spawn concurrent threads for calculating probability with three mixtures 푤 ∗퐵 Probability P for Prior Matrix = 퐵 * 푀 ∑푀 푤푀푝(푥푛|휃푗) wait; end -barrier; Prior Matrix = Probability P //send Prior to constant memory //thread to calculate mixture coefficient

forall T1 : T3 //spawn threads to calculate mixture coefficients for three mixtures 1 푤 = ∗ 푃푟푖표푟 푀푎푡푟푖푥 푀 푁 Return Prior Matrix and mixture coefficients Transfer result back to global memory}

Figure 6.13 Construction of Posterior Matrix and mixture coefficients

Third kernel estimates mean matrix by multiplying input matrix with the Prior. 162

Thread blocks are spawned such that each block works on one row of the matrix. All calculations by different thread blocks are synchronized to collect rows and construct a mean matrix. The results are stored in global memory as described given in Fig. 6.14.

Kernel 3: Estimate Mean

Input: Posterior Matrix, Input Matrix X Output: Mean matrix

{ forall block1: blockm { //Execute thread blocks for each beat.

forall T1 : Tn { // spawn concurrent threads for calculation of each column of matrix Mean Matrix = Posterior Matrix * Input Matrix wait; end -barrier; return (Mean-matrix); Transfe r result back to global memory} return Posterior Matrix and transfer result back to global memory}} Figure 6.14 Estimate Mean

The fourth kernel estimates variance-matrix, and the fifth kernel function estimates covariance-matrix using variance matrix. Both kernel functions estimate variance and covariance matrices of the order N × MD. Thread blocks are spawned. Each thread performs a calculation for one column of the matrix. The variance and covariance matrices are stored back in global memory. The detailed algorithms are given in figures 6.15 and

6.16 respectively.

Kernel 4: Estimate Variance Input: Mean Matrix, Input Matrix X Output: Variance Matrix of order N by MD { forall block1: blockm { //Execute thread blocks for each beat

forall T1 : Tn { // spawn concurrent threads for calculation of each column of matrix 푉푎푟푖푎푛푐푒 푀푎푡푟푖푥 = (퐼푛푝푢푡 푀푎푡푟푖푥 푒푙푒푚푒푛푡 − 푀푒푎푛 푀푎푡푟푖푥 푒푙푒푚푒푛푡)2 wait; end -barrier; return Variance Matrix //Transfer result back to global memory}

Figure 6.15 Estimate Variance

163

All threads from all blocks are synchronized using a barrier. The data is collected form the global memory, and convergence is checked. After the convergence, then GMM- parameters are passed back to the CPU, and the beats are labeled. These labeled beats with the feature-vectors are passed back to GPU, and are used as input for the BGTG construction.

Kernel 5: Estimate Covariance Input: Variance Matrix, posterior Matrix Output: Covariance Matrix { forall block1: blockm { // Execute thread blocks for each beat.

forall T1 : T20 { // spawn concurrent threads for calculation of each column of matrix Mean Matrix = Variance Matrix * Posterior Matrix wait; end -barrier; return Posterior Matrix //Transfer result back to global memory}

Figure 6.16 Estimate Covariance

6.3.4 Parallel algorithm for BGTG construction

To construct a BGTG, a grid of multiple blocks is spawned. Each block constructs one BGTG. A BGTG is constructed for a group of beats to be classified in real-time. Since each state of the BGTG requires information for a group of beats, the block spawns multiple threads. Each thread in the block performs calculations for each state of BGTG.

Each thread computes two sets of values: 1) bivariate probability; and 2) transition probabilities to other states. Thread termination is synchronized using barrier to ensure fully constructed BGTG before transferring data to the global memory. Detailed algorithm is given in Fig. 6.17.

164

Algorithm: Concurrent BGTG construction Input: vectors of amplitude, duration, transition probability and mean, variance, covariance Output: BGTGs 푏푑 { let, St be set of states in one BGTG, 푆푡푖 be bivariate distribution of state i 푡푟 푆푡푖 be transition probability from state i forall block1: blockm { //Execute one block to construct one BGTG

forall T1 : Tn { // spawn concurrent threads for each state in BGTG 푏푑 푆푡푖 = calculate joint probability forall edges 퐸푖 originating from 푆푡푖// for all edges starting from the state 푆푡푖 푡푟푎푛푠푖푡푖표푛 푐표푢푛푡 푏푒푡푤푒푒푛 푡푤표 푠푡푎푡푒푠 푆푡푡푟 (퐸 → 퐸 )= 푖 푖 푗 푡표푡푎푙 푡푟푎푛푠푖푡푖표푛푠 푓푟표푚 푠표푢푟푐푒 푠푡푎푡푒

wait; end-barrier; 푏푑 푡푟 return BGTG = {푆푡푖 , 푆푡푖 }; } Transfer result back to global memory}

Figure 6.17 Algorithm for concurrent BGTG construction

6.3.5 Parallel algorithm for BGTG-BGMM graph matching

Concurrent graph matching algorithm has three kernel functions: 1) Denoising observed BGTG and pruning BGMMs that do not have an edge present in the BGTG; 2) computing the most probable path (MPP) in each BGTG; and 3) classifying BGTG using forward algorithm.

Parallel algorithm for denoising and pruning

Grids of multiple blocks are launched such that each block is dedicated for denoising and obtaining the potential subset of matching BGMMs for one BGTG. Each block takes as input of one BGTG and all BGMMs, and stores the required data in shared memory for the block.

Each thread in a block works on one BGTG-BGMM pair. The first thread in each block loops over all states of BGTG-based on transitions, and prunes transitions with

165

probabilities less than the threshold. Next, concurrent threads are launched in the same block, such that each thread compares a pair of BGTG-BGMM. If the transitions in BGTG have the corresponding transition in BGMM, BGMM is considered as a potential match for the BGTG and is stored in the common-vector SUB accessible to all the threads in the block. The detailed algorithm is given in Fig. 6.18.

Algorithm Concurrent denoise and prune BGMM-set Input: 1. BGTG = [BGTG1…] 2. Set of BGMMs as MM = [MM1…] Output: BGMM subclasses considered for BGTG denoted by SUB

{ forall block1: blockm //compare BGMMs and a BGTG for potential match

for T1 // eliminate noise from edges of BGTG forall states in BGTG {StG} if (weight(EG) < threshold) remove EG //prune transition

forall T1 : Tm //compare BGMM-BGTG pair for potential match

select MMi  MM and BGTGi  BGTG // If all transitions in BGTG exist in BGMM and add it to vector SUB G M forall states in BGTG {St } and states in MMi {St } G M if (St == S ) SUB[i] = MMi ; return potential BGMMs to global memory }

Figure 6.18 Algorithm for concurrent denoising and pruning

Parallel algorithm for the most probable path

On the GPU, a grid of multiple blocks is launched such that one block constructs

MPP for one BGTG. In each block, one state of BGTG is analyzed by each thread to calculate the highest transition probability from that state.

166

Algorithm Concurrent Most Probable Path Input: BGTGs Output: updated BGTG with MPP traveled

{ forall block1: blockm //Execute one block to compute MPP for one BGTG forall T1 : Tm //Each thread works on one St of BGTG for 푆푡푖 //for one state of PTG max_prob = maximum (weight (Ei)) //calculate edge with highest prob. pair (Sti,, Stj)} //store state Stj with max. probability from state Sti //wait for barrier

//calculate MPP using thread after calculating all rows forall Sti do

store MPP = (Sti, Stj) (Stj, Stk) …. //store path by connecting pairs return MPP for each BGTG using one block to global memory}

Figure 6.19 Algorithm concurrent maximum probability

Parallel algorithm for maximum probability

To calculate the probabilities of matching each BGTG with the set of candidate-

BGMMs, kernel function with a grid of multiple blocks is launched; such that each block matches one pair of BGTG-BGMM. The detailed algorithm is given in Fig. 6.20.

Each block uses the transition probability, and the bivariate distribution for BGTG-

BGMM pair stored in shared memory and spawns multiple threads: one for each state of

BGTG Each thread calculates the probability of a state in BGTG belonging to a state in

BGMM. Based on the maximum number of blocks that can be executed in one SMs, blocks are distributed on multiple SMs by the OS.

To access information required by each block located on different SMs, common information about BGMMs is stored in faster access constant memory. Each thread performs multiplication of two values: 1) probability of state-value (bivariate Gaussian

167

distribution) of BGTG produced by BGMM; and 2) probability of transition probabilities in BGTG produced by BGMM using the forward algorithm [16].

Algorithm Concurrent Maximum Probability

Input: 1. BGTG = [퐵퐺푇퐺푖 …] and for each 퐵퐺푇퐺푖 , 푆푈퐵푖 = [MM1..MMn] Output: Probability for each BGTG-BGMM pair

{ //for 퐵퐺푇퐺푖 and 푆푈퐵푖 forall block1: blockm //grid of blocks

for 퐵퐺푇퐺푖 Select 푀푀푖 ∈ 푆푈퐵푖 //select one BGMM for BGTG forall T1 : Tm // probability for one state of BGTG and BGMM let 푏푑(푆푡퐺) denote bivariate distribution of state in BGTG 푏푑(푆푡푀) denote bivariate distribution of state in BGTG. let 푇푟(푆푡퐺) denote transition probability of state in BGTGG and 푇푟(푆푡푀) denote transition probability from state in BGMM. let 푆푡 be the set of states

select 푆푡푖  푆푡 ,//for each state of BGTG // calculate state and transition probability in BGTG belonging to BGMM

푃푟 퐺 푀 퐺 푀 푆푡푖 = 푃(푇푟(푆푡 )|푇푟(푆푡 )) * 푃(푏푑(푆푡 )|푏푑(푆푡 )) wait for barrier //thread to calculate total probability for all states of BGTG

parfor for all 푆푡푖  푆푡 and BGTG 푃푟 Pr( 퐵퐺푇퐺푖 ∈ 푀푀푖) += 푆푡푖 } //pass result back to global memory //pass result back to global memory

Figure 6.20 Algorithm concurrent maximum probability

Threads for multiple states for one BGTG are synchronized using a barrier. The final thread in block calculates the probability of BGTG belonging to BGMM and is stored in global memory. The outputs for each block are transferred back to the global memory.

6.4 Parallelization of premature beats using CUDA enabled GPU

In this section, concurrent approach for premature beat detection and classification along with irregular beat-pattern classification are described.

168

6.4.1 Dependency analysis

Premature beat subclassification algorithm has the following modules:

Preprocessing; Rule-based analysis integrated with embedded waveform detection;

Transition Graph Formation; Transition Graph and Markov model matching. Table 6.2 shows processing time required for each module for real time subclassification of beats in a window of 30 seconds.

All the modules remain same as arrhythmia subclassification modules except

GMM-based classification module in arrhythmia subclassification is replaced with rule- based analysis module for premature beat classification. The modules are: Feature- extraction, embedded wave analysis, rule-based analysis and BGTG graph construction modules require beat-level analysis and shared memory to merge the data from individual beat analysis. The level of dependency between these modules is the same. The modules are followed by premature beat classification module that includes graph-matching and simulation of the nondeterministic automata for identifying beat-patterns.

Table 6.2 Average processing time for modules

Module Processing time (ms) Initial window Analysis 650

Preprocessing 950 Embedded wave 200 Rule-based analysis 300 BGTG construction 3000 Graph matching 4000

169

6.4.2 Overall approach for parallelization of irregular beat-pattern classification

The overall approach begins by collecting ECG data on CPU with initial window analysis for first 30 seconds as explained in previous sections. Then preprocessing module; integrated rule-based and embedded waveform analysis module; classification module, and graph matching modules are executed on GPU. The overall approach of concurrent execution is given in Fig. 6.21.

6.4.3 Parallel premature beat detection

To detect premature beats, a kernel function with grid of multiple blocks is launched such that one beat is analyzed on one block. Each block spawns multiple threads such that each thread checks one rule and detects one embedded waveform.

CPU GPU ECG data collection of 30 Preprocessing sec. Noise Removal Feature Extraction

Initial Window Analysis

Integration of embedded waveform and rule based analysis

Separated premature and regular beats 1 Transition Graph Construction

Graph Matching BGTG-BGMM pairs with probabilities Concurrent BGMM Pruning Concurrent Maximum Probability

Select maximum probability for each BGTG to classify beats

Figure 6.21 Parallel approach for premature beat and beat pattern classification

170

To detect premature beat, a window of four beats is analyzed by each block. The required information to perform analysis for each block is: 1) features of all four beats, 2) average RR-intervals in a window, 3) average features of P-wave amplitude, P-wave duration, R-wave amplitude, R-wave duration, Q-wave amplitude, and T-wave area.

6.4.4 Parallelization of rule-based analysis

Rule-based analysis module analyzes a window of four beats with three RR- intervals and checks four rules to detect premature beat as explained in section 5.2.

Dependency between the four rules can be expressed as: rule 1  rule 2, rule 2  rule 3, and rule 2  rule 4 as shown in Fig. 6.22. Rules 3 and 4 are independent of each other.

Hence, rules 3 and 4 are checked concurrently to exploit intra-beat SIMT parallelism. Also, multiple beats can be labeled P or R independently.

Parallel approach for premature beat detection for observed window of 30 seconds is performed by launching a grid of multiple blocks such that each block analyzes one beat to exploit beat-level parallelism. Each block performs rule-based analysis for one beat by spawning multiple threads such that each thread checks one rule to exploit intra-beat parallelism. Each block labels a beat as premature or regular by analyzing all the rules and detects embedded waveform. The labeled beats are stored in global memory.

In each block, first thread checks first rule to detect prematurity of second beat. If the condition is satisfied, then beat is marked as P and block finishes its execution. If rule

1 doesn’t satisfy, then second thread checks rule 2 to detect embedded waveform in RR2 interval. If rule 2 satisfies, thread 3 and thread 4 are launched concurrently to detect 171

embedded waveform; and assign average features to embedded waveform; and mark the newly detected beat as P beat. Rules 3 and 4 are synchronized before passing the labeled beat back to global memory. If rule 2 is not satisfied then beat is marked R beat. Detailed algorithm for kernel function is given in Fig. 6.23.

RR1, RR2, RR3

Rule 1

Rule 2

Rule 3 Rule 4 R e Premature beat

Figure 6.22 Dependency graph for premature beat detection

6.4.5 Parallelization of graph matching for beat-labeling

The concurrent graph matching algorithm has two sequential tasks: 1) pruning

BGMMs that if it does not contain all the edges in BGTG; 2) classifying BGTG using the forward algorithm.

First task is performed by spawning multiple-thread blocks concurrently such that each block works on obtaining potential BGMMs for one BGTG. Each thread in a block compares a BGTG with one BGMM. Second task of graph matching takes output of the first set of tasks as input to match BGTG-BGMM pair by spawning multiple thread blocks.

Each block calculates the probability of one BGTG belonging to all BGMMs. The output of the graph matching task is in the form of probability for each BGTG belonging to each BGMM. The results of the pair are passed back to CPU to label BGTG with 172

BGMM based on highest probability.

Algorithm: Concurrent rule based analysis Input: 1. Features in the form Pamplitude, Qamplitude …

2. Average features for beat Pavergae-amplitude, Qaverage-amplitude, Tarea-average, RRaverage Output: Beat classification P or R; {//Execute grid of blocks on GPU

forall block1 : blockm //execute grid of multiple blocks

forall T1 : Tm // each thread works on one rule { let RR1, RR2, RR3 are intervals considered for beats: b1, b2, b3, b4

Spawn T1 //first thread checks for first rule {//check rule 1 to detect premature beat

if ((RR2 > 1.13 RR1) ⋀ ((RR2 > 1.15 RR3) ⋀ (1.15 RRaverage < RR2 < 1.8 RRaverage))

Mark b2 P //mark second beat as premature beat

else Spawn T2 //Spawn thread 2 to check rule 2

if ( RR2 ≥ 1.8 RRaverage) Cobegin for each T3 {//check rule 3 to detect embedded P-wave for B-PAC

if ((Tarea > 1.38 Tarea-average) ⋀ (Qamplitude-average > 2.4 negative-peak)

Mark bnew B-PAC; assign average P-wave features

wait end barrier; }

for each T4 { //check rule 4 to detect embedded R-wave in PVC

if ((Tarea > 1.38 Tarea-average) ⋀ (negative-peak > 2.4 Qamplitude-average) Mark bnew PVC ; positive peak = Rnew wait end barrier; } Coend

else Mark b2 R //If none of the condition

} //Pass the sequence of labeled beats with 푛푒푤푏푒푎푡 back to global memory

Figure 6.23 Algorithm for concurrent rule based analysis

The algorithms for BGTG construction and graph matching modules for premature

173

beat classification process bear close similarity with the algorithms for arrhythmia classification. Hence, algorithms for these modules have been omitted here.

6.5 Chapter Summary and Discussion

In this chapter, parallel processing approaches for arrhythmia subclassification and premature beat subclassification were discussed. Inter module dependency prohibits parallel implementation of the modules. However, each module exploits SIMT parallelism at the beat-level for multiple beats being analyzed concurrently. The granularity of the operations is either at the block level, beat level or group of beat level. Block-level parallelism was exploited for denoising and waveform extraction. SIMT parallelism was exploited at the beat level for the feature analysis, embedded wave detection, GMM-based classification using EM, rule-based analysis, and BGTG construction. Concurrent matching of BGTG-BGMM pairs for a group of beats was done by spawning multiple threads within a block, for multiple blocks one for each BGTG-BGMM pair.

174

Implementation

In this chapter, the implementation resources and low level implementation modules are discussed. Implementation resources are ECG database, software libraries and interfaces, machine configuration, implementation of modules, and a real-time simulation for the classification. Database resources include database formats, and the sampling frequency used in the implementation. Software resources include the libraries.

Machine configuration includes CPU and GPU details used for acquiring results based on the techniques discussed in the previous chapters.

7.1 ECG databases

All experiments performed in this research were carried out using data from several public ECG databases [Goldberger et al., 2000]. Physionet provides PhysioBank [21] databases − a large and growing archive of physiological data. For the identification of finer arrhythmia subclasses, irregular beats and beat-patterns, MIT-BIH dataset [24, 76],

MIT-BIH Atrial Fibrillation Database [24], MIT-BIH Malignant Ventricular Arrhythmia

Database [24], MIT-BIH Supraventricular Arrhythmia Database [24], MIT-BIH Normal

Sinus Rhythm Database [24], Creighton University ventricular tachyarrhythmia dataset

[24] and long-term AF database [24] were used.

The databases contain digitized ECG recordings with a rate of 360 samples per

175

second per channel and 11-bit resolution over a 10 mV range. Two or more cardiologists independently annotated each record. Disagreements were resolved to obtain the computer-readable reference annotations for each beat. There are approximately 110,000 annotations in all.

The MIT-BIH Arrhythmia Database contains 48 half-hour excerpts of two-channel ambulatory ECG recordings of clinically significant arrhythmias, obtained from 47 subjects, chosen at random from a set of 4000 24-hour ambulatory ECG recordings, collected from a mixed population of inpatients and outpatients.

The normal sinus rhythm database includes 18 long-term ECG recordings of subjects not exhibiting significant arrhythmias.

The atrial fibrillation database includes 25 long-term ECG recordings of human subjects with atrial fibrillation. The individual recordings are each 10 hours in duration, and contain two ECG signals each sampled at 250 samples per second and 12-bit resolution over a range of ±10 millivolts. This database includes 78 half-hour ECG recordings to supplement the examples of supraventricular arrhythmias in the MIT-BIH Arrhythmia

Database.

This Creighton University database includes 35 eight-minute ECG recordings of human subjects who experienced episodes of sustained ventricular tachycardia, ventricular flutter, and ventricular fibrillation.

This long-term atrial-fibrillation database includes 84 long-term ECG recordings of subjects with paroxysmal or sustained atrial fibrillation (AF). Record durations vary but

176

are typically 24 to 25 hour.

7.2 Software and libraries

For viewing, reading, and analyzing ECG signal, following Physionet resources have been used: 1) PhysioBank [21] − a large collection of physiological data; 2)

PhysioToolkit [24]− a library of software for physiologic signal processing and analysis, and detection of physiologically significant events; and 3) WFDB software package [77] written in C-language to read PhysioBank data. WFDB routines are called from user- written applications written in C or C++.

The major components of the WFDB software package is the WFDB library for signal processing and automated analysis, viewing, annotating, and interactive analyzing the waveform data. It has been used to create median filters for noise removal, calculate average amplitude, and duration of the waveforms.

7.2.1 WAVE software and LightWAVE editor

WAVE is an extensible interactive graphical environment for manipulating sets of digitized signals with optional annotations [21, 77]. It is built using the WFDB library developed for physiologic signal processing, so it can be applied to any of a wide variety of data formats supported by the WFDB library. WAVE has been used to visualize signal.

LightWAVE is a lightweight waveform and annotation viewer and editor used to view any recording of signal and time series with annotations.

177

7.2.2 WFDB toolbox for MATLAB

The WFDB Toolbox for Matlab, contributed by Michael Craig and available from

http://physionet.org/physiotools/matlab/wfdb-swig-matlab/, is a collection of WFDB

applications implemented as functions in Matlab, built using the Java wrappers for the

WFDB library. This toolbox has been used to access the database, and annotate the

rhythms. The toolbox was used in conjunctions of other software developed by me.

The MATLAB library is a collection of high level functions for the signal analysis.

PRTools is a Matlab toolbox for pattern recognition. It supplies over 300 user routines for

traditional statistical pattern recognition tasks. It includes procedures for data generation,

training classifiers, combining classifiers, features selection, linear and non-linear feature-

extraction, density estimation, cluster analysis, evaluation and visualization.

Due to the increased complexity of algorithms and to integrate with the available

libraries from WFDB package, MATLAB and C++ language were used for the

implementation by me. MATLAB has been used for signal processing techniques,

statistical analysis, feature-extraction, computation of area under a curve, embedded

waveform detection, slope analysis of waveforms, wavelet analysis, classification

algorithms, EM clustering, Markov models with bivariate Gaussian distributions, graph

matching, and beat-pattern analysis.

7.3 NVIDIA CUDA

CUDA (Compute Unified Device Architecture) is a parallel computing platform

and programming model developed by NVIDIA that exploits SIMT (Single Instruction 178

Multiple Threads) parallelism on graphical processing units (GPUs). The CUDA platform works with programming languages such as C, C++, and FORTRAN. CUDA architecture has been used for the execution of parallel algorithms to speed up arrhythmia, premature beat classification, and beat-pattern analysis algorithms.

CUDA compiler (nvcc) is provided by NVIDIA to run code on both CPU and GPU.

Nvcc is a compiler driver simplifies compilation of C programs. Source files for CUDA applications consist of a mixture of conventional C host code and GPU device functions.

Since CUDA framework provides nvcc compiler, parallel algorithms were written in C

[74]. MATLAB’s library module “Parallel Computing Toolbox” was also used for CUDA programming in addition to C-programs written by me.

7.4 Machine configuration

The software was executed on a Dell E5-2680 64bit machine having dual Xeon quad-core processors (2.70 GHz clock speed) with 128 GB RAM. CUDA enabled

GeForce GTX 1050 Ti GPU card was used for implementing GPU accelerated algorithms.

GTX 1050 Ti is Pascal architecture which provides 128-bit memory bus width and operates at a frequency of 3504 MHz. Each block can run 1024 threads per block in a batch of 32 parallel threads at a time. There are a total of 24 blocks. It has six SMs

(streaming processors). Each SM has four blocks with 32 cores per block, and 48 KB shared memory. Number of blocks are manipulated by increasing or decreasing number of threads allocated per block [74]. There are total of 768 cores (6 × 4 × 32) in the GPU for SIMT processing. Total amount of constant memory is 65 Kb. Total registers available 179

per block are 65536 which can store data accessed by threads.

7.5 Preprocessing waveforms

Preprocessing software implementation includes denoising, feature-extraction, and embedded waveform detection. Each step has the corresponding implementation module as illustrated in figure 7.1. This step is common to both arrhythmia classification and irregular beat analysis. Majority of the software for ECG preprocessing, feature-extraction and classification in this research have been implemented using MATLAB libraries.

WFDB software acts as a basis of software architecture. Other modules for arrhythmia classification are built on top of this architecture. WFDB toolbox for

MATLAB is used for reading, writing, and processing data. The functions are implemented as system calls through MATLAB and C wrappers to native binary executables based on WFDB packages [22, 75].

ECG samples from WFDB

Denoising module

Denoised sample

Feature extraction module Initial feature vectors Embedded waveform detection module

Updated feature vectors

Figure 7.1 ECG preprocessing step Preprocessing waveforms is implemented in both training phase and dynamic phase

180

of implementation. In training phase, a window containing number of samples for approximately 30 seconds of ECG signal are selected for performing analysis in real-time.

The selection of number of samples in a window is based on statistical analysis to get real- time response and the desired accuracy.

7.5.1 ECG signal denoising

To perform ECG signal denoising, samples of ECG signal are read from ECG databases, and stored as MATLAB readable files for further analysis. Two types of noises are removed from ECG signals, namely, baseline wander, and power line interference.

Baseline wander is removed using median-filter and functions provided in

PRTools. Delay caused while processing the signal is added into the processing time for real-time simulation explained in section 7.5. This delay is added into the real-time calculation at each stage of processing the signal. Power line interference is removed using wavelet-transform function available in MATLAB by passing Db6 as the prototype wavelet. Fig. 7.2 shows the structure of denoising module.

ECG Denoising Module

Input Output ECG Samples ECG denoising Clean ECG

Figure 7.2 ECG denoising module 7.5.2 Feature-extraction

ECG samples obtained from denoising phase are input to feature-extraction module. Feature-extraction is performed using wavelet-decomposition. The feature-

181

vector for a beat consists of amplitude and duration of all waveforms and baselines.

Wavelet-decomposition is done using function available in MATLAB library that extracts detail coefficients using Db6 wavelet. R-peak is chosen using maximum of all the coefficients. Ron (start of R-wave) and Roff (end of R-wave) are chosen using zero crossing method in MATLAB. The overall scheme is illustrated in Fig 7.3.

7.5.3 Embedded waveform analysis

Next phase analyzes feature-vectors, collected from each beat, to detect embedded waveforms by taking two inputs: 1) feature-vectors for beats, and 2) average feature- values. For a missing P-wave, area of QRS-complex is calculated by a MATLAB-function based on Simpson’s rule, and the P-wave features are added to the feature-vector if embedded waveform is detected.

Feature Extraction Module Outputs for n

beats Rpeak, Ron Roff

Qpeak, Qon, Qoff Input … Denoised ECG Feature Speak, Son, Soff Extraction

Ppeak, Pon, Poff

Tpeak, Ton, Toff

ISO1, ISO2, ISO3 Beat1_featvec

Figure 7.3 Feature-extraction module

182

7.6 Bivariate Distribution Markov Model (BGMM)

BGMMs for each subclass of arrhythmia and premature beats are constructed in training phase for both arrhythmia classification and irregular beat analysis. To construct

BGMM, vectors of beats with features corresponding to each subclass from various records from MIT-BIH database [21, 24] are required as input. To construct a Markov model with bivariate distribution, discrete-time Markov-chain object framework provided by

MATLAB is used. The framework provides basic tools for modeling and analyzing discrete-time Markov-chains.

Embedded Waveform Detection Module

Input Outputs Beat_featvec Updated Embedded Beat_featvec Average waveform features Ppeak, Pon, Poff

Figure 7.4 Embedded waveform detection module

BGMMs for each subclass are constructed in the form of multidimensional matrices using MATLAB libraries. A MATLAB-function is used to create multidimensional matrices of dimension (8 × 8 × 2) corresponding to the transition matrix of BGMM [21].

Each state of the matrix stores two values: 1) bivariate Gaussian distribution of amplitude and duration, and 2) transition probability to other states. The bivariate distribution is calculated using a MATLAB-function. Transition probability for 20 beats is calculated using frequency-analysis. The overall scheme is illustrated in Fig. 7.5.

183

BGMM Module

Input Outputs for n beats BGMM Vector of beats with BGMM1……BGMMn features construction

Figure 7.5 BGMM construction module 7.7 Implementing arrhythmia classification

Arrhythmia subclassification includes following modules executed sequentially in dynamic phase: waveform preprocessing as described in section 7.5, GMM-based classification, BGTG construction and graph matching. Fig. 7.6 shows the input and output for each module. Next subsections include detailed discussion for each module.

Updated feature vectors

GMM classification

Vector of label beats BGTG construction BGTG matrices Graph matching

Updated feature vectors

Figure 7.6 Preprocessing stage for updated feature-extraction

7.7.1 GMM parameter using EM

In the next phase of signal analysis, a function is implemented to estimate GMM parameters using EM (Expectation Maximization) algorithm. The input to the function are the updated feature-vectors after embedded waveform detection phase. The output of the function is a vector of beats with labels for either of the three classes: N (normal), V

184

(ventricular) or S (supraventricular). The module structure is shown in Fig. 7.7.

GMM Based classification Module

Outputs for n beats Input GMM Beat_featvec parameter Beat_primaryclass_vec estimation using EM

Figure 7.7 GMM based classification module

Twenty-two features were reduced to eight features using PCA function in PRTools library. Number of components for GMM were obtained using a function in PRTools for

GMM. The function implements EM algorithm by calculating covariance of all eight features for a maximum of 100 iterations.

7.7.2 Bivariate Gaussian Distribution Transition Graph (BGTG)

This phase takes vector of classified beats and feature-vectors of beats as input.

Output of this phase is set of BGTGs. To construct a graph with bivariate distribution, discrete-time Markov-chain object framework provided by MATLAB is used. Each

Markov-chain is modeled using a transition-matrix independent of initial and final states.

Each BGTG is constructed using a group of beats as shown in Fig 7.8.

Like BGMM, A MATLAB function is used to create multidimensional transition- matrices of dimension (8 × 8 × 2) that are used to store a BGTG. Transition probabilities for 20 beats are calculated using frequency analysis.

185

BGTG Module

Input Outputs for n beats Beat_featvec BGTG

construction bgtg1……bgtgn Beat_primaryclass_vec

Figure 7.8 BGTG construction module

7.7.3 Graph matching for labeling

For matching BGTG with a subset of BGMMs, object-functions available for

Markov-chain framework objects in MATLAB are used. Three functions are executed sequentially for performing three tasks: 1) identifying candidate BGMMs; 2) identifying most probable path (MPP); and 3) graph matching by forward algorithm. These three tasks are performed sequentially using three modules.

Candidate BGMM Module Input Outputs for n BGTG

bgtg1……bgtgn Identifying Denoised bgtg1……bgtgn candidate BGMMs SUB1 for bgtg1 S = [BGMM1… BGMMn] SUB2 for bgtg2 …

Figure 7.9 Denoising and BGMM pruning module

First module takes set of BGTGs constructed in previous phase and a set of

BGMMs to match as input as shown in Fig. 7.9. Second modules takes a set of clean

BGTGs with low probability transition removed, and finds out the most probable path in each BGTG by looping over all states. The output of this module are: updated BGTGs

186

with MPPs. (See Fig. 7.10)

Most Probable Path Module Outputs for n BGTGs Input Update Denoised bgtg1……bgtgn MPP bgtg_MPP1… bgtg_MPPn

Figure 7.10 Most probable path module

Third module takes BGTGs with MPP information and the candidate BGMMs as input. The output of this module is the probability for each pair of BGTG-BGMM calculated using maximum likelihood function and decode function available in PRTools.

(See Fig.7.11)

Maximum Probability Module

Input Outputs for n BGTGs

Updated bgtg_MPP1……bgtg_MPPn Maximum Probability (bgtg_MPP1, BGMM1) Probability (bgtg_MPP1, BGMM2) n Probability SUB1 for bgtg_MPP1 ... SUB2 for bgtg_MPP2 …

Figure 7.11 Maximum probability module

7.8 Implementing premature beat and irregular beat-pattern classification

Premature beat classification modules includes: Denoising, feature-extraction, integrated rule-based and embedded waveform analysis, BGTG construction, graph matching and beat-pattern analysis modules as shown in Fig. 7.12.

187

ECG samples from WFDB

Denoising Denoised samples

Feature extraction Feature vectors

Integrated rule based analysis

and embedded waveform

Updated feature vectors Vector of labeled beats BGTG construction Set of multidimensional matrices

Graph matching of BGTGs

BGTGs classified as subclass of

Beat pattern analysis premature beats

Beat pattern with premature beat subclass

Figure 7.12 Premature beat and beat pattern classification modules Implementation of denoising, feature-extraction, BGTG construction, and graph matching module is similar to arrhythmia classification modules. In the following subsections implementation of integrated rule-based and embedded waveform analysis and beat-pattern analysis modules are discussed.

7.8.1 Integrated rule-based and embedded waveform analysis

The input for this module are: 1) feature-vectors for the beats containing amplitude and duration of waveforms and baselines; 2) RR-intervals for the detected beats; and 3) average features for embedded waveforms. The output of this module is detected 188

embedded waveforms and set of labeled beats as either P (premature) or R (Regular sinus) beats as shown in Fig 7.13.

Integrated Rule Based and Embedded Waveform Analysis Module

Input Outputs for n beats

Beat_featvec Updated Beat_featvec Integrated Average features analysis Beat_primaryclass_vec

RR_interval_vec

Figure 7.13 Integrated rule based and embedded waveform analysis module

To detect prematurity of observed beat; group of four sequential beats containing three RR-intervals are analyzed. To detect embedded waveform, a function in MATLAB is used to calculate area of T-wave using Simpson’s rule. If area is greater than threshold, then embedded waveform is detected and average features are assigned. The feature vectors for the beat are updated to include embedded waveforms. New feature vector for beats is constructed to label beat either P or R based on the rules.

7.8.2 Irregular beat-pattern analysis

Irregular beat-pattern classification module takes input in the form of a vector of classified beats as normal (N) or premature beat (PAC, B-PAC, PJC or PVC). The output of the module is beat-pattern vector for a group of beats with five fields: (pattern observed

P, premature beat type T, start index s, end index e, count c). For the implementation of

189

beat-pattern analysis, a structure array in MATLAB is used. The implementation modules for beat-pattern classification is shown in Fig 7.14.

Irregular Beat Pattern Analysis Module

Input

Beat_finerclass_vec Outputs for n beats Pattern Struc_beat1 (P, T, s, e, c) Pattern for bigeminy, trigeminy, quadrageminy analysis

Figure 7.14 Irregular beat pattern analysis module

7.9 Implementation of GPU accelerated algorithms

Algorithms to accelerate real-time response using GPU have been implemented using C and imported on MATLAB platform before implementing on GPU. GPU has been used for accelerating algorithms using parallel computing toolbox in MATLAB [22], which supports:

1) CUDA-enabled NVidia GPUs, where codes for GPU and CPU are written in C and imported in MATLAB [75], and

2) GPU use directly using MATLAB functions such as linear algebra operations, signal processing, statistics, and machine learning applications. It also includes the integration of

GPU kernel function written in C with CUDA applications in MATLAB [74, 75].

For the implementation of parallel algorithms, an integration of CUDA enabled functions in MATLAB has been used for signal processing tasks, and executing C code files on GPU directly from MATLAB. The kernel is represented in MATLAB by 190

CUDAKernel object, which operates on MATLAB array and matrices using built-in functions.

The workflow for implementation of algorithms is as follows:

1. Code to be executed on GPU is written in C with an extension .cu,

2. Code written in C is compiled in Matlab using nvcc compiler and it is called PTX

(parallel thread execution) code.

3. Once C code is compiled in Matlab, compiled PTX code is used to create

CUDAKernel object, which contains GPU executable code.

4. Properties are set on CUDAKernel object to control its execution on GPU. This includes properties such as the number of threads, size of grid, type of memory used, etc.

5. Function feval is used to evaluate CUDAKernel on GPU with required inputs such as multidimensional matrices to store BGTGs, ECG samples or feature vectors [75].

7.10 Real-time simulation

To acquire results based on real-time analysis of ECG, we performed real-time simulation of ECG signal in Physionet datasets using WFDB software library and

LightWAVE [78]. Using LightWAVE, number of samples acquired per second from a patient are viewed and recorded. For executing algorithms in real time, input files named primary was maintained. The information obtained from LightWAVE is used to load the same number of samples observed per second with annotations from experts into a primary input .mat file (file to store ECG samples in MATLAB format) before analyzing signal.

191

Fig. 7.15 shows an instance of 10 seconds of ECG signal observed in LightWAVE for a record 16795 collected from MIT-BIH dataset [21, 24]. The left corner shows the beginning time of record and right shows the end timing of the record.

Figure 7.15 Real time simulation of ECG in LightWAVE

Real time simulation of accessing samples of ECG signal was achieved using timer class available in MATLAB. For the classification of ECG signal, the first module collects patient-specific information using first 30 seconds of patient’s data as explained in chapter

4. As the initial window analysis module is executed, simultaneously, next samples of 10 second are collected in primary file by adding it to a queue of samples. Consecutive samples for next window to be analyzed are obtained from the primary file after execution of initial window analysis.

The classification process is defined by two basic operations of reading the samples and classifying the samples, which are executed in a loop until all the samples are read.

Reading of the samples depends on the number of samples observed in one second in

LightWAVE. Next set of samples (next_samp) are collected and added to queue in primary 192

file as the current samples (curr_samp) are being analyzed. Classification of samples depends on: 1) availability of samples, and 2) completion of previous iteration of subclassification. Each iteration of arrhythmia subclassification execution checks for these two criteria every 0.1 seconds. An object of timer class in MATLAB is used to schedule criteria check using a fixed delay of 0.1 seconds. The delay check in every 0.1 seconds for the availability of the next set of samples in primary file and curr_samp are finished with its execution. If both the criteria are met, then next iteration of next_samp is executed.

Figure 7.16 Real time simulation timeline

Fig. 7.16 shows the timeline of simulation of real-time execution using timer object.

The green boxes depict reading of the samples, and blue boxes depict the execution of samples. Grey box illustrates that curr_samp has not finished execution.

193

7.11 Chapter Summary

In this chapter, implementation details for arrhythmia, premature beat, irregular beat-pattern classifications were discussed. Implementation details include software libraries, object oriented design in Matlab for various classification modules. Machine configurations used for implementing the modules and database resources used for training and testing classification results are discussed. Implementation model for acceleration of the code using GPU for real-time classification is also described. Real time simulation method of ECG signal to obtain results in real time is discussed. In next chapter, classification results obtained using the implementation model described in this chapter is discussed.

194

Performance Evaluations

In this chapter, results obtained using bivariate Gaussian distribution Markov models (BGMM) method described in previous chapters for finer subclasses of arrhythmia, and premature beats are discussed. Performance obtained using look-ahead method for irregular beat-pattern identification and classification based on ectopic focus are also described. The performance is evaluated in terms of sensitivity, specificity, accuracy, and positive predictive value. Accelerated algorithm implementation using GPU are also discussed with speed comparison obtained by using GPU and CPU. Machine configurations for CPU, GPU, and datasets used to obtain the results are discussed in chapter 7.

8.1 Arrhythmia classification

Real-time arrhythmia classification includes: Denoising and feature-extraction module, embedded wave detection or resolution of P-wave, GMM-based classification of beats, bivariate Gaussian distribution transition graph (BGTG) construction and BGTG-

BGMM graph matching to classify group of 20 beats.

For each classification module, 60% of the classified signals were used for training.

The remaining 40% was used for testing the accuracy of classification at each stage. The distribution of True Positives (TP) and True Negatives (TN) is 55% and 45% respectively.

195

Three different datasets, obtained from Physionet [21, 24], were used for the analysis:

1) MIT-BIH arrhythmia dataset [21] that included 48 patients’ half hour ECG recordings for ventricular (V) and supraventricular (S) subclasses;

2) Creighton University Ventricular Tachyarrhythmia Database [77] which includes 35 eight-minute ECG recordings of human subjects who experienced episodes of sustained ventricular tachycardia, ventricular flutter, and ventricular fibrillation; and

3) MIT BIH normal sinus rhythm (nsrdb) dataset [24, 77] which included 18 long term patients without any significant arrhythmia for testing accuracy of normal (N) beat classified.

Each ECG signal was sampled at 360Hz frequency to achieve classification and clustering. We obtained about 500,000 samples to be used for training and testing.

8.1.1 Level 1 classification using GMM

Table 8.1 summarizes results obtained for GMM-based classification where parameters were estimated using EM method.

Table 8.1 Result of GMM-based classification using EM clustering

Sensitivity (Se %) Specificity (Sp %) Accuracy (Acc %) Normal 99 98.6 98.3 Supraventricular 98 96.9 97.9 Ventricular 98.6 96.8 99.1 Fig. 8.1 illustrates the two dimensional projection of eight dimensional clusters obtained using GMM for record 233 from MIT-BIH dataset [21, 24]. In eight dimensions, the clusters are well separated, and there is no overlap.

196

The visualized clusters in Figure 8.1 are obtained by plotting eight features using multidimensional scaling in MATLAB by grouping eight features into two groups to show maximum separation [34]. Feature set 1 includes: R-wave amplitude, duration, Current and previous RR-intervals. Feature set 2 includes P-wave amplitude, duration and T-wave amplitude, duration. The clusters were obtained after five iterations which showed the maximum cluster separation while maintaining convergence parameter.

Figure 8.1 Result of GMM-based clustering

8.1.2 Level 2 classification using Markov model approach

Table 8.2 summarizes the result for finer supraventricular subclassification.

Sensitivity for AVNRT and AFlu is less than 95%. Lower sensitivity obtained for AVNRT is due to the misclassifications into AFlu due to similar transition probabilities of QRS- complex for both subclasses [2, 5]. The morphology similarity due to reentrant circuit present in AFlu and AVNRT are reflected in similar amplitude and duration for R-wave

197

and S-wave. EAT is sometimes misclassified into AFlu due to the negative P-waves and the missing Q-wave. The sensitivity obtained for JTachy is lower due to misclassification into EAT when ectopic focus of EAT is closer to AV junction.

Table 8.2 Result for supraventricular classification

Sensitivity (Se %) Specificity (Sp %) Accuracy Without P- With P- Without P- With P- (Acc %) wave wave wave wave resolution resolution resolution resolution AFib 90.2 95.8 92 95.3 95.1 AFlu 91.2 94.1 93 94.2 94.5 AVNRT 90.3 94.0 93.2 96.0 96.2 JTachy 90.2 95.2 92.2 96.1 96.8 EAT 92.3 95.6 90.8 95.2 96.0

The accuracy for identifying subclasses were improved by at least 2% by resolving

P-wave into four states in BGMMs and BGTG. Accuracy of AFlu was increased with P- wave resolution by 2.9%. The improved sensitivity was observed for records in database for patients suffering from AFlu and right atrial enlargement (RAE) correctly identified by detecting increased positive amplitude of P-wave [56]. Similarly improved sensitivity for

EAT was observed for patients with RAE correctly identified by notch present in negative

P-wave.

Table 8.3 summarizes result of ventricular subclassification. Sensitivity for VFlu was lower due to misclassification into VFib and VTach due to similar morphology of

QRS-complex and missing P-wave and T-waves.

198

Accuracy of subclassification was improved by recognizing embedded P-wave in

QRS-complex by at least 2%. VFlu is a subclass of ventricular arrhythmia that is preceded by VTach and degenerates into VFib. It is short-lived and difficult to identify. Sensitivity for VFlu and VFib increases after correctly identifying the embedded P-waves.

Table 8.3 Result for ventricular classification

Sensitivity (Se %) Specificity (Sp %) Accuracy Without With Without With (Acc %) embedded P- embedded embedded embedded wave P-wave P-wave P-wave VTach 93.1 95.2 92.2 94.3 95.6 VFlu 91.3 94.6 93.5 97.0 96.3 VFib 94.5 96.0 97.5 98.0 97.2

8.2 Premature beat and irregular beat-pattern classification

To detect P-waves embedded in T-waves observed for PAC and B-PAC, 3194 beats were analyzed, and achieved 97.9 % sensitivity using area subtraction. To detect embedded R-waves (in the case of PVC), 2245 beats were analyzed, and achieved 98.8% sensitivity. Table 8.4 shows the results. ‘TP’ denotes true positive. ‘TN’ denotes true negative. ‘FP’ denotes false positive. ‘FN’ denotes false negative. TN refer to beats with increased T-waves without P-wave or R-wave embedded in it.

Table 8.4 Result for embedded wave detection

Beats TP TN FP FN Se. % PPV % Acc. % P-wave 1599 1560 22 1 12 97.6 97.5 97.9 R-wave 896 876 10 5 5 98.9 99.4 98.8

199

For classifying type of premature beats, 1000 beats for each type of premature beat were analyzed. Table 8.5 shows the results obtained. For few records with B-PAC beats,

P-waves with lower amplitude were embedded in T-waves. Due to the lower amplitude of

P-waves, area of T-wave was not affected significantly to become detectible. This led to higher number of false negatives obtained for embedded P-wave detection.

Table 8.5 Result for premature beat classification

Heartbeats TP FN FP TN PAC 1000 940 5 5 50 B-PAC 1000 892 20 8 80 PJC 1000 903 7 7 83 PVC 1000 958 5 3 34

Table 8.6 shows the classification accuracy for premature beat subclassification.

Both specificity and sensitivity are quite high for all the four classes. Positive predictive value (PPV) for PVC in our technique is 99.2% that is much higher than the PPV value of

86.5%, 90.2% for PVC obtained by interval analysis alone by other researchers [43, 44].

Table 8.6 Result for classification accuracy of premature beats

Rule-based analysis + BGMM Class Se. % Sp. % PPV % Acc. % PAC 99.5 93.4 99.7 99.5 B-PAC 97.8 94.3 99.1 97.2 PJC 99.2 92.2 99.2 98.6 PVC 99.5 91.9 99.7 99.2

200

The lower value observed for specificity of PJC and PVC is indicative of higher false positives. Since the location of ectopic node in PJC and PVC was closer for few cases, similar morphology of R-wave and T-wave in terms of amplitude and duration is exhibited for these subclasses leading to misclassification of PJC into PVC.

Similarly, due to closer location of ectopic node in PJC and PAC leads to similar morphology of P-wave (lower amplitude), and ISO2 (decreased duration). This led to higher false positives due to misclassification of PJC into PAC.

8.2.1 Irregular beat-pattern classification

For classifying ECG signal as bigeminy, quadrigeminy or trigeminy based on ectopic location, arrhythmia datasets from MIT-BIH [21, 24] was analyzed. Ventricular patterns achieved sensitivity of 98.4%, 96.9% for atrial patterns and 95.3% for junctional patterns. Table 8.7 shows the results.

Table 8.7 Result for irregular beat-pattern classification

Type Pattern Se. % Sp. % Acc. % Bigeminy 98.2 93.3 98.0 PAC Trigeminy 98.7 92.0 97.3 Quadrigeminy 98.9 93.03 96.2 Bigeminy 99.3 92.6 98.9 PJC Trigeminy 98.8 92.4 98.6 Quadrigeminy 98.3 91.1 95.8 Bigeminy 99.0 92.0 98.5 PVC Trigeminy 98.8 92.7 97.6 Quadrigeminy 98.4 92.6 97.0

201

The accuracy obtained for irregular beat-pattern depends on detection of subclasses of premature beats. Accuracy for PAC quadrigeminy and PJC quadrigeminy is lower due to misclassification of few PJC beats into PAC beats which increased the number of false negatives. Similarly due to misclassification of PVC beats with PJC beats, PVC beats were misclassified into PJC beats which reduced specificity of PVC beats.

8.3 GPU-based acceleration

In this subsection, speedup comparison of GPU-based acceleration for arrhythmia subclassification, premature beat subclassification is described. The overall execution efficiency and speed up is tested using single core CPU and 768 CUDA cores at the module level. The effect of memory utilization of different types of memory on overall speedup is also tested.

Based on limitations and advantages of each memory, two approaches are analyzed to exploit data parallelism:

Approach 1: combination of constant memory and global memory; Approach 2: combination of shared memory and constant memory. In the first approach of memory utilization, constant memory was used for multiple access of data that doesn’t need modification during execution of kernel function. Due to the read-only access during kernel execution, concurrent modules that required dynamic data was stored in global memory.

In the second approach, faster shared memory within a single block was used as a read/write memory during kernel function execution. However, due to its limited size,

202

read-only multiple access data was stored in constant memory. The execution times of different modules are based on the analysis of window of 30 seconds observed in real time containing 120 beats per execution for 500 iterations. The comparison of real time response at the module level does not include initial window analysis of 30 seconds for each patient since this module is implemented on CPU for both sequential and concurrent approaches.

8.3.1 Arrhythmia

Table 8.8 shows speedup comparison for arrhythmia classification at the module level. The experiment was run for 20 beats/BGTG for a window of 30 seconds analyzed in real time. Average time taken to execute BGMM approach on CPU for arrhythmia classification is around 7 seconds. Average time taken to execute the modules concurrently using GPU is around 1.7 seconds. The overall speedup is 4.3 for arrhythmia classification.

After GPU implementation finishes in 4.3 seconds, GPU remains idle for the next 25.7 seconds while CPU collects next 30-second window in real-time. This idle time can be utilized to analyze other abnormalities working on same ECG data.

Table 8.8 Speedup comparison for arrhythmia classification

Module Single CPU (ms) Concurrency using GPU (ms) Speedup Preprocessing 950 503 1.8 GMMclassification 300 136 2.2 Embedded wave 200 102 1.9 BGTG construction 2800 489 5.7 Graph Matching 3200 492 6.5 Total time 7450 1722 4.3 203

The speed up comparison for arrhythmia classification using CPU and GPU is shown in Fig. 8.2. The sequential execution increased linearly as the number of BGTGs increased. The speed up with the combination of shared memory and constant memory

(approach 2) is more than the speed up using a combination of constant memory and global memory (approach 1). This difference is expected because the shared memory is a cache memory with a low latency period. One more interesting result was observed. The concurrent approach increased linearly up to six BGTGs. After six BGTGs, the execution time became constant possibly due to additional automated allocation of CUDA resources or SMs after six BGTGs. Thus, additional overloading of GPU is automatically compensated by additional allocation of resources to exploit concurrency. This is a useful feature for exploiting other abnormalities of ECG without increasing execution time.

Figure 8.2 Effect of memory optimization on speedup for arrhythmia

204

Classification accuracy

Table 8.9 shows the accuracy of arrhythmia subclassification implemented using a combination of shared memory and constant memory. Both specificity and sensitivity obtained are same for all the subclasses of arrhythmia using both sequential approach and

GPU-based concurrent approach.

Slightly lower sensitivity and specificity observed for few arrhythmia subclasses are due to division of observed samples for feature-extraction stage in GPU-based approach.

In this stage, the sample division on different blocks to extract R-wave without any knowledge of waveforms or beats leads to missing a beat in a window.

Table 8.9 Accuracy of arrhythmia subclassification

class Subclass Se. % Sp. % Se. % Sp. % Sequential GPU-based concurrent approach approach AFib 95.8 95.3 95.6 95.4 AFlu 94.1 94.2 94.0 94.3 Supraventricular AVNRT 94.1 94.0 94.1 94.0 EAT 95.6 94.0 95.6 94.0 VTach 95.2 94.3 94.9 94.0 Ventricular VFlu 94.6 97.0 94.1 97.1 VFib 96.0 98.0 95.5 97.5

8.3.2 Premature beats

The premature beat classification on GPU was run for one beat/BGTG. Table 8.10

205

depicts speedup comparison obtained by conducting experiment for 30 second ECG samples (approximately 120 beats). Average time taken to execute BGMM approach on

CPU for premature beat classification is around 8.5 seconds. Average time taken to execute the modules concurrently using GPU is around 1.7 seconds. The overall speedup is 4.9 for premature beat classification. After GPU implementation finishes in 1.7 seconds,

GPU remains idle for the next 28.3 seconds while CPU collects next 30-second window in real-time. This idle time can be utilized to analyze other abnormalities working on same

ECG data.

The highest speedup obtained was for graph matching module due to computational complexity of likelihood calculation involved in forward algorithm. Transition graph construction phase also gained speedup of 5.7 since bivariate distribution of amplitude and duration were calculated in this phase. It was observed that the complex instructions performed better on GPU while less data and less complex instructions did not gain much speedup on GPU.

Table 8.10 Speedup comparison for premature beat classification

Module Single CPU (ms) Concurrency using GPU (ms) Speedup Preprocessing 950 503 1.8 Rule-based Analysis 520 230 2.2 Transition Graph 3000 460 5.7 Graph Matching 4000 502 7.9 Total time 8470 1695 4.9

206

Figure 8.3 Effect of memory optimization on speedup for premature beats

Fig. 8.3 shows the speedup comparison of CPU and GPU based on two approaches for memory optimization. The sequential execution increased linearly as the number of

BGTGs increased. The speed up with the combination of shared memory and constant memory (approach 2) was more than the speed up using a combination of constant memory and global memory (approach 1). This difference is expected because the shared memory is a cache memory with a low latency period. After 120 beats, execution time for GPU increased constantly possibly due to additional overloading of GPU automatically compensated by additional allocation of resources to exploit concurrency.

Classification accuracy

Table 8.11 shows the classification accuracy of premature beat classification algorithm implemented using a combination of shared memory and constant memory. Both specificity and sensitivity obtained are same for all the subclasses of premature beats using

207

both sequential approach and GPU-based concurrent approach. Slightly lower sensitivity and specificity observed for few subclasses are due to division of observed samples for feature-extraction stage in GPU-based approach.

Table 8.11 Accuracy of premature beat and beat-pattern classification

Subclass Se. Sp. Se. Sp. Sequential approach GPU-based concurrent PAC 99.5 93.4 99.0approach 92.5 B-PAC 97.7 94.3 97.1 94.3 PJC 99.2 92.2 98.6 91.0 PVC 99.5 91.9 98.6 91.2

8.3.3 Speedup comparison to obtain optimum number of BGTGs

The rationale behind choosing the window with number of samples analyzed in real-time for concurrent approach is based on speedup obtained for each module. The simplest solution to obtain maximum speed up is by analyzing maximum number of samples possible for concurrent approach. However, as the number of samples read in a window are increased, GPU idle time is increased and in turn response time of classification is increased. On the other hand, if the number of samples analyzed in real- time are decreased below a limit, then GPU doesn’t perform as well as CPU for small number of samples. Hence, the optimum number of samples in a window are chosen by obtaining speedup for each module in arrhythmia and premature beat classification.

In this section, performance comparison of CPU and GPU based on various number of samples for each module of arrhythmia classification is presented. Performance is

208

measured in terms of time required to analyze number of samples for each module for CPU and GPU. All ECG signals are sampled at 360 Hz (360 samples per second). Since classification is focused on arrhythmia with faster heartbeats; each patient is observed with

120-250 bpm (beats per minute). One beat is composed of approximately 180 samples.

Group of twenty beats that form BGTG is composed of approximately 3600 samples. For implementation on GPU, six BGTGs were analyzed concurrently as explained in chapter

6. The optimum window of BGTGs to be analyzed concurrently was chosen such that each module gains speedup of at least 2.5.

In this subsection speedup obtained for each module is compared by analyzing number of BGTGs ranging from 1 to 10; implemented concurrently on GPU and sequentially on CPU using single core. Similar analysis was performed for premature beat classification modules to choose window of 30 seconds.

ECG denoising and preprocessing

As can be seen from Fig. 8.4, as the number of samples increased, time taken by

CPU increased linearly and GPU performed better for higher number of samples.

For 3600 (20 beats) samples analyzed, CPU required only 100ms to execute while

GPU required 260ms. This is because of limitation of GPU for smaller data and relatively few computationally intensive operations. For small number of samples, data transfer between GPU and CPU increases which results in increased time for execution. As the

209

number of samples increased, GPU performed better and CPU performed in linear increment.

Figure 8.4 Speedup comparison of ECG denoising and preprocessing

Table 8.12 Execution time for denoising and preprocessing module

Number of Number of CPU Time (ms) GPU Time (ms) BGTG ECG samples 1 3600 100 260 2 7200 200 265 3 10800 400 300 4 14400 600 400 5 18000 800 436 6 21600 950 503 7 25200 1200 515 8 28800 1362 520 9 32400 1493 536 10 36000 1605 553

210

Table 8.12 compares time required to execute ECG preprocessing module for different number of ECG samples. CPU performed better for up to 7200 samples. GPU performed better as the number of samples increased.

Embedded wave detection

Since the number of complex operations for embedded wave detection is even smaller, CPU performed better for up to 18000 samples (100 beats).

Figure 8.5 Speedup comparison of embedded waveform detection

When the number of samples reached 21600, GPU performed better by distributing area calculation task of 20 beats on separate SMs. While CPU showed linear increment in time as number of samples increased. Fig. 8.5 shows speedup comparison for embedded wave detection using CPU and GPU. Table 8.13 shows speedup comparison for different number of samples.

211

Table 8.13 Execution time required for embedded wave detection module

Number of Number of CPU Time (ms) GPU Time (ms) BGTG ECG samples 1 3600 5 75 2 7200 10 80 3 10800 22 83 4 14400 44 90 5 18000 90 92 6 21600 200 94 7 25200 403 106 8 28800 815 159 9 32400 1520 240 10 36000 3053 302

BGTG construction

BGTG construction involves analyzing patient’s ECG and creating one BGTG for group of 20 heartbeats. Construction of BGTG requires transition graph construction and bivariate normal distribution calculation for waveform amplitude and duration.

As can be seen from Table 8.14 and Fig. 8.6, GPU performs better from single

BGTG construction. But speedup gain is not more than 0.5 for BGTG construction of 2

BGTGs. GPU performs better when number of samples are increased up to 18000 with speedup gain of 2.7. For six BGTGs speedup gain is 2.9. Increasing number of BGTGs constructed concurrently increases the gain linearly with the speed gain of 0.2 for adding one BGTG.

212

Figure 8.6 Speedup comparison for BGTG construction

Table 8.14 Execution time for BGTG construction module

Number of Number of CPU Time (ms) GPU Time (ms) BGTG ECG samples 1 3600 200 300 2 7200 308 450 3 10800 639 480 4 14400 1289 520 5 18000 1496 560 6 21600 1700 600 7 25200 1920 649 8 28800 2156 702 9 32400 2369 768 10 36000 2600 850

Graph matching

Graph matching is computationally intensive task as well since it involves likelihood estimate calculation using forward algorithm.

213

Figure 8.7 Speedup comparison for BGTG-BGMM matching

Table 8.15 Execution time for graph matching module

Number of Number of CPU Time (ms) GPU Time (ms) BGTG ECG samples 1 3600 201 360 2 7200 406 490 3 10800 598 500 4 14400 1069 520 5 18000 1596 540 6 21600 2398 560 7 25200 3002 608 8 28800 3809 702 9 32400 4450 806 10 36000 5040 900 For this module, it was observed that GPU started performing better from two

BGTG construction. This was because GPU can distribute task of matching one BGTG with multiple BGMMs on separate SMs; while CPU performs sequential comparison.

214

Fig. 8.7 depicts speedup comparison for different number of samples for graph matching. Table 8.15 shows execution time required for CPU and GPU by considering different number of samples.

8.4 Chapter summary and discussion

In this chapter, performance obtained for arrhythmia, premature beat subclassification using BGMM-based method and irregular beat-pattern classification using look-ahead technique is evaluated. BGMM-based approach classified subclasses of supraventricular arrhythmia with more than 93% accuracy; ventricular arrhythmia with more than 95% accuracy; and premature beats with more than 97% accuracy. Irregular beat-patterns with premature beats were classified with 95% and higher accuracy.

The concurrent algorithms for arrhythmia and premature beat classification were tested on GPU. The speedup obtained for arrhythmia subclassification is 4.3 and speedup obtained for premature beat classification is 4.9. The window of 30 seconds analyzed in real time for classification was chosen based on maximum speedup gain of 2.5, obtained for each module of arrhythmia and premature beat classification.

215

Related Works and Limitations

This chapter discusses the development of current research and compares the result of research techniques presented in this dissertation with new evolving research in the same area that have appeared while this work was in progress. Current research in arrhythmia classification, irregular beat classification and beat-pattern classification have been discussed. Evolving research in accelerating ECG signal classification process using GPU has also been discussed. Some of the limitations observed using different techniques and issues that were not addressed are discussed.

9.1 Arrhythmia Classification

ANN-based techniques are one most commonly used classification techniques for arrhythmia classification. Several researchers have used neural network with its variations such as probabilistic neural network [79, 80] with back propagation, convolutional neural network [81] with multilayer perceptron to classify each beat into different arrhythmia classes.

Kiranyaz [79] presents ECG classification and monitoring system using adaptive

1-D convolutional neural networks (CNN) to classify three different types of beats including normal, supraventricular, and ventricular beats. Each neuron is fed morphological and temporal characteristics including RR-intervals, P-waves and shape of

216

QRS-complex using information about two neighboring beats. The CNN is made patient adaptive by training classifier with first 5-min of each patients’ ECG record. The proposed solution gives high accuracy for 99% for ventricular beats and 97% for supraventricular beats.

However, since the features fed to neuron do not include morphological waveform features pertaining to each subclass of arrhythmia, it does not classify subclasses of finer arrhythmia. Due to this, supraventricular beat classification shows low sensitivity of 60%, which indicates higher value of false negatives. Although inter-beat temporal feature i.e.,

RR-interval is included in classification process, intra-beat temporal features, which include transition probability between waveforms and baselines are not considered. The focus of the work presented in this dissertation is to extract more morphological features pertaining to each subclass of finer arrhythmia and include it in classification approach to obtain finer real time classification.

Rajpurkar [81] presented an algorithm which trains 34-layer convolutional neural network which maps a sequence of ECG samples to a sequence of rhythm classes. The algorithm is claimed to exceed the performance of board certified cardiologists in detecting arrhythmias from ECGs recorded with a single-lead wearable monitor. The model is trained end-to-end on a single lead ECG signal sampled at 200Hz and sequence of annotations for every second of ECG as supervision. They also collected a dataset of about

30,000 unique patients exhibiting abnormal rhythm in order to make the class balance of the dataset more even.

217

However, we worked with limited training dataset available on Physionet [23] and the mentioned research obtained data from wearable monitoring device. This resulted in better training for each arrhythmia subclasses and high sensitivity for ventricular arrhythmia subclasses. For some supraventricular subclasses like EAT, the algorithm confused it with normal beat. Their work also has limitation due to changes in P-wave morphology when EAT is present with atrial enlargement [2]. Since the waveform slopes and other morphological features are not considered during training phase, JTachy is also sometime confused with normal rhythm. Also, neural network won’t be able to perform embedded wave analysis. In this research, we tried to make the most use of available data on Physionet [24] by extracting as many features and utilizing them in most useful way to detect and classify arrhythmia. Our research will also benefit positively by working with larger dataset and continuous monitoring of patient.

Elhaja [82] present representation ability of linear and nonlinear features and proposes a combination of such features in order to improve the classification of ECG data.

In this study, five types of beats are classified using ability of nonlinear features such as high order statistics and cumulants. Nonlinear feature reduction methods such as independent component analysis are combined with linear features, namely, the principal component analysis of discrete wavelet transform coefficients. Classification of these features were tested using SVM classifier and NN classifier. Both achieved 98% classification accuracy. Feature-extraction involved heartrate calculation, RR-interval, location of P-wave with respect to QRS-complex, PR-interval duration and duration of Q-

218

wave. The feature-extraction was done using DWT before classifying the features using probabilistic neural network. Since the features extracted are reduced using PCA and ICA; some subclass specific waveform features might get lost.

Although morphological features were extracted; certain temporal features such as average RR-interval, amplitude of waveforms which is indicative of direction of impulse travel were not analyzed. This limited the classification of ectopic beats to ventricular or supraventricular beats. Classification of arrhythmia into finer subclasses which may not be malignant is required as it can predict an occurrence of serious life-threatening arrhythmia [1, 2, and 5]. In this research, focus was mainly on finding out location of ectopic node that controls impulse travel in heart; which in turn points out to specific arrhythmia subclass.

The primary goal of study presented by Haldar [5] was to classify different arrhythmic beats with reduced set of relevant-only ECG attributes based on Fuzzy C-

Means (FCM) algorithm for real time monitoring. The relevant feature selection was based on multi-dimensional pattern recognition tool known as Mahalanobis-Taguchi System

(MTS). Out of the initial eleven attributes selected, only two to five are selected using

MTS to detect and classify five types of beats using soft clustering method. The accuracy of the technique ranges between 74-88%. Most of the selected features for classification are based on RR-interval and other morphological features and embedded waveform analysis is lacking which leads to higher number false negatives. Clustering has been proven a good classifier for separation of ventricular and supraventricular arrhythmia from

219

normal rhythm [77].

However for finer subclasses, clustering techniques fail to consider transitions within waveforms of single beat and between waveforms of multiple beats. Classification technique presented in this research uses clustering to classify the signal into ventricular, supraventricular and normal rhythm. For finer classification, we use Markov model to consider transitions within and between beats along with finer morphological features.

Table 9.1 Comparison of classification accuracy

Technique Arrhythmia class Accuracy Accuracy % using % BGMM ANN by Kiranyaz [79] Supraventricular 97.0 99.0 Ventricular 98.0 99.0 CNN by Rajpurkar [81] EAT 92.0 96.0 JTachy 90.0 96.8 SVM by Elhaja [82] PVC 95.2 99.2 FCM by Haldar [5 ] PAC 85.3 99.5 PVC 86.9 99.2 RR interval analysis Premature beats 98.1 99.5 Tsipouras [43] Supraventricular 85.7 99.0 Ventricular 83.6 99.0 VFlu 95.8 96.3 Ventricular bigeminy 91.0 98.5 Ventricular trigeminy 71.0 97.6 Fisher discriminant PAC 98.3 99.5 analysis by Elgendi [84]

220

9.2 Premature Beat and Irregular Heartbeat Patterns Classification

Premature beat analysis has been attempted by several researchers [43, 44, 89, and

90]. However, limited research has been conducted on pattern analysis to classify premature beat-pattern with sinus beats [43, 90].

Tsipouras [43] proposed a method for the classification of cardiac rhythms based only on RR-interval signal. The method consisted of four steps: Preprocessing, QRS detection and computation of RR-interval, arrhythmic beat classification and arrhythmia episode detection and classification. Arrhythmia beat-by-beat classification was performed on the RR-interval signal using a set of rules provided by medical experts.

However, none of the temporal or morphological features were considered limiting classification of beats to three type: premature ventricular beat, ventricular flutter and . Although rules based on RR-intervals detect prematurity of beats, the research does not consider cases when R-wave is embedded in T-wave [1, 2, and 3]. Due to embedding of R-wave, amplitude is decreased and the entire beat goes undetected. To overcome this issue, in this research, rules are modified to integrate embedded waveform detection. Also, other types of premature beat detection (PAC, B-PAC, and PJC) is not performed by the researchers, which require morphological and temporal features to be included in the classifier.

Premature atrial contraction (PAC) classification technique presented by Mohamed

Elgendi [84], uses overlap of T-waves with P-waves using a threshold rule-based algorithm

221

and Fisher linear discriminant analysis based on a linear combination of RR-interval, P- wave duration and T-wave duration.

Although the technique reduces the dimensionality of the feature-space by giving smaller variance and good discrimination, it is limited by the number of features that can be considered for dimensionality reduction. It doesn’t separate PACs from B-PACs and

PJCs. These premature beat types are characterized by the presence and the absence of P- waves, transitions of P-wave to ISO2 and duration of ISO2 [2, 3, 5]. Our technique captures waveform transitions by developing a bivariate Markov model and transition graphs for every premature beat, which includes the appropriate features and transitions within waveforms to separate PAC, PJC and B-PAC.

A hybrid classifier is proposed by Muthuvel and Suresh [85] to classify the beats by using three stages: preprocessing, hybrid feature-extraction and hybrid feature classifier.

The proposed technique classifies beats into bundle branch block beats, PAC, PVC and

PJC. The extracted features include PR, PT and ST segments; and amplitudes of waveforms. Tri-spectrum method is also used to extract frequency based features of ECG.

The hybrid classifier uses genetic algorithm to train beats in feed-forward neural network.

The hybrid classifier achieved 91% accuracy for classifying the beats.

Although, both morphological and temporal features were included in proposed hybrid classifier; RR-interval variation was not included. RR-intervals are one of the most significant indicative characteristics of premature beats. Also, embedding of the waveforms for PAC was not considered. Premature beats of B-PAC with embedded P-

222

wave goes undetected.

Irregular beat-patterns of premature beats with sinus beats were analyzed by few researchers using string pattern analysis methods [43, 89, and 90]. This analysis is limited to identifying a pattern of ‘푃 푅’ for bigeminy, ′푃 푅 푅’ for trigeminy, and ′푃 푅 푅 푅’ for quadrigeminy. One of the challenge encountered during automated analysis of irregular beat-pattern is that it requires to look ahead certain number beats and then look back number of beats to ascertain the presence or absence of beat-pattern. For instance, to classify a sequence as bigeminy, two sequential occurrence of ′푃 푅′ are required. This means the beat before P has to be R and beat after R has to be P to classify the pattern as bigeminy.

Study by Tsipouras discussed earlier, analyzed such patterns of premature beats by feeding the pattern deterministic finite state automata (FSA) to classify episode into either ventricular bigeminy, trigeminy or couplet. In FSA proposed by Tsipouras [43], previous beats before premature beat are not considered for sequence classification and achieved

91% sensitivity for ventricular bigeminy and 71% sensitivity for ventricular trigeminy. In pattern analysis algorithm presented in this dissertation, both sequences are considered to classify the irregular beat-pattern by also locating the source of ectopic location.

Another research proposed by N Ikeda [90], studied two types of distribution patterns of P and R beats observed for ventricular bigeminy and trigeminy. The analysis distinguishes the interval between 푃 and 푅 for bigeminy and trigeminy. A modulated parasystole model was assumed and it was modeled as a system of difference equations in

223

a phase of sinus pacemaker’s on ectopic pacemaker (ventricles). The solution of differentiating between bigeminy and trigeminy pattern was presented based on interval between beats. According to the research, when interval is small, bigeminy is preceded by trigeminy and when interval is large, bigeminy is followed by trigeminy. The proposed method did not discuss results obtained using any standard medical databases. Although the pattern identification depends on PR-interval look-ahead method is still necessary to identify second pattern with the same premature beats to detect concealed patterns [56, 57].

9.3 GPU-based classification

Real time processing is very important and critical for analysis of ECG signals.

Real time processing requires faster performance of different modules of signal processing which are computationally expensive and require more execution time for maintain accuracy. Since GPUs can provide remarkable performance gain when compared to CPUs for computationally intensive applications, several researchers have exploited parallel techniques using GPU cores for various subtasks such as processing of the signal using

DSP filters [68], wavelet transform [69, 71]. Some researchers have focused on classification task as a whole on GPU using neural network-based classifier to classify beats to supraventricular and ventricular arrhythmia [68, 69, 70, 71, 87, 88, 91, 92, and 93].

N. Lopes [87] have proposed ventricular arrhythmia diagnosis using parallel implementation of Multiple Back Propagation (MBP) neural networks to reduce computational complexity of neural network implementation. MBP is a generalization of

Back-Propagation (BP) algorithm which is used to train multiple feed forward networks. 224

GPU has been used to increase training speed of multiple feed forward algorithm using

MBP.

However, the research did not use GPU for other subtasks such as feature- extraction, noise removal. GPU was also not optimized for best memory usage to improve performance gain. Their technique is limited to PVC beat detection, and does not address real-time classification but focus is on long term ECG recordings. The approach does not detect embedded P-waves in T-waves which is an important characteristic of PVC beat which in turn lowers the accuracy. The sensitivity obtained using their approach is 94.5%

[87] compared to 98.8% obtained using our approach for detecting PVC beats.

Another neural network-based classification approach with GPU implementation has been proposed by Pengfei Li [88]. The research implemented a parallel general regression neural network to classify the heartbeat into either supraventricular or ventricular beat. Features considered were amplitude and slope of Q, R and S waveforms with slope of ST segment. In their study, GPU was used for data processing modules of the algorithm namely feature-extraction and pattern, summation, output layer of neural networks. However, since their research is based on long term ECG recording, data transfer between CPU and GPU can be frequent. Also, data used by threads and SM can have overhead of data collection from and to global memory. This can be optimized by using constant memory, shared memory or a combination various memories. They obtained sensitivity of 88.0% [88] in detection of PVC beat compared to 99.3% by our approach. In addition, we diagnose all seven major subclasses in real-time.

225

We did not come across with research that considered using GPU for parallelizing classifiers other than neural networks. Some researchers have utilized GPU for denoising and feature-extraction [71, 92]. Domazet [68] has proposed an optimization with shared and constant memory for DSP filter for ECG denoising. Although our goal in this research is much broader than parallelizing single module in ECG signal classification. We also tested our approach with two memory optimization techniques to utilize memory resource in optimal way.

9.4 Limitations

One of the major limitations across various techniques used for signal classification is in the first stage of processing which is feature-extraction. In this research, we concentrated on obtaining as much information possible in this stage by using DWT.

Among the obtained features, we considered morphological and temporal features.

Another limitation observed in this stage was consideration of transition between waveforms and between beats within observed beats. To overcome this limitation, Markov model was chosen as the classifier which considers waveform transitions in the classification process. ECG signal also carries information about waveforms that can be hidden in another waveform. To detect presence and absence of hidden waveforms, area calculation and embedded waveform detection was used.

In terms of accelerating classification process, many researchers focused on one of the subtasks of classification process which involved denoising using either DSP filters

[68] or wavelet transforms [91, 92]. Limited research has been done in accelerating overall 226

classification process in real time using GPU. In this work, real time classification was accelerated using GPU by developing parallel algorithms for all subtasks including denoising, feature-extraction, embedded waveform detection, and classification. Among several challenges faced during this task, one of the most challenging aspect was overhead of accessing data from global memory by threads in different blocks across SMs; which limited speedup obtained to 1.2X compared to CPU time. To solve this issue, memory optimization technique was used to store the data accessed by threads into combination of constant and shared memory. This led to speedup of 4.3X compared to time required to execute on CPU. Although this speedup might not look impressive, in real time classification process of 30 seconds of ECG samples take only 1.7 seconds using GPU compared to 7.5 seconds using CPU.

9.5 This research

This research presents a novel technique, which integrates Markov models with bivariate Gaussian distribution to include both morphological and temporal features of

ECG signal. The integration provides a way to include bivariate distribution of amplitude and duration of each waveform and baseline for each beat with transition within and between beats. Including morphological and temporal features of the signal allows to characterize finer subclasses of arrhythmia and premature beats. A graph matching technique has also been developed that matches Markov model for patients (BGTG) with trained Markov model of each subclass (BGMM) by considering both morphological feature distribution and transition probability within waveforms. 227

The research focuses finer analysis of ECG signal for capturing life-threatening arrhythmia subclasses as well as benign premature beat subclasses, which degenerate into malignant arrhythmia for elderly patients and emergency room scenario. Such situations require faster and accurate classification of the observed ECG signal. To solve the issue of faster classification in real-time without compromising the complexity of algorithms, this research has presented GPU-based accelerated algorithms using NVidia CUDA framework. Previous research has attempted to accelerate various modules of ECG signal processing and analyzing Holter ECG collected for 24 to 48 hour period. This research focuses on the use of GPU resources tactfully with the goal of real time processing of the data with individual patient training for first 30 seconds. In the literature review, no other

ECG classification technique was found which can accelerate all the modules in classification process with the use of GPU.

228

Conclusion

This chapter concludes the work described in this dissertation; discusses limitations of the work; gives some insights about the future work, and future improvement to the current technique.

10.1 Conclusion

This dissertation proposes machine learning techniques that integrate morphological and temporal features of ECG signal to identify and classify arrhythmia, and premature beats into finer subclasses in real-time. Finer subclasses is clinically significant as different subclasses have different risk factors, associated morbidity, and mortality. Besides, different subclasses have different clinical outcomes, and are treated differently. Another factor that motivated this research was the understanding of the location of the ectopic nodes. Currently, cardiac physiologists perform manual study of the ECG to understand the origin of the ectopic nodes for further ambulatory investigation and surgery. Identification of the origin in different chambers of the heart will certainly help their exploratory investigation.

Most of the data available in public databases was single-lead data. The effort to get a large set of twelve-lead data was unsuccessful partly because of the unwillingness of the vendors to part with software data structures that would produce a raw data file that could be used for software development for automated analysis.

229

The existing techniques at the time this research began addressed this need by including amplitude of waveforms and RR-intervals into various machine learning techniques. However, these techniques did not consider intra-beat temporal features of variations, which are important to figure out differences in arrhythmia subclasses with close feature similarity.

The research began with analyzing different techniques and challenges faced to classify arrhythmia into finer subclasses. It was realized that all the temporal features, specifically intra-beat variations, missing waveforms, and superimposed waveforms carry significant weight in the subclassification process. This led to the choice of Markov models in this research to model beats for each subclass of arrhythmia and premature beats. The states of Markov model are considered to be waveforms and baselines; and intra-beat temporal features were considered using transitions between states.

The challenges faced in arrhythmia subclassification process are: supraventricular arrhythmia occurrence with atrial enlargement, variation in the location of the ectopic nodes, and variation in the irritability pattern of ectopic nodes. These variations cause ambiguities, and become the cause of many false-positives and false-negatives.

It also become clear after studying various subclasses that there were significant morphological variations among the ECG of different subclasses such as: 1) saddled waveforms for atrium enlargement; 2) sawtooth waveforms in atrial fibrillation; 3) negative

P-wave for ectopic atrial tachycardia; 4) irregular low amplitude in ventricular fibrillation; and 5) change in amplitude due to waves superimposition. This led me to investigate the

230

integration of morphological features with temporal features.

To include morphological features in the states of the Markov model, bivariate

Gaussian distribution of amplitudes and durations of individual waveforms and baselines were chosen. Each state of the Markov-model was represented by a bivariate Gaussian distribution. The proposed model for each subclass of arrhythmia is called Bivariate

Gaussian Distribution Markov Model (BGMM). The study showed that variation exhibited normal distribution.

Variations in the location and irritability of ectopic nodes cause waveforms to get superimposed, especially if the ectopic nodes are close to AV-junction. The waveforms get superimposed causing embedding of the waveforms. This superimposition causes ambiguity, and increases the number of false-positives and false-negatives. This challenge has been addressed partially by identifying the embedded waves using embedded-wave analysis. The challenge of atrium-enlargement is addressed by introducing P-wave resolution technique.

The idea of real-time diagnosis led to the Markov-model with limited sample space.

However, accuracy is an issue with limited sample space. My experimentation showed that 30 seconds or 120 beats gave at most 2% of error that is within statistically accepted error margin. The limited-space probabilistic transition graph was christened BGTG. The diagnosis became a labeling problem by comparing with already developed Markov models. The final model needed not only subclassification but identification of supraventricular and ventricular arrhythmia. Gaussian mixture model could separate

231

ventricular, supraventricular and normal beats with high accuracy. Thus, the overall model became an integration of clustering + Markov model that integrates morphology and temporality based on the analysis of intra beat variations.

The next step was to identify the premature beats that are precursors to arrhythmia, and sometimes are associated with fatal heart conditions, especially in old age. The subclassification was already available from arrhythmia analysis. However, no research was found that could classify premature beats accurately. This led me to investigate patterns of irregular premature beats with regular sinus beats using a novel look-ahead pattern analysis algorithm, which models a general purpose nondeterministic automata; and classifies the pattern based on location of ectopic focus in the heart.

The subclassification could be done on a real-time in the sequential machine for arrhythmia. However, heart has many other abnormalities such as ischemia, myocardial infarction, electrolyte imbalance. In order to make time for these analysis, and to enhance analysis to three leads in future, the computational time for the arrhythmia analysis had to be improved. Currently available chips for miniaturization use GPUs, and GPUs exploit

SIMT parallelism. GPU architectures have their own limitations in terms memory capabilities, data transfer, and latency times.

Fortunately, beats are the basic unit of analysis that allowed SIMT-parallelism to be exploited at beat level. It turned out that beat-level parallelism was the right amount of granularity. Careful spawning of various tasks improved the execution time by a factor of four creating enough time for other abnormality detections in real-time.

232

Another challenge observed in supraventricular arrhythmia was similarity of a notched P-wave morphology for various subclasses in presence of atrial enlargement. To solve this issue, P-wave was resolved to consider four states based on slope of waveforms.

BGMM was modified to include four states to include a positive and negative notch present in P-wave due to atrial enlargement. This also improved accuracy of classification by over

2% for supraventricular arrhythmia.

The performance analysis shows that subclasses can be separated with very high accuracy ranging from 95% - 99% for different subclasses. GPU-based parallel execution for arrhythmia and premature beat classification achieved speedup of 4.3 and 4.9 respectively. The diagnosis is currently done in a quantum step of every 30 seconds continuously along with data collection with this accuracy. The real-time was a simulated using delay factors while reading the data.

10.2 Limitations

There are still false-positives and false-negatives due to inherent problem in separating the embedded waves from the missing waves partly because P-wave area is quite small compared to T-wave. Cases where ectopic nodes are close to AV-junction and due to random irritability of the ectopic nodes, there are cases of PJC and B-PAC that give nearly the same waveform causing false-positives and false negatives.

The irregular beat-pattern algorithm has not been parallelized since it is an iterative algorithm and the boundaries to divide the sequence of beats for concurrent analysis are not demarcated. 233

Current system is done with only one lead (Lead II). The accuracy can be improved further using three leads analysis.

10.3 Future Work

Currently, researchers have not attempted T-wave inversion, three dimensional ischemia analysis leading to myocardial infarction, and electrolyte imbalance. Another major problem in old age is hyper myopathy that can cause valve deformations that leads to blood flow problems despite proper heart-beat.

The study of T-wave inversion with multiple other morphological features suggest serious conditions such as myocardial ischemia, or pulmonary embolism (blockage in one of the arteries in lung) [1, 2, 93]. These conditions can exist without showing any other abnormality in ECG signal and that makes the diagnosis of T-wave inversion challenging

[2]. In future, I plan to extend this research to include detailed study of T-wave inversion and analyze relation between multiple features such as QT elongation, ST elevation.

Another future work is to perform context sensitive analysis based upon the knowledge of the existing heart condition such as hyper myopathy (thickening of heart muscles) and hypertrophies (enlargement of heart chambers). The change in the structure of heart also deforms the valve structure affects the blood flow. In future, I also intend to integrate MRI images of heart [94, 95, 96, 97, 98], echogram of blood flow [99, 100, 101,

102], and ECG variations. The ischemia is limited to ST-segment analysis. It can be further improved to include T-wave inversion analysis.

234

REFERENCES

[1] D. H. Bennett, Bennett’s Cardiac Arrhythmias. Wiley, 2013.

[2] D. P. Zipes and J. Jalife, Cardiac Electrophysiology: From Cell to Bedside: Sixth

Edition. 2013.

[3] D. P. Zipes, A. J. Camm, J. L. Tamargo, and J. L. Zamorano, “ACC/AHA/ESC 2006

Guidelines for Management of Patients With Ventricular Arrhythmias and the

Prevention of Sudden Cardiac Death. A Report of the American College of

Cardiology/American Heart Association Task Force and the European Society of

Cardiology Com,” J. Am. Coll. Cardiol., vol. 48, no. 5, 2006.

[4] S. A. Hunt, “ACC/AHA 2005 Guideline Update for the Diagnosis and Management

of Chronic Heart Failure in the Adult: A Report of the American College of

Cardiology/American Heart Association Task Force on Practice Guidelines

(Writing Committee to Update the 2001 Guideli,” Circulation, vol. 112, no. 12, pp.

e154–e235, 2005.

[5] T. Garcia and G. Miller, Arrhythmia Recognition - The Art of Interpretation. 2004.

[6] H. A. Guvenir, B. Acar, G. Demiroz, and A. Cekin, “A supervised machine learning

algorithm for arrhythmia analysis,” Comput. Cardiol., vol. 24, pp. 433–436, 1997.

[7] A. H. Tayal, M. Tian, K. M. Kelly, S. C. Jones, D. G. Wright, D. Singh, J. Jarouse,

J. Brillman, S. Murali, and R. Gupta, “Atrial fibrillation detected by mobile cardiac

outpatient telemetry in cryptogenic TIA or stroke,” in Neurology, vol. 71, no. 21,

235

pp. 1696–1701, 2008.

[8] K. Van Laerhoven and B. Lo, “Medical healthcare monitoring with wearable and

implantable sensors,” Proc. 3rd Int. Work. Ubiquitous Comput. Healthc. Appl.,

January, pp. 11, 2004.

[9] F. Guo, Y. Li, M. S. Kankanhalli, and M. S. Brown, “An evaluation of wearable

activity monitoring devices,” Proc. 1st ACM Int. Work. Pers. Data Meets Distrib.

Multimed., pp. 31–34, 2013.

[10] D. Coast and R. Stern, “An approach to cardiac arrhythmia analysis using hidden

Markov models,” IEEE Trans. Biomed. Eng., vol. 37, no. 9, pp. 826–836, 1990.

[11] W. T. Cheng and K. L. Chan, “Classification of electrocardiogram using hidden

Markov models,” Proc. 20th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. Vol.20

Biomed. Eng. Towar. Year 2000 Beyond (Cat. No.98CH36286), pp. 143–146 vol.1,

1998.

[12] S. Osowski and Tran Hoai Linh, “ECG beat recognition using fuzzy hybrid neural

network,” IEEE Trans. Biomed. Eng., vol. 48, no. 11, pp. 1265–1271, 2001.

[13] L. Y. Shyu, Y. H. Wu, and W. Hu, “Using wavelet transform and fuzzy neural

network for VPC detection from the Holter ECG,” IEEE Trans. Biomed. Eng., vol.

51, no. 7, pp. 1269–1273, 2004.

[14] E. D. Übeyli, “ECG beats classification using multiclass support vector machines

with error correcting output codes,” Digit. Signal Process. A Rev. J., vol. 17, no. 3,

pp. 675–684, 2007.

236

[15] R. J. Martis, C. Chakraborty, and A. K. Ray, “A two-stage mechanism for

registration and classification of ECG using Gaussian mixture model,” Pattern

Recognit., vol. 42, no. 11, pp. 2979–2988, 2009.

[16] F. Sufi, I. Khalil, and A. N. Mahmood, “A clustering based system for instant

detection of cardiac abnormalities from compressed ECG,” Expert Syst. Appl., vol.

38, no. 5, pp. 4705–4713, 2011.

[17] P. R. Gawde, A. K. Bansal, and J. A. Nielson, “Integrating Markov model and

morphology analysis for finer classification of ventricular arrhythmia in real time,”

in 2017 IEEE EMBS International Conference on Biomedical and Health

Informatics ( BHI 2017), pp. 409-412, 2017.

[18] P. R. Gawde, A. K. Bansal, and J. A. Nielson “ECG Analysis for Automated

Diagnosis of Subclasses of Supraventricular Arrhythmia,” in Int’l Conf. Health

Informatics and Medical Systems, pp. 10-16, 2015.

[19] P. R. Gawde, A. K. Bansal, J. A. Nielson, and J. I. Khan, “Bivariate Markov Model

Based Analysis of ECG for Accurate Identification and Classification of Premature

Heartbeats and Irregular Beat-Patterns,”, in Proc. IEEE Intelligent Systems

Conference ( IntelliSys 2018), pp. 850-859, 2018.

[20] P. R. Gawde, A. K. Bansal, and J. A. Nielson,“Integrating Markov Model, Bivariate

Gaussian Distribution and GPU based Parallelization for Accurate Re-al-time

Diagnosis of Arrhythmia Subclasses,” in 2018 IEEE Technically Sponsored Future

Technologies Conference, FTC 2018, 2018, to appear.

237

[21] G. B. Moody, R. G. Mark, and A. L. Goldberger, “Physionet: A web-based resource

for the study of physiologic signals,” IEEE Engineering in Medicine and Biology

Magazine, Vol. 20, No. 3, pp. 70-75, 2001.

[22] “The MathWorks - MATLAB ,”Inc., Natick, Massachusetts, United States, 488,

2013, Available at https://www.mathworks.com/.

[23] NVIDIA, “{GPU} {C}omputing {SDK},” 2012. [Online]. Available:

https://developer.nvidia.com/gpu-computing-sdk.

[24] A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G.

Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley, “PhysioBank,

PhysioToolkit, and PhysioNet : Components of a New Research Resource for

Complex Physiologic Signals,” Circulation, vol. 101, no. 23, pp. e215–e220, 2000.

[25] Richard G. Lyons, “Understanding Digital Signal Processing,” pp. 46–50, 2004.

[26] S. W. Smith, “The Scientit & Engineer’s Guide to Digital Signal Processing,

California Technical Pub. San Diego, (1997).”

[27] L. A. Barford, R. S. Fazzio, and D. R. Smith, “An introduction to wavelets,”

Hewlett-Packard Labs, Bristol, UK, Tech. Rep. HPL-92-124, vol. 2, pp. 1–29, 1992.

[28] A. Einstein, R. G. Henao, P. Delicado, J. Mateu, I. I. Frederick Pearson, R. G. Henao,

G. P. H. Styan, C. Ashcraft, R. G. Grimes, J. G. Lewis, M. Franco-Villoria, R.

Ignaccolo, and J. A. Calvin, “An Introduction to Wavelets,” Stoch. Environ. Res.

Risk Assess., vol. 47, no. 2, pp. 129–146, 2015.

[29] D. Sadhukhan and M. Mitra, “ECG noise reduction using linear and non-linear

238

wavelet filtering,” Proc. Int. Conf. Comput. Commun. Manuf., pp. 22–23, 2014.

[30] S. Z. Mahmoodabadi, A. Ahmadian, and M. D. Abolhasani, “Ecg feature extraction

using daubechies wavelets,” Proc. Fifth IASTED Int. Conf., vol. 2, no. 2, pp. 343–

348, 2005.

[31] U. Rajendra Acharya, J. S. Suri, J. A. E. Spaan, and S. M. Krishnan, “Advances in

cardiac signal processing,” Adv. Card. Signal Process., pp. 1–468, 2007.

[32] A. Haldar and S. Mahadevan, “Probability, Reliability and Statistical Methods in

Engineering Design,” p. 321, 2001.

[33] H. Jeffreys, “Theory of Probability,” Theory Probab., 1967.

[34] R. F. Vieira and E. M. B. D. A. Ten, “Speaker Verification Using Adapted Gaussian

Mixture Models,” pp. 1–4, 2014.

[35] W.-H. Steeb, Y. Hardy, and R. Stoop, “The nonlinear workbook: chaos, fractals,

cellular automata, neural networks, genetic algorithms, gene expression

programming, support vector machine, wavelets, hidden {Markov} models, fuzzy

logic with {C}++, {Java} and symbolic {C}++ programs,” 2006.

[36] S. J. Russell and P. Norvig, Artificial Intelligence, A Modern Approach,” 4th edition,

Pearson Education Ltd., 2016.

[37] L. Scharf, Statistical Signal Processing: Detection, Estimation, and Time Series

Analysis, Pearson Education Ltd., 1990.

[38] A. Gelman, Markov Chain Monte Carlo in Practice, CRC Press, 1997.

[39] W. W. Hwu and D. Kirk, Programming Massively Parallel Processors: A Hands-

239

on Approach, Morgan Kaufmann, 2010.

[40] M. J. Quinn, Parallel Programmimng, McGraw Hill, 2003.

[41] A. K. Bansal, Introduction to Programming Languages. Chapman and Hall/CRC

Press, 2013.

[42] J. Nickolls, I. Buck, M. Garland, and K. Skadron, “Scalable parallel programming

with CUDA,” Queue - GPU Computing, vol. 6, no. 2, pp. 40-53, 2008.

[43] M. G. Tsipouras, D. I. Fotiadis, and D. Sideris, “Arrhythmia classification using the

RR-interval duration signal,” Comput. Cardiol., pp. 485–488.

[44] O. Wieben, V. X. Afonso, and W. J. Tompkins, “Classification of premature

ventricular complexes using filter bank features, induction of decision trees and a

fuzzy rule-based system,” Med. Biol. Eng. Comput., vol. 37, no. 5, pp. 560–565,

1999.

[45] R. V. Andreao, B. Dorizzi, and J. Boudy, “ECG signal analysis through hidden

Markov models,” IEEE Trans. Biomed. Eng., vol. 53, no. 8, pp. 1541–1549, 2006.

[46] K. S. Park, B. H. Cho, D. H. Lee, S. H. Song, J. S. Lee, Y. J. Chee, I. Y. Kim, and

S. I. Kim, “Hierarchical support vector machine based heartbeat classification using

higher order statistics and hermite basis function,” Comput. Cardiol. pp. 229–232,

2008.

[47] G. De Lannoy, D. François, J. Delbeke, and M. Verleysen, “Weighted SVMs and

feature relevance assessment in supervised heart beat classification,” Commun.

Comput. Inf. Sci. (CCIS), vol. 127, pp. 212–223, 2011.

240

[48] S. Gradl, P. Kugler, C. Lohmuller, and B. Eskofier, “Real-time ECG monitoring and

arrhythmia detection using Android-based mobile devices,” Proc. Annu. Int. Conf.

IEEE Eng. Med. Biol. Soc. (EMBS), pp. 2452–2455, 2012.

[49] J. A. Nasiri, M. Naghibzadeh, H. S. Yazdi, and B. Naghibzadeh, “ECG Arrhythmia

Classification with Support Vector Machines and Genetic Algorithm,” in Proc. of

Third UKSym European Symposium on Computer Modeling and Simulation, pp.

187-192, 2009.

[50] R. Akbani, S. Kwek, and N. Japkowicz, “Applying support vector machines to

imbalanced datasets,” European Conference of Machine Learning (ECML), pp. 39–

50, 2004.

[51] M. Lagerholm and G. Peterson, “Clustering ECG complexes using hermite

functions and self-organizing maps,” IEEE Trans. Biomed. Eng., vol. 47, no. 7, pp.

838–848, 2000.

[52] Y. H. Hu, S. Palreddy, and W. J. Tompkins, “A patient-adaptable ECG beat

classifier using a mixture of experts approach,” IEEE Trans. Biomed. Eng., vol. 44,

no. 9, pp. 891–900, 1997.

[53] Maglaveras,N., Stamkopoulos,T., Diamantaras,K., Pappas,C., and Strintzis,M.,

“ECG pattern recognition and classification using non-linear transformations and

neural networks: A review,” Int. J. Med. Inform., vol. 52, no. 1–3, pp. 191–208–

191–208, 1998.

[54] P. M. Rautaharju, B. Surawicz, and L. S. Gettes, “AHA/ACCF/HRS

241

Recommendations for the Standardization and Interpretation of the

Electrocardiogram: Part IV: The ST Segment, T and U Waves, and the QT Interval,”

Circulation, vol. 119, no. 10, pp. e241–e250, 2009.

[55] P. Van Leeuwen, B. Halier, W. Bader, J. Geissler, E. Trowitzsch, and D. H. W.

Grönemeyer, “Magnetocardiography in the diagnosis of fetal arrhythmia,” BJOG

An Int. J. Obstet. Gynaecol., vol. 106, no. 11, pp. 1200–1208, 1999.

[56] N. B. Schiller, P. M. Shah, and A. J. Tajik, “Recommendations for Quantitation of

the Left Ventricle by Two-Dimensional Echocardiography,” J. Am. Soc.

Echocardiogr., vol. 2, no. 5, pp. 358–367, 1989.

[57] R. B. Stamm, B. A. Carabello, D. L. Mayers, and R. P. Martin, “Two-dimensional

echocardiographic measurement of left ventricular ejection fraction: Prospective

analysis of what constitutes an adequate determination,” Am. Heart J., vol. 104, no.

1, pp. 136–144, 1982.

[58] B. S. Everitt and A. Skrondal, “The Cambridge Dictionary of Statistics,” J. Chem.

Inf. Model., vol. 53, no. 9, p. 480, 2010.

[59] W. Press, S. Teukolsky, W. Vetterling, B. Flannery, E. Ziegel, W. Press, B.

Flannery, S. Teukolsky, and W. Vetterling, “Numerical Recipes: The Art of

Scientific Computing,” Camberidge Univ. Press, 1992.

[60] K. Peeva H.-J. Vogel R. Lozanov P.N. Peeva, Elsevier’s Dictionary of Mathematics,

Elsevier Science, 2000.

[61] R. J. Prineas, R. S. Crow, and Z.-M. Zhang, “The Minnesota Code Manual of

242

Electrocardiographic Findings,” Minnesota Code Man. Electrocardiogr. Find., pp.

i–xiii, 2010.

[62] J. W. Mason, E. W. Hancock, and L. S. Gettes, “Recommendations for the

Standardization and Interpretation of the Electrocardiogram,” J. Am. Coll. Cardiol.,

vol. 49, no. 10, pp. 1128–1135, 2007.

[63] M. N. Levy, N. Kerin, M. Rubenfire, “Three variants of concealed

bigeminy,”Journal of Electrocardiology, vol. 11, no. 2, pp. 185-189, 1978.

[64] N. Kerin, I. Mori, and M. N. Levy, “Ventricular quadrigeminy as a manifestation of

concealed bigeminy,” Circulation, vol. 52, no. 6, pp. 1023–1029, 1975.

[65] M. H. Lee, M. N. Levy, and H. Zieske, “Role of the compensatory pause in the

production of concealed bigeminy,” Am. J. Cardiol., vol. 34, no. 6, pp. 697–703,

1974.

[66] M. N. Levy, N. Kerin, and M. Rubenfire, “Concealed atrial bigeminy and

trigeminy,” J. Electrocardiol., vol. 11, no. 2, pp. 185–189, 1978.

[67] M. T. . Satria, S. . Gurumani, W. . Zheng, K. P. . Tee, A. . Koh, P. . Yu, K. . Rupnow,

and D. . Chen, “Real-time system-level implementation of a telepresence robot using

an embedded GPU platform,” Proc. 2016 Des. Autom. Test Eur. Conf. Exhib. DATE

2016, pp. 1445–1448, 2016.

[68] E. Domazet, M. Gusev, and S. Ristov, “Optimizing high-performance CUDA DSP

filter for ECG signals,” Ann. DAAAM Proc. Int. DAAAM Symp., vol. 27, no. 1, 2016.

[69] T. J. Jun, H. J. Park, H. Yoo, Y. H. Kim, and D. Kim, “GPU based cloud system for

243

high-performance arrhythmia detection with parallel k-NN algorithm,” Proc. Annu.

Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBS), pp. 5327–5330, 2016.

[70] X. Fan, R. Chen, C. He, Y. Cai, P. Wang, and Y. Li, “Toward Automated Analysis

of Electrocardiogram Big Data by Graphics Processing Unit for Mobile Health

Application,” IEEE Access, vol. 5, pp. 17136–17148, 2017.

[71] S. Cuomo, P. De Michele, A. Galletti, and L. Marcellino, “A GPU-parallel algorithm

for ECG signal denoising based on the NLM method,” Proc. - IEEE 30th Int. Conf.

Adv. Inf. Netw. Appl. Work. (WAINA), pp. 35–39, 2016.

[72] O. Krejcar, D. Janckulik, L. Motalova, and K. Musil, “Real time processing of ECG

signal on mobile embedded monitoring stations,” 2010 2nd Int. Conf. Comput. Eng.

Appl. (ICCEA), vol. 2, pp. 107–111, 2010.

[73] J. Sanders and E. Kandrot, “CUDA by Example: An Introduction to General-

Purpose GPU Programming,” Concurr. Comput. Pract. Exp., vol. 21, p. 312, 2010.

[74] S. Cook, CUDA Programming: A Developer’s Guide to Parallel Computing with

GPUs, Newnes, 2013.

[75] J. Reese and S. Zaranek, “GPU Programming in MATLAB,” GPU Program.

MATLAB, available at https://www.mathworks.com/company/newsletters/articles/

gpu-programming-in-matlab.html.

[76] Physiobank Archieve, “MIT-BIH Arrhythmia Database,” 2015. [Online]. Available:

https://www.physionet.org/physiobank/database/mitdb/.

[77] I. Silva and G. B. Moody, “An Open-source Toolbox for Analysing and Processing

244

PhysioNet Databases in MATLAB and Octave,” J. Open Res. Softw., vol. 2, e27,

2014, DOI: 10.5334/jors.bi.

[78] G. B. Moody, “LightWAVE: Waveform and annotation viewing and editing in

aWeb browser,” Comput. Cardiol. (2010)., vol. 40, pp. 17–20, 2013.

[79] S. Kiranyaz, T. Ince, and M. Gabbouj, “Real-Time Patient-Specific ECG

Classification by 1-D Convolutional Neural Networks,” IEEE Trans. Biomed. Eng.,

vol. 63, no. 3, pp. 664–675, 2016.

[80] J. A. Gutiérrez-Gnecchi, R. Morfin-Magaña, D. Lorias-Espinoza, A. D. C. Tellez-

Anguiano, E. Reyes-Archundia, A. Méndez-Patiño, and R. Castañeda-Miranda,

“DSP-based arrhythmia classification using wavelet transform and probabilistic

neural network,” Biomed. Signal Process. Control, vol. 32, pp. 44–56, 2017, DOI:

10.1016/j.bscc.2016.10.005.

[81] F. A. Elhaj, N. Salim, A. R. Harris, T. T. Swee, and T. Ahmed, “Arrhythmia

recognition and classification using combined linear and nonlinear features of ECG

signals,” Comput. Methods Programs Biomed., vol. 127, pp. 52–63, 2016, DOI:

10.1016/j.cmpb.2015.12.024.

[82] P. Rajpurkar, C. Bourn, A. Y. Ng, P. Cs, S. Edu, A. N. G. Cs, and S. Edu,

“Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks,”

Arxiv, 2017, available at https://arxiv.org/pdf/1707.01836.pdf.

[83] N. A. H. Haldar, F. A. Khan, A. Ali, and H. Abbas, “Arrhythmia Classification using

Mahalanobis Distance based Improved Fuzzy C-Means Clustering for Mobile

245

Health Monitoring Systems,” Neurocomputing, vol. 220, no. 1,. pp. 221–235, 2017,

DOI: 10.1016/j.neu.neucom.2016.08.042.

[84] M. Elgendi, M. Jonkman, and F. De Boer, “Premature Atrial Complexes detection

using the Fisher Linear Discriminant,” Proc. 7th IEEE Int. Conf. Cogn. Informatics

(ICCI), pp. 83–88, 2008.

[85] K. Muthuvel, LP. Suresh, “Hybrid Features and Classifier for Classification of ECG

Signal.” Research Journal of Applied Sciences, Engineering and Technology, vol.

9, no. 12, pp. 1034-50, 2015.

[86] C. Phaudphut, C. So-In, and W. Phusomsai, “A parallel probabilistic neural network

ECG recognition architecture over GPU platforms,” Proc. 13th Int. Jt. Conf.

Comput. Sci. Softw. Eng. (JCSSE), pp. 1-7, 2016.

[87] N. Lopes and B. Ribeiro, “Fast Pattern Classification of Ventricular Arrhythmias

Using Graphics Processing Units,” Proc. IbroAmerican Congress on Pattern

Recognition, pp. 603–610, 2009.

[88] P. Li, Y. Wang, J. He, L. Wang, Y. Tian, T. S. Zhou, T. Li, and J. S. Li, “High-

performance personalized heartbeat classification model for long-term ECG signal,”

IEEE Trans. Biomed. Eng., vol. 64, no. 1, pp. 78–86, 2017.

[89] J. W. Chong, N. Esa, D. D. McManus, and K. H. Chon, “Arrhythmia Discrimination

using a Smart Phone.,” IEEE J. Biomed. Heal. Informatics, vol. 19, no. 3, pp. 815-

824, 2015.

[90] N. Ikeda, K. Takayanagi, A. Takeuchi, N. Mamorita, and H. Miyahara, “Two types

246

of distribution patterns of bigeminy and trigeminy in long-term ECG: A model-

based interpretation,” Comput. Cardiol., vol. 35, pp. 1049–1052, 2008.

[91] M. M. A. Rahhal, Y. Bazi, H. Alhichri, N. Alajlan, F. Melgani, and R. R. Yager,

“Deep learning approach for active classification of electrocardiogram signals,” Inf.

Sci. (Ny)., vol. 345, pp. 340–354, 2016.

[92] M. Fatemi and R. Sameni, “An online subspace denoising algorithm for maternal

ECG removal from fetal ECG signals,” Iran. J. Sci. Technol. - Trans. Electr. Eng.,

vol. 41, no. 1, pp. 65–79, 2017.

[93] U. R. Acharya, H. Fujita, S. L. Oh, Y. Hagiwara, J. H. Tan, and ..., “Application of

deep convolutional neural network for automated detection of myocardial infarction

using ECG signals,” Information Science, vol. 415, pp. 190-198, 2017.

[94] G. Garcia, G. Moreira, D. Menotti, and E. Luz, “Inter-Patient ECG Heartbeat

Classification with Temporal VCG Optimized by PSO,” Scientific Reports, vol. 7,

no. 1, 2017.

[95] E. E. Eddleman and H. V. Pipberger, “Computer analysis of the orthogonal

electrocardiogram and vectorcardiogram in 1,002 patients with myocardial

infarction,” J. Am. Heart., vol. 81, no. 5, pp. 608–621, 1971.

[96] V. Queiroz, E. Luz, G. Moreira, A. Guarda, and D. Menotti, “Automatic cardiac

arrhythmia detection and classification using vectorcardiograms and complex

networks,” Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBS), pp. 5203–

5206, 2015.

247

[97] M. Vozda and M. Cerny, “Methods for derivation of orthogonal leads from 12-lead

electrocardiogram: A review,” Biomed. Signal Process. Control, vol. 19, pp. 23–34,

2015, DOI: 10.1016/j.bspc.2015.03.001.

[98] L. D. Sharma and R. K. Sunkaria, “Inferior myocardial infarction detection using

stationary wavelet transform and machine learning approach,” Signal, Image Video

Process., vol. 12, no. 2, pp. 199-206, 2018.

[99] W. T. O’Neal, Z. M. Zhang, L. R. Loehr, L. Y. Chen, A. Alonso, and E. Z. Soliman,

“Electrocardiographic Advanced Interatrial Block and Atrial Fibrillation Risk in the

General Population,” Am. J. Cardiol., vol. 117, no. 11, pp. 1755–1759, 2016.

[100] Q. Zhang, X. Huang, R. Eagleson, G. Guiraudon, and T. M. Peters, “Real-time

dynamic display of registered 4D cardiac MR and ultrasound images using a GPU,”

Proc. Medical Imaging: Visualization and Image Guided Procedures, vol. 6509, p.

65092D, 2007.

[101] M. Uecker, S. Zhang, and J. Frahm, “Nonlinear inverse reconstruction for real-time

MRI of the human heart using undersampled radial FLASH,” Magn. Reson. Med.,

vol. 63, no. 6, pp. 1456–1462, 2010.

[102] Q. Zhang, R. Eagleson, and T. M. Peters, “Dynamic real-time 4D cardiac MDCT

image display using GPU-accelerated volume rendering,” Comput. Med. Imaging

Graph., vol. 33, no. 6, pp. 461–476, 2009.

248