Data Science and Artificial Intelligence (2)

Approaches to ML (Alpaydin, 2020)

• Supervised – learn to predict a known output – Learns from training data – Evaluated on test data • To avoid “over fitting” • Unsupervised – find naturally occurring patterns or groupings within data • Semi-supervised – mixture of two, with combination of labeled and unlabeled inputs – Algorithms find structure and patterns on their own with help from labeled inputs • Reinforcement learning learns from ongoing data and results, e.g., from ongoing use in a clinical setting (Gottesman, 2019)

WhatIs7 2

1 Tasks of ML

• Classification – predict class from one or more features of data • Regression – predict numerical value from data • Clustering – group items together • Density estimation – find statistical values • Dimensionality reduction – reduce many to few features

WhatIs7 3

Approaches to classification

• k-Nearest Neighbors (kNN) – aim to find category having “closest” number of attributes • Naïve Bayes – derive conditional probabilities that classify into categories • Support vector machines (SVMs) – for binary classification, draw “line” that separates one category from other • Decision trees – develop set of rules that classify into categories • Neural networks – (somewhat) mimic human brain using artificial neurons

WhatIs7 4

2 Neural networks

• Have come to fore as main approach for machine learning with large amounts of data and increased modern computing power (Choi, 2020) – Particular success has been achieved with deep learning, with much internal complexity to networks (Goodfellow, 2016) – Neural networks had been around for many decades, but deep learning successes often attributed to work of Hinton (2006) • Mathematics of neural networks complex, but can understand what they do in context of ML tasks

WhatIs7 5

Anatomy and physiology of neural networks (Taylor, 2017; Yiu, 2019) • Anatomy – Layers – Nodes and weights – connected like neurons • Physiology – Feedforward – processing from input to output • Convolutional neural networks (CNNs) particularly effective for image analysis – Feedback – processing may loop backwards • Sometimes called recurrent neural networks (RNNs), not as much success as CNNs

WhatIs7 6

3 CNNs found to be effective for image classification • CNNs effective because of subnetworks that distinguish individual features (Geitgey, 2016) • Thought to mimic human visual cortex (Taylor, 2017)

WhatIs7 7

Earliest success with images – comparable to clinicians

Early studies • Diabetic retinopathy (DR) (Gulshan, 2016; Ting, 2017) • Histology of cancer (Benjordi, 2017; Yu, 2017) and metastases (Benjordi, 2017) • Tuberculosis (Lakhani, 2017) and pneumonia (Rajpurkar, 2018) • Skin cancer (Esteva, 2017; Haenssle, 2018; Tschandi, 2018)

WhatIs7 8

4 Systematic review and meta- analysis of imaging (Liu, 2019) • Evaluated diagnostic accuracy of deep learning algorithms versus healthcare professionals in classifying diseases using medical imaging • 69 studies with enough data to construct contingency tables – Sensitivity from 9·7% to 100·0% (mean 79·1%) – Specificity from 38·9% to 100·0% (mean 88·3%) • Out-of-sample external validation done in 25 studies, of which 14 made comparison between deep learning models and healthcare professionals in same sample – Pooled sensitivity of 87·0% for deep learning models vs. 86·4% for healthcare professionals – Pooled specificity of 92·5% for deep learning models and 90·5% for healthcare professionals

WhatIs7 9

Clinical prediction beyond images

• Length of stay, mortality, readmission, and diagnosis at two large medical centers (Rajkomar, 2018) • Prognosis in palliative care (Avati, 2018) • 30-day readmission in heart failure (Golas, 2018) • ML-selected variables outperformed expert- selected variables in predicting patient mortality from coronary artery disease (Steele, 2018) • Age and sex determination from retinal images (Poplin, 2018) or EKG (Attia, 2019) • Early risk of chronic kidney disease in patients with diabetes (Ravizza, 2019) • Wide variety of pediatric diagnoses from EHR data at major referral center (Liang, 2019) • Dementia from EHR data up to two years before clinical diagnosis (Wang, 2019) • Potential to transform surgical care by augmenting decision to operate, identification and mitigation of modifiable risk factors, decisions regarding postoperative management, and shared decisions regarding resource use (Loftus, 2020)

WhatIs7 10

5 Other benefits to clinicians

• Automatically charting symptoms from patient-physician conversations (Rajkomar, 2019) • “Weakly supervised” (using clinical diagnoses) interpretation of pathology slides would allow pathologists to exclude 65–75% of slides while retaining 100% sensitivity (Campanella, 2019) • Automated software for clinicians with no coding experience able to achieve state-of-the-art results (Faes, 2019) • Learning clinical alerts to reduce drug prescribing errors and adverse events (Segal, 2019) – 85% confirmed clinically valid – 80% considered clinically useful – 43% caused changes in subsequent medical orders – Alert burden low – 0.4% of all medication orders • Ensemble of ML approaches plus radiologists outperformed any single method for mammography interpretation (Schaffter, 2020)

WhatIs7 11

Results from other deep learning

• Wave forms – Cardiac arrhythmia detection comparable to cardiologists (Rajpurkar, 2017) – EKG interpretation better than conventional algorithm (Smith, 2018) – Arrhythmia detection and classification in ambulatory electrocardiograms (Hannun, 2019) – Detecting hyperkalemia from 2 (of 12) EKG leads (Galloway, 2019) • Genomics – Predicting clinical outcomes from cancer genomic profiles (Yousefi, 2017) – Calling gene variants in sequencing data (Poplin, 2019) – Identifying facial phenotypes of genetic disorders (Gurovich, 2019) • Drug discovery – Retrospective discovery of molecule shown to be effective against a variety of bacteria, with possible prospective benefit of eight more (Stokes, 2020) • Mobile devices – Detect anemia from smartphone pictures (Mannino, 2018) • Social media data – Detecting foodborne illness from Web searching and social media (Sadilek, 2018)

WhatIs7 12

6 Real-world studies of ML and AI in medicine • Most evaluation studies have focused retrospectively and used cleaned, high- quality data sets – Important for developing and validating algorithms • AI and ML must be evaluated prospectively in real-world conditions (Keane, 2018; Stead, 2018; Nsoesie, 2018)

WhatIs7 13

Real-world studies

• Retinal diseases – Diagnosis and treatment decisions for congenital cataracts • High accuracy for diagnosis (98%), risk stratification (93-100%), and treatment suggestions (93%) (Long, 2017) • Accuracy for diagnosis and treatment determination were 87.4% and 70.8%, which were significantly lower than 99.1% and 96.7% than senior consultants but took less time (2.79 min vs. 8.53 min) (Lin, 2019) – Detect previously undiagnosed DR at primary care clinics (Abràmoff, 2018) • Sensitivity 87.2%, specificity 90.7%, imageability rate 96.1% – Use in rural India (Gulshan, 2019) • Sensitivity 88.9%, specificity 92.2%, comparable to manual grading – Use in smartphone (Natarajan, 2019) • Images from 18 of 231 were deemed ungradable • For rest, sensitivity and specificity of referable DR were 100.0% and 88.4%

WhatIs7 14

7 Real-world studies (cont.)

• Algorithm-assisted pathologists demonstrated higher accuracy than either the deep learning algorithm or pathologist alone (Steiner, 2018) – Assistance significantly increased sensitivity of detection for micrometastases (91% vs. 83% alone) – Reduced time compared to pathologist alone for positive (61 vs. 116 sec) and negative images (111 vs. 137 sec) • In colonoscopy – Predicted pathology of detected diminutive polyps (≤5 mm) on basis of real-time comparison with pathologic diagnosis of resected specimen (gold standard) to “detect and leave” (Mori, 2018) • Negative predictive value 94% – Adenoma detection rate improved from 20-30% to 50%, although additional polyps mostly small and benign (Wang, 2019)

WhatIs7 15

Real-world studies (cont.)

• Sepsis surveillance reduced in-hospital mortality and length of stay (Shimabukuro, 2017) • Noncontrast head CT scans acquired at single emergency department for three months found lower sensitivity (98% vs. 87%) and specificity (95% vs. 58%) but comparable negative predictive value (97.9% vs. 98.5%) in “real-world” setting (Lee, 2018) • ML system better able to detect blind spots in EGD than human endoscopists (Wu, 2019) • ML better than physicians for chest x-rays of major thoracic diseases and their locations and found to increase physician accuracy when used by them (Hwang, 2019)

WhatIs7 16

8 How will ML and AI impact clinical practice? • Physicians (Jha, 2016; Jha, 2018; Shah, 2019) and ML (Verghese, 2018) must adapt • “AI won’t replace radiologists, but radiologists who use AI will replace radiologists who don’t,” Langlotz, Stanford radiologist (Reardon, 2019) – True for all physicians, even Dr. McCoy? • Must be “democratizing” role for all in healthcare (Allen, 2019)

WhatIs7 17