Healthcare Solutions White Paper Nuance AI Marketplace and mPower

Radiology analytics: Empowering radiologists and accelerating AI adoption Using clinical analytics for AI model validation, evaluation, and performance monitoring.

By Karen Holzberger Senior Vice President and General Manager, Diagnostics

Woojin Kim, MD Musculoskeletal Radiologist and Imaging Informaticist Healthcare Solutions White Paper 2 Nuance AI Marketplace and mPower

In a few short years, perceptions of AI in radiology have shifted from the technology as a threat to the profession to an exciting innovation with the potential for faster and more accurate diagnosis, reduction of repetitive and mundane tasks, and faster scanning with improved image quality.

Commercial AI developers are now offering applications for a range of medical imaging findings and modalities, and AI marketplaces or “app stores” simplify to these AI models. Multiple industry initiatives also are focused on the regulatory, medico-legal, ethical, data science, and other challenges of developing and using AI in clinical practice. They include activities conducted under the auspices of the American College of Radiology (ACR) and the Radiological Society of North America (RSNA), at radiology teaching institutions, and by hospital systems, healthcare IT vendors, and AI developers. Many efforts are focused on the quantity, quality, and use of data for AI model training, validation, and surveillance over time.

Radiologists as the “essential data scientists of medicine” At a grassroots level, radiologists are fulfilling the role urged by the ACR in 2016 as “essential data scientists of medicine,” actively guiding the development and adoption of AI in ways similar to how they shaped the implementation of PACS, RIS, and in the past. One important area where they are taking the lead is in applying advanced clinical analytics to create their own datasets for AI model training and validation. Clinical analytics are typically used to mine radiology report data for perfor- mance measurement, and for performing follow-up tracking and other quality improvement programs. But analytics tools also can quickly define and extract data to generate and validate AI models. In addition, analytics reports can be used to assess clinical needs and associated opportunities for AI use cases.

The use of internal data helps ensure that the results of AI models accurately represent the demographics of a hospital system’s patient population, imaging systems, and protocols in use. Internal data access and analytics also enable quicker AI model generation and evaluation as well as comparisons of various AI model performance. At the same time, patient data remains secure within the hospital’s systems.

A surge of activity It was only in late 2016 that Geoffrey Hinton, “The Godfather of ,” declared that deep learning algorithms would outperform human radiologists within a decade, and that “people should stop training radiolo- gists now.” The same year, AI pioneer Andrew Ng told The Economist that it would be easier to replace radiologists with machines than it would be to replace their executive assistants.

Nonetheless, radiologists not only continued but expanded research into potential uses of the technology. In April 2018, the editor of Radiology magazine noted “a remarkable transformation” underway in the investigation of AI to diagnose and detect disease. He noted that in 2015, the journal had no publications about AI, three in 2016, and 17 in 2017. In 2019, the RSNA launched a new journal, Radiology: Artificial Intelligence, dedicated to AI applications in medical imaging.

Healthcare Solutions White Paper 3 Nuance AI Marketplace and mPower

The publication boom across the profession itself has prompted studies. One in the December 2019 issue of the American Journal of Roentgenology examined global trends in radiology AI research published in scientific and medical journals and proceedings. “Our bibliometric analysis yielded 8813 radiology AI publications worldwide from 2000 to 2018,” wrote authors West et al. They called the growth in AI research “exponential.”

Much of the attention has focused on the development and performance of AI models for a variety of image characterization use cases. For example, some of the AI models already on the Nuance AI Marketplace include:

• Triage algorithms from Aidoc for detection and prioritization on the radiol- ogist worklist of intracranial hemorrhage, pulmonary embolism, and spinal fractures.

• FDA-cleared algorithms from MaxQ for the detection of intracranial hemorrhage.

• An FDA-cleared application for the automatic detection of lung nodules from Riverain Technologies, which is working with Nuance to pilot its ClearRead CT algorithm.

• VIDA’s LungPrint Discovery algorithm, which is an automated AI-powered analysis of an inspiratory chest CT scan that identifies lung density abnor- malities. VIDA also is collaborating with Nuance on a pilot.

• Zebra Medical Vision has four FDA-cleared AI algorithms: A cardiology offering calculates coronary calcium scores based on gated CT scans; three worklist triage offerings include intracranial hemorrhage that automatically detects brain bleeds based on standard, non-contrast brain CTs, pneumo- thorax detection using chest CR, DR and/or DX images, and pleural effusion.

Understanding and responding to the data science challenges Most AI models for medical imaging use machine learning algorithms. First introduced in 1959 by AI pioneer Arthur Samuel, the term machine learning describes computers that learn automatically from exposure to data; AI models learn by being trained with more data over time with the goal of improved performance. Developers and users validate performance by comparing model output to “ground truth,” or known results in the real world. For example, a finding detected by a model can be compared to data in the medical record or radiology report. A radiologist’s own observations can also serve as a reference standard to supplement ground truth data for evaluating model output.

However, monitoring of AI model performance must continue beyond initial development and training due to the nature of machine learning algorithms. AI developers and users need to monitor for performance decay continuously and adjust either the algorithm, or the data, or both as needed to maintain optimal performance. Healthcare Solutions White Paper 4 Nuance AI Marketplace and mPower

Radiologists should be aware of three important data science concerns when developing or applying AI models in clinical use:

– Brittleness: AI models can be brittle, or likely to produce results with more inaccuracies when exposed to new or different datasets. The term “overfitting” sometimes applies, where an algorithm essentially memorizes or over-learns the statistical characteristics or parameters of its training data and performs less well when exposed to new data. Developers address this with larger training datasets and multiple rounds of testing until accuracy warrants more generalized use.

In radiology, brittle models perform well with data from a single hospital environment but stumble when used with different data from other facilities. In a study published in November 2018 in PLOS Medicine, researchers at the Icahn School of Medicine reported that AI model performance in diagnosing pneumonia on chest X-rays was signifi- cantly lower using data from two other institutions.

– Concept drift: Concept drift describes changes to the properties of the “target variable,” or the result that the model is expected to pro- duce. While the input data remain consistent, the condition or concept that you are trying to predict changes. AI models that predict human behavior or biological processes are particularly prone to concept drift. For example, a radiology practice using an AI model that predicts patient no-show rates for an MRI exam may find that it becomes less accurate over time. That could be caused by seasonality, where the no-show rates may be higher during the winter holiday season than during the fall. By adding a seasonality variable to the AI model, its performance may improve, and the rate of model decay may decrease. Users need to monitor the quality of model output, adjust model inputs to account for the drift, re-label the training data, and retrain or “refresh” the model.

– Data drift: In data drift, the input data can change unexpectedly as new sources are used, or as changes are made to the systems generating data. In radiology, that can be caused by different data from external sources or changes in imaging devices, new imaging protocols, and other factors. Again, model output quality should be monitored, data labeled with new classes, and the model retrained.

There are different approaches to how concept drift and data drift are detected and remedied. AI developers also have various methods for predicting those phenomena. Radiologists, in any case, need to be aware of both concerns when evaluating and using AI models in clinical practice. In fact, radiologists are instrumental in the ongoing surveil- lance of AI model performance, and in helping AI developers diagnose and correct problems.

Radiologists are addressing the data science challenges and the need for AI model evaluation and performance surveillance within their own institutions and at scale in collaborative efforts with other organizations. In one notable pilot, radiologists from seven leading healthcare organizations are partnering with the ACR, , and Nuance to speed and simplify the development and adoption of high-quality AI models. Healthcare Solutions White Paper 5 Nuance AI Marketplace and mPower

The pilot includes Massachusetts General Hospital, Ohio State University, Lahey Hospital and Medical Center, Emory University, the University of Washington, the University of California San Francisco, and Brigham and Women’s Hospital. Participants are using AI-LAB™, ACR’s free software platform, to develop, validate, and refine AI models developed at the other organizations using their own patient data. Patient data stays securely on-site at each originating facility. The goal is to develop AI models that meet each institution’s specific clinical needs and have the potential for generalized commercial use. The initiative is expected to expand to other institutions after the pilot is completed.

NVIDIA® is providing its Clara™ AI software toolkits at no cost to enable participants to create AI models without programming knowledge. Nuance provides the last-mile technology required to integrate AI models on the radiologist’s desktop through its PowerShare image sharing network and its flagshipPowerScribe radiology reporting solutions.

In addition to the ACR pilot, the Nuance AI Marketplace enables user-developer collaboration by automatically capturing the difference between the final radiology report and the AI results and makes these performance results available so that AI developers can improve model performance. The AI Marketplace currently reaches 80 percent of radiologists, and the 7,000+ healthcare facilities are already using PowerScribe and PowerShare.

AI Marketplace Nuance Nuance

developers AI Marketplace Diagnostic solutions

F

e (such as PowerScribe)

k n e

c o

i d

a t

a

b b

d a

d i

l c

e

a e k V F Nuance PowerShare

Define and build Publish and share Access, subscribe, and use Healthcare Solutions White Paper 6 Nuance AI Marketplace and mPower

Empowering AI adoption and improving radiology Nuance provides a more natural outcomes with mPower Clinical Analytics and insightful approach to As you explore the potential of AI models within your own radiology practices, clinical documentation, freeing mPower Clinical Analytics from Nuance can inform and guide your process clinicians to spend more time before, during, and after AI adoption. At the same time, it can generate caring for their patients. Nuance meaningful and actionable insights to implement quality and productivity healthcare solutions capture and improvements. A packaged subset of mPower features also is available in communicate more than 300 million HealthCheck, a clinical analytics service that provides a customized perfor- patient stories each year helping mance snapshot. more than 500,000 clinicians in 10,000 healthcare organizations Designed and validated by radiologists, mPower uses advanced text mining globally. Nuance’s award-winning and natural language processing to identify and extract data elements from clinical speech recognition, medical unstructured free-text in radiology reports. It automates what otherwise transcription, CDI, coding, quality would be time-consuming and hard-to-scale manual searching for data. That and diagnostic imaging solutions makes it practical to create AI model training and validation datasets using provide a more complete and patient data that remains safely on site. accurate view of patient care, which drives meaningful clinical and Currently, AI models are tested against either publicly available datasets or financial outcomes. from datasets obtained by AI developers. As noted in the descriptions of brittleness above, there is considerable variability in real-world performance. That may be the result of data generated by different makes and models of imaging equipment, as well as varying scanning protocols and parameters. Differences in patient populations also can have a profound impact on AI model performance and lead to clinical and ethical concerns related to algorithm bias. Concept drift and data drift make it crucial to perform ongoing surveillance and repeated validations.

In the September 2019 issue of the Journal of the American College of Radiology, Daniel L. Rubin, MD, MS, describes in detail a structured process for radiolo- gists to actively engage in the adoption of AI tools in clinical practice. Dr. Rubin defines seven steps for radiologists to follow when evaluating AI models:

1. Understand the key model outputs and decide which are relevant to clinical needs. 2. Collect representative patient test cases. 3. Establish ground truth for each test case. 4. Choose an appropriate evaluation metric, such as sensitivity, specificity, and positive predictive value. 5. Define a performance threshold for the metric. 6. Evaluate the test cases against the metric. 7. Implement a monitoring strategy.

AI in Action: at scale and with confidence With mPower, radiologists can train, validate, and monitor the performance of AI models with data that remains protected within their institutions. They can identify test datasets for specific diagnoses to test various AI models from different developers against the institution’s own data. And, its AI-powered analytics functions support continuous improvement programs and follow-up compliance tracking.

mPower is a key element of Nuance’s comprehensive “AI in Action” solution set—combining the PowerScribe One radiology reporting cloud platform, the PowerShare Network, the Nuance AI Marketplace and NVIDIA Clara support, to give radiologists the tools and the confidence to realize the clinical and financial value of AI at scale. Healthcare Solutions White Paper 7 Nuance AI Marketplace and mPower

About the authors

Karen Holzberger Senior Vice President and General Manager, Diagnostics Nuance Communications

Karen joined Nuance in 2014 with more than 15 years of experience in the . Prior to Nuance, she was the vice president and general manager of Global Radiology Workflow at GE Healthcare where she managed service, implementation, product management and development for mission critical healthcare IT software. Karen attended Stevens Institute of Technology where she earned a B.S. in Mechanical Engineering.

Woojin Kim, MD Musculoskeletal Radiologist and Imaging Informaticist

Dr. Kim is a musculoskeletal radiologist at the VA Palo Alto, CA. He was the Chief Medical Information Officer at the Healthcare Division of Nuance Communications from 2016 to 2019. He was a co-founder, member of the Board of Directors, and Director of Innovation at the Montage Healthcare Solutions, which was acquired by Nuance in 2016. Previously, Dr. Kim had served as interim Chief of Division of Musculoskeletal Imaging, Director of Center for Translational Imaging Informatics, and Chief of Radiography at the Hospital of the University of Pennsylvania. He completed his radiology residency and MSK fellowship training at the Hospital of the University of Pennsylvania. Also, he completed an Imaging Informatics fellowship at the University of Maryland/Baltimore VA Medical Center. Dr. Kim has been an active member in imaging informatics within various societies, including ACR, SIIM, and RSNA, with a focus on data mining, analytics, and machine learning. Healthcare Solutions White Paper 8 Nuance AI Marketplace and mPower

About Nuance Communications, Inc. Nuance Communications, Inc., is a leading provider of voice and language solutions for businesses and consumers around the world. Its technologies, applications, and services make the user experience more compelling by transforming the way people interact with devices and systems. Every day, millions of users and thousands of businesses experience Nuance’s proven applications. For more information, visit www.nuance.com/healthcare or call 1-877-805-5902. Connect with us through the healthcare blog, What’s next, Twitter, LinkedIn, and Facebook.

Copyright © 2019 Nuance Communications, Inc. All rights reserved. Nuance, Dragon, PowerScribe, and the Nuance logo are registered trademarks and/or trademarks of Nuance Communications, Inc., and/or its subsidiaries in the United States and/or other countries. All other trademarks referenced herein are properties of their respective owners.

HC_4370 JAN 2020