AI: A Glossary of Terms

Disclaimer Thanks to (in alphabetical order): Ameen The glossary of terms contains entries that we Abu-Hanna, Anjum Ahmed, Brad Genereaux, think might come handy when studying Artifi- Peter van Ooijen, and Martijn Schut. cial Intelligence in Medical Imaging. Many of Jiapan Guo, Postdoc in Medical Imaging In- the terms can be found in the preceding chap- formatics in University Medical Center Gronin- ters. Some descriptions were found on the In- gen, The Netherlands ([email protected]) ternet (see excellent websites such as Techo- Violet Farhang-Razi, MD Northwest Hospi- pedia and Medium). In most cases no authors tal Alkmaar, The Netherlands could be traced. If sources were identified, we ([email protected]) obtained permission to reproduce. For legibil- Paul Algra MD PhD, neuroradiologist ity we avoided mentioning the sources at each Northwest Hospital Alkmaar, The Netherlands entry. References will be given upon request. ([email protected]) The authors appreciate feedback if sources are unrightfully omitted.

Assembled by Jiapan Guo, Violet Farhang-Razi and Paul Algra (editor).

© Springer Nature Switzerland AG 2019 347 E. R. Ranschaert et al. (eds.), Artificial Intelligence in Medical Imaging, https://doi.org/10.1007/978-3-319-94878-2 Glossary

A Area under curve (AUC) The area under a Algorithm A formula or set of rules (or curve between two points is calculated by procedure, processes, or instructions) for performing the definite integral. In the context solving a problem or for performing a of a receiver operating characteristic for a task. In Artificial Intelligence, the algorithm binary classifier, the AUC represents the tells the machine how to find answers to classifier’s accuracy. a question or solutions to a problem. In Artificial Intelligence (AI) Artificial intelli- , systems use many different gence (or machine intelligence) refers to types of algorithms. Common examples systems that display intelligent behavior include decision trees, clustering algorithms, by analyzing their environment and taking classification algorithms, or regression actions—with some degree of autonomy—to algorithms. achieve specific goals. AI-based systems can AlexNet The name of a neural network that won be purely software-based, acting in the virtual the ImageNet Large Scale Visual Recognition world (e.g., voice assistants, image analysis Challenge in 2012. It is named after Alex software, search engines, speech and face Krizhevsky, then a computer science PhD stu- recognition systems) or AI can be embedded dent at Stanford University. See ImageNet. in hardware devices (e.g., advanced robots, AlphaGo AlphaGo is the first computer autonomous cars, drones, or Internet of Things program that defeated a professional player applications). The term AI was first coined by on the board game Go in October 2015. Later John McCarthy in 1956. in October 2017, AlphaGo’s team released its Artificial Intelligence complete AI-complete, new version named AlphaGo Zero which is which is short for Artificial Intelligence stronger than any previous human-champion- complete or sometimes called AI-hard, defeating versions. Go is played on 19 by describes the complexity of the computational 19 board which allows for 10171 possible problems is equal to that of the entire layouts (chess 1050 configurations). It is AI problem which aims at producing a estimated that there are 1080 atoms in the general computerized system with the universe. human-level intelligence. An AI-complete Analogical Reasoning Solving problems by us- problem addresses the fact that the problem ing analogies, by comparing to past experi- cannot be easily solved by a simple specific ences. algorithm. Anonymization The process in which data is Artificial Intelligence Winters (AIWI) Artifi- de-identified as part of a mechanism to submit cial Intelligence Winters are periods of data for machine learning. time during which artificial intelligence experienced reduced fundings for researches

© Springer Nature Switzerland AG 2019 349 E. R. Ranschaert et al. (eds.), Artificial Intelligence in Medical Imaging, https://doi.org/10.1007/978-3-319-94878-2 350 Glossary

and low interest from the public. Two major tasks from the given data without specific winter periods were in 1974–1980 and 1987– programing for a single task. 1993. AIW are the result of inflated, unreal Artificial Neuron An artificial neuron is a digi- expectations. tal construct that seeks to simulate the behav- Artificial General Intelligence (AGI) Artificial ior of a biological neuron in the brain. Artifi- general intelligence as opposed to narrow cial neurons are typically used to make up an intelligence, also known as complete, strong, artificial neural network—these technologies super intelligence, Human Level Machine are modeled after human brain activity. Intelligence, indicates the ability of a machine Asimov Isaac Asimov’s Three Laws are as fol- that can successfully perform any tasks in an lows: (1) A robot may not injure a human intellectual way as the human being. Artificial being. (2) A robot must obey orders, unless superintelligence is a term referring to the they conflict with law number one. (3) A robot time when the capability of computers will must protect its own existence, as long as surpass humans. those actions do not conflict with either the Artificial Superintelligence (ASI) Artificial first or second law. superintelligence is a term referring to the Association Subcategory of unsupervised learn- time when the capability of computers will ing. It can be best explained by market basket surpass humans. “Artificial intelligence,” analysis (MBA). MBA attempts to identify as- which has been much used since the 1970s, sociation/relation between various items that refers to the ability of computers to mimic have been chosen by a particular shopper and human thought. Artificial superintelligence placed in their respective baskets (real or vir- goes a step beyond and posits a world in which tual). The output value from this lies in cross a computer’s cognitive ability is superior to a marketing of products and customer behavior human’s. analysis. Association is the generalization of Artificial Narrow Intelligence (ANI) Artificial m.b.a. Example: there is a good chance that Narrow Intelligence, also known as weak or a customer will buy bread if he has already applied intelligence, represents most of the bought milk and eggs. current artificial intelligent systems which Augmented Intelligence Augmented Intelli- usually focus on a specific task. Narrow AIs gence is the intersection of machine learning are mostly much better than humans at the task and advanced applications, where clinical they were made for: for example, look at face knowledge and medical data converge on recognition, chess computers, calculus, and a single platform. The potential benefits of translation. The definition of artificial narrow Augmented Intelligence are realized when it is intelligence is in contrast to that of strong AI used in the context of workflows and systems or artificial general intelligence, which aims that healthcare practitioners operate and at providing a system with consciousness or interact with. Unlike Artificial Intelligence, the ability to solve any problems. Virtual which tries to replicate human intelligence, assistants and AlphaGo are examples of Augmented Intelligence works with and artificial narrow intelligence systems. amplifies human intelligence. Artificial Neural Network (ANN) Artificial Autoregressive Model An autoregressive Neural Network (ANN) is a computational model is a time series model that uses model in machine learning, which is inspired observations from previous time steps as by the biological structures and functions of input to a regression equation to predict the the mammalian brain. Such a model consists value at the next time step. In statistics and of multiple units called artificial neurons signal processing, an autoregressive model is which build connections between each other a representation of a type of random process. to pass information. The advantage of such It is used to describe certain time-varying a model is that it progressively “learns” the processes in nature, economics, etc. Glossary 351

B cannot be processed by relational Backpropagation, also called engines. This type of data requires a different “backward propagation of errors,” is an processing approach which uses massive par- approach that is commonly used in the allelism on readily available hardware. training process of the deep neural network to Blockchain Blockchain is a distributed system reduce errors. It allows the machine learning that records transactions across all users in algorithm to adjust itself according to looking an expanding chain of encrypted blocks. at its past function. It involves the calculation Blockchain builds a decentralized ledger that of errors between prediction and the target indicates every user has the same copy of the values, the computation of the gradient of record. The records cannot be easily altered the error function, and then the update of unless all of them are altered. Blockchain was the weights. Seen also feedforward neural invented in 2008 for the use of cryptocurrency network. bitcoin as a public transaction ledger. Such a Backward Chaining Backward chaining, also system also shows its potential applications called goal-driven inference technique, is an in different fields regarding the recording inference approach that reasons backward of events, medical records, and other record from the goal to the conditions used to get management systems. the goal. Backward chaining inference is Boolean neural network Boolean neural net- applied in many different fields, including work is an artificial neural network approach game theory, automated theorem proving, and which only consists of Boolean neurons (and, artificial intelligence. or, not). Such an approach reduces the use of A preprocessing step memory space and computation time. It can where the data are centered around zero, and be implemented to the programmable circuits often the standard deviation is set to unity. such as FPGA (Field-Programmable Gate Bayesian Filter A Bayesian filter is a program Array or Integrated circuit). using Bayesian logic. It is used to evaluate the header and content of email messages and de- termine whether or not it constitutes spam— C unsolicited email or the electronic equiva- Caffe Caffe is short for Convolutional Archi- lent of hard copy bulk mail or junk mail. tecture for Fast Feature Embedding which is A Bayesian filter works with probabilities of an open source framework de- specific words appearing in the header or veloped in Berkeley AI Research. It supports content of an email. Certain words indicate a many different deep learning architectures and high probability that the email is spam, such GPU-based acceleration computation kernels. as Viagra and refinance. Case-Based Reasoning (CBR) Case-Based Bayesian Network A Bayesian Network, also Reasoning is a way to solve a new problem called Bayes Network, belief network, or by using solutions to similar problems. It has probabilistic directed acyclic graphical model, been formalized to a process consisting of is a probabilistic graphical model (a statistical case retrieve, solution reuse, solution revise, model) that represents a set of variables and and case retention. their conditional dependencies via a directed CE Marking A certification marking indicating acyclic graph (see DAG). conformity with standards for products sold Biased algorithm See Inadvertent effects of AI. within the European Economic Area. In the Big Data The term big data is used when tra- context of medical devices, CE Marking is ditional data mining and handling techniques similar to US Food and Drug Administration cannot uncover the insights and meaning of approval. the underlying data. Data that are unstruc- Central processing unit (CPU) Central pro- tured or time sensitive or simply very large cessing unit is the electronic circuit within 352 Glossary

that carries out the instructions of a computer the human brain to help with the decision- program by performing the basic arithmetic, making. It uses self-learning algorithms logical, control, and input/output operations that perform tasks such as natural language specified by the instructions (see also GPU). processing, image analysis, reasoning, and Chatbot Chatbot, also known as interactive human–computer interaction. Examples of agent, is an artificial intelligence system that cognitive systems are IBM’s and uses natural language processing techniques Google DeepMind. to conduct a conversation via audio or texts. Cohort A sample in a clinical study (conducted The most recognizable examples of chatbots to evaluate a machine learning algorithm, for are Apple’s Siri, Microsoft’s Cortana, and example) where it is followed prospectively or Amazon’s Alexa. retrospectively and subsequent status evalua- Classification Classification is a general pro- tions with respect to a disease or outcome are cess for categorization which assigns a la- conducted to determine which initial partic- bel to the samples. A classification system is ipants’ exposure characteristics (risk factors) an approach to accomplish categorization of are associated with it. samples. Computer-Aided Detection/Diagnosis (CAD) Clinical Decision Support (CDS) A clinical Computer-aided detection (CAD), or decision support system is a health infor- computer-aided diagnosis (CADx), uses mation technology system that is designed computer programs to assist radiologists in to provide physicians and other health the interpretation of medical images. CAD professionals with clinical decision support, systems process digital images for typical that is, assistance with clinical decision- appearances and highlight suspicious regions making tasks. in order to support a decision taken by a Cloud The cloud is a general metaphor that professional. is used to refer to the Internet. Initially, the Common Data Element (CDE) Common Data Internet was seen as a distributed network and Element is a tool to support data management then with the invention of the World Wide for clinical research. Web as a tangle of interlinked media. As the The process of filtering. A filter (or Internet continued to grow in both size and the equivalently: a kernel or a template) is shifted range of activities it encompassed, it came to over an input image. The pixels of the output be known as “the cloud.” The use of the word image are the summed product of the values in cloud may be an attempt to capture both the the filter pixels and the corresponding values size and nebulous nature of the Internet. in the underlying image. Cloud Computing Cloud Computing enables Convolutional neural network (CNN) A con- access to and usage of shared computer volutional neural network is a specific type resources that can be provisioned with of artificial neural network that uses percep- minimum management effort. The cloud is trons, a machine learning unit algorithm, for a general metaphor to refer to a group of supervised learning, to analyze data. CNNs networked computer resources that could apply to image processing, natural language provide computing services to avoid up-front processing, and other kinds of cognitive tasks. IT infrastructures costs. A convolutional neural network is also known Clustering Clustering is a task to organize data as a ConvNet. A CNN consists of an input and into groups based on certain properties. Clus- output layer as well as multiple hidden layers tering analysis is widely used in data mining which are formed as mathematical operations. for , image analysis, and The hidden layers include convolutional layer, computer graphics, among others. pooling layer, normalization, and fully con- Cognitive computing Cognitive computing is nected layers. Since the success of AlexNet used to refer to the systems that simulate (see Alexnet) applied the ImageNet competi- Glossary 353

tion in 2013, there has been a rapid evolution Deductive Reasoning Deductive Reasoning, of CNNs. VGGNet, GoogLeNet, ResNet, and also known as logical deduction, is a DenseNet are some successful examples. See reasoning method that relies on premises to Multilayer neural network. reach a logical conclusion. It works in a top- Computer Vision is an inter- down manner, in which the final conclusion disciplinary field that uses computer science is obtained by reducing the general rules techniques to analyze and understand digital that hold the entire domain until only the images and videos. Computer vision tasks conclusion is left. include object recognition, event detection, Data Refinement Data refinement is used to motion detection, and object tracking, among convert an abstract data model in terms of others. sets for example into implementable data structures such as arrays. D Decision Tree A decision tree uses tree-like Data Data is a collection of qualitative graph or model as a structure to perform and quantitative variables. It contains the decision analysis. It uses each node to information that is represented numerically represent a test on an attribute, each branch and needs to be analyzed. to represent the outcome of the test, and each Data Cleaning Data Cleaning is the process of leaf node to represent a class label. identifying, correcting, or removing inaccu- Data Warehouse A data warehouse is typically rate or corrupt data records. an offline copy of production and Data Curation Data Curation includes the pro- copies of files in a non-production environ- cesses related to the organization and manage- ment. ment of data which is collected from various Deep Blue Deep Blue was a chess supercom- sources. puter developed by IBM. It was the first com- Data-Driven Science Data-Driven Science, or puter chess player that beat the world cham- Data Science, is an interdisciplinary field of pion Garry Kasparov, after six-game match in employing computing algorithms to extract 1997. knowledge or insights from data acquired Deep Learning (DL) Deep Learning is a sub- from different sources. field of machine learning concerned with al- Data Extraction Data Extraction is the act or gorithms that are inspired by the human brain process of retrieving data out of data resources that works in a hierarchical way. Deep Learn- for further data processing or data storage. ing models, which are mostly based on the Data Integration Data Integration involves the (artificial) neural networks, have been applied combination of data residing in different re- to different fields, such as , sources and then the supply in a unified view computer vision, and natural language pro- to the users. Data integration is in high de- cessing. mand for both commercial and scientific do- DeepMind DeepMind is an artificial intelli- mains in which they need to merge the data gence company founded in 2010 and later and research results from different reposito- acquired by Google in 2014. DeepMind ries. developed Alphago program that beat a Data Lake A type of data repository that stores human professional Go player for the first data in its natural format and relies on various time. schemata and structure to index the data. Deep neural network A neural network archi- Data Mining Data Mining is the process of tecture with many layers, typically 5–100. A data analysis and information extraction network with only a few layers is called a from large amounts of datasets with machine shallow network. learning, statistical approaches. and many Dice coefficient A measure to compare the sim- others. ilarity of two segmentations, e.g., by expert 354 Glossary

and by machine. It is the ratio of twice the into robot ethics and machine ethics. The number of common pixels to the sum of all former one is about the concern with the pixels in both sets. moral behavior of humans as they design, Directed Acyclic Graph (DAG) In computer construct, use, and treat artificially intelligent science and mathematics, a directed acyclic beings, and the latter one is about the moral graph is a finite directed graph with no di- behavior of artificial moral agents (see also rected cycles. It consists of finitely many ver- inadvertent effects). tices and edges, with each edge directed from Expert System Expert system is a computer one vertex to another, such that there is no way system that simulates the ability or behavior to start at any vertex and follow a consistently of a human expert on performing a task. An directed sequence of edges that eventually expert system incorporates the knowledge loops back to that starting vertex again. base that represents facts and rules, and the inference engine that uses the knowledge base to deduce new conclusions. E Explainable artificial intelligence (XAI) Electronic Medical Record (EMR) An elec- Explainable artificial intelligence is a key term tronic medical record, or electronic health in AI design and in the tech community as a record, is the systematized collection of whole. It refers to efforts to make sure that patient and population electronically stored artificial intelligence programs are transparent health information in a digital format. in their purposes and how they work. Explain- These records can be shared across different able AI is a common goal and objective for healthcare settings. Records are shared engineers and others trying to move forward through network-connected, enterprise-wide with artificial intelligence progress. information systems or other information networks and exchanges. F ELIZA The ELIZA effect is a term used to Fast Healthcare Interoperability Resources discuss progressive artificial intelligence. It is (FHIR) Fast Healthcare Interoperability the idea that people may falsely attach mean- Resources is a draft standard describing data ings of symbols or words that they ascribe to formats and elements (known as “resources”) artificial intelligence in technologies. and an application programming interface Enterprise Imaging Enterprise Imaging has for exchanging electronic health records. The been defined as “a set of strategies, initiatives standard was created by the Health Level and workflows implemented across a health- Seven International healthcare standards care enterprise to consistently and optimally organization. capture, index, manage, store, distribute, view, Forward Chaining Forward Chaining, also exchange, and analyze all clinical imaging and called forward reasoning, is a reasoning multimedia content to enhance the electronic approach that searches inference rules from health record” by members of the HIMSS- available data and then makes deduction and SIIM Enterprise Imaging Workgroup. decision based on the rule. Forward Chaining Error backpropagation The process of works in the opposite as the backward adjusting the weights in a neural network by chaining. minimizing the error at the output. It involves Feedforward Neural Network A feedforward a large number of iteration cycles with the neural network is an artificial neural network training data. in which the connections between units do Ethics of Artificial Intelligence The ethics of not form a cycle. The feedforward neural artificial intelligence is the ethics of network has an input layer, hidden layers, and technology specific to robots and other an output layer. Information always travels artificial intelligence beings, which is divided in one direction—from the input layer to the Glossary 355

output layer—and never goes backward. See especially for computations that can be also backpropagation. processed parallely. GPUs are widely used in Fully Convolutional Network (FCN) embedded systems, mobile phones, personal Fully Convolutional Network is the first computers, workstations, and many others. convolutional neural network for semantic The rapid development of GPUs contributes segmentation. It is trained end-to-end, pixel- to the rise of deep learning systems. The first to-pixel from arbitrary-sized inputs. Both GPU was developed by NVidia in 1999 and learning and inference are performed whole called the GeForce 256. image at a time by dense feedforward computation and backpropagation. H Heuristics A heuristic is a technique to provide G fast or approximate solutions when the Generative Adversarial Network (GAN) traditional methods are too slow or fail to give A class or artificial intelligence algorithms an accurate solution. A heuristic is commonly used in unsupervised machine learning, called a rule of thumb. While faster, it is where two neural networks (a generative typically less optimal than the classic methods network and a discriminative one) are pitted it replaces. against one another—one network generates Heuristic search techniques Support that candidates, and the other evaluates them in a narrows down the search for optimal solutions zero-sum game framework. for a problem by eliminating options that are Genetic Algorithm Genetic algorithms are incorrect. heuristic search and optimization algorithms Human Level Machine Intelligence See: inspired by the natural selection theory. Artificial General Intelligence. A genetic algorithm requires a genetic representation of the solution and a fitness function to evaluate the solution. I Genomic data Genomic data refer to the ImageNet ImageNet is a large image database genome and DNA data of an organism. They with more than 14 million images over 20,000 are used in bioinformatics for collecting, categories. Since 2010, the ImageNet project storing, and processing the genomes of living runs annually a contest called the ImageNet things. Genomic data generally require a large Large Scale Visual Recognition Challenge amount of storage and purpose-built software (ILSVRC) for object and scene recognition. to analyze. The winner algorithm of the contest in the Gradient boost machine A type of machine year 2012 is considered as the beginning of learning technique that uses an ensemble of the deep learning revolution. See AlexNet. weak prediction models to perform regression Inadvertent effects of AI If training sets are and classification tasks. poised with faulty data, then the algorithm will A fast optimization method render faulty outcomes. A system is only as to find a minimum (e.g., error). The gradient good as the data it learns from and databases is computed at the local position, and walking must increase in order to let AI grow. See in is done only a step in the downward direction. the literature for racist, sexist algorithms. See Repeating this process gives the fastest and also Ethics of Artificial Intelligence. most efficient way to the minimum. Inductive reasoning Inductive reasoning is Graphical Processing Unit (GPU) A graphical a reasoning method which uses premises processing unit is a single chip processor to supply evidence in order to support the designed for efficient manipulation of conclusion. Opposed to deductive reasoning, computer graphics and image processing, inductive reasoning works as a down-top logic 356 Glossary

which provides the conclusion by generalizing L or extrapolating from special cases to general Label Also known as annotation. In supervised rules. learning, the answer or result portion of an Interactive Machine Learning Interactive example. Each example in a labeled dataset Machine Learning are approaches based on a consists of one or more features and a label. coupling of human input and machines during For instance, in a housing dataset, the features the learning process. might include the number of bedrooms, the Internet of Things (IoT) The Internet of number of bathrooms, and the age of the Things (IoT) is the network of electronic house, while the label might be the house’s devices embedded with softwares and sensors price. In a spam detection dataset, the features that enable the interaction between machines. might include the subject line, the sender, and The connectivity between devices helps the the email message itself, while the label would collection of huge data which can be analyzed probably be either spam or not spam. by computer-based systems. Layer A layer, as in convolutional layer, is a set Internet of Medical Things (IoMT) Internet of neurons in a neural network that process of Medical Things (IoMT) specifies the a set of input features, or the output of those network of devices that are used to monitor neurons. Deep learning networks get their the health status during the daily life. name because they have many layers; most Interoperability Interoperability is the property systems now have 30–150 layers, compared that allows for the unrestricted sharing of with traditional ANNs that would fail if they resources between different systems. This had more than about three layers. can refer to the ability to share data between Learning Learning is the process of acquiring different components or machines, or it can new or modifying existing knowledge, behav- be defined as the exchange of information iors, skills, values, or preferences. The ability and resources between different computers to learn is possessed by humans, animals, and through local area networks (LANs) or wide some machines, and there is also evidence area networks (WANs). Broadly speaking, for some kind of learning in some plants. interoperability is the ability of two or Some learning is immediate, induced by a more components or systems to exchange single event but much skill and knowledge information and to use the information that accumulates from repeated experiences. has been exchanged. See also deep learning, machine learning, Isaac Asimov Isaac Asimov (1920–1992) was unsupervised and reinforcement learning. a science fiction author and formulated the Learning algorithm A learning algorithm is an Three Laws of Robotics in the latter, which algorithm used in machine learning to help continues to influence researchers in robotics the technology to imitate the human learning and artificial intelligence (AI). process. Combined with technologies like neural networks, learning algorithms create K involved, sophisticated learning programs. Kaggle Kaggle is a data science platform to Learning algorithm, examples Logic regres- host data analysis competitions launched by sion, linear regression, decision trees, and companies and users. random forests are all examples of learning al- Knowledge-Based Systems It is a computer gorithms. Algorithms like “nearest neighbor” system that uses knowledge to solve a problem also involve the ways that these algorithms are or support a decision. A knowledge-based used to affect decision-making and learning in system has three types of subsystems: a machine learning. In general, what all of these knowledge base, a user interface, and an algorithms have in common is their ability inference engine. to extrapolate from test or training data to make projections or build models in the real Glossary 357

world. Think of these algorithms as tools for analysis for applications in industry such as “pulling data points together” from a raw data automatic inspection, process control, and mass or a relatively unlabeled background. robot guidance. Where learning algorithms are useful in Markov Chain Any multivariate probability both supervised and unsupervised machine density whose independence diagram is a learning, they are used in different ways in chain. In other words, the variables are or- each type of discipline. Supervised machine dered, and each variable “depends” only on its learning benefits from having already labeled neighbors in the sense of being conditionally and isolated data, so the learning algorithms independent of the others. An equivalent that are used will be different in some definition is that you sample the variables ways. left-to-right, conditional only on the last Learning rate A scalar used to train a model outcome. via gradient descent. During each iteration, Mask R-CNN Mask R-CNN is a general deep the gradient descent algorithm multiplies the learning-based framework for object instance learning rate by the gradient. The resulting segmentation. It consists of two stages, in product is called the gradient step. Learning which the first stage performs a region rate is a key hyperparameter. proposal network that proposes candidate Linear regression Linear regression is a kind object bounding box, while the second stage of statistical analysis that attempts to show provides a class prediction to the instances in a relationship between two variables. Linear the bounding box as well as a binary mask for regression looks at various data points and instance segmentation. plots a trend line. Linear regression can create Medical Imaging Informatics MII is the a predictive model on apparently random development, application, and assessment data, showing trends in data. See Learning of information technology (IT) for clinical algorithm, examples. medical imaging. It includes the interfaces Logistic regression Logistic regression is a of IT and people. In practical terms, MII kind of statistical analysis that is used to already occurs at a basic level throughout predict the outcome of a dependent variable radiology practice, from the moment a based on prior observations. For example, an clinician considers ordering an imaging study algorithm could determine the winner of a until images and interpretation are used to presidential election based on past election plan the patient’s treatment. results and economic data. Logistic regression Monte Carlo Methods Monte Carlo Methods, algorithms are popular in machine learning. or Monte Carlo Simulation, are computational See Learning algorithm, examples. algorithms that rely on random sampling to obtain numerical results based on probability distributions. One example of using Monte M Carlo Method is to approximate the value of Machine intelligence See Artificial Intelligence. π. It is done by uniformly scattering random Machine Learning Machine Learning is a field points inside a square and then computing the in computer science that builds computational ratio between the number of points falling in models that have the ability of “learning” from the circle and that of the total number of points the data and then provide predictions. De- within the square, which is equal to π/4. pending on whether there is a supervisory sig- Moore’s Law Named after the cofounder of nal, machine learning can be divided into three Intel, Moore predicted in 1965 that the categories: the supervised learning, unsuper- number of transistors that can be placed on vised learning,andreinforcement learning. an integrated circuit doubles every 2 years. Machine Vision Machine Vision is the technol- This trend has been continuing since 1965 ogy used to provide image-based automatic with no signs of any slowdown yet. It can be 358 Glossary

applied in general to a range of technology accuracy for the task-specific models, when areas that are growing at an accelerating rate. compared to training the models separately. Multilayer neural network A multilayer neu- ral network contains more than one layer of artificial neurons or nodes. They differ widely N in design. It is important to note that while Narrow artificial intelligence (narrow AI) single-layer neural networks were useful early Narrow artificial intelligence (narrow AI) is a in the evolution of AI, the vast majority specific type of artificial intelligence in which of networks used today have a multilayer a technology outperforms humans in some model. Multilayer neural networks can be very narrowly defined task. Unlike general set up in numerous ways. Typically, they artificial intelligence, narrow artificial intelli- have at least one input layer, which sends gence focuses on a single subset of cognitive weighted inputs to a series of hidden layers, abilities and advances in that spectrum. and an output layer at the end. These more Natural Language Processing Natural lan- sophisticated setups are also associated guage processing (NLP) is a method to with nonlinear builds using sigmoids and translate between computer and human other functions to direct the firing or languages. Traditionally, feeding statistics activation of artificial neurons. While some and models have been the method of choice of these systems may be built physically, for interpreting phrases. Recent advances in with physical materials, most are created this area include voice recognition software, with software functions that model neural human language translation, information activity. Convolutional neural networks retrieval, and artificial intelligence. There (CNNs), used for image processing and is difficulty in developing human language computer vision, as well as recurrent neural translation software because language is con- networks, deep networks, and deep belief stantly changing. Natural language processing systems are all examples of multilayer neural is also being developed to create human networks. CNNs, for example, can have readable text and to translate between one dozens of layers that work sequentially on an human language and another. Already existing image. All of this is central to understanding reports associated with radiology images can how modern neural networks function. be used to learn about disease and conditions Multilayer Perceptrons (MLP) A multilayer and the ultimate goal of NLP is to build perceptron is a class of feedforward artificial software that will analyze, understand, and neural network. An MLP consists of at least generate human languages naturally, enabling three layers of nodes. Except for the input communication with a computer as if it were a nodes, each node is a neuron that uses a human. nonlinear . MLP utilizes Neural networks Also known as artificial a supervised backpropagation technique for neural network, neural net, deep neural training. Its multiple layers and nonlinear net; a computer system inspired by living activation distinguish MLP from a linear brains. Neural networks found to perform perceptron. It can distinguish data that is not best in ImageNet data challenges were linearly separable. convolutional neural networks (CNNs). This Multi-task Learning Multi-task learning name comes from the mathematical concept (MTL) is a subfield of machine learning in of convolution, which is similar to the CNN which multiple learning tasks are solved at convolutional operation wherein filters are the same time while exploiting commonalities applied to an image in fixed spatial regions and differences across tasks. This can result and are swept across, or integrated, over the in improved learning efficiency and prediction entire image. The resulting activations can then be aggregated in pooling operations, Glossary 359

subjected to repeated , and P eventually mapped to a vector of probabilities Pattern matching Pattern recognition and corresponding to likelihoods that the image pattern matching are sometimes confused belongs to a certain class. as the same thing when, in fact, they are not. Whereas pattern recognition looks for a similar or most likely pattern in a given data, O pattern matching looks for exactly the same Omics The word omics indicates the study of a pattern. Pattern matching is not considered body of information and refers to the fields of part of machine learning, although in some biology ending in -omics such as genome, cases it leads to similar results as pattern proteome, microbiome, and exposome. recognition. Pattern recognition has its origins Many of the emerging fields of large-scale in engineering, whereas machine learning data-rich biology are designated by adding grew out of computer science. Both can be the suffix -omics onto previously used viewed as two facets of the same field. terms. Pattern recognition In IT pattern recognition is OpenAI OpenAI is a nonprofit artificial abranchofmachine learning that emphasizes intelligence research company (founded in the recognition of data patterns or data December 2015 by partners including Elon regularities in a given scenario. It is a Musk) that aims to promote and develop subdivision of machine learning and it should friendlyAIinsuchawayastobenefit not be confused with actual machine learning humanity as a whole. The organization aims study. Pattern recognition can be either to “freely collaborate” with other institutions supervised, where previously known patterns and researchers by making its patents and can be found in a given data, or unsupervised, research open to the public. where entirely new patterns are discovered. Overfitting In statistics and machine learning, The objective behind pattern recognition overfitting occurs when a model tries to algorithms is to provide a reasonable answer predict a trend in data that is too noisy. for all possible data and to classify input Overfitting is the result of an overly complex data into objects or classes based on certain model with too many parameters. A model features. A most likely matching is performed that is overfitted is inaccurate because the between various data samples and their key trend does not reflect the reality of the data. features are matched and recognized. An overfitted model is a model with a trend Perceptron The perceptron computes a single line that reflects the errors in the data that it is output from multiple real-valued inputs by trained with, instead of accurately predicting forming a linear combination according to unseen data. This is better seen visually with its input weights and then possibly putting a graph of data points and a trend line. An the output through some nonlinear activation overfitted model shows a curve with higher function. A multilayered perceptron is a and lower points, while a properly fitted network of simple neurons called perceptrons. model shows a smooth curve or a linear The basic concept of a single perceptron was regression. introduced by Rosenblatt in 1958. Overfitting, compensation of Overfitting Perceptron algorithm Perceptron algorithm typically results from an excessive number of is a machine learning algorithm that helps training points. There are a number of tech- provide classified outcomes for computing. niques that machine learning researchers can Perceptron algorithm is called supervised use to mitigate overfitting, including cross- classification because the computer is validation, regularization, early stopping, aided by the human classification of data pruning, Bayesian priors, dropout, and model points. Perceptron is also related to the comparison. development of “artificial neural networks,” 360 Glossary

where computing structures are based on the the gene profile of, for example, a tumor. design of the human brain. Combining both radiomics and radiogenomics Planning A branch of AI dealing with planned will lead to AI predicting which kind of gene sequences or strategies to be performed by an profile defect there is based on its features AI-powered machine. Things such as actions seen on scans. to take, variable to account for, and duration Random Forests (or Random Decision of performance are accounted for. Forests) Random Forests or Random Deci- Principal component analysis (PCA) Con- sion Forests are ensembling learning methods structing new features which are the principal for data classification and regression. They components of a dataset. The principal construct a multitude of decision trees during components are random variables of maximal the training and output the class that is the variance constructed from linear combinations mode of the classes (classification) or mean of the input features. Equivalently, they are prediction (regression) of the individual trees. the projections onto the principal component Receptive field (RF) The sensitivity pattern of axes, which are lines that minimize the a neuron. For example, the receptive field of average squared distance to each point in a simple cell in the primary visual cortex the dataset. To ensure uniqueness, all of the V1 is determined by measuring its firing principal component axes must be orthogonal. rate as a function of the pointwise scanning PCA is a maximum-likelihood technique light stimulation of its receptive field area on for linear regression in the presence of the retina. A receptive field is the biological Gaussian noise on both inputs and outputs. implementation of a filter. In some cases, PCA corresponds to a Fourier Recurrent neural network (RNN) A type of transform, such as the DCT used in JPEG neural network that makes sense of sequential image compression. information and recognizes patterns, and Pruning The use of a search algorithm to cut creates outputs based on those calculations. off undesirable solutions to a problem in an Remembers a previous state in its memory, AI system. It reduces the number of decisions and feeds this back as one of the inputs. It that can be made by the AI system. is characterized by a recurrent loop in the Python Programming language that runs on architecture. This type of neural network is most platforms and is often used for data used for sequential data, e.g., text and video. science, machine learning, and deep learning. Regression Regression is a process of predict- ing the value to a yes or no label provided it falls on a continuous spectrum of input values, R subcategory of supervised learning. Radiomics The -omics of images is an expan- Reinforcement Learning Reinforcement sion of CADx. Radiomics refers to the extrac- learning is a type of dynamic programming tion and analysis of large amounts of advanced that trains algorithms using a system of reward quantitative image features with the intent of and punishment. The algorithm is exposed creating mineable databases from radiological to a total random and new dataset and it images. From which prognostic associations automatically finds patterns and relationships can be made between images and outcomes. inside of that dataset. The system is rewarded Radiogenomics This term is used in two when it finds a desired relationship inside of contexts. Either to refer to the study of genetic that dataset but it is also punished when finds variation associated with response to radiation an undesired relation. The algorithm learns or to refer to the correlation between cancer from awards and punishments and updates imaging features and gene expression. It is itself continuously. This type of algorithm is the combination of radiomics and genomics, Glossary 361

always in production mode. It requires real- can then provide answers to new questions time data to be able to update and present onthesametopic.Seealsounsupervised actions. The agent learns without intervention machine learning. from a human by maximizing its reward and Support Vector Machine (SVM) Support minimizing its penalty. Vector Machine, or in short SVM, is a Residual neural network (RNN) This network supervised machine learning model for data skips connections over network layers, by classification and . One of making shortcuts or jump-overs. A ResNet the most used classifiers in machine learning. skips over a single layer. It optimizes the width of the gap between the points of separate categories in feature space. S Superintelligence A superintelligence is an Scikit-learn Scikit-learn (formerly scikits- intelligence system that rapidly increases its learn) is a free software machine learning intelligence in a short time, specifically, to library for the Python programming language. surpass the cognitive capability of the average It features various classification, regression, human being. Part of the idea of superin- and clustering algorithms, including support telligence is that certain kinds of artificial vector machines, random forests, gradient intelligence work are theoretically capable boosting, k-means, and DBSCAN, and is of triggering a “runaway reaction” where designed to interoperate with the Python an artificial intelligence far exceeds human numerical and scientific libraries NumPy and capacity for thought and starts to manipulate SciPy. or control humans in specific ways. Superin- Singularity Singularity refers to the emergence telligence is tied to the idea of a “singularity,” of superintelligent machines with capabilities which is based on the idea that a catalyst or that cannot be predicted by humans. The word trigger would cause rapid change beyond what singularity comes from astrophysics where it humans can anticipate. See Singularity. is used to refer to a point in space time where the rules of ordinary physics do not apply. This idea is parallel to the way the term is used T in a technological context, because if a tech- TensorFlow TensorFlow is an open source soft- nological singularity were to occur, humans ware library for numerical computation using would be become unable to predict events data flow graphs. Nodes in the graph represent beyond that point. See Superintelligence. mathematical operations, while the graph Strong AI An area of AI development that edges represent the multidimensional data is working toward the goal of making AI arrays (tensors) communicated between them. systems that are as useful and skilled as the The flexible architecture allows you to deploy human mind. computation to one or more CPUs or GPUs in Stride The step size in the shift of convolution a desktop, server, or mobile device with a sin- filters. It is normally set to 1, but can be 2–10 gle API. TensorFlow was originally developed or even higher, to increase the computational by researchers and engineers working on the efficiency. Team within Google’s Machine Supervised Learning Training a model from Intelligence research organization for the input data and its corresponding labels. purposes of conducting machine learning and Supervised machine learning is analogous deep neural networks research, but the system to a student learning a subject by studying is general enough to be applicable in a wide a set of questions and their corresponding variety of other domains as well. answers. After mastering the mapping Tensors Multidimensional arrays of primitive between questions and answers, the student data values that are used in TensorFlow. A 362 Glossary

tensor consists of a set of primitive values training images and to yield more precise shaped into an array of any number of dimen- segmentations. sions. These massive numbers of large arrays Unsupervised learning Unsupervised learning are the reason that GPUs and other processors is a type of machine learning algorithm used designed to do floating point mathematics to draw inferences from sets of data consisting excel at speeding up these algorithms. of input data without labeled responses, A unit similar to a e.g., . This means that the Graphic Processing Unit, it is a measure of system is exposed to a total random and new tensor processing power. dataset and it automatically finds patterns and A test developed by Alan Turing relationships inside of that dataset. Unsuper- in the 1950s that tests the ability of a vised learning is used in email clustering in machine to mimic human behavior (see terms order to distinguish between spam emails “Computing Machinery and Intelligence”). and useful emails. It can also be seen as The test involves a human evaluator who Learning by Example. Another example of undertakes natural language conversations unsupervised machine learning is principal with another human and a machine and rates component analysis (PCA). For example, the conversations. It is designed to determine applying PCA on a dataset containing the whether or not a computer could be classed contents of millions of shopping carts might as intelligent. The test (also referred to as reveal that shopping carts containing lemons the imitation game) is conducted by having frequently also contain antacids. human judges chat to several people via Underfitting Underfitting occurs when a a computer. Most of the people the judges statistical model cannot adequately capture will be speaking to are humans, but one will the underlying structure of the data. actually be a chatbot. The chatbot’s objective will be to convince the human judges that they are speaking to a real person. If it does this, it V has passed the Turing test. Variational Autoencoder Variational autoen- coder (VAE) models inherit autoencoder architecture, but make strong assumptions U concerning the distribution of latent variables. Uncanny valley The uncanny valley is a They use variational approach for latent phenomenon that occurs in the human psyche representation learning, which results in and perception with regard to objects that are an additional loss component and specific human-like, usually robots and images, and training algorithm called Stochastic Gradient determines our reaction toward that object. Variational Bayes. It assumes that the data is It is still just a hypothesis, and it is stated generated by a directed graphical model and to the effect of “as an object such as a robot that the encoder is learning an approximation gets more human-like, the response of some to the posterior distribution and denote the observers will become increasingly positive parameters of the encoder (recognition model) and emphatic, until a point is reached in the and decoder (generative model), respectively. robot’s human-likeness beyond which the reactions turn to strong revulsion.” U-net A network with a U-shape, where W connections exist between the horizontally Watson Watson is named after Dr. Watson, a corresponding layers of the contracting former IBM CEO. It is a question-answering input branch and the expanding output supercomputer that uses AI to perform branch. It was designed to work with fewer cognitive computing and data analysis. In the year 2011, Watson competed on the Jeopardy! Glossary 363

show against human contestants and These weights can be adjusted in a process won the first place prize. Since then, Watson called learning. The goal of training a linear has been used for utilization management in model is to determine the ideal weight for each medical centers. feature. If a weight is 0, then its corresponding Weak AI See: Artificial Narrow Intelligence. feature does not contribute to the model. Weights The connection strength (coefficients) Winters See Artificial winters. between units or nodes in a neural network. Index

A image transformation, 132 Acquisition, neuroimaging, 218–220 imaging appropriateness and utilization, 130 ACRassist™, 133 imaging protocoling, 131 ACR DSI Data Science Subspecialty Panels, MRI, image quality improvement and acquisition 312 time reduction in, 131 ACR Lung-RADS classification system, 302 patient experience, 137–138 Actionable nodule, 151 patient safety, 137 Activation layers, 43 patient scheduling, 130–131 Active learning, 207–208 reporting, 133–134 Actual learning, 29 schematic representation, 130 Acute respiratory distress syndrome (ARDS), 146 speech recognition, 135 Advanced Radiology Guidance, Reporting and staffing optimization, 136 Monitoring, 301 text summarization and report translation, AIM, see Annotation and Image Markup 134–135 Airways diseases, chest pathology, 157–159 worklist optimization, 135–136 Alberta Stroke Program Early Computed Tomography cardiovascular diseases (see Cardiovascular diseases) Score (ASPECTS), 221 chest pathology AlexNet, 28, 29 chest radiography, 145–148 Algorithm performance, in clinical practice, 317–318 computed tomography, lung nodules, 148–152 Algorithm validation standards, 300 computer-assisted diagnosis for, 145 Alternate payment models (APMs), 319 interstitial lung disease, 159–162 Analytical and predictive medical tools, 4 lung cancer radiomics, 153–155 Anatomically intelligent ultrasound (AIUS), 171 lung nodules, 146 Anatomical Tracings of Lesions After Stroke (ATLAS-1), parenchymal lung and airways diseases, 157–159 221 pulmonary embolism, 155–157 Ankle-brachial index (ABI), 179 entities, 270–273 ANN, see Artificial neural network imaging biomarkers and imaging biobanks in, 124, Annotation and Image Markup (AIM), 76, 114, 252 125 Artificial general intelligence (AGI), 296 medicine (see Medical artificial intelligence) Artificial intelligence (AI), 26, 49 radiology, applications applications, 49 clinical problem definition, 54 beyond image interpretation, applications components, development, 58 billing, 137 data annotation, 55–56 business intelligence and business analytics, 136 dataset collection, 54 challenges, 138 engineering, 54 content-based image retrieval, 136–137 image acquisition, 50–51 follow-up care, 135 image interpretation, 51–53 hanging protocols, 133 knowledge extraction, 52, 54 image quality evaluation, 132–133 management, 54 image quality improvement and radiation dose reporting, 52 reduction, 132 testing, 56

© Springer Nature Switzerland AG 2019 365 E. R. Ranschaert et al. (eds.), Artificial Intelligence in Medical Imaging, https://doi.org/10.1007/978-3-319-94878-2 366 Index

resources framework early diagnosis, 187–188 computing resources, 57 using mammography, 188 data resources, 58 patient harms, 188 expertise, 57 population-based screening programmes, 187 software resources, 57–58 randomized clinical trials, 187 value-driven integration of, 50 screening, 187 value, structured reporting for Business analytics (BA), 136 analytics and clinical decision support, 78–80 Business ecosystems, 291–293 conventional radiological reporting vs. structured Business intelligence (BI), 136 reporting, 74–75 Business models, AI, 318–320 external data, integration of, 78 NLP, 74, 76–77 radiology, 73 C structured reporting and IHE MRRT, technical Cancer Imaging Archive (CIA), 253 implementations of, 75–76 Cardiac CT angiography, 337 Artificial intelligence ecosystem, 293–295, 297 Cardiac fluid flow reserve computed tomography, 316, Artificial legal persons and objects, 265 317 Artificial neural network (ANN), 26, 27, 49, 56, 132 Cardiovascular diseases Artificial products, 309–310 AI, practical use of arXiv.org, 32 clinical practice, algorithms deployment, 179–180 Automated computer-aided tool, 146 composite data, outcome prediction, 178–179 Automated hanging protocols, 51 computed tomography, 172–175 Automated organ segmentation, 240 echocardiography, 170–172 Automated systems magnetic resonance imaging, 175–177 decision support, recall rates, 194–195 nuclear imaging, 177–178 density stratification and risk prediction, 197–199 cardiovascular imaging, AI impact lesion localization, 195–197 combining imaging with data sources, 170 Automated transform by manifold approximation decision support, 168 (AUTOMAP), 219 image acquisition, 169 Automatic liver segmentation, 240, 241 image quality, image reconstruction and Autonomous artificial intelligent entities, 270–273 improvement of, 169 Auto-regressive models, 206 interpretation and diagnosis, 169–170 opportunistic screening and prognosis, 170 post-processing and image analysis, 169 B computer vision and artificial intelligence, 167–168 Backpropagation, 40 practical use, 168 Batch normalization, 32 Case-by-case basis, 113 Bayesian network, 42 Cheshire Cat, 276, 280, 288 Bayesian network interface, 133 Chest pathology, AI and computer-assisted evaluation Bayesian support vector machine learning method, 159 chest radiography, 145–146 Big data, 9, 30 computer-assisted diagnosis for, 145 Binary large objects (BLOB’s), 66 interstitial lung disease, 159–162 Biobanking and BioMolecular resources Research lung cancer radiomics, 153–155 Infrastructure (BBMRI-ERIC), 123 lung nodules, 146 Biological robot, 257–258 chest radiography, 147–148 Biotechnology, 260, 266 computed tomography, 148–152 BI-RADS density scale, 197–199 parenchymal lung and airways diseases, 157–159 Blockchain technology, 339–340 pulmonary embolism, 155–157 Bonferroni correction, 98 Chest X-rays, 39, 40 Brain–computer interfaces, 257, 260 Class, deep learning and machine learning, 40 Brain extraction based on nonlocal segmentation Class imbalance, 202 technique (BEaST), 217 Clinical decision support (CDS), 130 Brain imaging preprocessing, 217–218 Clinical integration, 317 Brain research, deep learning Clinical studies, medical artificial intelligence, 85–86 brain efficiency, 32 analysis, 95–99 foveated vision, 34–35 clinical question, 87 visual learning, 32–34 Cohort studies, 86, 90–93 Brain Surface Extractor (BSE), 217 ground truth, 87–89 Breast cancer screening metrics, 93–95 early detection of, 187 retrospective cohort studies, 86 Index 367

study design, 86 Computer Assisted Reporting Data Science (CARDS), target population, 89–90 305, 311 Clinical trials, medical imaging in, 231 Computing power, AI, 298 AI techniques, 236, 238–240 Confidence intervals, 97 and biomarker quantification, 232 Congenital heart disease, 177 classification algorithms, 239–240 Consciousness, behavior and self-reporting, 280 image acquisition protocol design, 233 Constrictive pericarditis (CMP), 171 image acquisitions, 232 Content-based image retrieval (CBIR), AI applications, imaging biomarkers in clinical research, 238 136–137 magnetic resonance/computed tomography scanners, Content-based visual information retrieval (CBVIR), 136 233 Contextual region, 27 medical image standardization pipeline, 232 Conventional radiological reporting, 74–75 quality assurance, 235–236 ConvNetJS, 32 segmentation algorithms, 239–240 Convolution, 26 site validation Convolutional autoencoder (CAE), 174 cross-calibration, 234, 235, 237 Convolutional layers, 43 DICOM capabilities, 234 Convolutional neural networks (CNN), 25, 28–30, 36, dummy run exploration, 235 44, 49, 50, 56, 57, 114, 134, 169, 240, 330 MR machine, 234 COPDGene, 158 site survey, 233–234 Coronary artery calcium scoring (CACS), 172, 174 standardization of, 232 Coronary CT angiography (CCTA), 172 CNN, see Convolutional neural networks Criminal Law for Robots, 274–275 Coefficients of variation (CoV), 122 , 312 Common data elements (CDEs), 76 CT-fractional flow reserve (CT-FFR), 175 Compressed sensing MRI (CS-MRI), 131 CT post processing technique (CALIPER), 160 Computational fluid dynamics (CFD) method, 174 CT pulmonary angiography (CTPA), 78–79, 155 Computed tomography (CT), 132 cardiovascular diseases, 172–175 lung nodules, 148–152 Computer aided detection (CAD) systems, 187 D CADETII, 190 Data access and quality, 200–201 digital mammography, 192 techniques, 239 with digital mammography, 191 Data collection, 248 feature extraction, 189 Data curation full field digital mammography reading, 191, 192 and aggregation, 322 ImageChecker M1000, 189, 190 process of, 248 image sensitivity, 191 Data discovery and retrieval, 249–251, 253 microcalcifications, 193 Data-driven texture analysis (DTA) approach, 160 performance of, 189–192 Data handling and curation tools, 253, 254 reported and minimum human-level performance, 193 Data issues, during training screen film mammography, 191 class imbalance, 202 screening programs, 189 dataset bias, 202 support-vector machines, 189 dataset imbalance, 201–202 Computer-aided diagnosis software (CADx), 302, 303 generalizability, 203 Computer-aided diagnostic (CAD) device, 25, 26, 101, minority class, data insufficiency, 202 145 over-fitting, 202, 203 Computer-assisted diagnosis of pulmonary embolism under-fitting, 202, 203 (PE-CAD), 155, 156 Data labeling, 203, 204 Computer-assisted evaluation, chest pathology Data quality, 248, 251–252 chest radiography, 145–148 Data security and privacy, 341–342 computed tomography, lung nodules, 148–152 Dataset bias, 202 computer-assisted diagnosis for, 145 Dataset imbalance, 201–202 interstitial lung disease, 159–162 Deep convolutional neural networks (CNN), 240 lung cancer radiomics, 153–155 Deep learning (DL), 13, 14, 16, 20, 42–43, 49, 114, 171, lung nodules, 146 247, 248 parenchymal lung and airways diseases, 157–159 activation layers, 43 pulmonary embolism, 155–157 artificial intelligence, 26 368 Index

based neural networks, 4 DNA research, 271 big data, 30 Domain adaptation, 207 brain research 3D Simultaneous Localization and Mapping (3D SLAM), brain efficiency, 32 34 foveated vision, 34–35 3D voxel-based method, 159 visual learning, 32–34 Dynamic time warping (DTW), 52 in breast cancer (see Breast cancer screening) CAD, 25, 26 CNNs, 44 E convolutional layers, 43 Echocardiography, cardiovascular diseases, 170–172 convolutional neural networks, 28–30 Economics of AI, 318–320 data access and quality, 200–201 Electronic health record (EHR), 108–111, 304 data issues during training Electronic Signatures Act (ESign), 285 class imbalance, 202 Emphysema index, 157 dataset bias, 202 Endotracheal (ET) tubes, 146 dataset imbalance, 201–202 Energy minimization, 14–15 generalizability, 203 Enterprise content management (ECM), 112 minority class, data insufficiency, 202 Enterprise Image Repository (EIR), 112 over-fitting, 202, 203 Enterprise imaging under-fitting, 202, 203 in context of AI, 114–115 data labeling, 203, 204 enterprise imaging platform, 109–112 diabetic retinopathy, screening for, 30–31 legal aspects, 113–114 FCNs, 44 maturity model, stage description for, 116 fully connected layers, 43 PACS, 107–108 GANs, 45 principles of, 108 GPUs, 30 standards and technology for, 112–113 interpretability, 204–206 Enterprise Viewer, 111 for mammography, 193–194 Epigenomics, 299 memory constraints and image size, 199–200 Ergonomic software, 52 neural networks, 26–28 Error backpropagation, 29 output layer, 44 Ethics, 342–344 pooling layers, 43 Euro-BioImaging project, 54 principled uncertainties, 204 European General Data Protection Regulation (GDPR), residual layer, 44 286–288 smart network architectures, 30 European Imaging Biomarkers Alliance (EIBALL), use of, 129 119 on web, 31–32 European Medicines Agency (EMA), 121 Deep neural network, 27, 28 European Society of Medical Imaging Informatics Default display protocols (DDP), 133 (EuSoMII), 341 Defense Advanced Research Projects Agency (DARPA), Evidence imaging, 109 273 Expectation-maximization (EM) algorithm, 19 Diabetic retinopathy, screening for, 30–31 Diagnostic biomarker, 123 Dice score, 240 F DICOM, see Digital Imaging and Communication in FAIR guiding principles, 253 Medicine False discoveries, 97 DICOM presentation state, 252 Faster-RCNN object detection network, 197 DICOM SR, 78 Fast Healthcare Interoperability Resource (FHIR), 109, Diffusion-perfusion mismatch, 221–222 306 Digital breast tomosynthesis (DBT), 208–209 Fast segmentation of brain imaging, 220–221 Digital data collection, 248 Flexible model-to-image fitting, 21 Digital Imaging Adoption Model (DIAM), 115 Flexible shape fitting, 17–19 Digital Imaging and Communication in Medicine FMRIB Software Library (FSL) Brain Extraction Tool (DICOM), 107, 112–113, 133, 248, 306 (BET), 217 Digital radiography, 147 Food and Drug Administration (FDA) programs, 302 Digital twins, 240, 242 4D flow imaging, 177 Disease-oriented biobanks, 124 Framingham risk score (FRS), 175 DL-based applications in PACS interface, 338 Fully connected layers, 29, 43 Index 369

Fully connected networks (FCNs), 44 Imaging biomarkers Fully convolutional network (FCN), 132 and imaging biobanks in artificial intelligence, 124, 125 BBMRI-ERIC, 123 G biomarkers, clinical scenario of oncology, 123 GANs, see Generalized adversarial networks European Society of Radiology, 123 Gaussian mixture model, 19 types of, 124 Generalized adversarial networks (GANs), 30, 45, 56, stepwise development, 120–121 206–207 validation, 121–123 Genomics, 209, 299 Imaging 3.0 informatics tools, 303–305 Gini Index, 42 Imaging protocoling, 131 Google, 90 Imaging value chain, 129 Google Neural Machine Translation (GNMT), 134 Informatics solutions, 296 Graphical processing units (GPUs), 30, 49 In silico clinical trials, 240, 242 Integrated research application system (IRAS), 66 International Medical Device Regulators Forum H (IMDRF), 301 Hanging protocols, 133 Internet Brain Segmentation Repository (IBSR), 221 Healthcare information, 113 Internet of things, 30 Healthcare Information and Management Systems Interoperability, integration, 303–307 Society (HIMSS), 295 Interpretability, 204–206 Health information exchange (HIE), 109, 111 Interstitial lung disease, chest pathology, 159–162 Health information technology (HIT) industry, 305 Health Insurance Portability and Accountability Act of 1996 (HIPAA), 300 K High-quality datasets for algorithm testing and training, Kaggle Data Science Bowl, 152 313–315 HIPAA Security Rule, 300 L HL7, 109 Late gadolinium enhancement (LGE), 177 HTML5, 75 Lateral geniculate nucleus (LGN), 29 Human-like behavior, 267 Lateral inhibition, 32 artificial legal persons and objects, 265 Learning, 26, 27 computational decisions, 264 Learning algorithm for brain extraction and labeling free will concept, 267 (LABEL), 217 law and legal opinions, 266 Left ventricle (LV) myocardium, 174 rational intelligence, 267–268 Legal enforcement robots, 285–286 social intelligence, 267–268 Legal orientation, 261 Human–Robot integration, 273–275 Legal personality of robots, 259 Hyperbolic tangent function, 41 Legal personhood, 260 Hyperparameters, 41 Legal position, robots, 260–261 LeNet-5, 49 LI-DAR, 34 I Logical Observation Identifiers Names and Codes IDx system, 85, 102 (LOINC), 305 IHE MRRT, 75–76, 78 Lowest limit of quantification (LLOQ), 122, 123 Image acquisition, 50–51, 169 Lung cancer radiomics, chest pathology, 153–155 ImageChecker CT, 151 Lung nodules, chest pathology, 146 Image construction, neuroimaging, 218–220 chest radiography, 147–148 ImageNet, 43, 98, 239 computed tomography, 148–152 ImageNet Large Scale Visual Recognition Challenge Lung PBV software, 156 (ILSVRC), 129 Lung-RADS classification, 307, 308 Image recognition, 331 Lung texture analysis, 159 Image registration, 10–11, 19 Image segmentation, 10 Imaging biobanks, imaging biomarkers and M in artificial intelligence, 124, 125 Machine learning (ML), 39, 50, 114, 160, 171 BBMRI-ERIC, 123 Bayesian networks, 42 biomarkers, clinical scenario of oncology, 123 big data, 61 European Society of Radiology, 123 CT, cardiovascular diseases, 175 types of, 124 data, 61–62 370 Index

decision trees, 42 image segmentation, 10 and deep learning systems, 248 image visualization, 11 DICOM standards, 62–63 in clinical trials, 231 FAIR Guiding Principles, 63 AI techniques, 236, 238–240 features and classes, 39–40 and biomarker quantification, 232 good quality data management, 63 classification algorithms, 239–240 MIDaR scale, 63–64 cross-calibration, 234, 235, 237 Level A, 69–70 DICOM capabilities, 234 Level B, 67–68 dummy run exploration, 235 Level C, 66–67 image acquisition, 232 Level D, 64–66 image acquisition protocol design, 233 MRI, cardiovascular diseases, 175–177 imaging biomarkers in clinical research, 238 neural networks, 40–41 magnetic resonance/computed tomography nuclear imaging, cardiovascular diseases, 177–178 scanners, 233 strategies, 16 medical image standardization pipeline, 232 SVM, 41 MR machine, 234 Magnetic resonance imaging (MRI), 130, 131, 175–177 quality assurance, 235–236 Major adverse cardiovascular events (MACE), 178 segmentation algorithms, 239–240 Mammography Quality Standards Act (MQSA), 323 site survey, 233–234 Markov models, 52 standardization of, 232 Mask-RCNN network, 197 community, 295 Medical artificial intelligence data, complexity of, 12 assessing safety, 100–102 in general and radiology, 3 clinical studies, assessing model efficacy using, objects of interest, complexity of, 12 85–86 validation, complexity of, 12–13 analysis, 95–99 Medical imaging data readiness (MIDaR) scale, 63–64 clinical question, 87 Level A, 69 Cohort studies, 86, 90–93 data labelling, 69 ground truth, 87–89 powering, 69–70 metrics, 93–95 Level B retrospective cohort studies, 86 data selection, 67 study design, 86 data visualisation, 68 target population, 89–90 quality control, 67, 68 efficacy/performance, 84 Level C safety, 84–85 access control, 67 study design data extraction, 66–67 analysis, 100 ethical approval, 66 clinical question, 99 Level D, 64–66 Cohort, 99–100 contains patient identifiable information, 65 final design, 100 quantity, unverified in, 65 ground truth, 99 researchers, inaccessible to, 65–66 metrics, 100 Medical information databases, 338–339 target population, 99 Medium.com, 32 Medical Image and Statistical Interpretation Lab Memory constraints, and image size, 199–200 Multi-Atlas Labelling Challenge, 221 Merit-based Incentive Payment System (MIPS), 137, Medical image computing, 13–14 319 computational strategies Merkle tree technology, for health systems, 316 flexible shape fitting, 17–19 Metabolomics, 299 Pixel classification, 19–20 Missing data, 87 fundamental issues Model-based image analysis, 13, 14 appearance, global vs. local representations, Bayes’ rule, 14 21–22 classification/regression, 15–16 data congruency vs. model fidelity, 22 energy minimization, 14–15 deterministic vs. statistical models, 22 Model fitting, 22 geometry, explicit vs. implicit representation, 21 Monitoring biomarker, 123 Medical imaging Morphologic biomarker, 123 analysis, 9–10 MR imaging-based attenuation correction (MRAC), image registration, 10–11 132 Index 371

Multi-ethnic study of atherosclerosis (MESA), 158, 179 PE-CAD, see Computer-assisted diagnosis of pulmonary Multiple hypothesis testing, 97 embolism Muscular/neural implants, 273 Personally identifying information (PII), 65 Personhood in legal sense, 275–280 Picture archiving and communication system (PACS), 9, 51, 55, 107–108, 133, 248, 304 N Pixel classification, 19–20 Naffine’s concept, 278 Point distribution models (PDM), 17 Napkin-ring sign, 173 Pooling layers, 43 Narrow AI, 330 Population-based biobanks, 124 National Evaluation System for Health Technology Population health management and research, 306 (NEST) program, 301 Population stratification, 90 National Imaging Informatics Course (NIIC), 341 Positive predictive value (PPV), 94 National Institute of Arthritis and Musculoskeletal and Positron-emission tomography (PET) imaging, 132, Skin Diseases (NIAMS), 253 177–178 Native DICOM devices, 112 Precision, 94 Natural language processing (NLP), 52, 54, 74, 76–77, Predictive biomarker, 123 133–135, 252 Principal component analysis (PCA), 32, 177 NEST Coordinating Center (NESTcc), 301 Procedural imaging, 109 Neural network, 4, 13 Proteomics, 299 architectural diagram of, 40 Protocoling, neuroimaging, 218–220 deep learning, 26–28 Pseudo-CT imaging, 132 machine learning, 40–41 Public-private partnerships, 300–303 Neurological diseases Pulmonary CT angiography, 156 acquisition, 218–220 Pulmonary embolism, chest pathology, 155–157 brain imaging preprocessing, 217–218 disease detection, 223–224 fast segmentation, 220–221 Q image construction, 218–220 Quadruple Aim, 294, 303 protocoling, 218–220 Quality Payment Program (QPP), 319 stroke, 221–222 Quantitative Imaging Biomarker Alliance (QIBA), 115, tumor classification, 222–223 119 Neurons, 27 Query-based case retrieval, 134 Neuroscience, 265 Query by image content (QBIC), 136 NIH chest X-ray dataset, 98 QUIBIM Precision® platform, 53 NLP, see Natural language processing Nodule Enhanced Viewing, 151 Noise reduction, 172–173 Nonhuman Rights Project, 268 R Non-interpretive AI algorithms, 320 Radiological Society of North America (RSNA), 75–76, Non-natural (artificial) legal persons, 269–270 119 No-show visits (NSVs), 130, 131 Radiology Nuclear imaging, cardiovascular diseases, 177–178 AI, applications Null hypothesis significance testing, 96, 97 clinical problem definition, 54 NVIDIA.com blog, 32 components, development, 58 computing resources, 57 data annotation, 55–56 O data resources, 58 Online cross-enterprise image transfer, 111 dataset collection, 54 Osteoarthritis Initiative (OAI), 253 engineering, 54 Output layer, 44 expertise, 57 image acquisition, 50–51 image interpretation, 51–53 P knowledge extraction, 52, 54 PACS, see Picture archiving and communication system management, 54 Parametric response map (PRM), 158 reporting, 52 Parenchymal lung, chest pathology, 157–159 software resources, 57–58 Patient-and family-centered care (PFCC), 137 testing, 56 Patient data privacy, AI algorithms, 315–316 AI value, structured reporting for, 73 Patient scheduling, 130–131 innovation in 372 Index

AI, 330–331 Society of Imaging Informatics in Medicine (SIIM), 136, radiologists’ perspective, 331–333 341 level of expectation, 333–337 Software as a Medical Device (SaMD), 301 Radiology information systems (RIS), 54 Software development community, 295 Radiomics, 51, 153–155, 299 Specificity, 93, 94 RadLex, 75, 79, 110 Spectral detector CT, 157 Random forest, 29, 42 Speech recognition (SR), 52, 135 Randomised control trial (RCT), 102 Staffing optimization, AI applications, 136 Reasonable Human, 276–277 Staging biomarker, 123 Receiver operating characteristic (ROC) curve, 94, 95 Stratification, 92 Receptive fields, 32 Stress perfusion imaging, 177 Recurrent neural networks (RNN), 30, 52 Structured reporting Regions of interest (ROI), 51, 55 conventional radiological reporting vs., 74–75 Reinforced learning, 299 integration, external data into, 78 ReLU, 43 technical implementations, 75–76 Reporting, AI in, 133–134 Subjective definitions, 87 Report turnaround times (RTAT), 135–136 Subpopulations and intermediate outcomes in COPD Representational state transfer (REST), 306 study (SPIROMICS), 158 Research Ethics Service (RES), 66 Super-resolution imaging, 132 Residual layer, deep learning, 44 Supervised learning, 299, 341 Restrictive cardiomyopathy (RCM), 171 Support vector machine (SVM), 29, 41, 130, 174 Rights-based conception, 277 Surround suppression, 32 Robotics SVM, see Support vector machine development of, 258 Swiss Knife for Radiologists, 338 entities, 270–273 Synapses, 27 Robots Synthetic CT, 132 classification, 281 Synthetic data generation, 206–207 Criminal Law, 274–275 legal acts, 283–285 legal personality, 259, 261–264 T legal personhood, 262–264 Tabar parenchymal patterns, 197, 198 legal positioning, 260–261, 286 Technological evolution, 258 legal subjectivity, 262–264 Tensor, 28 liability and legal subjectivity, 281–283 Test set, 90 role and personality, 280 Text summarization technology, 134–135 technological evolution, 258 Texture analysis (TA), 173 Robust Learning-based Brain Extraction System 3D adaptivemultiple featuremethod, 159 (ROBEX), 217 Tomosynthesis, 208–209 RSNA, see Radiological Society of North America Total stress perfusion deficit (TPD), 178 TOUCH-AI development platform, 311 Trade-off, 94 S Transfer learning, 239 Sampling bias, 91 Transthoracic echocardiogram (TTE) imaging, 171 Segment involvement scores (SIS), 175 Tutorial papers, 32 Segment stenosis score (SSS), 175 SegNet, 44 Self-learning algorithms development, 280 U Self-learning computer systems, 247 UK Breast Screening History, 188–189 Semi-supervised learning, 207 U-Net, 44, 45 Sensitivity, 93, 94 UNET architecture, 240 Service-oriented architecture (SOA) technology, 338 Uniform Computer Information Transaction Act 7th Framework Programme (FP7), 66 (UCITA), 285 Signal-to-noise ratio (SNR), 132 Uniform Electronic Transaction Act (UETA), 285 SIIM Workflow Initiative in Medicine (SWIM), 136 Unsupervised learning, 299 Single image super-resolution, 132 Single-photon emission computed tomography (SPECT), 177–178 V Smart network architectures, 30 Variational auto-encoders (VAEs), 206 Smart tissue autonomous robot (STAR), 257 Vector-based classification/regression, 16 Social robots, 285–286 Vendor Neutral AI Platform (VNAP), 179 Index 373

Vendor neutral archive (VNA), 107, 179 W Veye Chest product, 152 Wells’ score, 78, 79 Visual learning, 32–34 Voice recognition software (VRS) platforms, 305–306