European Journal of Nuclear Medicine and Molecular Imaging (2019) 46:2722–2730 https://doi.org/10.1007/s00259-019-04382-9

ORIGINAL ARTICLE

Why imaging data alone is not enough: AI-based integration of imaging, omics, and clinical data

Andreas Holzinger1 · Benjamin Haibe-Kains2,3 · Igor Jurisica3,4,5

Received: 12 May 2019 / Accepted: 28 May 2019 /Published online: 15 June 2019 © Springer-Verlag GmbH Germany, part of Springer Nature 2019

Abstract Artificial intelligence (AI) is currently regaining enormous interest due to the success of machine learning (ML), and in particular deep learning (DL). Image analysis, and thus radiomics, strongly benefits from this research. However, effectively and efficiently integrating diverse clinical, imaging, and molecular profile data is necessary to understand complex diseases, and to achieve accurate diagnosis in order to provide the best possible treatment. In addition to the need for sufficient computing resources, suitable algorithms, models, and data infrastructure, three important aspects are often neglected: (1) the need for multiple independent, sufficiently large and, above all, high-quality data sets; (2) the need for domain knowledge and ontologies; and (3) the requirement for multiple networks that provide relevant relationships among biological entities. While one will always get results out of high-dimensional data, all three aspects are essential to provide robust training and validation of ML models, to provide explainable hypotheses and results, and to achieve the necessary trust in AI and confidence for clinical applications.

Keywords Precision medicine · Artificial intelligence · Machine learning · Decision support · Integrative computational biology · Network-based analysis · Radiomics

Data explosion and the vital need extremely sensitive to poor data quality [2]. In the medi- for high-quality data cal domain, data quality (i.e., completeness, correctness) is further extended by confidentiality requirements, and the First and foremost, the biggest problem in the AI world is need for comprehensive annotation. It is amazing how bad the quality and sufficient volume of data. The most suc- the standard data sets in the medical domain are (noisy, cessful machine learning methods are data hungry [1], and sparse, wrong, biased, etc.), which was already described by Komaroff in 1979, when he stated that “the taking of a medi- This article is part of the Topical Collection on Advanced Image cal history, the performance of the physical examination, the Analyses (Radiomics and Artificial Intelligence) interpretation of laboratory tests, even the definition of dis-  Igor Jurisica eases, are surprisingly inexact” [3]. Unfortunately, with the [email protected] introduction of sophisticated electronic patient record sys- tems this is often even worse today, as incorrect data escape routine checks and the errors are compounded across com- 1 Institute for Medical Informatics / Statistics, Medical University Graz, Auenbruggerplatz 2/V, putational workflows, frequently increasing clinical errors 8036 Graz, Austria [4]. Therefore, within a whole ML pipeline the aspect of 2 Princess Margaret Cancer Centre, University Health Network, data processing is of utmost importance. Consequently, for Toronto, ON, Canada AI applications to be successfully applied to the medical 3 Departments of Medical Biophysics and Computer Science, domain, an integrative machine learning approach might University of Toronto, Toronto, Canada be necessary [5], calling for the integration and fusion of 4 Krembil Research Institute, UHN, 60 Leonard Avenue, heterogeneous data sets, e.g., images, physiological data, 5KD-407, Toronto, Ontario, M5T 0S8, Canada text (non-standardized“unstructure” and structured patient 5 Institute of Neuroimmunology, Slovak Academy of Sciences, records), and diverse omics profiles (, microRNA, pro- Bratislava, Slovakia tein, metabolic, etc.) [6]. Eur J Nucl Med Mol Imaging (2019) 46:2722–2730 2723

Generally, in the health domain one can identify different to reproduce the results, for examples and a discussion areas of data volumes depending on the medical context in see [9]. which they have been created [7]: • Patient data from the electronic patient records (EPR), Integrative computational biology including clinical reports, mainly unstructured informa- tion in written clinical reports, but also lab test data, and Developing effective knowledge systems to support the all types of biomedical signals (ECG, EEG, EOG, etc). governance, processing, inference, analysis and interactive • Imaging data, e.g., from radiology, pathology (whole visualization of integrated omics data is critical to maximize slide images), dermatology (dermoscopy), but also the impact on translational research. The importance of sonography, etc. visualization is often underestimated, but it is the quality • Biomedical research data, including clinical trial data of the visualization that enables the experts end users to and all sort of omics data, particularly from next- understand the data in the context of their problems [10]. generation sequencing (NGS), etc. In addition to this Visualization enables more accurate and relevant modeling data, which is produced in a clinical environment, more of healthy and disease states, and in turn support precision and more data is being generated outside the clinical medicine [11]. sector, i.e., • Health business data, including management data, Integrating layers of omics data logistics, and accounting but also more and more prediction (e.g., resource planning); Understanding complex diseases (e.g., arthritis, brain • Private patient data, produced completely outside the disorders, cancer) requires computational analyses that clinical context, which is fostered by the possibilities integrate diverse layers of data—imaging data, omics of modern low-cost smartphones such as sport data, profiles, clinical data, and annotations. Such complex data wellness and data for ambient assisted living (sensor needs to be analyzed using scalable data mining, machine data) learning and statistical methods. In turn, comprehensive The US Department of Health and Human Services integration, further analysis and modeling requires detailed (HHS) created a taxonomy of health data with the following annotations and relationships among these entities (see seven dimensions: Fig. 1). Such integrated network-based analyses help create and validate explainable disease models, and enable 1. Demographics and socioeconomic data including age, improved treatments strategies and patient outcomes. gender, education, etc. Radiomics can aid this process (see Fig. 1), by pro- 2. Health status data, including disabilities, diagnoses, and viding additional features for the analysis, improved dis- symptoms ease characterization and patient stratification. Radiomics 3. Health resources data including the capacities of the enables more accurate disease classification than traditional health system, performance and operating data disease grading; for example, it significantly improves sen- 4. Health-care utilization, including data about treatment sitivity of identifying structural knee osteoarthritis [12]. and duration Imaging data can provide integrated in the ”front end” by 5. Health-care cost and expenditure data, including supporting cancer detection Cameron-2016, characterizing charges, insurance status, etc. micro-environment [13], tracking tumor biology [14], cell 6. Health-care outcomes of current and past prevention, proliferation, and blood vessel formation [15]. This can aid treatments, etc. in more precise and earlier tumor detection and sybtyping 7. Other data including omics data, but also environmental [16], more accurate staging, treatment planning, progno- exposure data such as environmental impacts, etc. sis [17], and non-invasive response monitoring [18–20]. Reproducibility is another recognized but still mostly From the machine learning perspective, this helps reducing ignored issue. Currently, there is a huge trend in the noise and heterogeneity. Radiomics can also be integrated opposite direction: most scientific contributions are judged in the ”back end” by helping quantifying disease dynamics according to their novelty—rarely for their reproducibility, and predicting response to treatment [21] through series of which would be the essential criteria of good science— “digital biopsies”. see the recent debate in Science [8]. A typical example is that the data are considered as inaccessible due to ethical From individual biomarkers to signatures restrictions. Data protection is of course an issue but there must be solutions to publish data along with the results One of the central tasks in precision medicine is the and the international scientific community should be invited identification of groups of markers for aiding diagnosis, 2724 Eur J Nucl Med Mol Imaging (2019) 46:2722–2730

Binding U - Uncharacterized Chlorprothixene Thioridazine Clozapine D - Genome Maintenance VIPR2 CHRM4 TMEM120A PLLP HTR4 CCR9 CCR8 CCR6 CCR4 CERS6 BDKRB2 NPSR1 PRNP FA2H CCR2 MGLL CHRM3 OXTR CCR1 MC4R RXFP4 WHSC2 CYSLTR2 SYT1 PTAFR ATP2A2 FZD7 GPR37 GNRHR RHBDD2 ADRB2 DRD2 CALM3 ATP1B1 OAZ1 GPRC5B HTR2C PTGER4 F2RL1 LTB4R2 MID1IP1 HTR2B RXFP3 AGTR1 YIF1A CHRM2 OPRM1 CHRM5 BUD31 SLC25A11 TTYH1 SLC3A2 GPR35 C4orf3 SYNGR2 DNAJC8 GPRIN2 YIF1B TMEM161A CCR5 HTR6 SMO ADORA2A ADORA1 HRH1 ADRA1A BDKRB1 GABBR1 OPRL1 MBP HNRNPK APLNR GPR77 HSPA1B GPR182 TAAR1 TSHR Catalytic Activity HTR4 C - Cellular Fate and Organization CCR9 Channel Regulator Activity P - Translation CCR8 B - Transcriptional Control CCR6 CCR5 Enzyme Regulator Activity T - Transcription CCR4 M - Other Metabolism CERS6 F - Fate BDKRB2 Regulation of Molecular Function NPSR1 G - Amino Acid Metabolism PRNP Transcription Factor Activity A - Transport and Sensing GLP3LGLP3L FA2H R - Stress and Defence ATF4 CCR2 ATF4 MGLL Receptor Regulator Activity E - Energy Production SQSTMSQSTM CHRM3 Unmatched RELBRELB TRPV1TRPV1 OXTR FXR2FXR2 CCR1 Structural Molecule Activity MC4R DNJB1DNJB1 B3KQ79B3KQ79 RELREL RXFP4 Translation Regulator Activity GBRL1GBRL1 NFKB2NFKB2 WHSC2 CYSLTR2 Transporter Activity MAOXMAOX TRFETRFE TANKTANK SYT1 S35B1S35B1 PTAFR UVRAGUVRAG MYO10MYO10 ATP2A2 WFS1WFS1 FZD7 CASP1CASP1 MYO6MYO6 GPR37 IBP3IBP3 RTN3RTN3 HTR6 Uncategorized GRB7GRB7 GNRHR CHKBCHKB GABR2GABR2 RHBDD2 IKBBIKBB ADRB2 ARFRPARFRP GABR1GABR1 Chlorprothixene DRD2 AKAP9AKAP9 CALM3 ATF6AATF6A RTN4RTN4 ATP1B1 NED4LNED4L OAZ1 ING2ING2 GPRC5B SCNAASCNAA MYO1EMYO1E HTR2C SMUF1SMUF1 NBR1NBR1 PTGER4 PIGGPIGG OPTNOPTN SMO GOGA4GOGA4 F2RL1 GPR37GPR37 3BP23BP2 ADORA2A LTB4R2 ANK3ANK3 R3GEFR3GEF APRAPR ARL1ARL1 MID1IP1 MARK4MARK4 HTR2B VIPR2 RXFP3 AGTR1 ESR1ESR1 HERP1HERP1MYOMEMYOME ADORA1 LASS2LASS2 YIF1A SCN1BSCN1B RAB3ARAB3A MCL1MCL1 CREL1CREL1 CHRM4 SERC3SERC3 S22AHS22AH ATP9AATP9A CHRM2 Thioridazine Clozapine OPRM1 VATLVATL HRH1 BASIBASI ADRA1A ADA1AADA1A SYUASYUA CHRM5 BDKRB1 CS042CS042 CALMCALM M3K1 GABBR1 ACM4 DRD2 ACM3 ACM5 OPRL1 MBP BUD31 HNRNPK GP125GP125 SLC25A11 PPIL2PPIL2 RANB3RANB3 PPIAPPIA TBA3CTBA3C TTYH1 PWP1PWP1 SLC3A2 WWTR1WWTR1 TMEM120A H33H33 GPR35 EXOS4EXOS4 C4orf3 APLNR MRT4MRT4 CUL3CUL3 SYNGR2 PLLP DNJA1DNJA1 P5CR3P5CR3 DNAJC8 GPRIN2 UBE3AUBE3A CIA30CIA30 YIF1B ALG6ALG6 GPR77 K0020K0020 TMEM161A ASF1AASF1A HSPA1B MSH6MSH6 NIP7NIP7 GPR182 UBE2NUBE2N NOL7NOL7 TAAR1 TSHR BCS1BCS1 HR = 2.09 (1.76 − 2.49) logrank P < 1E−16

Edge color: PPI supporting evidence-ovary AF10AF10

HLFHLF FBLN3FBLN3 MYPT2MYPT2 RNAseq MITFMITF MEOX2MEOX2 CD53CD53 TP53TP53 RNAseq and protein DYR1BDYR1B SC31ASC31A MYO6MYO6 CO5A2CO5A2 AT1B1AT1B1 STACSTAC ITB8ITB8 GLIS3GLIS3 RADIRADI PMP22PMP22 CHKACHKA No evidence TRPS1TRPS1 CTND1CTND1 IMB1IMB1 NFAC1NFAC1 AT8A1AT8A1 hsa-mir-143 CRESTCREST S20A1S20A1 FOSFOS SIX4SIX4 K2022K2022 PTPRDPTPRD CCD69CCD69 hsa-mir-148a K1024K1024 GNPTAGNPTA GAB1GAB1 PPM1EPPM1E DAPK1DAPK1 COBA1COBA1 VASH1VASH1 ZN318ZN318 SVEP1SVEP1 AKP13AKP13 GPM6AGPM6A SPT2SPT2 UBP6UBP6 CO2A1CO2A1 LIMD2LIMD2 FND3BFND3B STRN3STRN3 AKT3AKT3 IPKAIPKA BCL2L12BCL2L12 MTMR9MTMR9 WDR47WDR47 UBP32UBP32 AKAP6AKAP6 T22D3T22D3 AT11AAT11A BAI3BAI3 DYR1ADYR1A RND3RND3 BCL2BCL2 ENPLENPL AAKG2AAKG2 KLF4KLF4 AGO1AGO1 5HT7R5HT7R CYCCYC ICALICAL E41L2E41L2 SPSYSPSY CPEB4CPEB4 CAC1CCAC1C ULK2ULK2 ENPP2ENPP2 BCL2L14BCL2L14 RBM24RBM24 DCP2DCP2 NLKNLK RFIP4RFIP4 SPTA2SPTA2 MOT8MOT8 MD12LMD12L CHD9CHD9 BCORLBCORL FBN1FBN1 SPD2ASPD2A ITM2BITM2B E41L1E41L1 AP1G1AP1G1 SAPSAP ADRB2ADRB2 LAMA2LAMA2 BCL2L13BCL2L13 PKHH1PKHH1 ROBO1ROBO1 hsa-let-7fhsa-let-7f Probability DMDDMD GNSGNS NPTX1NPTX1 S4A4S4A4 RMD5ARMD5A OTUD4OTUD4 PLPL6PLPL6 PROSPROS MTMR4MTMR4 K2C5K2C5 ELNELN CRIM1CRIM1 DICERDICER PER1PER1 GSH1GSH1 ERR3ERR3 S38A2S38A2 QKIQKI JAG2JAG2 IGF1BIGF1B IL13IL13 ANR12ANR12 MAFBMAFB ID1ID1 Q32M63Q32M63 ZBTB5ZBTB5 CO1A2CO1A2 CREMCREM hsa-mir-375 HOME1HOME1 DDAH1DDAH1 DIAP2DIAP2 AVR2BAVR2B BZW1BZW1 CCND2CCND2 BCL2L10BCL2L10 NFIXNFIX ATX1ATX1 KS6A3KS6A3 CO4A6CO4A6 MPIP1MPIP1 ARF4ARF4 PDE4DPDE4D Q8IWE6Q8IWE6 CO3A1CO3A1 HXA5HXA5 JIP4JIP4 CPEB1CPEB1 SENP5SENP5 INSI1INSI1 M3K4M3K4 CLD1CLD1 BCL2L11BCL2L11 FLRT3FLRT3 CD5R1CD5R1 WDFY3WDFY3 PLAL2PLAL2 BUB1BBUB1B CHD4CHD4 VINCVINC NSD2NSD2 GP116GP116 ZDH17ZDH17 TOM70TOM70 JARD2JARD2 NFAT5NFAT5 DGKHDGKH TUSC2TUSC2 GAS7GAS7 BCL2L1BCL2L1 E2F7E2F7 VP37AVP37A ZFY26ZFY26 SMC1ASMC1A K1539K1539 TNR6BTNR6B UBP47UBP47 ARRD3ARRD3 RLFRLF PROF2PROF2 FRAS1FRAS1 HMGA2HMGA2 TBB2BTBB2B PTN4PTN4 DCXDCX BCL2L2BCL2L2 DKK1DKK1 PK3CAPK3CA CLCN3CLCN3 ZNT8ZNT8 ANR49ANR49 E2F5E2F5 VEGFAVEGFA STEFNB2EFNB2A13STA13 TF2H1TF2H1 SACSSACS DPOLZDPOLZ S1PR1S1PR1 LTBP1LTBP1 STXB5STXB5 PP2ABPP2AB SRPK2SRPK2 MEIS1MEIS1 ABI2ABI2 K1468K1468 GALT1GALT1 PPM1DPPM1D BCL2A1BCL2A1 B4DN89B4DN89 PRIC2PRIC2 Q5VZZ6Q5VZZ6 ITPK1ITPK1 PICALPICAL BZW2BZW2 MYBBMYBB CDC7CDC7 BCL2BCL2 E2F3E2F3 FOG2FOG2 ST7LST7L KLDC3KLDC3 ADNPADNP ZN462ZN462 BACH2BACH2 S12A6S12A6 SYNMSYNM EPNRP1NRP1AS1EPAS1 RNF6RNF6 Expression PAN3PAN3 hsa-mir-29b ICA69ICA69 ITA5ITA5 BC11ABC11A SAC1SAC1 CNOT6CNOT6 SNX1SNX1 PARD3PARD3 low MUC4MUC4 PKD2PKD2 RAB14RAB14 AP2CAP2C PHF6PHF6 CPEB3 CNBP1CNBP1 PAG1PAG1 ZEB2ZEB2 SMAD2SMAD2 NCOR2NCOR2 SESD1SESD1 XKR6XKR6 high HAS3HAS3 AT2B4AT2B4 RECKRECK SETD5SETD5 0.0 0.2 0.4 0.6 0.8 1.0 hsa-mir-10a ELF2ELF2 UN13BUN13B TYB4TYB4 VRK1VRK1 BRCA2BRCA2 CNOT8CNOT8 HBEGFHBEGF IRS1IRS1 PALM2PALM2 LRIG2LRIG2 NR4A3NR4A3 PKHA1PKHA1 HIF3AHIF3A BNI3LBNI3L NEUMNEUM SOX6SOX6 AJAP1AJAP1 IF2B3IF2B3 CTDSLCTDSL Node color: GO Biological Process WN10BWN10B ITSN2ITSN2 AVR2AAVR2A BLMBLM CRBL2CRBL2 CMTA1CMTA1 PRKRAPRKRA GCSHGCSH 0 50 100 150 200 NFIBNFIB BAZ2BBAZ2B CSMD1CSMD1 NSMA2NSMA2 Binding BRCA1BRCA1 KLF9KLF9 SCML2SCML2 AN13CAN13C TLE4TLE4 PTPRMPTPRM M3K8M3K8 PO2F2PO2F2 KMT2EKMT2E ABCA1ABCA1 CHD7CHD7 DTX4DTX4 KIF2AKIF2A Catalytic Activity TR11BTR11B KLHL3KLHL3 TNR1BTNR1B SATB1SATB1 IOD2IOD2 PHAR2PHAR2 ZMY11ZMY11 WNK3WNK3 Channel Regulator Activity Time (months) DYRK2DYRK2 ZWINTZWINT DSC2DSC2 STC1STC1 PDZD2PDZD2 hsa-mir-183 MBNL1MBNL1 TOP3ATOP3A LIPLLIPL SULF1SULF1 SPY1SPY1 Enzyme Regulator Activity DNMT1DNMT1CSF1CSF1 ZN410ZN410 Number at risk TBX2TBX2 FCSD2FCSD2 DTNADTNA AKA12AKA12 SNX2SNX2 TNFL6TNFL6 IF4E2IF4E2 Regulation of Molecular Function GTR3GTR3 SMAD7SMAD7 hsa-mir-210 MUC1MUC1 SAFBSAFB BCL6BCL6 TIAM1TIAM1 NDST1NDST1 low 481 273 46 11 2 PAQRBPAQRB PDCD4PDCD4 Transcription Factor Activity U - Uncharacterized TF7L2TF7L2 K0240K0240 AGO2AGO2 SESN1SESN1 high 1445 555 157 46 5 VIMEVIME SUZ12SUZ12 D - Genome Maintenance Receptor Regulator Activity MPDZMPDZ PLAG1PLAG1 C - Cellular Fate and Organization NKRFNKRF CNN1CNN1 P - Translation Structural Molecule Activity ERGERG RUNX3RUNX3 B - Transcriptional Control S10AAS10AA hsa-mir-21hsa-mir-21 NRDCNRDC MYT1LMYT1L T - Transcription Translation Regulator Activity Mutation disrupted interaction LIFRLIFR M - Other Metabolism RYBPRYBP PPBTPPBT CNTFRCNTFR F - Protein Fate Transporter Activity DAXXDAXX STAT6STAT6 G - Amino Acid Metabolism S35A1S35A1 Mutation increased interaction A - Transport and Sensing ZMYM2ZMYM2 R - Stress and Defence PIAS4PIAS4 FOXA1FOXA1 Uncategorized E - Energy Production EGR1EGR1 Unmatched A6NNU5A6NNU5 Mutation reduced interaction YPEL5YPEL5

Omics profiles across genome, proteome, Layers of annotated networks; Discovered relationships across Network relationships link Combined biomarkers Treatment tailored metabolome can be annotated with tissue, disease, data layers identify combined relevant entities within identify clinically-relevant to patient subgroups analyzed separately network properties can further biomarkers, drug mechanism each data layer and identify patient subgroups results in improved or combined to find characterize potential of action and create explainable better biomarkers patient outcomes differentially biomarkers disease models expressed entities

Fig. 1 Network-based omics data integration and analysis

and for predicting risk and treatment outcome. The different cohort of patients [23–25]. Reasons for such discovery and validation of such biomarkers is a complex failures include: (1) patient and sample heterogeneity, and computationally intensive process, utilizing advanced (2) range of biological assays with different technical statistics and ML algorithms. Both unsupervised data and analytical biases, (3) diversity of statistical and exploration and pattern discovery methods are useful. An bioinformatics algorithms and annotation databases used, ideal approach is one that characterizes each sample on the (4) disproportionately many more variables than samples basis of a small number of biologically relevant, pathway- analyzed, (5) and existence of multiple, clinically equivalent related variables, integrating omics data with imaging and biomarkers that basic statistical and ML algorithms cannot clinical data. distinguish. A promising alternative to the brute-force Current radiomics studies usually use smaller cohorts, approach that works in the space of expression levels of extract many features, and have limited validation across thousands of , microRNAs, metabolites or different instruments, leading to overfitting and in turn takes advantage of relationships among these entities and overoptimistic results that do not generalize. An overfit identifies a small number of biologically relevant, network- signature or model contains more features than could be structure or pathway-related variables, integrating high- justified from the training data set. Recently, Phantom study throughput, imaging, and clinical data. [22] explored reproducibility and robustness of radiomic Successful biomarker discovery and building explain- features for MRI, concluding that large fraction of imaging able models require integration of diverse and heterogenous features lack robustness. Standardizing instruments and databases, ontologies and biological networks. Ontolo- protocols will lead to ability for integrating larger datasets, gies and pathway/network annotations come from diverse, and validating signatures and models on independent distributed repositories, including (http:// datasets. This will increase generalizability, and in turn www.geneontology.org), Uniprot (http://www.uniprot.org), will help increasing reproducibility and robustness. In ProteinData Bank (http://www.rcsb.org), Disease Ontology turn, using robust features will result in better classifier (http://disease-ontology.org), Online Mendelian Inheri- performance Robinson-2019. tance in Man (http://www.omim.org), DrugBank (http:// Many methods have been introduced to generate useful www.drugbank.ca), DisGeNET (http://www.disgenet. biomarkers from individual and combined layers of omics, org), The Comparative Toxicogenomics Database (http:// imaging and clinical data, but results remain unsatisfactory; ctdbase.org), Integrated Interactions Database (http:// existing methods often suffer from overfitting due to ophid.utoronto.ca/iid), pathDIP (http://ophid.utoronto.ca/ small numbers of samples. Proposed biomarkers frequently pathDIP), etc. (see http://omictools.com/ for extensive do not validate using other biological assays or on a list). Eur J Nucl Med Mol Imaging (2019) 46:2722–2730 2725

Network-based computational biology Resulting multi-modal data can be further analyzed and visualized using algorithms from graph theory, and Successful network-based methods for class prediction identify patterns characterizing graph structure-function take advantage of network modules to identify score- relationship. Linking network structure to properties of based sub-network biomarkers. This could take advantage genes and proteins that form it provides powerful method of identifying network structures (complexes, graphlets, for predicting function [36, 38–40], identifying robust hubs and articulation points, etc.) or identifying overlap biomarkers [41, 42], and modeling drug mechanism of with curated pathway databases. Multiple studies have action [43, 44]. shown that resulting biomarkers are highly conserved across Integrated network-based analyses will lead to improving studies. data interpretation, help generating testable hypotheses, and These analytics workflows have to integrate diverse creating biologically meaningful models with clinical rel- heterogeneous data, comprising confidential and publicly evance [6, 45]. Importantly, broad imaging modalities and available data, local and broadly distributed data and diverse image data analysis algorithms will lead to robust annotations. Some parts of the analysis may strongly biomarkers for diagnosis and prognosis, but especially non- benefit from GPU (graphical processing unit) acceler- invasive monitoring of response to therapy [21]. Examples ations (image feature extraction algorithms, deep neu- include molecular, functional and anatomical imaging and ral networks, graph analysis algorithms), while others image feature extraction [46], including CT, ultrasound, require multi-core CPU (central processing unit) and magnetic resonance imaging, magnetic resonance spec- large bandwidth/storage. Implementing such discovery troscopy, positron emission tomography, etc. Imaging helps pipelines using automated workflows [26] support effi- to identify subregions for further omics studies [47]. Net- ciency, increase validation rates and reproducibility [27]. works bring individual data layers together, help de-noise Combining results of such analyses with diverse biological individual data sets, validate models, and explain results networks, including transcription regulatory, protein inter- [31]. action and microRNA:target networks, and metabolic and Despite noise present in interaction data sets, their sys- signaling pathways, with transcriptional changes induced by tematic analysis uncovers biologically relevant information: drugs [28] enables modeling drug mechanism of action. In lethality and synthetic lethality, functional organization, turn, radiomics can help predict [29] and measure treatment hierarchical structure, modularity, and network-building response [30]. motifs. Synthetic lethality has been first explored by syn- Interaction networks underlie the genotype to phenotype thetic genetic arrays to study genetic interaction in yeast relationship, understanding of which is the prime goal [48], then highly explored using graph theory algorithms for a system’s view of the cell, tissue or organism. (e.g., [36–38, 49]), and understanding of these principles While each omics data layer can be analyzed separately, resulted in applications for predicting drug combinations integrating findings across data layers using biological (e.g., [44, 50]. As such, network-based analysis of patient networks enables discovering new results through these data is essential for developing stratified and patient-centric relationships. Integrated data and network models reduce treatment strategies. Besides improved analytics and scala- biases, improve coverage and quality by eliminating noise bility, medical applications also require reliability, robust- and using reinforcement learning strategy to strengthen the ness, and explainability. This allows transitions from pat- signal [31]. terns and correlations to causation and explainable models Applications of ”integrative radiomics” include combin- [51]. As such, they will provide “an intellectual prosthesis” ing radiomics features from MRI with peak area features for the domain experts, striving for augmented rather than from MR spectroscopy (prostate cancer), integrating histo- just “artificial” intelligence. morphometric features with protein MS features for pre- dicting 5-year recurrence (prostate cancer), or integrating volumetric measurements on MRI with protein expression AI for precision medicine features (Alzheimers’ diagnosis) [32]. Importantly, fMRI data analysis helps creating complex network structures, AI holds great promise to revolutionize cancer care and such as structural brain network [33], neuro-connectivity bring precision medicine closer to reality [52–54]. There after brain injury [34], or schizophrenia characterization is an urgent need to invest in collecting, curating and by functional connectivity [35]. All such application would annotating the data and developing the algorithms necessary benefit from all the graph theoretical algorithms devel- to implement AI tools to optimize research and care [55]. oped mainly for protein interaction network analysis and We describe below the opportunities provided by AI for characterization (e.g., [36, 37]). precision medicine. 2726 Eur J Nucl Med Mol Imaging (2019) 46:2722–2730

Biomarker discovery which can be overcome by the use of AI tools for (semi-)automated segmentation of radiological images [65]. Developing biomarkers for cancer diagnostic, prognosis AI-based segmentation will not only save time but will also and prediction of therapy response or adverse side effects decrease the inter-observer variance that is currently high constitutes one of the main challenges in precision in clinical settings, enabling clinicians to better identify medicine. Combining the established clinical parameters and monitor lymph nodes that either benign or at risk of with other non-invasive measurements from radiological being invaded during tumor progression. Cardenas et al. images, blood or urine samples, would allow us to outline the most recent deep learning approaches using develop better predictors of clinical outcome and improve in radiology showing unprecedented performance but also monitoring of treatment effects over the course of the poses serious challenges for their clinical deployment [66]. therapy. Deep neural networks have recently been used Notably, these challenges include the retraining of the deep to extract more information from radiological images, a learning model on new data for calibration and performance relatively new field referred to as radiomics [56], and improvement, as described in the recent FDA while paper intense research is being pursued to further integrating on this topic.1 Similarly, AI can help pathologists speed multiple data modalities for higher accuracy [57]. up and increase the reliability of the diagnosis based on Imaging and genomic data have been shown to carry histo-pathological slides. Moreover, AI-based tools can redundant and complementary information to develop be developed to go beyond current diagnostic capabilities biomarkers. For instance, Sun et al. showed that it is possi- and allow for more comprehensive identification of cell ble to develop a radiomics signature from CT images as a types and their associations with aggressive phenotypes or surrogate for the level of CD8 cell tumor infiltration quan- response to targeted and chemotherapies. tified using RNA-sequencing [58]. This is an important finding, as it provides a non-invasive way to quantify the Natural language processing level of tumor infiltrated lymphocytes (TILs), an important pathological measurement during immunotherapy). How- Rich clinical data are embedded into dictated and tran- ever, molecular profiling from tumor biopsy or blood draw scribed clinical notes. Recent advances in AI for natu- may reveal genomic aberrations that are undetectable from ral language processing (NLP) allows for automated and images, supporting their complementary value with image- efficient extraction of discrete data elements from these based features. clinical notes [67]. NLP methods can be used to parse In addition to biomarkers developed from data directly the large amount of physicians’ notes, pathology reports, measured in clinical settings, the massive amount of radiology reports and other free-form clinical documenta- imaging and genomics data generated in the pre-clinical tion sources generated in clinical trials and standard-of- and research settings can be used to build novel predictors care. This is essential to provide context to other data of therapy response. Images and genomic data collected types, such as radiological, histo-pathological images and during drug testing in vitro (e.g., established cancer cell genomics. lines or patient-derived organoids) [59–61]andin vivo (e.g., genetically-engineered mouse models or patient-derived Ontology learning xenografts) [62] is a prime example of the richness of data that can be analyzed using AI methodologies to improve While vast amounts of heterogeneous data are collected in clinical predictors. administrative, clinical and research databases, integration and classification of these rich data sets are challenging Image segmentation due to the use of different systems involved in data collection and storage. Data across these various systems Imaging technologies are omnipresent in cancer research are often coded inconsistently, and lack the defined context and healthcare. AI, especially deep neural networks, already required to fully understand the relationship between demonstrated superior performance in image classification different data concepts. Increasing use of coding standards and segmentation of natural images compared to other and the application of ontologies to these heterogeneous machine learning approaches [63]. AI tools are being data sets will help ensure the data is optimally classified developed to optimize the current workflows in radiology and integrated to enable advanced analytics. AI-based and pathology [64]. Tumor and lymph node delineations methodologies can be used in NLP and ontology learning are complex and lengthy tasks performed by radiologists and radiation oncologists and represent bottlenecks for radiation therapy and monitoring of treatment effects, 1https://www.regulations.gov/document?D=FDA-2019-N-1185-0001 Eur J Nucl Med Mol Imaging (2019) 46:2722–2730 2727 to build and apply ontologies to better annotate and connect Future challenges and opportunities data assets across the institution. Ontology learning is the (semi-)automatic creation of ontologies, extracting the One of the grand objectives of the AI community is to domain’s terms and the relationships between the concepts develop algorithms that can automatically learn from data that these terms represent from a corpus of natural language and to provide predictions—without any human interaction, text, and encoding them with an ontology language for called automatic or autonomous machine learning (aML). easy retrieval [68]. As building ontologies manually is A close concept is automated ML (AutoML), which extremely labor-intensive and time-consuming, there is focuses on end-to-end automation of ML and helps, for a great need to automate the process using recent AI- example, to solve the problem of automatically producing based ontology learning applied to oncology, allowing us test set predictions for a new data set - without any to annotate and connect data at the scale of multiple human interaction [72]. Such automatic approaches are institutions. currently very successful in daily routine and can solve already a number of problems in an sufficient manner. Patient-reported outcomes Standard best-practices include the advances achieved with deep learning models in automatic speech recognition, Learning more about the phenotypes of the patients autonomous driving or game playing without human (disease, treatment response, adverse side effects, survival) intervention. Here the current best example is the mastering is crucial enable more relevant analyses of biomedical data of the game of Go, which has a long tradition in the AI [69]. AI enables collection of health-related data from the community and is indeed a good benchmark for progress in patients through the use of healthcare bots able to mimic automatic approaches [73]. human conversations, incorporate NLP, sentiment analysis, However, all these approaches are limited to certain and perform image recognition tasks to analyze photos, tasks in a very narrow field of specialization and totally handwritten notes and barcodes related to the patients’ lacking human-centered aspects, e.g., emotion, which is diseases, medications or treatment side effects [70]. AI- influencing cognition, perception, learning, communication based solutions can be used to better engage the patients and decision making [74]. Such approaches are not yet and collect high-quality patient-reported outcomes in a applied in the medical domain or on very specific small- systematic way. Such patient-reported outcomes can then scale problems. The currently best performing working be used as output variables to predict from imaging and/or examples are in automatic image classification, which are other data types, therefore enriching the pool of clinical on par with medical doctors or even outperforms them [75]. questions that can be addressed using machine learning However, bio-medical applications are diverse and as such, approaches. there will only be some parts fully automated, while other will require human experts involved in the decision-support Monitoring patients with wearable devices process – one size will not fit all. Importantly, considering the complexities of medical decisions and liability, we Wearable devices hold the promise to complement patient- will get further and faster by empowering/augmenting reported outcomes with longitudinal data streams for a experts with AI-technology rather than replacing them, more comprehensive assessment of symptom progression, using the human-in-the-loop concept [76] or at least requiring minimal patient interventions to inform their put the human-in-control. Using AI/ML solutions as an physician on their health status [71]. These devices augmenting/assistive technology [77] has the benefit that are becoming increasingly popular among the healthy both can complement each other, the human and the population and cancer patients. Although the current sensors machine. To date, only human experts are able to understand are limited to a few vital signs and are still pending the overall context [78]. regulatory approvals, the technology is evolving fast and Definitively the future will be in integrative approaches: are being applied in clinical trials for a variety of diseases. Integrative omics will link biopsy or liquid biopsy to provide AI methods have the potential to efficiently integrate these time-course view of disease progression and treatment new data streams in clinical applications, while important response, effective and efficient analytic pipelines will challenges remain to be addressed regarding possible ensure scalable and reproducible results, diverse networks artifacts due to noisy sensors or incorrect use by the patients, will help link relationships among measured entities across and issues about patient privacy. Wearable devices can multiple data layers and build explainable models, which supplement current patient information to provide clinicians in turn will provide improved hypothesis generation, with a more accurate view of the habits and needs of their validation and translation into clinical practice. A look patients, allowing for treatment to be more tailored and further is in bringing wearable devices, which will enable a effective. true “patient-centric” approach—as each patient’s data over 2728 Eur J Nucl Med Mol Imaging (2019) 46:2722–2730 time will be compared not to other patients with a similar 3. Komaroff AL. The variability and inaccuracy of medical data. condition/symptoms; rather, to their own, long-term data. Proc IEEE. 1979;67(9):1196–207. Finally, reproducibility, cost-efficiency, acquisition time, 4. Kadmon G, Pinchover M, Weissbach A, Kogan Hazan S, Nahum E. Case not closed: prescription errors 12 years after computeri- resolution and resulting confidence and precision are vital. zed physician order entry implementation. J Pediatr. 2017;190: AI and ML will become mainstream in the medical field, 236–40.e2. https://doi.org/10.1016/j.jpeds.2017.08.013. but cohort differences, instrument and protocol variations, 5. Holzinger A, Jurisica I. Knowledge discovery and data mining overall data quality, integrity and relevant annotation in biomedical informatics: the future is in integrative, interactive machine learning solutions. Interactive knowledge discovery and are paramount. The goals of comprehensively integrative data mining in biomedical informatics: state-of-the-art and future systems are not just classification; instead, the most useful challenges. Lecture notes in computer science LNCS 8401. In: application will be helping human experts discover new Holzinger A and Jurisica I, editors. Springer; 2014. p. 1–18. knowledge, form new questions and help explain and 6. Wang B et al. Similarity network fusion for aggregating data types on genomic scale. Nat Methods. 2014;11(3):3377. comprehend complex biological states and processes. 7. Holzinger A. Biomedical informatics: discovering knowledge in big data. New York: Springer; 2014. Funding This study was funded in part by Ontario Research 8. Hutson M. Artificial intelligence faces reproducibility crisis. Fund (34876) to IJ. B.H.K. was supported by the Gattuso-Slaight Science. 2018;359(6377):725–6. Personalized Cancer Medicine Fund at Princess Margaret Cancer 9. Goodman SN, Fanelli D, Ioannidis JPA. What does research Centre and the Artificial Intelligence and Microbiome Program reproducibility mean? Sci Transl Med. 2016;8(341):341ps12. supported by the Princess Margaret Cancer Foundation. 10. Turkay C, Jeanquartier F, Holzinger A, Hauser H. On computationally-enhanced visual analysis of heterogeneous data Compliance with Ethical Standards and its application in biomedical informatics. In: Interactive knowledge discovery and data mining: state-of-the-art and future challenges in biomedical informatics. Lecture notes in computer Conflict of interest Author AH declares no conflict of interest. Author science LNCS 8401. Springer; 2014. p. 117–140. BHK declares no conflict of interest. Author IJ declares no conflict of 11. Collins FS, Varmus H. A new initiative on precision medicine. interest. England J Med. 2015;372(9):793–5. 12. Schiphof D, Oei EH, Hofman A, Waarsing JH, Weinans H, Ethical approval This article does not contain any studies with human Bierma-Zeinstra SM. Sensitivity and associations with pain and participants or animals performed by any of the authors. body weight of an MRI definition of knee osteoarthritis compared with radiographic Kellgren and Lawrence criteria: a population- based study in middle-aged females. Osteoarthritis Cartilage. Appendix: Abbreviations 2014;22(3):440–6. 13. Mlecnik B, Bindea G, Kirilovsky A, Angell HK, Obenauf AC, Tosolini M, Church SE, Maby P, Vasaturo A, Angelova M, AI ... Artificial Intelligence Fredriksen T, Mauger S, Waldner M, Berger A, Speicher MR, CPU ... Central Processing Unit Pages F, Valge-Archer V, Galon J. The tumor microenvironment CT ... Computer Tomography and immunoscore are critical determinants of dissemination to ECG ... Electrocardiography distant metastasis. Sci Transl Med. 2016;8(327):327ra26. 14. Sanduleanu S, Woodruff HC, de Jong EEC, van Timmeren JE, EEG ... Electroencephalography Jochems A, Dubois L, Lambin P. Tracking tumor biology with EOG ... Electrooculography radiomics: a systematic review utilizing a radiomics quality score. EPR ... Electronic Patient Record Radiotherapy and Oncology: Journal of the European Society fMRI ... Functional Magnetic Resonance Imag- for Therapeutic Radiology and Oncology. 2018;127(3):349– 60. ing 15. Vargas HA, Veeraraghavan H, Micco M, Nougaret S, Lakhman GPU ... raphical Processing Unit Y, Meier AA, Sosa R, Soslow RA, Levine DA, Weigelt ICD ... International Classification of Diseases B, Aghajanian C, Hricak H, Deasy J, Snyder A, Sala E. ML ... Machine Learning A novel representation of inter-site tumour heterogeneity from pre-treatment computed tomography textures classifies ovarian MRI ... Magnetic Resonance Imaging cancers by clinical outcome. Eur Radiol. 2017;27(9):3991– NGS ... Next-Generation Sequencing 4001. SNOMED CT ... Standard Nomenclature of Medicine 16. Yin Q, Hung SC, Rathmell WK, Shen L, Wang L, Lin Clinical Terms W, Fielding JR, Khandani AH, Woods ME, Milowsky MI, Brooks SA, Wallen EM, Shen D. Integrative radiomics TILs ... Tumor Infiltrated Lymphocytes expression predicts molecular subtypes of primary clear cell renal cell carcinoma. Clin Radiol. 2018;73(9):782–91. 17. Liao Xin, Bo Cai, Tian Bin, Luo Yilin, Song Wen, Li Yinglong. Machine-learning based radiogenomics analysis of MRI features References and metagenes in glioblastoma multiforme patients with different survival time. J Cell Mol Med, 4. 2019. 1. Marcus G. Deep learning: A critical appraisal. arXiv:1801.00631. 18. Danala G, Thai T, Gunderson CC, Moxley KM, Moore K, 2018. Mannel RS, Liu H, Zheng B, Qiu Y. Applying quantitative 2. Zhang Q, Yang LT, Chen Z, Li P. A survey on deep learning for CT image feature analysis to predict response of ovarian cancer big data. Inf Fus. 2018;42:146–57. patients to chemotherapy. Acad Radiol. 2017;24(10):1233–9. Eur J Nucl Med Mol Imaging (2019) 46:2722–2730 2729

19. Cook GJR, Azad G, Owczarczyk K, Siddique M, Goh V. 33. Ingalhalikar M, Smith A, Parker D, Satterthwaite TD, Elliott Challenges and promises of PET radiomics. Int J Radiat Oncol MA, Ruparel K, Hakonarson H, Gur RE, Gur RC, Verma R. Biol Phys. 2018;102(4):1083–89. Sex differences in the structural connectome of the human brain. 20. Lin G, Lai CH, Yen TC. Emerging molecular imaging Proc Natl Acad Sci USA. 2014;111(2):823–8. techniques in gynecologic oncology. PET Clin. 2018;13(2):289– 34. Bigler ED. Systems biology, neuroimaging, neuropsychology, 99. neuroconnectivity and traumatic brain injury. Front Syst Neurosci. 21. Xu Y, Hosny A, Zeleznik R, Parmar C, Coroller T, Franco 2016;10:55. I, Mak RH, Aerts HJWL. Deep learning predicts lung cancer 35. Cui LB, Liu L, Wang HN, Wang LX, Guo F, Xi YB, Liu treatment response from serial medical imaging. Clinical Cancer TT, Li C, Tian P, Liu K, Wu WJ, Chen YH, Qin W, Yin Research. 2019. H. Disease definition for schizophrenia by functional connectivity 22. Baessler B, Weiss K, Pinto Dos Santos D. Robustness and using radiomics strategy. Schizophrenia Bull. 2018;44(5):1053–9. reproducibility of radiomics in magnetic resonance imaging: a 36. Jeong H, Mason SP, Barabasi AL, Oltvai ZN. Lethality and phantom study. Invest Radiol. 2019;54(4):221–8. centrality in protein networks. Nature. 2001;411(6833):41–2. 23. Lau SK, Boutros PC, Pintilie M, Blackhall FH, Zhu CQ, 37. Przulj N, Wigle DA, Jurisica I. Functional topology in a network Strumpf D, Johnston MR, Darling G, Keshavjee S, Waddell of protein interactions. Bioinformatics. 2004;20(3):340–8. TK,LiuN,LauD,PennLZ,ShepherdFA,JurisicaI,Der 38. Ye P, Peyser BD, Pan X, Boeke JD, Spencer FA, Bader JS. Gene SD, Tsao MS. Three-gene prognostic classifier for early-stage function prediction from congruent synthetic lethal interactions in non small-cell lung cancer. J Clin Oncol. 2007;25(35):5562–9. yeast. Molec Syst Biol. 2005;1:1. 24. Shedden K, Taylor JM, Enkemann SA, Tsao MS, Yeatman 39. Sharan R, Ulitsky I, Shamir R. Network-based prediction of TJ, Gerald WL, Eschrich S, Jurisica I, Giordano TJ, Misek protein function. Mol Syst Biol, 3. 2007. DE, Chang AC, Zhu CQ, Strumpf D, Hanash S, Shepherd FA, 40. Ulitsky I, Shamir R. Identification of functional modules using Ding K, Seymour L, Naoki K, Pennell N, Weir B, Verhaak network topology and high-throughput data. BMC Syst Biol, 1. R, Ladd-Acosta C, Golub T, Gruidl M, Sharma A, Szoke J, 2007. Zakowski M, Rusch V, Kris M, Viale A, Motoi N, Travis 41. Chuang HY, Lee E, Liu YT, Lee D, Ideker T. Network-based W, Conley B, Seshan VE, Meyerson M, Kuick R, Dobbin classification of breast cancer metastasis. Mol Syst Biol, 3. 2007. KK, Lively T, Jacobson JW, Beer DG. Gene expression-based 42. Fortney K, Kotlyar M, Jurisica I. Inferring the functions survival prediction in lung adenocarcinoma: a multi-site, blinded of longevity genes with modular subnetwork biomarkers of validation study. Nat Med. 2008;14(8):822–7. Caenorhabditis elegans aging. Genome Biol. 2010;11(2):R13. 25. Boutros PC, Lau SK, Pintilie M, Liu N, Shepherd FA, 43. Kotlyar M, Fortney K, Jurisica I. Network-based characterization Der SD, Tsao MS, Penn LZ, Jurisica I. Prognostic gene of drug-regulated genes, drug targets, and toxicity. Methods. signatures for non-small-cell lung cancer. Proc Natl Acad Sci 2012;57(4):499–507. Protein-Protein Interactions. USA. 2009;106(8):2824–8. 44. Hinze L, Pfirrmann M, Karim S, Degar J, McGuckin C, 26. Beaulieu-Jones BK, Greene CS. Reproducibility of computa- Vinjamur D, Sacher J, Stevenson KE, Neuberg DS, Orellana tional workflows is automated using continuous analysis. Nat E, Stanulla M, Gregory RI, Bauer DE, Wagner FF, Stegmaier Biotechnol. 2017;34(4):342–6. K, Gutierrez A. Synthetic lethality of Wnt pathway activation 27. Munafo` MR, Nosek BA, Bishop VMD, Button SB, Chambers and asparaginase in drug-resistant acute leukemias. Cancer Cell. CD, du Sert N, Simonsohn U, Wagenmakers E-J, Ware JJ, 2019;35(4):664–676.e667. Ioannidis JPA. A manifesto for reproducible science. Nat Hum 45. Wong SWH, Pastrello C, Kotlyar M, Faloutsos C, Jurisica Behav. 2017;1(1):s41562–016–0021. I. Sdregion: fast spotting of changing communities in biological 28. Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, networks. In: Proceedings of the ACM SIGKDD international Wrobel MJ, Lerner J, Brunet J-P, Subramanian A, Ross conference on knowledge discovery and data mining. New York: KN, Reich M, Hieronymus H, Wei G, Armstrong SA, ACM; 2018. p. 867–875. Haggarty SJ, Clemons PA, Wei R, Carr SA, Lander 46. Lambin P, Rios-Velazquez E, Leijenaar R, Carvalho S, van ES, Golub TR. The connectivity map: using gene-expression Stiphout RG, Granton P, Zegers CM, Gillies R, Boellard R, signatures to connect small molecules, genes, and disease. Dekker A, Aerts HJ. Radiomics: extracting more information Science. 2006;313(5795):1929–35. from medical images using advanced feature analysis. Eur J 29. Sun R, Limkin EJ, Vakalopoulou M, Dercle L, Champiat S, Han Cancer. 2012;48(4):441–6. SR, Verlingue L, Brandao D, Lancia A, Ammari S, Hollebecque 47. Ger RB, Zhou S, Chi PM, Lee HJ, Layman RR, Jones A, Scoazec JY, Marabelle A, Massard C, Soria JC, Robert AK, Goff DL, Fuller CD, Howell RM, Li H, Stafford C, Paragios N, Deutsch E, Ferte C. A radiomics approach to RJ, Court LE, Mackin DS. Comprehensive investigation on assess tumour-infiltrating cd8 cells and response to anti-pd-1 or controlling for CT imaging variabilities in radiomics studies. Sci anti-pd-l1 immunotherapy: an imaging biomarker, retrospective Rep. 2018;8(1):13047. multicohort study. Lancet Oncol. 2018;19(9):1180–91. 48. Jorgensen P, Nelson B, Robinson MD, Chen Y, Andrews B, 30. Aerts HJ, Grossmann P, Tan Y, Oxnard GR, Rizvi N, Schwartz Tyers M, Boone C. High-resolution genetic mapping with ordered LH, Zhao B. Defining a radiomic response phenotype: a pilot arrays of Saccharomyces cerevisiae deletion mutants. Genetics. study using targeted therapy in NSCLC. Sci Rep. 2016;6:33860. 2002;162(3):1091–9. 31. Tokar T, Pastrello C, Ramnarine VR, Zhu CQ, Craddock 49. Suthers PF, Zomorrodi A, Maranas CD. Genome-scale KJ,PikorLA,VucicEA,VaryS,ShepherdFA,TsaoMS, gene/reaction essentiality and synthetic lethality analysis. 2009, Lam WL, Jurisica I. Differentially expressed microRNAs in Vol. 5. lung adenocarcinoma invert effects of copy number aberrations of 50. Chan EM, Shibue T, McFarland JM, Gaeta B, Ghandi M, prognostic genes. Oncotarget. 2018;9(10):9137–55. Dumont N, Gonzalez A, McPartlan JS, Li T, Zhang Y, Liu 32. Viswanath SE, Tiwari P, Lee G, Madabhushi A. Dimensionality JB, Lazaro J-B, Gu P, Piett CG, Apffel A, Ali SO, Deasy R, reduction-based fusion approaches for imaging and non-imaging Keskula P, Ng RWS, Roberts EA, Reznichenko E, Leung L, biomedical data: concepts, workflow, and use-cases. BMC Med Alimova M, Schenone M, Islam M, Maruvka YE, Liu Y, Roper Imaging. 2017;17(1):2. J, Raghavan S, Giannakis M, Tseng Y-Y, Nagel ZD, D’Andrea 2730 Eur J Nucl Med Mol Imaging (2019) 46:2722–2730

A, Root DE, Boehm JS, Getz G, Chang S, Golub TR, Tsherniak 63. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. A, Vazquez F, Bass AJ. Wrn helicase is a synthetic lethal target 2015;521(7553):436. in microsatellite unstable cancers. Nature. 2019;568(7753):551–6. 64. Chartrand G, Cheng PM, Vorontsov E, Drozdzal M, Turcotte 51. Holzinger A, Langs G, Denk H, Zatloukal K, Mueller H. S, Pal CJ, Kadoury S, An T. Deep learning A primer for Causability and explainability of AI in medicine. Wiley Inter- radiologists. Radiographics Rev Publ Radiological Soc North Am disciplinary Reviews, Data Mining and Knowledge Discovery. Inc. 2017;37(7):2113–31. 2019. 65. Sharp G, Fritscher KD, Pekar V, Peroni M, Shusharina 52. Krittanawong C, Zhang HJ, Wang Z, Aydar M, Kitai T. N, Veeraraghavan H, Yang J. Vision 20/20: perspectives Artificial intelligence in precision cardiovascular medicine. J Am on automated image segmentation for radiotherapy. Med Phys. Coll Cardiol. 2017;69(21):2657–64. 2014;41(5):050902. 53. Hamet P, Tremblay J. Artificial intelligence in medicine. 66. Cardenas CE, Yang J, Anderson BM, Court LE, Brock Metabolism. 2017;69:S36–40. KB. Advances in auto-segmentation. Semin Radiat Oncol. 54. Robertson S, Azizpour H, Smith K, Hartman J. Digital image 2019;29(3):185–97. Adaptive Radiotherapy and Automation. analysis in breast pathology—from image processing techniques 67. Kreimeyer K, Foster M, Pandey A, Arya N, Halford G, Jones to artificial intelligence. Transl Res. 2018;194:19–35. SF, Forshee R, Walderhaug M, Botsis T. Natural language 55. Topol EJ. High-performance medicine: the convergence of human processing systems for capturing and standardizing unstructured and artificial intelligence. Nat Med. 2019;25(1):44–56. clinical information: a systematic review. J Biomed Inform, 73. 56. Hosny A, Parmar C, Quackenbush J, Schwartz LH, 2017. Aerts H. Artificial intelligence in radiology. Nat Rev Cancer. 68. Asim M, Wasim M, Khan M, Mahmood W, Abbasi H. A survey 2018;18(8):500–510. of ontology learning techniques and applications. Database, 2018. 57. Kang J, Rancati T, Lee S, Oh J, Kerns SL, Scott JG, 2018. Schwartz R, Kim S, Rosenstein BS. Machine learning and 69. Deshpande PR, Rajan S, Sudeepthi LB, Abdul CN. Patient- radiogenomics: lessons learned and future directions. Frontiers reported outcomes: a new era in clinical research. Perspectives Oncol. 2018;8:228. Clin Res. 2011;2(4):137–44. 58. Sun R, Limkin E, Vakalopoulou M, Dercle L, Champiat S, Han 70. Loh E. Medicine and the rise of the robots: a qualitative review S, Verlingue L, Brandao D, Lancia A, Ammari S, Hollebecque of recent advances of artificial intelligence in health. BMJ Leader. A, Scoazec J-Y, Marabelle A, Massard C, Soria J-C, Robert 2018;2(2):59–63. C, Paragios N, Deutsch E, Ferte´ C. A radiomics approach to 71. Dias D, Cunha J. Wearable health Devices—Vital sign assess tumour-infiltrating CD8 cells and response to anti-PD-1 or monitoring, systems and technologies. Sensors. 2018;18(8):2414. anti-PD-L1 immunotherapy: an imaging biomarker, retrospective 72. Guyon I, Bennett K, Cawley G, Escalante HJ, Escalera S, Ho multicohort study. Lancet Oncol. 2018;19(9):1180–91. TK, Macia N, Ray B, Saeed M, Statnikov A. Design of the 59. Iorio F, Knijnenburg TA, Vis DJ, Bignell GR, Menden MP, 2015 ChaLearn AutoML Challenge. In: 2015 International joint Schubert M, Aben N, Gonc¸alves E, Barthorpe S, Lightfoot H, conference on neural networks (IJCNN). IEEE; 2015. p. 1–8. Cokelaer T, Greninger P, van Dyk E, Chang H, de Silva H, 73. Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Heyn H, Deng X, Egan RK, Liu Q, Mironenko T, Mitropoulos Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, X, Richardson L, Wang J, Zhang T, Moran S, Sayols S, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Soleimani M, Tamborero D, Lopez-Bigas´ N, Ross-Macdonald Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, P, Esteller M, Gray NS, Haber DA, Stratton MR, Benes Hassabis D. Mastering the game of go with deep neural networks CH, Wessels LFA, Saez-Rodriguez J, McDermott U, Garnett and tree search. Nature. 2016;529(7587):484–9. MJ. A landscape of pharmacogenomic interactions in cancer. Cell. 74. Stickel C, Ebner M, Steinbach-Nordmann S, Searle G, 2016;166(3):740–54. Holzinger A. Emotion detection: application of the valence 60. Seashore-Ludlow B, Rees MG, Cheah JH, Cokol M, Price arousal space for rapid biological usability testing to enhance EV, Coletti ME, Jones V, Bodycombe NE, Soule CK, Gould universal access. Universal access in human–computer interaction. J, Alexander B, Li A, Montgomery P, Wawer MJ, Kuru Addressing diversity, lecture notes in computer science, LNCS N, Kotz JD, Hon S-YC, Munoz B, Liefeld T, Dancik V, 5614. In: Stephanidis C, editors. Springer; 2009. p. 615–24. Bittker JA, Palmer M, Bradner JE, Shamji AF, Clemons 75. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, PA, Schreiber SL. Harnessing connectivity in a large-scale small- Thrun S. Dermatologist-level classification of skin cancer with molecule sensitivity dataset. Cancer Discovery. deep neural networks. Nature. 2017;542(7639):115–8. 61. Haverty PM, Lin E, Tan J, Yu Y, Lam B, Lianoglou S, Neve 76. Holzinger A. Interactive machine learning for health informat- RM, Martin S, Settleman J, Yauch RL, Bourgon R. Reproducible ics: when do we need the human-in-the-loop? Brain Inform. pharmacogenomic profiling of cancer cell line panels. Nature. 2016;3(2):119–31. 2016;533(7603):333–7. 77. Holzinger A, Stocker C, Ofner B, Prohaska G, Brabenetz 62. Gao H, Korn JM, Ferretti S, Monahan JE, Wang Y, Singh M, A, Hofmann-Wellenhof R. Combining HCI, natural language Zhang C, Schnell C, Yang G, Zhang Y, Balbin AO, Barbe S, processing, and knowledge discovery - potential of IBM content Cai H, Casey F, Chatterjee S, Chiang DY, Chuai S, Cogan analytics as an assistive technology in the biomedical domain. In: SM, Collins SD, Dammassa E, Ebel N, Embry M, Green J, Springer lecture notes in computer science LNCS 7947. Springer; Kauffmann A, Kowal C, Leary RJ, Lehar J, Liang Y, Loo 2013. Alice, Lorenzana EIII, Robert E, McLaughlin ME, Merkin J, 78. Holzinger A, Plass M, Holzinger K, Crisan GC, Pintea C-M, Meyer R, Naylor TL, Patawaran M, Reddy A, Roelli¨ C, Ruddy Palade V. Towards interactive machine learning (IML): applying DA, Salangsang F, Santacroce F, Singh AP, Tang Y, Tinetto ant colony algorithms to solve the traveling salesman problem W, Tobler S, Velazquez R, Venkatesan K, Arx F, Wang H, with the human-in-the-loop approach. In: Springer lecture notes in Wang Z, Wiesmann M, Wyss D, Xu F, Bitter H, Atadja P, computer science LNCS 9817. Springer; 2016. p. 81–95. Lees E, Hofmann F, Li E, Keen N, Cozens R, Jensen M, Pryer NK, Williams JA, Sellers WR. High-throughput screening using patient-derived tumor xenografts to predict clinical trial drug Publisher’s note Springer Nature remains neutral with regard to response. Nat Med. 2015;21(11):1318–25. jurisdictional claims in published maps and institutional affiliations.