Artificial Intelligence in Bulk and Single-Cell RNA-Sequencing Data
Total Page:16
File Type:pdf, Size:1020Kb
International Journal of Molecular Sciences Review Artificial Intelligence in Bulk and Single-Cell RNA-Sequencing Data to Foster Precision Oncology Marco Del Giudice 1,2,† , Serena Peirone 1,3,†, Sarah Perrone 1,4, Francesca Priante 1,4, Fabiola Varese 1,5, Elisa Tirtei 6 , Franca Fagioli 6,7 and Matteo Cereda 1,2,* 1 Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; [email protected] (M.D.G.); [email protected] (S.P.); [email protected] (S.P.); [email protected] (F.P.); [email protected] (F.V.) 2 Candiolo Cancer Institute, FPO—IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy 3 Department of Physics and INFN, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy 4 Department of Physics, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy 5 Department of Life Science and System Biology, Università degli Studi di Torino, via Accademia Albertina 13, 10123 Turin, Italy 6 Paediatric Onco-Haematology Division, Regina Margherita Children’s Hospital, City of Health and Science of Turin, 10126 Turin, Italy; [email protected] (E.T.); [email protected] (F.F.) 7 Department of Public Health and Paediatric Sciences, University of Torino, 10124 Turin, Italy * Correspondence: [email protected]; Tel.: +39-011-993-3969 † The authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors. Abstract: Artificial intelligence, or the discipline of developing computational algorithms able to perform tasks that requires human intelligence, offers the opportunity to improve our idea and Citation: Del Giudice, M.; Peirone, S.; delivery of precision medicine. Here, we provide an overview of artificial intelligence approaches for Perrone, S.; Priante, F.; Varese, F.; the analysis of large-scale RNA-sequencing datasets in cancer. We present the major solutions to dis- Tirtei, E.; Fagioli, F.; Cereda, M. entangle inter- and intra-tumor heterogeneity of transcriptome profiles for an effective improvement Artificial Intelligence in Bulk and of patient management. We outline the contributions of learning algorithms to the needs of cancer Single-Cell RNA-Sequencing Data to genomics, from identifying rare cancer subtypes to personalizing therapeutic treatments. Foster Precision Oncology. Int. J. Mol. Sci. 2021, 22, 4563. https://doi.org/ Keywords: artificial intelligence; RNA sequencing; cancer heterogeneity 10.3390/ijms22094563 Academic Editor: Jung Hun Oh Received: 20 March 2021 1. Introduction Accepted: 23 April 2021 Artificial intelligence (AI) is becoming a fundamental asset for healthcare and life Published: 27 April 2021 science research. Despite being in its infancy, research activities employing AI are chang- ing our understanding and vision of science. The European Commission has recently Publisher’s Note: MDPI stays neutral estimated that 13% of global venture capital investments (i.e., ~5 billion of Euros) are with regard to jurisdictional claims in for start-ups dedicated to AI application in medicine [1]. This commitment reflects the published maps and institutional affil- interest in the potential of AI to improve healthcare. Precision medicine is a new approach iations. to health. In the last decade, the generation of Big Data through genome sequencing (i.e., genomic Big Data), the collection of clinical data, and the growth of bioinformatics has made it possible to identify the genetic causes responsible for onset and progression of diseases and to support the clinical management of patients. Despite the high expectations, Copyright: © 2021 by the authors. personalized therapeutic treatments still remain limited. A breakdown is the lack of AI Licensee MDPI, Basel, Switzerland. infrastructure and models capable of supporting the constant generation of genomic Big This article is an open access article Data [2]. Consequently, the challenge remains how to interpret the variety of information distributed under the terms and contained in these data [3]. conditions of the Creative Commons The need for AI models is even more evident in complex diseases such as cancer. Attribution (CC BY) license (https:// The heterogeneity that characterizes Big Data is amplified in cancer, where diversity not creativecommons.org/licenses/by/ only manifests itself across individuals (i.e., inter-tumor) but also within each tumor 4.0/). Int. J. Mol. Sci. 2021, 22, 4563. https://doi.org/10.3390/ijms22094563 https://www.mdpi.com/journal/ijms Int. J. Mol. Sci. 2021, 22, x FOR PEER REVIEW 2 of 18 Int. J. Mol. Sci. 2021, 22, 4563 2 of 18 manifests itself across individuals (i.e., inter‐tumor) but also within each tumor (i.e., intra‐ (i.e.,tumor) intra-tumor) [4]. So far, [ 4cancer]. So far,sequencing cancer sequencing projects have projects made have available made genomic available profiles genomic for profilesthousands for of thousands biological of samples, biological corresponding samples, corresponding to petabytes toof petabytesgenetic information of genetic [5]. in- formationWith the introduction [5]. With the of introductionsingle‐cell technologies, of single-cell the technologies, complexity of the genomic complexity information of ge- nomichas grown information rapidly. has This grown heterogeneity rapidly. Thisrepresents heterogeneity the major represents hurdle to the achieve major effective hurdle toprecision achieve oncology. effective Therefore, precision oncology. AI is the pivotal Therefore, tool to AI exploit is the the pivotal information tool to exploitavailable the in informationgenomic Big available Data and in ultimately genomic Big“deliver” Data and a medicine ultimately of precision. “deliver” aThe medicine COVID of‐19 preci- pan‐ sion.demicThe has COVID-19 opened uppandemic new possibilities has opened for upAI development. new possibilities The for pandemic AI development. has increased The pandemicthe use of hasAI in increased biomedical the research: use of AI from in biomedical remotely monitoring research: from patients, remotely to predicting monitoring the patients,spread of to the predicting SARS‐CoV the‐2 spread coronavirus of the or SARS-CoV-2 in developing coronavirus new drugs or [6,7]. in developing The pandemic new drugshas also [6, 7brought]. The pandemic about new has clinical also brought practices, about primarily new clinical the use practices, of mRNA primarily vaccines. the This use oftechnological mRNA vaccines. leap forward This technological gives the possibility leap forward of accelerating gives the possibility the delivery of accelerating of similar ther the‐ deliveryapies to cancer of similar [8]. therapies to cancer [8]. TranscriptomicsTranscriptomics generally generally refers refers to to the the high-throughput high‐throughput profiling profiling of of all all RNA RNA species species producedproduced by by cells. cells. Among Among genomic genomic Big Data,Big Data, transcriptomics transcriptomics has seen has an seen explosive an explosive growth ingrowth recent in years recent [9]. years RNA [9]. sequencing RNA sequencing (RNA-seq) (RNA profiles‐seq) dynamic profiles biologicaldynamic biological processes thatpro‐ arecesses active that in are a active population in a population of cells or of in cells single or in cells. single Assessing cells. Assessing the complexity the complexity of these of profilesthese profiles could could inform inform the discovery the discovery of new of new biomarkers biomarkers and and therapeutic therapeutic targets. targets. Since Since RNA-seqRNA‐seq screeningsscreenings areare becomingbecoming partpart ofof precisionprecision medicinemedicine trialstrials [[10,11],10,11], AI AI mining mining of of thesethese datadata isis thusthus required required to to determine determine novel novel clinical clinical targets. targets. InIn thisthis paper,paper, we we provide provide an an overview overview of of AI AI approachesapproaches applied applied to to high-volume high‐volume bulk bulk andand single-cellsingle‐cell RNA-seqRNA‐seq inin cancercancer genomicsgenomics andand precisionprecision oncology.oncology. WeWe do do not not intend intend to to provideprovide a a comprehensive comprehensive characterization characterization of of all all published published AI AI methods methods and and their their technical technical details.details. ByBy contrast, contrast, we we illustrate illustrate the the major major AI AI solutions solutions to to disentangle disentangle the the heterogeneity heterogeneity of cancerof cancer transcriptomes transcriptomes for for an an effective effective improvement improvement of of patient patient management. management. We We explain explain distinctdistinct strategies strategies to to face face the the “heterogeneity“heterogeneity challenge”. challenge”. We We then then outline outline some some of of the the major major contributionscontributions of of applying applying AI AI to to the the needsneeds ofof cancercancer genomics,genomics, fromfrom identifying identifying rare rare cancer cancer subtypessubtypes toto personalizingpersonalizing treatment treatment for for individuals. individuals. 2.2. AIAI inin thethe EraEra ofof TranscriptomicTranscriptomic Big Big Data Data FromFrom thethe firstfirst drafts of of the the human human genome genome [12], [12], 20 20 years years ago, ago, the the number number of ofscientific scien- tificworks works employing employing sequencing sequencing data data has hasexponentially