Image Bio-Markers and Gene Expression Data Correlation Framework for Lung Cancer Radio- Genomics Analysis Based on Deep Learning

Image Bio-Markers and Gene Expression Data Correlation Framework for Lung Cancer Radio- Genomics Analysis Based on Deep Learning Dong Sui ( [email protected] ) Beijing University of Civil Engineering and Architecture https://orcid.org/0000-0002-7887-2111 Maozu Guo Beijing University of Civil Engineering and Architecture Xiaoxuan Ma Beijing University of Civil Engineering and Architecture Julian Baptiste University of Maryland Medical Center Lei Zhang University of Maryland Baltimore County Research Keywords: Radiomics, Radiogenomics, Deep Learning, Genomics Biomarker, GSEA Posted Date: January 18th, 2021 DOI: https://doi.org/10.21203/rs.3.rs-144196/v1 License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License 1 Sui et al. 2 3 4 5 6 RESEARCH 7 8 9 Image Bio-markers and Gene Expression Data 10 11 12 Correlation Framework for Lung Cancer 13 14 Radio-genomics Analysis Based on Deep Learning 15 16 Dong Sui1* 17 , Maozu Guo1 18 , Xiaoxuan Ma1 19 2 20 , Julian Baptiste 21 and Lei Zhang2 22 23 *Correspondence: 24 [email protected] Abstract 1 25 School of Electrical and Information Engineering, Beijing Background: Precision medicine, a popular treatment strategy, has become 26 University of Civil Engineering and increasingly important to the development of targeted therapy. To correlate 27 Architecture, Beijing, China 28 Full list of author information is medical imaging with prognostic and genomic data, researches in radiomics and 29 available at the end of the article radiogenomics have provide many pre-defined image features to describe image 30 information quantitatively or qualitatively. However, in previous researches, there 31 are only statistical results which proves high correlation among multi-source 32 medical data, but those can’t give intuitive and visual result. 33 Results: In this paper, a deep learning based radio-genomics framework is 34 provided to construct the linkage from lung tumor images to genomics data and 35 implement generation process in turn, which form a bi-direction framework to 36 map multi-source medical data. The imaging features are extracted from 37 auto-encoder under the condition of genomics data. It can obtain much more 38 relevant features than traditional radio-genomics methods. Finally, we use 39 generative adversarial network to transform genomics data onto tumor images, 40 which gives a cogent result to explain the linkage between them. 41 42 Conclusions: Our proposed framework provides a deep learning method to do 43 radio-genomics researches more functionally and intuitively. 44 Keywords: Radiomics; Radiogenomics; Deep Learning; Genomics Biomarker; 45 GSEA;Radiomics; Radio-genomics; Deep Learning; Genomics Bio-markers; GSEA 46 47 48 49 50 Background 51 52 Currently, the number of lung cancer patients and cancer-related deaths has occu- 53 pied a vital position of cancer-related deaths worldwide. Over 70% of lung cancer 54 patients can only be diagnosed after the onset of symptoms from advanced local 55 56 or metastatic disease. Unfortunately, the survival rate is only 50% on the condi- 57 tion that diagnose could be located. It is worst that only less than 20% of these 58 patients are diagnosed with very early stages. In this context, precision medicine is 59 gaining popularity for providing customized or personalized healthcare, and quan- 60 61 titative imaging has been contributing to significant improvement of the diagnosis 62 procedures [1]. 63 64 65 1 Sui et al. Page 2 of 16 2 3 4 5 6 In another way, personalized medicine aims to tailor medical care to the individ- 7 ual from view of molecular. High-throughput molecular biology technologies claim 8 to produce biomarkers for disease diagnosis and its prognosis prediction [2, 3, 4]. 9 However, these biospies generated from heterogeneous lesions cannot completely 10 11 represent the anatomic and physiologic properties, such as the tumor size, anatomic 12 location, and morphology. In another way, image features extracted from these le- 13 sions construct a highly informative pathway to disease diagnosis, treatment plan- 14 ning and clinical analysis. Nevertheless, only few studies have constructed radio- 15 16 genomics frameworks of these information correlations that integrate the genomic 17 and image data [5, 6, 7, 8]. 18 Traditionally, radiology and image-guided interventional therapy have been used 19 to deal with diagnosis and provide anatomical information. However, all these meth- 20 ods must make the patient’s physical wounds painful and take longer to heal. To 21 22 overcome these shortcomings, radiomics extract image features and sub-visual fea- 23 tures from radiological images and use state-of-the-art machine learning techniques 24 to provide unique potential for faster and more accurate lung cancer screening. 25 Therefore, by improving the practice of qualitative and quantitative analysis, it 26 27 is expected to improve the prognosis prediction and response to some treatments. 28 In addition, some radiogenomics researchers have shown that some image features 29 are even associated with genomic changes in tumor DNA [1]. These characteris- 30 tics can identify specific changes in biological pathways that in turn affect patient 31 management and health outcomes [11, 10]. 32 33 We develop a radiogenomic framework centered on deep learning (DL) to map 34 image features and genomic data, based on our previous work [12] on the correlation 35 between genomics and images. Conditional autoencoder replace the original model 36 while correlating and more genomic analysis are conducted in this paper. Moreover, 37 38 inspired by some latent space projection model, we fulfill the transition between 39 tumor image and genomic data. 40 41 Radiomics and Radiogenomics Approaches 42 Traditionally, radiomics and radiogenomics contain four steps: image acquisition, 43 44 lesion segmentation, feature extraction and model validation. In relevant researches 45 of radiomics and radiogenomics, features are extracted from medical images qualita- 46 tively and quantitatively including semantic (prognostic) and numerical (statistic) 47 features. Then statistical methods will give a correlation between them and genomic 48 data using gene-set enrichment analysis (GSEA). These works improve diagnostic 49 50 and prognostic performance in various oncologic applications and finally promote 51 the development of precision medicine. Coroller et al. build a correlation between 52 image features and clinical data to predict distant metastasis in lung adenocarci- 53 noma [13]. Abdollahi et al. use statistic features to predict sensorineural hearing 54 55 loss and get a high accuracy [27]. The correlation between genomic data and images 56 can indicate the linkage of gene change and tumor variation. Aerts et al. provide a 57 quantitative method to correlate gene expression profile data and the low level im- 58 age features, which can give a decision-support in cancer treatment at low cost [14]. 59 et al. 60 Additionally, Gevaert propose a protocol which maps semantic features and 61 genome data, and get a model with an area under the receiver operating characteris- 62 tic curve (AUC) of 65% or greater [29]. Besides these, many researches have proved 63 64 65 1 Sui et al. Page 3 of 16 2 3 4 5 6 that there is linkage between multi-source medical data. However, the pre-defined 7 features used in these researches are not rich and effective enough, such as the sum 8 of pixels with different percentages. This gives us the thought that mapping these 9 10 data onto DL methods, which will produce multi-level and rich image features for 11 correlation. 12 13 14 Deep Learning for Radiological Analysis 15 Radiomics and DL are focus points in medical imaging field [13]. Radiomics apply 16 17 image features for prognosis prediction, which is vital due to its clinical significance. 18 DL has been used in medical imaging analysis tasks, such as in CT, MRI, and PET, 19 because its high precision. It can provide informative information about diagnosis, 20 21 prognostic data, tumor phenotypes, and the gene-protein signatures in lung cancer 22 treatments and prediction [13]. Some recent progresses are as followed. 23 As a basic model of Convolutional neural networks (CNNs), autoencoder is a 24 25 popular encoder model. In medical imaging segmentation task, it can give an 26 anatomical prior which eliminates the burden of providing paired example segmen- 27 tations [21, 19]. As an end-to-end method, autoencoder has been applied for lesion 28 detection and segmentation, pixel repair and prognosis validation [21, 22, 39], which 29 30 show the features extracted from autoencoder contain mostly image information. 31 With the similar structure as autoencoder, the U-Net architecture achieves excel- 32 lent performance on different biomedical applications, including tumor segmenta- 33 34 tion and CT reconstruction [17, 18]. As most popular method of segmentation field, 35 U-net shows its robustness and effectiveness, so we will use it to obtain the lesion 36 region. 37 38 Recently, There are many applications in several medicals imaging fields using 39 GAN [20] frameworks, including creating translation of label-to-image, mask-to- 40 image or medical cross modality [24, 25, 26]. These generation models can output 41 42 the images we need according to the input data. However, Conditional Generative 43 Adversarial Nets (CGAN) utilizes extra inputs for fusing more information [23]. 44 It produces more precise result compared to GAN due to the extra condition y in 45 generation process: 46 47 48 min max V (D,G)= 49 G D E E 50 x∼pdata(x)[logD(x|y)] + z∼pz (z)[log(1 − D(G(z|y)))]. (1) 51 52 53 It has been applied in many medical fields, including organ segmentation [34], 54 lesion generation [33], etc.. These researchers indicate CGAN architecture can gen- 55 erate expected medical images. 56 57 Considering the distribution of images of different categories, Bao J.

Image Bio-Markers and Gene Expression Data Correlation Framework for Lung Cancer Radio- Genomics Analysis Based on Deep Learning

A Systematic Review on the Current Radiogenomics Studies in Glioblastomas, P

Radiogenomics Consortium Genome-Wide Association Study

MRI-Based Radiomics and Radiogenomics in the Management of Low-Grade Gliomas: Evaluating the Evidence for a Paradigm Shift

Radiomics and Imaging Genomics in Precision Medicine

A Synergy of Imaging and Transcriptomics in Clinical Assess- Ment

Combining Molecular and Imaging Metrics in Cancer: Radiogenomics Roberto Lo Gullo1*, Isaac Daimiel1, Elizabeth A

Radiogenomics: Bridging Imaging and Genomics

Associations Between FDG PET Image Features and Oncogenic Signaling

Magnetic Resonance Imaging Based Radiomic Models of Prostate Cancer: a Narrative Review

Radiogenomics and Radioproteomics Mehdi Djekidel* Yale University School of Medicine, USA

Radiogenomics Is the Future of Treatment Response Assessment In

Radiogenomics Predicts the Expression of Microrna-1246 in the Serum of Esophageal Cancer Patients