Gene Expression Network Analysis Provides Potential Targets Against SARS-Cov-2

bioRxiv preprint doi: https://doi.org/10.1101/2020.07.06.182634; this version posted July 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 Gene Expression Network Analysis Provides Potential Targets Against SARS-CoV-2 2 Ana I. Hernández Cordero1*, Xuan Li1, Chen Xi Yang1, Stephen Milne1,2,3, Yohan Bossé4, Philippe 3 Joubert4, Wim Timens5, Maarten van den Berge6, David Nickle7, Ke Hao8, Don D. Sin1,2 4 5 1 Centre for Heart Lung Innovation, University of British Columbia, Vancouver, BC, Canada 6 2 Division of Respiratory Medicine, Faculty of Medicine, University of British Columbia, Vancouver, 7 BC, Canada 8 3 Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia 9 4 Institut universitaire de cardiologie et de pneumologie de Québec, Université Laval, Québec City, 10 QC, Canada 11 5 University of Groningen, University Medical Centre Groningen, Department of Pathology and 12 Medical Biology, Groningen, The Netherlands 13 6 University of Groningen, University Medical Center Groningen, Department of Pulmonary Diseases, 14 Groningen, The Netherlands 15 7 Merck Research Laboratories, Genetics and Pharmacogenomics, Boston, Massachusetts, United 16 States of America 17 8 Department of Genetics and Genomic Sciences and Icahn Institute for Data Science and Genomic 18 Technology, Icahn School of Medicine at Mount Sinai, New York, NY, United States of America 19 20 * Corresponding author 21 E-mail: [email protected] 22 23 24 25 26 27 28 29 30 31 32 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.07.06.182634; this version posted July 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 33 ABSTRACT 34 BACKGROUND: Cell entry of SARS-CoV-2, the novel coronavirus causing COVID-19, is facilitated 35 by host cell angiotensin-converting enzyme 2 (ACE2) and transmembrane serine protease 2 36 (TMPRSS2). We aimed to identify and characterize genes that are co-expressed with ACE2 and 37 TMPRSS2, and to further explore their biological functions and potential as druggable targets. 38 METHODS: Using the gene expression profiles of 1,038 lung tissue samples, we performed a 39 weighted gene correlation network analysis (WGCNA) to identify modules of co-expressed genes. We 40 explored the biology of co-expressed genes using bioinformatics databases, and identified known 41 drug-gene interactions. RESULTS: ACE2 was in a module of 681 co-expressed genes; 12 genes with 42 moderate-high correlation with ACE2 (r>0.3, FDR<0.05) had known interactions with existing drug 43 compounds. TMPRSS2 was in a module of 1,086 co-expressed genes; 15 of these genes were enriched 44 in the gene ontology biologic process ‘Entry into host cell’, and 53 TMPRSS2-correlated genes had 45 known interactions with drug compounds. CONCLUSION: Dozens of genes are co-expressed with 46 ACE2 and TMPRSS2, many of which have plausible links to COVID-19 pathophysiology. Many of 47 the co-expressed genes are potentially targetable with existing drugs, which may help to fast-track the 48 development of COVID-19 therapeutics. 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 2 bioRxiv preprint doi: https://doi.org/10.1101/2020.07.06.182634; this version posted July 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 67 INTRODUCTION 68 69 Since the outbreak of severe acute respiratory syndrome coronavirus (SARS coronavirus) in 70 2002 and 2003, coronaviruses have been considered to be highly pathogenic for humans 1–3. A new 71 coronavirus (SARS-CoV-2) that spreads through respiratory droplets emerged in December of 2019 72 in Wuhan, Hubei province, China 4. Its rapid spread across China and the rest of the world ultimately 73 resulted in the global COVID-19 pandemic, which was officially declared in March 2020 by the World 74 Health Organization (WHO). 75 76 It has since been shown that SARS-CoV-2 shares a common host cell entry mechanism with 77 the 2002-2003 SARS coronavirus 5,6. The viral genome encodes for multiple viral components, 78 including the spike protein (S), which facilitates viral entry into the host cell 7,8. The S protein 79 associates with the angiotensin-converting enzyme 2 (ACE2) to mediate infection of the target cells9. 80 ACE2 is a type 1 transmembrane metallocarboxypeptidase that is an important negative regulator of 81 the renin–angiotensin system (RAS). Once SARS-CoV-2 gains entry into epithelial cells, surface 82 expression of ACE2 is rapidly downregulated in the infected cell 10, which leads to an imbalance in 83 angiotensin II-mediated signalling and predisposes the host to acute lung injury. SARS-CoV-2 also 84 employs transmembrane serine protease 2 (TMPRSS2) to proteolytically activate the S protein, which 85 is essential for viral entry into target cells 11. 86 87 In view of the rapid spread and the mortality of COVID-19 worldwide, there is an urgent need 88 to find effective treatments against this infection, especially for severe cases. However, the 89 development of a vaccine and novel treatments may take months to years, requiring billions of dollars 90 in investment and with no certainty of their ultimate success. Bioinformatic approaches, however, can 91 rapidly identify relevant gene-drug interactions that may contribute to the understanding of the 3 bioRxiv preprint doi: https://doi.org/10.1101/2020.07.06.182634; this version posted July 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 92 mechanisms of viral infection and reduce the time to finding potential drug targets and existing drugs 93 that could be repurposed for this indication. Here, we performed a gene expression network analysis 94 on data generated in the Lung eQTL Consortium Cohort to investigate the mechanisms of ACE2 and 95 TMPRSS2 expression in lung tissue. We identified potential targets to be explored as possible 96 treatments for COVID-19. We hypothesized that the mechanisms associated with ACE2 and TMPRSS2 97 likely encompass protein coding genes involved in the pathogenesis of COVID-19. 98 99 RESULTS 100 The Lung eQTL Consortium cohort used in this gene network analysis is described in Table 101 1. Supplementary Fig. S1 shows the expression levels of ACE2 and TMPRSS2 in the three centres 102 that are part of the Lung eQTL Consortium (see methods); ACE2 had low to moderate expression 103 levels in lung tissue; whereas TMPRSS2 was highly expressed. Based on the study cohort lung 104 expression profile, we determined that ACE2 and TMPRSS2 were contained in distinct modules. The 105 module containing ACE2 (ACE2 module) included 681 unique genes, while the modules containing 106 TMPRSS2 (TMPRSS2 module) encompassed 1,086 unique genes (Supplementary Tables S1 and S2). 107 Only 41 genes were found in both modules. The hub gene for the ACE2 module was TMEM33, and 108 hub gene for the TMPRSS2 module was PDZD2 (see methods for the definition of ‘hub gene’). Fig. 1 109 shows the top 50 genes with the highest connectivity to ACE2 and TMPRSS2 within their respective 110 modules, based on the weighted gene correlation network analysis (WGCNA) analysis. 111 112 Table 1. Study cohort demographics Lung eQTL Consortium Cohort n 1,038 Age, years† 61 (52-69) Females, n (%) 472 (45.47) BMI, kg/m2† 24.60 (21.80-27.98) COPD‡, n (%) 426 (41.04) Asthma, n (%) 37 (3.56) 4 bioRxiv preprint doi: https://doi.org/10.1101/2020.07.06.182634; this version posted July 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Cardiac disease, n (%) 192 (18.50) Hypertension, n (%) 142 (13.68) Diabetes, n (%) 81 (7.80) Never smokers, n (%) 162 (15.61) Former smokers, n (%) 631 (60.79) Current smokers, n (%) 245 (23.60) 113 †Median (interquartile range). ‡ chronic obstructive pulmonary disease 114 115 116 Figure 1. ACE2 and TMPRSS2 expression modules. The center of each graph represents ACE2 (A) 117 or TMPRSS2 (B), the circles at the edges represent the top 50 genes with the highest connectivity to 118 ACE2 or TMPRSS2 based on the WGCNA analysis. The circle size represents the size of each gene 119 node in their respective modules. The arm thickness represents the relative strength of the connection 120 to the ACE2 or TMPRSS2 expression. 121 122 ACE2 module 123 124 The median module membership (MM) (see methods) across the genes in the ACE2 module 125 was 0.40, and the minimum and maximum values were 0.002 and 0.79, respectively. The MM for 126 ACE2 was 0.25. We utilized genes in the ACE2 module to execute a pathway enrichment analysis, 127 which showed significant enrichment of four Kyoto Encyclopedia of Genes and Genomes (KEGG) 128 pathways (Lysosome, Metabolic pathways, N-Glycan biosynthesis and Endocytosis) (Supplementary 5 bioRxiv preprint doi: https://doi.org/10.1101/2020.07.06.182634; this version posted July 6, 2020.

Gene Expression Network Analysis Provides Potential Targets Against SARS-Cov-2

Mouse Pign Conditional Knockout Project (CRISPR/Cas9)

Congenital Disorders of Glycosylation from a Neurological Perspective

Supplementary Table 1: Adhesion Genes Data Set

Supplementary Table S4. FGA Co-Expressed Gene List in LUAD

A Genome-Wide Association Study of Calf Birth Weight in Holstein Cattle Using Single Nucleotide Polymorphisms and Phenotypes Predicted from Auxiliary Traits

A Novel Resveratrol Analog: Its Cell Cycle Inhibitory, Pro-Apoptotic and Anti-Inflammatory Activities on Human Tumor Cells

High Mutation Frequency of the PIGA Gene in T Cells Results In

Whole Genome Sequencing of Nearly Isogeneic WMI and WLI Inbred Rats Identifies Genes Potentially Involved in Depression

Hypotonia and Intellectual Disability Without Dysmorphic Features in A

Genome-Wide Association and Gene Enrichment Analyses of Meat Sensory Traits in a Crossbred Brahman-Angus

Mechanisms Underlying Phenotypic Heterogeneity in Simplex Autism Spectrum Disorders

Demographic History and Genetic Adaptation in the Himalayan