Charting the COVID-19 human : an early report

Paolo Tieri CNR National Research Council, IAC Institute for Applied Computing, Rome, Italy [email protected]

Abstract

The ongoing COVID-19 pandemic requires fast and effective efforts from all fronts, including epidemiological, clinical and molecular. A comprehensive molecular framework is needed to better understand the disease mechanisms and to design successful treatments able to slow down and stop the impressive pace of the outbreak. Although the number of COVID-19-associated genes are still limited due to the disease’s recentness, the study of their molecular context should be considered of paramount importance. Here, in order to provide the first, wider human molecular landscape of the SARS-CoV-2 infection, the protein-protein interaction (PPI) networks of four key human genes, ACE2, FURIN, TMPRSS2 and AGTR1, involved in various capacities in the SARS-CoV-2 life cycle, host cell entry and host defense, and of 89 more genes from a recent study about host-virus protein interaction, has been derived from available experimentally validated data, and analyzed via a network approach. Grounding on prior knowledge and using network analysis techniques such as network propagation and connectivity significance, the host molecular reaction network to the viral invasion have been explored. Results based on the current PPI and pathway data show that these genes mostly work in isolation in respect to one another, and a partial overlap exists between COVID-19 mechanisms and those related to the former SARS-CoV, responsible for the 2003 SARS epidemic. Moreover, network analysis outcome suggest that macropinocytosis may be critical for sustaining the viral processes, and that there is a significant overlap between COVID-19 genes and those associated to the major critical conditions leading to patients’ death. All interactomes are available via the NDEx framework for the sharing of biological network knowledge.

1 Introduction

The worldwide ongoing COVID-19 outbreak numbers ~745000 confirmed cases and a death toll above 35000, at the time of writing1. Worse, it is not yet possible to forecast when a slowing down of the pace of the new infections will occur, and epidemiological data seem to exclude it in the short period. The ultimate goal in fighting a pandemic is to completely stop the spread, but slowing it is also crucial, to mitigate otherwise devastating effects on health and socioeconomic systems on a global scale. Thus, it is necessary to interfere by every possible means with the natural, deadly flow of the outbreak, in order to reduce and flatten

1 March 30, 2020, data from https://gisanddata.maps.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48 e9ecf6

1 the epidemic curve and relieve the pressure on hospitals capacity (Qualls et al. 2017; ​ Anderson et al. 2020). ​

In this perspective, aside all implemented epidemiological, clinical and immunological measures, a picture of the comprehensive molecular landscape of the COVID-19 is timely and needed for a better understanding of the infection mechanisms. Such picture includes the charting of the physical molecular interactions around the host genes that in the current state of knowledge are considered critical in the host infection processes. In the wider context of network medicine (Silverman and Loscalzo 2017), the protein-protein interaction ​ ​ (PPI) framework provides a widely assessed and effective heuristic approach for the identification of disease genes (Gustafsson et al. 2014; Taylor et al. 2009; Tieri et al. 2019). ​ ​

In this study, to give account of the complexity of the molecular processes underlying the COVID-19 host response, and to provide an integrated knowledge to be possibly further exploited by molecular and network biologists (Bauer-Mehren et al. 2011), four genes that ​ ​ are identified to be among the main culprits of, or involved in various capacities in the SARS-CoV-2 infection, have been considered to build the COVID-19-related interactome. Moreover, taking advantage from the latest available data about host-virus protein interactions (Cui et al. 2020), an additional set of 89 human genes has been added, and the ​ ​ resulting COVID-19 extended interactome examined with available network medicine tools. ​ Moreover, GDA from Severe Acute Respiratory Syndrome (SARS) and Middle East Respiratory Syndrome (MERS) have been retrieved from a specialized platform (DisGeNET) to reconstruct a SARS-MERS PPI map to be compared with COVID-19 map. This information has been thus analyzed via a network medicine approach and exploited to infer candidate genes, pathways and drugs for the treatment of COVID-19.

The workflow followed here included the identification of COVID-19, SARS and MERS seed genes, the reconstruction of the interactomes, the disease genes inference analysis, the functional enrichment analysis and finally the discussion of candidate genes, target pathways and drugs (fig. 1).

Even if it can be argued that the number of established COVID-19 gene-disease associations (GDA) is still limited -and so the completeness of the related interactomes- due to the COVID-19 recentness, it must be considered that such associations are non static in their nature, are updated continuously as soon as new knowledge surfaces, and their current slenderness must not refrain in any case from their investigation.

2 Methods

In this section it is detailed how and which genes have been considered as starting genes for the reconstruction of the COVID-19, SARS and MERS molecular interaction maps, and which validated PPI data have been used for such reconstruction (2.1); which analyses (and related tools) have been carried out on the resulting interactomes (2.2); notes about the availability of all data (2.3).

2 2.1 Data types and datasets

COVID-19 seed genes

Several research groups are currently inquiring into the mechanisms exploited by the SARS-CoV-2 to penetrate host cells, and several observations led to focus in particular on two cell surface receptors and two cleavage enzymes, namely ACE2, AGTR1, FURIN, and TMPRSS2. Interfering with such disease seed genes can offer potential mechanisms to disrupt the Coronavirus functionality and to provide prospective targets for drugs to inhibit the pathogen (Mallapaty 2020). ​ ​

The Angiotensin I Converting Enzyme 2 ACE2 has been deemed the host cells' main access gate to SARS-CoV-2 (Hoffmann et al. 2020), as it was already considered a functional ​ ​ receptor for the 2003 SARS coronavirus (Li et al. 2003). ACE2 finally constitutes a cell ​ ​ surface complex utilized by the SARS-CoV-2 together with a separate endoprotease, the Transmembrane Protease/Serine Subfamily Member 2 TMPRSS2, which in turn acts to enhance the cell entry of the SARS-CoV (Shulla et al. 2011). ​ ​

The Paired Basic Amino Acid Cleaving Enzyme FURIN seems also crucial to cleave and activate the viral S protein, making it able to bind the ACE2 receptor (Wu et al. 2020). ​ ​

Finally, observations of a complex Angiotensin regulation mechanism (Gurwitz 2020) led to ​ suggest that blockage of Angiotensin II Receptor Type 1 AGTR1 (obtained via drugs such as losartan) stimulates higher ACE2 expression that, apparently paradoxical, may protect SARS-CoV-2-infected individuals from severe lung injury (Gurwitz 2020; Sun et al. 2020; ​ Kickbusch and Leung 2020). ​

Moreover, in a recent analysis of SARS-CoV- and SARS-CoV-2-host protein interactions, it is proposed that, on top of the four mentioned above, 89 further human genes (for brevity hereinafter Korkin lab dataset) may play a relevant role in the onset and development of COVID-19 (Cui et al. 2020) (Suppl. table 1.1). ​ ​

Full lists of the above mentioned genes are available in supplementary table 1.

SARS and MERS GDA data

SARS and MERS GDA data have been retrieved from the DisGeNet database (Piñero et al. ​ 2020), a discovery platform containing one of the largest publicly available collections of ​ genes and variants associated to human diseases. From the search in DisGeNet for SARS and MERS (Concept ID: C1175175 and C3694279, respectively) 84 genes and 14 genes respectively (Supplementary table 1) have been retrieved, nine of which are shared by the two syndromes, accounting for a total of 89 genes related to both the diseases.

Protein-protein interaction data and interactomes reconstruction

3 Protein-protein interaction data for interactome reconstruction have been retrieved from the BioGRID, one of the most comprehensive interaction repositories with freely provided data compiled through manual curation efforts, currently containing more than 1.7 million protein and genetic interactions from major species, including Homo Sapiens (Oughtred et al. 2019). For this study, the latest version available at the time of the analysis ​ (BIOGRID-ORGANISM-Homo_sapiens-3.5.182.tab2.txt) has been used. From this original dataset, it has been extracted a two-column interaction file (a.k.a. “edge list”) with H. sapiens official gene symbols indicating the largest connected component (i.e. excluding isolated nodes and islands of nodes; LCC).

Starting from this comprehensive set, interactomes of COVID-19 seed genes, as well as of SARS and MERS GDA, have been reconstructed using the well-known Cytoscape network analysis platform (Shannon et al. 2003), version 3.7. The human interactome derived from ​ ​ BioGRID (BioGRID LCC) has been uploaded, the different subnetworks have been created starting from seed genes and from GDAs retrieving all interacting proteins. The union of ACE2, AGTR1, FURIN and TMPRSS2 interactomes has been named ‘COVID interactome’. The 89-genes Korkin lab dataset has been integrated in the COVID interactome to build a wider view of the disease molecular landscape (COVID extended interactome). Cytoscape-embedded network editing tools, management and analysis allowed centralities computation and other elaborations of the interactomes.

Full lists of the interactomes genes are available in supplementary table 1, and all PPI networks are available via a Cytoscape .cys file and the biological network sharing platform NDEx (see section 2.3).

2.2 Network analysis and enrichment tools

Network diffusion

Network diffusion (or network propagation) is a process that uses network topology (and optionally other features) to identify genes which are proximal to a starting list of seed genes. It can be used to identify genes and genetic modules that underlie human diseases (Cowen ​ et al. 2017). The network propagation process begins with a subset of seed nodes (e.g. ​ genes in a specific interactome, or genes associated with a disease) of the network. A diffusion/quenching process is applied to the initial values assigned to the seed nodes, passing some of these values to neighboring nodes according to the topology of the network. The final distribution of the passed values identifies a subnetwork which genes are closely associated to the seed genes network. In particular, network propagation has been used in network medicine to identify causal paths linking mutations to expression regulators, or to discover significantly mutated subnetworks in cancer (Paull et al. 2013; Vandin, Upfal, ​ and Raphael 2011). The Cytoscape-embedded function ‘Diffuse’, based on a heat diffusion ​ algorithm, has been used for the analysis (Carlin et al. 2017). The diffusion algorithm has ​ ​ been run starting from the COVID, the SARS-MERS and the COVID extended interactomes. The list of resulting genes are available in suppl. table 1. Full details of propagation analysis, including network measures, are available in suppl. tables 2-2b-3. The overlap between COVID and SARS-MERS diffusion interactomes are available in suppl. table 1 and the

4 functional enrichment in suppl. tables 7-9. The networks are available on NDEx (named ‘disease_diffusion’). ​ ​

Connectivity significance

The concept of connectivity significance, proposed by Ghiassian et al. (Ghiassian, Menche, ​ and Barabási 2015), has been used to uncover genes associated with a particular ​ pathophenotype, via the observation that proteins associated to specific diseases show peculiar patterns of interaction among each other, patterns that in turn help in the identification of neighborhoods not previously associated to the disease. The methodology and an efficient algorithm calculating this quantitative measure, a.k.a. DIAMOnD (DIseAse MOdule Detection), is available (Ghiassian, Menche, and Barabási 2015), and has been ​ ​ used to find the first 200 genes (arbitrary value suggested by the tool’s authors) that are closely linked to the set of genes of interest in this study. The lists of COVID, COVID extended and SARS-MERS ‘DIAMOnD’ genes are available in suppl. table 1.

Enrichment analysis

Pathways, Gene Ontology (GO) and disease functional enrichments of all retrieved datasets have been carried out through the Enrichr web service (Kuleshov et al. 2016). All functional ​ ​ enrichment data are available in the supplementary tables 4-15.

2.3 Data availability

All interactome data are freely available through a Cytoscape session .cys file (COVID-19_interactome_v.2020-03.cys, from which it is also possible run the same or additional analyses) and accessible via the NDEx platform2 (www.ndexbio.org, (Pillich et al. ​ ​ ​ 2017)), which supplies an open-source structure where it is possible to share, store, ​ manipulate, and publish biological network knowledge. The COVID-19 interactome data 2 shown here are provided in a dedicated NDEx section .​ The lists of genes of all interactomes ​ are available in the supplementary table 1.1. All network data are available in the NDEx platform named ‘genename_interactome’, ‘disease_interactome’, ​ ​ ​ ​ ‘disease_process_interactome’. ​ ​

3 Results

3.1 Characteristics of COVID and SARS-MERS interactomes

The four interactomes of ACE2, FURIN, AGTR1 and TMPRSS2, i.e. the subnetworks composed by the gene of interest and their first neighbors (available in NDEx as ‘gene name_interactome’ networks; fig. 2a-d), do not share any gene and have relatively few connections in respect to the number of nodes: while the average number of neighbors of

2 Interactomes are available at the following NDEx address: http://www.ndexbio.org/#/networkset/7ae142e8-6d22-11ea-bfdc-0ac135e8bacf?accesskey=a3cb0ddd 62f1cd2ee055ea99d38e58deda824519abd3b33d64f4c79873a44cf5

5 the LCC is 40.83, for ACE2, FURIN, AGTR1 and TMPRSS2 interactomes it is 1.75, 2.14, 2.94, and 1.5, respectively. Thus, apparently, signaling related to these four genes and their interactomes seems to play in a relative isolation and independently. The union of these interactomes, the COVID interactome (fig. 3), is composed by 58 genes (the mere sum of all four interactomes’ nodes, since there is no gene in common) and 183 interactions. This integrated interactome uses HNRNPL, ELAVL1, CALM1, GNB2L1 genes, among few others (high betweenness centrality values), as communication linkers. Taken together, these 58 genes are implicated in the renin-angiotensin system, systemic arterial blood pressure, vasoconstriction (as it was expected from the involvement of ACE2 and downstream signaling), in the cGMP-PKG signaling pathway, and in heart failure (Suppl. tables 1.4-6). The COVID extended interactome accounts for 140 genes (LCC: 115 genes) and 430 interactions (fig. 4). From the topological point of view, the most important nodes are AGTR1, for its high degree, ELAVL1 and FURIN for the betweenness centrality/degree ratio (Joy et al. 2005). ​

Comparison of the COVID interactome with the SARS-MERS interactome (composed by 81 genes and 86 interactions) showed a very small overlap, with only four genes shared: ACE2, AGT, MAPK1, TMPRSS2, while overlap between COVID extended and SARS-MERS interactomes accounts for 6 more genes: BCL2L1, CD209, CLEC4M, IRF3, PPIA, UBE2I. It should be noted that SARS and MERS genes are experimentally associated to the disease via curated repositories, GWAS catalogues, animal models and scientific literature, while the COVID interactome has been directly derived by PPI data. This consideration can partly accounted for the topological differences between COVID and SARS-MERS interactomes (e.g. COVID interactome showing much more interactions that SARS-MERS). Nevertheless, given the virological and clinical similarities of these respiratory syndromes, it is possible that a number of SARS and MERS genes should be associated to COVID-19 too. Thus, in this work, further potential overlaps -other than the mere count of shared genes- have been studied on the basis of robust genetic association techniques such as network propagation and connectivity significance analyses.

3.2 ‘Diffusion’ interactomes

Network diffusion algorithms has been run starting from the BioGRID LCC and selecting COVID, COVID extended and the SARS-MERS gene lists, separately and, finally, results have been compared. The algorithm default-resulting networks are composed by 10% of the nodes of the starting network (BioGRID LCC, 17987 nodes), thus the three diffusion subnetworks are composed by 1798 nodes each (it is to be noted that this is an arbitrary, default setting that can changed by the user). SARS-MERS diffusion interactome (30932 interactions) is approximately two-fold more dense in terms of interactions compared to the COVID, and COVID extended diffusion interactomes (17654 and 16064 interactions, respectively), even if the original starting SARS-MERS interactome was definitely less dense (81 genes and 86 interactions) than the original COVID (58 genes and 183 interactions) and COVID extended (140 genes and 430 interactions) interactomes. Apart from the consideration above about the different nature of the COVID and SARS-MERS interactomes, this result could suggest that SARS-MERS core genes communicate to a lesser extent among themselves, but involve other genes that are much more linked each

6 other, compared to COVID interactome genes. Worth to be noted, the number of shared genes between COVID and SARS-MERS diffusion interactomes are 277 i.e. ~15%, COVID extended/SARS-MERS are 372 i.e. 20%, while original COVID and SARS-MERS interactomes only share from ~4 to ~7% of the genes. Diffusion-derived shared genes mainly pertain to antigen presenting processes, angiotensin levels, systemic scleroderma and breast cancer (Suppl. tables 7-9).

To note, in Italy the most common pre-existing complication among COVID-19 deceased patients (at March 19th, 2020) was hypertension (HT, 73.8% of patients) and the most common acute conditions observed were respiratory insufficiency (RI, 96.5%), acute kidney failure (AKF, 29.2%), acute myocardial damage (AMD, 10.4%), among others (COVID-19 ​ Surveillance Group, ISS, Italy 2020). Notably, COVID extended diffusion interactome genes ​ significantly overlap with all four conditions, and genes shared among COVID ext. diffusion and SARS-MERS diffusion interactomes significantly overlap with HT, AKF, AMD (Table 1, GDA data from DisGeNET, p-value computed applying a hypergeometric distribution test), possibly suggesting the impact of virus infection on the associated disease genes and the impairment of the respective pathways and biological processes.

3.3 Connectivity significance

In parallel, by using the DIAMOnD tool, connectivity significance has been computed to find genes that are closely linked to COVID, COVID extended and SARS-MERS interactomes genes, and then compared for similarities and differences. COVID and SARS-MERS results obtained through DIAMOnD showed only a very small overlap (COVID/SARSM-MERS: 9 genes in common: AKT1, ARNT, CDKN2A, CTNNB1, HIF1A, MAPK8, STAT1, STAT3, TP53; COVID ext/SARS-MERS 10 genes in common: AKT1, CDKN1A, CHD3, CHD4, CTNNB1, EP300, ESR1, HDAC4, MDM2, TP53). The 200 genes more significantly connected to the COVID (and similarly to the COVID extended) interactome genes (suppl. table 1.1) are closely involved in the ErbB (EGFR) signaling pathway (Suppl. table 1.10), which has been shown to be utilized by the respiratory pathogens Influenza A virus, respiratory syncytial virus, and coronaviruses for host cell entry (Ho et al. 2017). To note, ​ ​ SARS-CoV is able to induce EGFR-dependent macropinocytosis ('cell drinking', a type of endocytosis involving nonspecific uptake of extracellular material, such as antigens, among others), which peculiarly occurs late in infection and persistently, unrelated to cell entry and associated with increased virus titers (Freeman et al. 2014). Similarly, SARS-CoV-2 could ​ ​ exploit macropinocytosis at multiple stages in its replication, possibly making its pathogenesis as severe as it is being observed. To this extent, macropinocytosis inhibitors, such as imipramine, phenoxybenzamine and vinblastine, could be considered candidates repurposed as therapeutic agents for the treatment of COVID-19 (Lin et al. 2018). ​ ​

4 Conclusions and perspectives

In the dramatic scenario of the ongoing COVID-19 outbreak, an extensive molecular landscape of the COVID-19 is in urgent need as it can help to shed light on the mechanisms involved in the infection as well as its potential treatments. This work promptly provides a

7 first view of the molecular context of the genes that are currently considered among the most important, involved in the host entry and development of the virus. From the latest data related to the SARS-CoV-2, relevant interactome maps have been reconstructed and compared to those relating to SARS and MERS. Results suggest that the four main genes involved in SARS-CoV-2 entry do not share genes and may work in isolation to one another, so that targeting only one of them might be not sufficient for an complete hindrance of the virus entry. Analysis outcome also suggest that the different respiratory syndromes still have several processes in common. For example, the indication that the EGFR pathway and the macropinocytosis process, for which effective inhibitors exist, can be involved in COVID-19 is worth to be further explored as potential treatment for the reduction of potentially lethal virus titers in affected patients. Further, it is interesting to note that diffusion interactomes genes are associated in a statistically highly significant way to the most common conditions of deceased patients in Italy.

A further significant step from the work presented here will possibly consist in mapping cell type-specific, and time-dependent COVID interactomes, to show the specificity and dynamics of viral entry and invasion. As a whole, the results of the integrative, network approach reported here have been conceived to identify the comprehensive molecular landscape of COVID-19 as well as genes and pathways that are not immediately associated to SARS-CoV-2 invasion, or not taken into consideration in respect to the host defense regulation and dynamics, and may thus suggest new directions for further studies and analyses. To this extent, all data have been shared via the open collaborative platform NDEx for the incorporation of new data, analyses, revisions and exploitation.

References

Anderson, Roy M., Hans Heesterbeek, Don Klinkenberg, and T. Déirdre Hollingsworth. 2020. “How Will Country-Based Mitigation Measures Influence the Course of the COVID-19 Epidemic?” The Lancet 395 (10228): 931–34. ​ ​ Bauer-Mehren, Anna, Markus Bundschus, Michael Rautschka, Miguel A. Mayer, Ferran Sanz, and Laura I. Furlong. 2011. “Gene-Disease Network Analysis Reveals Functional Modules in Mendelian, Complex and Environmental Diseases.” PloS One 6 (6): e20284. ​ ​ Carlin, Daniel E., Barry Demchak, Dexter Pratt, Eric Sage, and Trey Ideker. 2017. “Network Propagation in the Cytoscape Cyberinfrastructure.” PLoS Computational Biology 13 ​ ​ (10): e1005598. COVID-19 Surveillance Group, ISS, Italy. 2020. “Istituto Superiore Di Sanità. Characteristics of COVID-19 Patients Dying in Italy Report Based on Available Data on March 20th, 2020.” 2020. https://www.epicentro.iss.it/coronavirus/bollettino/Report-COVID-2019_20_marzo_eng.p df. ​ Cowen, Lenore, Trey Ideker, Benjamin J. Raphael, and Roded Sharan. 2017. “Network Propagation: A Universal Amplifier of Genetic Associations.” Nature Reviews. Genetics ​ 18 (9): 551–62. Cui, Hongzhu, Ziyang Gao, Ming Liu, Senbao Lu, Sun Mo, Winnie Mkandawire, Oleksandr Narykov, Suhas Srinivasan, and Dmitry Korkin. 2020. “Structural Genomics and Interactomics of 2019 Wuhan Novel Coronavirus, 2019-nCoV, Indicate Evolutionary

8 Conserved Functional Regions of Viral Proteins.” Bioinformatics. bioRxiv. ​ ​ Freeman, Megan Culler, Christopher T. Peek, Michelle M. Becker, Everett Clinton Smith, and Mark R. Denison. 2014. “Coronaviruses Induce Entry-Independent, Continuous Macropinocytosis.” mBio 5 (4): e01340–14. ​ ​ Ghiassian, Susan Dina, Jörg Menche, and Albert-László Barabási. 2015. “A DIseAse MOdule Detection (DIAMOnD) Algorithm Derived from a Systematic Analysis of Connectivity Patterns of Disease Proteins in the Human Interactome.” PLoS ​ Computational Biology 11 (4): e1004120. ​ Gurwitz, David. 2020. “Angiotensin Receptor Blockers as Tentative SARS-CoV-2 Therapeutics.” Drug Development Research, March. https://doi.org/10.1002/ddr.21656. ​ ​ ​ ​ Gustafsson, Mika, Colm E. Nestor, Huan Zhang, Albert-László Barabási, Sergio Baranzini, Sören Brunak, Kian Fan Chung, et al. 2014. “Modules, Networks and Systems Medicine for Understanding Disease and Aiding Diagnosis.” Genome Medicine 6 (10): 82. ​ ​ Hoffmann, Markus, Hannah Kleine-Weber, Nadine Krüger, Marcel Müller, Christian Drosten, and Stefan Pöhlmann. 2020. “The Novel Coronavirus 2019 (2019-nCoV) Uses the SARS-Coronavirus Receptor ACE2 and the Cellular Protease TMPRSS2 for Entry into Target Cells.” . bioRxiv. ​ ​ Ho, Jemima, David L. Moyes, Mahvash Tavassoli, and Julian R. Naglik. 2017. “The Role of ErbB Receptors in Infection.” Trends in Microbiology. ​ ​ https://doi.org/10.1016/j.tim.2017.04.009. ​ ​ Joy, Maliackal Poulo, Amy Brock, Donald E. Ingber, and Sui Huang. 2005. “High-Betweenness Proteins in the Protein Interaction Network.” Journal of ​ Biomedicine and Biotechnology. https://doi.org/10.1155/jbb.2005.96. ​ ​ ​ Kickbusch, Ilona, and Gabriel Leung. 2020. “Response to the Emerging Novel Coronavirus Outbreak.” BMJ 368 (January): m406. ​ ​ Kuleshov, Maxim V., Matthew R. Jones, Andrew D. Rouillard, Nicolas F. Fernandez, Qiaonan Duan, Zichen Wang, Simon Koplev, et al. 2016. “Enrichr: A Comprehensive Gene Set Enrichment Analysis Web Server 2016 Update.” Nucleic Acids Research. ​ ​ https://doi.org/10.1093/nar/gkw377. ​ ​ Lin, Hui-Ping, Bhupesh Singla, Pushpankur Ghoshal, Jessica L. Faulkner, Mary Cherian-Shaw, Paul M. O’Connor, Jin-Xiong She, Eric J. Belin de Chantemele, and Gábor Csányi. 2018. “Identification of Novel Macropinocytosis Inhibitors Using a Rational Screen of Food and Drug Administration-Approved Drugs.” British Journal of ​ Pharmacology. https://doi.org/10.1111/bph.14429. ​ ​ ​ Li, Wenhui, Michael J. Moore, Natalya Vasilieva, Jianhua Sui, Swee Kee Wong, Michael A. Berne, Mohan Somasundaran, et al. 2003. “Angiotensin-Converting Enzyme 2 Is a Functional Receptor for the SARS Coronavirus.” Nature. ​ ​ https://doi.org/10.1038/nature02145. ​ ​ Mallapaty, Smriti. 2020. “Why Does the Coronavirus Spread so Easily between People?” Nature 579 (7798): 183. ​ Oughtred, Rose, Chris Stark, Bobby-Joe Breitkreutz, Jennifer Rust, Lorrie Boucher, Christie Chang, Nadine Kolas, et al. 2019. “The BioGRID Interaction Database: 2019 Update.” Nucleic Acids Research 47 (D1): D529–41. ​ Paull, Evan O., Daniel E. Carlin, Mario Niepel, Peter K. Sorger, David Haussler, and Joshua M. Stuart. 2013. “Discovering Causal Pathways Linking Genomic Events to Transcriptional States Using Tied Diffusion Through Interacting Events (TieDIE).” Bioinformatics 29 (21): 2757–64. ​ Pillich, Rudolf T., Jing Chen, Vladimir Rynkov, David Welker, and Dexter Pratt. 2017. “NDEx: A Community Resource for Sharing and Publishing of Biological Networks.” Methods in ​ Molecular Biology 1558: 271–301. ​ Piñero, Janet, Juan Manuel Ramírez-Anguita, Josep Saüch-Pitarch, Francesco Ronzano,

9 Emilio Centeno, Ferran Sanz, and Laura I. Furlong. 2020. “The DisGeNET Knowledge Platform for Disease Genomics: 2019 Update.” Nucleic Acids Research 48 (D1): ​ ​ D845–55. Qualls, Noreen, Alexandra Levitt, Neha Kanade, Narue Wright-Jegede, Stephanie Dopson, Matthew Biggerstaff, Carrie Reed, et al. 2017. “Community Mitigation Guidelines to Prevent Pandemic Influenza — United States, 2017.” MMWR. Recommendations and ​ Reports. https://doi.org/10.15585/mmwr.rr6601a1. ​ ​ ​ Shannon, Paul, Andrew Markiel, Owen Ozier, Nitin S. Baliga, Jonathan T. Wang, Daniel Ramage, Nada Amin, Benno Schwikowski, and Trey Ideker. 2003. “Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks.” Genome Research 13 (11): 2498–2504. ​ Shulla, Ana, Taylor Heald-Sargent, Gitanjali Subramanya, Jincun Zhao, Stanley Perlman, and Tom Gallagher. 2011. “A Transmembrane Serine Protease Is Linked to the Severe Acute Respiratory Syndrome Coronavirus Receptor and Activates Virus Entry.” Journal ​ of Virology 85 (2): 873–82. ​ Silverman, Edwin K., and Joseph Loscalzo. 2017. “1. Scientific Basis of Network Medicine.” Network Medicine. https://doi.org/10.4159/9780674545533-002. ​ ​ ​ Sun M. L., Yang J. M., Sun Y. P., and Su G. H. 2020. “[Inhibitors of RAS Might Be a Good Choice for the Therapy of COVID-19 Pneumonia].” Zhonghua jie he he hu xi za zhi = ​ Zhonghua jiehe he huxi zazhi = Chinese journal of tuberculosis and respiratory diseases 43 (0): E014. Taylor, Ian W., Rune Linding, David Warde-Farley, Yongmei Liu, Catia Pesquita, Daniel Faria, Shelley Bull, Tony Pawson, Quaid Morris, and Jeffrey L. Wrana. 2009. “Dynamic Modularity in Protein Interaction Networks Predicts Breast Cancer Outcome.” Nature ​ Biotechnology 27 (2): 199–204. ​ Tieri, Paolo, Lorenzo Farina, Manuela Petti, Laura Astolfi, Paola Paci, and Filippo Castiglione. 2019. “Network Inference and Reconstruction in Bioinformatics.” Encyclopedia of Bioinformatics and Computational Biology. ​ https://doi.org/10.1016/b978-0-12-809633-8.20290-2. ​ ​ Vandin, Fabio, Eli Upfal, and Benjamin J. Raphael. 2011. “Algorithms for Detecting Significantly Mutated Pathways in Cancer.” Journal of Computational Biology: A Journal ​ of Computational Molecular Cell Biology 18 (3): 507–22. ​ Wu, Canrong, Yang Yueying, L. I. U. Yang, Zhang Peng, Wang Yali, Wang Qiqi, X. U. Yang, L. I. Mingxue, Zheng Mengzhu, and Li Hua. 2020. “Furin, a Potential Therapeutic Target for COVID-19.” Chinaxiv. Hubei Key Laboratory of Natural Medicinal Chemistry and ​ ​ Resource Evaluation, School of Pharmacy, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China; Wuya College of Innovation, Key Laboratory of Structure-Based Drug Design & Discovery, Ministry of Education, Shenyang Pharmaceutical University, Shenyang 110016, China. https://doi.org/10.12074/202002.00062. ​ ​

Tables and figures captions

Table 1. Overlaps between interactomes genes and genes associated to disease (no. of ​ genes, statistical significance [p-value]). Number of interactomes genes (columns) shared with pathological condition-associated genes (rows) and statistical significance (computed

10 via hypergeometric test with population = tot. no. of genes in DisGeNET = 17549; bootstrap tests with the same number of disease genes p-value = 0). A statistically significant number of COVID extended diffusion interactome genes are in common with all the major COVID-19 conditions in deceased patients. (Red: high statistical significance, orange: statistically significant.)

Figure 1. Workflow of the study: disease genes have been selected from recent COVID-10 ​ literature and from SARS-MERS GDA; PPI networks have been reconstructed via the BioGRID PPI dataset; disease genes inference approaches have been used to finally get candidate genes and biological processes relevant for COVID-19.

Figure 2. Depiction of the interactomes of the four genes ACE2 (a), AGTR1 (b), FURIN (c) ​ and TMPRSS2 (c) (in yellow).

Figure 3. Depiction of the COVID interactome, accounting for 58 genes and 183 ​ interactions. Node dimension proportional to the node degree; node color: from red to green for high to low betweenness values.

Figure 4. Depiction of the COVID extended interactome, accounting for 140 genes and 430 ​ interactions. Node dimension proportional to the node degree; node color: from red to green for high to low betweenness values.

11