A Comprehensive SARS-Cov-2 Genomic Analysis Identifies Potential Targets for Drug Repurposing

PLOS ONE RESEARCH ARTICLE A comprehensive SARS-CoV-2 genomic analysis identifies potential targets for drug repurposing 1☯ 1☯ 2,3 Nithishwer Mouroug Anand , Devang Haresh Liya , Arpit Kumar PradhanID *, Nitish Tayal4, Abhinav Bansal5, Sainitin Donakonda6, Ashwin Kumar Jainarayanan7,8* 1 Department of Physical Sciences, Indian Institute of Science Education and Research, Mohali, India, 2 Graduate School of Systemic Neuroscience, Ludwig Maximilian University of Munich, Munich, Germany, 3 Klinikum rechts der Isar, Technische UniversitaÈt MuÈnchen, MuÈnchen, Germany, 4 Department of Biological Sciences, Indian Institute of Science Education and Research, Mohali, India, 5 Department of Chemical a1111111111 Sciences, Indian Institute of Science Education and Research, Mohali, India, 6 Institute of Molecular a1111111111 Immunology and Experimental Oncology, Klinikum rechts der Isar, Technische UniversitaÈt MuÈnchen, a1111111111 MuÈnchen, Germany, 7 The Kennedy Institute of Rheumatology, University of Oxford, Oxford, United a1111111111 Kingdom, 8 Interdisciplinary Bioscience DTP, University of Oxford, Oxford, United Kingdom a1111111111 ☯ These authors contributed equally to this work. * [email protected] (AKP); [email protected] (AKJ) OPEN ACCESS Abstract Citation: Anand NM, Liya DH, Pradhan AK, Tayal N, The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) which is a novel Bansal A, Donakonda S, et al. (2021) A comprehensive SARS-CoV-2 genomic analysis human coronavirus strain (HCoV) was initially reported in December 2019 in Wuhan City, identifies potential targets for drug repurposing. China. This acute infection caused pneumonia-like symptoms and other respiratory tract ill- PLoS ONE 16(3): e0248553. https://doi.org/ ness. Its higher transmission and infection rate has successfully enabled it to have a global 10.1371/journal.pone.0248553 spread over a matter of small time. One of the major concerns involving the SARS-COV-2 is Editor: Malaya Kumar Sahoo, Stanford University the mutation rate, which enhances the virus evolution and genome variability, thereby mak- School of Medicine, UNITED STATES ing the design of therapeutics difficult. In this study, we identified the most common haplo- Received: November 3, 2020 types from the haplotype network. The conserved genes and population level variants were Accepted: March 1, 2021 analysed. Non-Structural Protein 10 (NSP10), Nucleoprotein, Papain-like protease (Plpro or Published: March 18, 2021 NSP3) and 3-Chymotrypsin like protease (3CLpro or NSP5), which were conserved at the highest threshold, were used as drug targets for molecular dynamics simulations. Darifena- Peer Review History: PLOS recognizes the benefits of transparency in the peer review cin, Nebivolol, Bictegravir, Alvimopan and Irbesartan are among the potential drugs, which process; therefore, we enable the publication of are suggested for further pre-clinical and clinical trials. This particular study provides a com- all of the content of peer review and author prehensive targeting of the conserved genes. We also identified the mutation frequencies responses alongside final, published articles. The across the viral genome. editorial history of this article is available here: https://doi.org/10.1371/journal.pone.0248553 Copyright: © 2021 Anand et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which Introduction permits unrestricted use, distribution, and reproduction in any medium, provided the original The 2019 novel coronavirus strain (2019-nCoV, later officially named SARS-CoV-2) which author and source are credited. was initially reported in Wuhan, Hubei Province, People's Republic of China (PRC) belongs to Data Availability Statement: All relevant data are the coronaviridae family of viruses that possess a positive-sense single-stranded RNA genome within the manuscript and its Supporting [1, 2]. Compared to the previous outbreaks of severe acute respiratory syndrome coronavirus Information files. (SARS-CoV) in 2003 and Middle East respiratory syndrome coronavirus (MERS-CoV) in PLOS ONE | https://doi.org/10.1371/journal.pone.0248553 March 18, 2021 1 / 21 PLOS ONE A comprehensive SARS-CoV-2 genomic analysis Funding: The author(s) received no specific 2012, 2019-nCoV has higher transmission and infection rate with an increasing mortality rate funding for this work. [3]. The SARS-CoV-2 genome like other members of the betacoronavirus family has a long Competing interests: The authors have declared ORF1ab polyprotein at the 50 end, which is followed by a set of four major structural proteins, that no competing interests exist. including the spike surface glycoprotein, small envelope protein, matrix protein, and nucleo- capsid protein (Fig 1) [4]. The 2019-nCoV strain and SARS-CoV share a genome sequence homology of about 79%. The 2019-nCoV has a greater similarity to the SARS-like bat CoVs (MG772933) than the SARS-CoV [1]. The high similarity of receptor-binding domain (RBD) in Spike-protein and several other analyses reveals that SARS-CoV-2 uses angiotensin-con- verting enzyme 2 (ACE2) as receptor, just like SARS-CoV. Coronavirus via the S protein on the surface identifies the corresponding receptor on the target cell thereby making its entry into the host cell [5]. The higher transmissibility and infection rate of 2019-nCoV as compared to SARS-CoV is attributed to the higher binding affinity of SARS-CoV-2 to the ACE2 recep- tors [6, 7]. In one of the structure model analysis, SARS-CoV-2 showed a 10-fold higher binding affinity for ACE-2 as compared to that of SARS-CoV [7]. The similarity of sequences between SARS-CoV-2 and SARS-CoV allows utilization of the known protein structures to build a model for drug discovery on this new SARS-CoV-2. A comprehensive genomic study could identify the start of community spread immediately and could help in imposing restric- tions that could prevent subsequent infections [8]. As of January 23, 2021, total of 99,298,747 cases of COVID-19 occurring in at least 219 countries and territories were reported, with approximately 3% of fatality rate. The coronavirus similar to other RNA viruses is characterized by significant genetic variability and high recombination rate which boosts them to be easily distributed among humans and animals in different geographic locations [9]. Numerous coronavirus strains exist within the human and animal populations without causing life threatening diseases [10]. However in certain rare cases there is genetic recombination of viruses which produces infectious strains which are pathogenic to humans [11]. What makes SARS-CoV-2 more powerful is the mutation events that allow structural changes in the virus. One of the recent studies suggests the existence of three central variants of SARS-CoV-2 distinguished by amino acid changes [12]. There have been many studies which have performed phylogenetic analysis on SARS-CoV-2 genomes sampled from across the world. These studies have detailed the role of founder effects, genetics, immunological and environmental factors playing a confounding role in the evolution of SARS-CoV-2. These studies have identified several core mutations on the viral genome which have been linking them to the COVID-19 transition events [12±15]. With the increasing spread of the virus, there is an increase in the accumulation of mutation, which would thereby make pharmaceutical interventions difficult. We urgently need therapeutic options to combat this virus infection. In this study, we thereby performed wide array analysis, which addresses the mutation problem and systematically identified drug targets to aid the therapeutic design. Firstly, we Fig 1. A detailed schematic representation of the SARS-CoV-2 viral genome. The figure represents the detailed view of structural and non-structural proteins (NSPs). https://doi.org/10.1371/journal.pone.0248553.g001 PLOS ONE | https://doi.org/10.1371/journal.pone.0248553 March 18, 2021 2 / 21 PLOS ONE A comprehensive SARS-CoV-2 genomic analysis performed haplotype analysis, which identified several different primary clusters based on the haplotype network suggesting the presence of different variants of SARS-CoV-2. We also found the genes that are conserved and the population level variants. In this study, we also highlight the mutation frequencies across the viral genome. We then identified the stable genes, which have stretches of conserved regions and thereby can be used as efficient drug-targets. Using this as our base, we identified 4 genes which are stable and conserved in all the strains. We used them as our targets in in-silico drug designing, molecular docking and molecular dynamics simulations. Given the fast mutation rate of these viruses, our approach of targeting the stable genes through small molecules would provide a better therapeutic approach and confidence in the successive clinical trials. This study provides new insights into the evolution of COVID-19, identifies the divergence pattern, spread of the virus at the population level, and utilises a unique and efficient method of targeting the stable genes for the drug discovery approach. Results and discussions Viral clusters identified via haplotype network In order to understand the population level divergence of SARS-CoV-2 we tried to map the haplotype network and establish the relationship among the SARS-CoV-2 haplotypes from the genome data collected all over the globe. A total of 194 haplotypes were identified

A Comprehensive SARS-Cov-2 Genomic Analysis Identifies Potential Targets for Drug Repurposing

Treebase: an R Package for Discovery, Access and Manipulation of Online Phylogenies

NASP: an Accurate, Rapid Method for the Identification of Snps in WGS Datasets That Supports Flexible Input and Output Formats Jason Sahl

Introduction to Bioinformatics (Elective) – SBB1609

Can Hybridization Be Detected Between African Wolf and Sympatric Canids?

Computational Biology and Bioinformatics

An Efficient Pipeline for Assaying Whole-Genome Plastid Variation for Population Genetics and Phylogeography

Deep Sampling of Hawaiian Caenorhabditis Elegans Reveals

Methodology in Phylogenetics

Phylogenetic Analyses of Sites in Different Protein Structural Environments Result in Distinct Placements of the Metazoan Root

CLC Sequence Viewer

8 Conversion Examples

Phyluce Documentation Release 1.7.1