Unique Features Concerning the Proximal Origin of SARS-Cov-2
Total Page:16
File Type:pdf, Size:1020Kb
Unique features concerning the proximal origin of SARS-CoV-2 Murat Seyran1,2, Damiano Pizzol3, Parise Adadi4, Tarek Mohamed Abd El-Aziz5,6, Sk. Sarif Hassan7, Antonio Soares5, Ramesh Kandimalla8,9, Kenneth Lundstrom10, Murtaza Tambuwala11, Alaa A. A. Aljabali 12, Amos Lal13, Gajendra Kumar Azad14, Pabitra Pal Choudhury15, Vladimir N. Uversky16, Samendra P. Sherchan17, Bruce D. Uhal18, Nima Rezaei19,20,*, Adam M. Brufsky21,* 1Doctoral studies in natural and technical sciences (SPL 44), University of Vienna, Austria. 2Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Austria. [email protected], [email protected] 3Italian Agency for Development Cooperation - Khartoum, Sudan Street 33, Al Amarat, Sudan. [email protected] 4Department of Food Science, University of Otago, Dunedin 9054, New Zealand. [email protected], [email protected] 5Department of Cellular and Integrative Physiology, University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Dr, San Antonio, TX 78229-3900, USA. [email protected], [email protected] 6Zoology Department, Faculty of Science, Minia University, El-Minia 61519, Egypt. 7Department of Mathematics, Pingla Thana Mahavidyalaya, Maligram, Paschim Medinipur, 721140, West Bengal, India. [email protected] 8CSIR-Indian Institute of Chemical Technology Uppal Road, Tarnaka, Hyderabad-500007, Telangana State, India. 9Kakatiya Medical College/MGM-Hospital, DME/TSPSC, Hyderabad, Warangal, Telangana State- 506007, India. [email protected], [email protected] 10PanTherapeutics, Rte de Lavaux 49, CH1095 Lutry, Switzerland. [email protected] 11School of Pharmacy and Pharmaceutical Science, Ulster University, Coleraine BT52 1SA, Northern Ireland, UK. [email protected] 12Department of Pharmaceutical Sciences, Yarmouk University—Faculty of Pharmacy, Irbid 566, Jordan 1 13Division of Pulmonary and Critical Care Medicine, Mayo Clinic, Rochester, Minnesota, USA. [email protected] 14Department of Zoology, Patna University, Patna-800005, Bihar, India. [email protected] 15Applied Statistics Unit Indian Statistical Institute, 203 B T Road, Kolkata 700108, West Bengal, India. [email protected] 16Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA. [email protected] 17Department of Environmental Health Sciences, Tulane University, New Orleans, LA, 70112, USA. [email protected] 18Department of Physiology, Michigan State University, East Lansing, MI 48824, USA. [email protected] 19Center for Immunodeficiencies, Pediatrics Center of Excellence, Children’s Medical Center, Tehran University of Medical Sciences, Tehran, Iran. 20Network of Immunity in Infection, Malignancy and Autoimmunity (NIIMA), Universal Scientific Education and Research Network (USERN), Tehran, Iran. [email protected], [email protected] 21University of Pittsburgh School of Medicine, Department of Medicine, Division of Hematology/Oncology, UPMC Hillman Cancer Center, Pittsburgh, PA, USA. [email protected] *Corresponding author Abstract There is a consensus that SARS-CoV-2 originated naturally from BatCoVs. Although based on phylogenetic analysis SARS-CoV-2 seems to be related to BatCoVs RaTG13 or RmYN02, its exact origin of lineage is not clear. In fact, the flat and non-sunken surface of the sialic acid-binding domain of SARS-CoV-2 spike protein conflicts with the general adaptation and survival pattern observed for all other coronaviruses (CoVs). Unlike RaTG13, for SARS-CoV-2 a recombination apparently occured between the S1/S2 domains of spike protein that enabled furin protease utilization. Interestingly, the clinical strains of SARS-CoV-2 do not have any further recombination 2 that would be in conflict with the recombination models of CoVs. Despite millions of SARS-CoV-2 infections, the spike protein receptor-binding domain (RBD) has not accumulated a high-frequency substitution, differentiating SARS-CoV-2 from other CoVs that have positive selection sites on their RBDs, calling into question the sources of the origin of the apparently optimized angiotensin- converting enzyme 2 (ACE2) binding of SARS-CoV-2. The SARS-CoV-2 host tropism pattern has these major discrepancies compared to other CoVs, raising questions concerning the proximal origin of SARS-CoV-2. Keywords: Severe Acute Respiratory Syndrome Coronavirus 2, Canyon Hypothesis, Template- switching model, Receptor Binding Domain, Positive Selection Site Human Pathogenic CoVs and SARS-CoV-2 The possible natural origin of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) from Bat CoV RaTG13 was explained by Andersen and colleagues 1. However, another Bat CoV (RmYN02) was recently claimed to be more genetically related to the SARS-CoV-2 than RaTG13, based on the whole genome phylogenetic analysis 1,4. Therefore, there is no genomic consensus on the exact lineage origin of SARS-CoV-2 1,4. SARS-CoV-2 is the seventh CoV capable of infecting humans, but the first and only human coronavirus (HCoV) with pandemic potential 5. During the host tropism, bat or rodent CoVs have demonstrated certain changes in their S protein receptor- binding domain (RBD) and glycan-binding N-terminal domain (NTD) 3,6. SARS-CoV-2, unlike other CoVs, does not have those signature changes. SARS-CoV-2 NTD composition conflicting with the ″Canyon Hypothesis″ The formation of canyons, depression zones, or cavities on the surfaces of the influenza virus, human rhinovirus, and Meningo virus is explained by the "Canyon Hypothesis" suggesting that the host cell receptor attachment site on a viral surface is hidden from the immune surveillance in deep surface depressions or "canyons" 7. In CoVs (except SARS-CoV-2), the S protein NTD glycan- binding domain has several sugar-binding modes, with a common feature being the hidden localization of the corresponding binding sites in cavities to limit their access to antibodies and immune cells 6. This pattern of CoVs was considered as an evolutionary measure to limit recognition of these active sites by host immune system 3. 3 The genomic comparative analysis of six HCoVs with their parent bat or rodent CoVs indicated the presence of several deletions, inserts, and recombination in their S protein NTD to evade from the host glycan-binding immune receptors, being compatible with the "Canyon Hypothesis" 3,6,8. HCoV- 229E S protein, being compared to its parent virus bat CoV, has a 187 amino acid deletion within the NTD that aims to provide means to evade the host immune C-type lectin receptors (CLRs) detecting glycan-binding viruses 8. Similarly, in the MERS-CoV, sialic acid binding site of the S Protein NTD is located 280 Å2 beneath the protein surface 9. However, the flat pattern of the SARS-CoV-2 S protein NTD is conflicting with the evolutionary host tropism strategy, not only regarding the HcoVs, but also many other human viral pathogens 2,3,6,7. SARS-CoV-2 recombination pattern is not compatible with the Template Switching (Copy- Choice) mechanism The high rate of RNA recombination in CoVs can be explained by a template-switching (copy- choice) mechanism 10. CoV RNA show discontinuous and non-progressive replication in its host cells, with replication pause at specific RNA sites creating blocks of free RNA that could fuse into another set of CoV replication, leading to the formation of different recombinant CoV strains 10. These pauses in RNA replication of CoVs could allow recombination as a result of mixed infection with other CoVs, other viruses, or even the host RNA sequences. As an example, mouse hepatitis coronavirus S protein NTD sialic acid binding domain is considered to have originated from recombination human galectin RNA sequences 14. HCoV-HKU1, when compared to the genetically related HCoV-OC43, shows a high rate of recombination in specific ORFs; e.g., P65 to NSP10, suggesting the presence of several RNA replications stops in these domains 11. Recombination can also occur with deletions in the CoVs RNA. Both HCoV-229E and the genetically related Alpaca CoV show a range from 185 to 404 nucleotide deletions in the S1 region compared to genetically related BatCoV 12. Strikingly, ORF8 was deleted in HCoV-229E compared to related alpaca and parent bat viruses, suggesting that the alpaca might serve as the first inter-host of the bat virus 12. The SARS-CoV-2 S Protein S1/S2 site corresponding to the furin recognition motif does not exist in other ″lineage B″ beta-coronavirus; e.g., Pangolin-CoV or RaTG13, suggesting that S protein S1/ S2 is not a recombination hot spot or RNA termination based on the template switching (copy- choice) model 1,10. Additionally, clinical isolates of SARS-CoV-2 S protein have not indicated any 4 recombination in this area, suggesting that the S1/S2 site furin recognition motif insertion represents a one-time event of special recombination. RmYN02, which is genetically the closest related virus to SARS-CoV-2, was proposed to have an insert like the SARS-CoV-2 S1/S2 S protein 4. However, based on the RBD, RmYN02 is phylogenetically not related to SARS-CoV-2, and the insert between the S1 and S2 domains is not the PRRA found in SARS-CoV-2, but P(deletion)AA, and the function of this insertion in RmYN02 as a furin cleavage site is not clear 4. In addtition, another genetically closest related SARS-CoV-2 virus, RmYN01, does not have that insert in the S1/S2 region 4. The S Protein of RmYN02 also has many deletions (e.g., residues 473 to 480 and 485 to 490) that do not exist in the genetically closest related SARS-CoV-2 4. SARS-CoV-2 RBD is not high frequency positive selection site The CoVs genomes have hot spots with a high frequency of amino acid substitutions, also called positive selection sites that favor host tropism, antibody resistance, or immune evasion 5. For example, analysis of the S protein RBD in HCoV-229E strains revealed the presence of 36 amino acid substitutions suggesting the positive selection site of RBD and highlighting the viral attempt of low-pathogenicity for adaptation 13.