Domains and Functions of Spike Protein in SARS-Cov-2 in the Context of Vaccine Design
Total Page:16
File Type:pdf, Size:1020Kb
viruses Review Domains and Functions of Spike Protein in SARS-Cov-2 in the Context of Vaccine Design Xuhua Xia 1,2 1 Department of Biology, University of Ottawa, Marie-Curie Private, Ottawa, ON K1N 9A7, Canada; [email protected]; Tel.: +1-613-562-5718 2 Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, ON K1H 8M5, Canada Abstract: The spike protein in SARS-CoV-2 (SARS-2-S) interacts with the human ACE2 receptor to gain entry into a cell to initiate infection. Both Pfizer/BioNTech’s BNT162b2 and Moderna’s mRNA-1273 vaccine candidates are based on stabilized mRNA encoding prefusion SARS-2-S that can be produced after the mRNA is delivered into the human cell and translated. SARS-2-S is cleaved into S1 and S2 subunits, with S1 serving the function of receptor-binding and S2 serving the function of membrane fusion. Here, I dissect in detail the various domains of SARS-2-S and their functions discovered through a variety of different experimental and theoretical approaches to build a foundation for a comprehensive mechanistic understanding of how SARS-2-S works to achieve its function of mediating cell entry and subsequent cell-to-cell transmission. The integration of structure and function of SARS-2-S in this review should enhance our understanding of the dynamic processes involving receptor binding, multiple cleavage events, membrane fusion, viral entry, as well as the emergence of new viral variants. I highlighted the relevance of structural domains and dynamics to vaccine development, and discussed reasons for the spike protein to be frequently featured in the conspiracy theory claiming that SARS-CoV-2 is artificially created. Keywords: COVID-19; spike protein; S-2P; SARS-CoV-2; cleavage; vaccine; protein structure; Citation: Xia, X. Domains and hydrophobicity; isoelectric point Functions of Spike Protein in SARS-Cov-2 in the Context of Vaccine Design. Viruses 2021, 13, 109. https://doi.org/10.3390/v13010109 1. Introduction Academic Editors: SARS-CoV-2 uses its trimeric spike protein for binding to host angiotensin-converting Kenneth Lundstrom and Alaa. A. enzyme 2 (ACE2) and for fusing with cell membrane to gain cell entry [1–4]. This is a A. Aljabali multi-step process involving three separate S protein cleavage events to prime the SARS-2-S Received: 15 December 2020 for interaction with ACE2 [2,3], and subsequent membrane fusion and cell entry. These Accepted: 12 January 2021 processes involve different domains of the S protein interacting with host cell and other Published: 14 January 2021 intracellular and extracellular components. Efficiency in each step could contribute to virulence and infectivity. Disrupting any of these steps could lead to medical cure. Publisher’s Note: MDPI stays neu- The domain structure is very similar between SARS-S (UniProtKB: P59594) and SARS- tral with regard to jurisdictional clai- 2-S (UniprotKB: P0DTC2). Both are cleaved to generate S1 and S2 subunits at specific ms in published maps and institutio- cleavage sites (Figure1A). S1 serves the function of receptor-binding and contains a signal nal affiliations. peptide (SP) at the N terminus, an N-terminal domain (NTD), and receptor-binding domain (RBD). S2 (Figure1A) functions in membrane fusion to facilitate cell entry, and it contains a fusion peptide (FP) domain, internal fusion peptide (IFP), two heptad-repeat domains (HR1 Copyright: © 2021 by the author. Li- and HR2), transmembrane domain, and a C-terminal domain [2,3,5–8]. However, there are censee MDPI, Basel, Switzerland. also significant differences between SARS-S and SARS-2-S. For example, the contact amino This article is an open access article acid sites between SARS-S and human ACE2 (hACE2) [5,7,9,10] differ from those between distributed under the terms and con- SARS-2-S and hACE2 [11–14]. This may explain why some antibodies that are effective ditions of the Creative Commons At- against SARS-S are not effective against SARS-2-S [4], especially those developed to target tribution (CC BY) license (https:// the ACE2 binding site of SARS-S [15]. In this article, numerous experiments on SARS-S are creativecommons.org/licenses/by/ considered to facilitate comparisons and to highlight differences between the two. 4.0/). Viruses 2021, 13, 109. https://doi.org/10.3390/v13010109 https://www.mdpi.com/journal/viruses Viruses 2021, 13, x FOR PEER REVIEW 2 of 17 Viruses 2021, 13, 109 2 of 16 (http://creativecommons.org/licenses experiments on SARS-S are considered to facilitate comparisons and to highlight differ- /by/4.0/). ences between the two. (A) S1: 14-667 S2: 668-1255 14-685 686-1273 (site 1) 697-1273 (site 2) NTD:14-290 RBD: 306-527 S2': 798-1255 FP CT IFP TM HR2 SP 14-303 319-541 HR1 816-1273 (B) 770-788 873-888 524CVNFNFN530 788-806 1145-1184 1196-1216 MFIFLLFLTLTSG 696 site2: cleavage 891-906 ||||||| 1163-1202 1214-1234 || || | | | 685 site1: cleavage 538CVNFNFN544 MFVFLVLLPLVSS SARS-S SARS-2-S (D) 287KCSVKSFEIDKGIYQTSNFRVVP309 3 815 site cleavage 924-972 942-990 2.4 ||: ||| :||||||||||| | STALGKL ASALGKL g 300KCTLKSFTVEKGIYQTSNFRVQP322 QDVVNQN QDVVNQN c d AQALNTL AQALNTL 1.9 (C) | f a 697 VKQLSSN VKQLSSN | b e 686 FGAISSV FGAISSV LNDILSR LNDILSR 1.4 | (E) 816 LDKVEAE LDKVEAE 0.9 0.4 Hydrophobicity -0.1 -0.6 -1.1 -1.6 1 201 401 601 801 1001 1201 (F) Window start Figure 1. Domain structure of SARS-S and SARS-2-S. (A) Key domains in SARS-S and SARS-2-S. FigureSP, signal 1.peptide;Domain NTD, N structure-terminal domain; of SARS-S RBD, receptor and-binding SARS-2-S. domain; (A FP,) Keyfusion domains peptide; in SARS-S and SARS-2-S. SP, signalIFP, internal peptide; fusion peptide; NTD, HR,N heptad-terminal repeats; domain; TM, transmembrane RBD, receptor-bindingdomain; CT, cytoplasmic domain; FP, fusion peptide; IFP, tail. The top and bottom numbers in each domain pertain to SARS-S and SARS-2-S, respectively. internalThe red arrows fusion indicate peptide; cleavage sites HR,, and heptad their numbers repeats; pertain TM, to SARS transmembrane-2-S; (B) Alignment domain; of CT, cytoplasmic tail. The topSP between and bottomSARS-S (top) numbers and SARS-2 in-S (bottom) each domain; (C,D) Alignment pertain of two to inter SARS-S-domain andsegments SARS-2-S,; respectively. The red (E) HR1 in SARS-S and SARS-2-S, together with the top view of a helix showing hydrophobic posi- arrowstions a and indicate d on the same cleavage side; (F) Hydrophobicity sites, and their plot generated numbers from pertain DAMBE [16]. to SARS-2-S; (B) Alignment of SP between SARS-S (top) and SARS-2-S (bottom); (C,D) Alignment of two inter-domain segments; (E) HR1 in 2. General Features of SARS-S and SARS-2S SARS-SSARS and-2-S is SARS-2-S, 1273 aa long together, in contrast with to 1255 the aa top in viewSARS-S. of Individual a helix showing protein do- hydrophobic positions a and d onmains the in samethe S protein side; (tendF) Hydrophobicity to fold independently plot and generated are associated from with DAMBE specific func- [16]. tions. The numbers (Figure 1A) that indicate the start/end of individual domains in SARS- S and SARS-2-S may mislead readers to think that the boundary is based on some clearly 2.recognizable General physiochemical Features oflandmarks. SARS-S In fact, and these SARS-2S numbers are for rough reference only. For example, the boundaries of RBD in SARS-S mainly result from experiments with differentSARS-2-S RNA clones is containing 1273 aa different long, inparts contrast of RBD [17 to–19]. 1255 The aa 5′ side in SARS-S. is delimited Individual protein domains inby the the site S where protein upstream tend mutations/deletions to fold independently do not affect receptor and binding are associated, but down- with specific functions. stream mutations/deletions do affect receptor binding. Similarly, the 3′ side is where up- Thestream numbers mutations/deletions (Figure affec1A)t receptor that binding indicate, but downstream the start/end mutations/deletions of individual domains in SARS-S anddo notSARS-2-S have an effect.may Boundaries mislead of some readers domains are to substantiated think that bythe protein boundary structure, is based on some clearly recognizablefor example, the boundaries physiochemical of RBD [11–14,20], landmarks. but some are Innot fact,substantiated these by numbers protein are for rough reference structure. only.Some For inter example,-domain segments the boundaries(Figure 1C,D) could of RBDbe much in more SARS-S conserved mainly than result from experiments withneighboring different domains. RNA For example, clones C822, containing D830, L831, and different C833 in SARS parts-S (correspond- of RBD [17–19]. The 50 side is de- limiteding to C840, by D848, the L849 site, and where C851 in upstream SARS-2-S) are mutations/deletions located between FP and IFP but do are not affect receptor binding, highly conserved and critically important for membrane fusion [21]. Similarly, V601 in 0 butSARS- downstreamS (corresponding to mutations/deletions V615 in SARS-2-S) does not belong do affect to any recognized receptor domain binding. Similarly, the 3 side is(Figure where 1A) but upstream is highly conserved. mutations/deletions Replacing it by G contributes affect receptor to viral escape binding, from but downstream muta- tions/deletions do not have an effect. Boundaries of some domains are substantiated by protein structure, for example, the boundaries of RBD [11–14,20], but some are not substantiated by protein structure. Some inter-domain segments (Figure1C,D) could be much more conserved than neighboring domains.