Targeted in Situ Cross-Linking Mass Spectrometry and Integrative Modeling Reveal the Architectures of Three Proteins from SARS-Cov-2

Targeted in Situ Cross-Linking Mass Spectrometry and Integrative Modeling Reveal the Architectures of Three Proteins from SARS-Cov-2

Targeted in situ cross-linking mass spectrometry and integrative modeling reveal the architectures of three proteins from SARS-CoV-2 Moriya Slavina,1, Joanna Zamela,1, Keren Zohara,1, Tsiona Eliyahua, Merav Braitbarda, Esther Briellea, Leah Barazb,c, Miri Stolovich-Rainc, Ahuva Friedmanc, Dana G. Wolfd, Alexander Rouvinskic, Michal Liniala,2, Dina Schneidman-Duhovnya,e,2, and Nir Kalismana,2 aDepartment of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel; bHadassah Academic College Jerusalem, Jerusalem 9101001, Israel; cDepartment of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada, The Kuvin Center for the Study of Infectious and Tropical Diseases, The Hebrew University-Hadassah Medical School, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel; dClinical Virology Unit, Hadassah Hebrew University Medical Center, 9190401 Jerusalem, Israel; and eThe Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel Edited by Ivet Bahar, University of Pittsburgh School of Medicine, Pittsburgh, PA, and approved July 13, 2021 (received for review February 22, 2021) Atomic structures of several proteins from the coronavirus family In situ cross-linking and mass spectrometry (in situ CLMS) are still partial or unavailable. A possible reason for this gap is the allows probing protein structure inside intact cells (4, 5). In this instability of these proteins outside of the cellular context, thereby approach, cells are incubated with a membrane-permeable cross- prompting the use of in-cell approaches. In situ cross-linking and linking reagent, which reacts with the cellular proteins in their mass spectrometry (in situ CLMS) can provide information on the native environment. Following the chemical cross-linking, the structures of such proteins as they occur in the intact cell. Here, we cells are lysed and their protein content is analyzed by mass applied targeted in situ CLMS to structurally probe Nsp1, Nsp2, spectrometry (MS). Computational search can then identify from and nucleocapsid (N) proteins from severe acute respiratory syn- the MS data the pairs of residues that were covalently linked. drome coronavirus 2 (SARS-CoV-2) and obtained cross-link sets Because a link between two residues reports on their structural with an average density of one cross-link per 20 residues. We then proximity, the list of identified links is a rich resource for mod- employed integrative modeling that computationally combined eling protein structures and interactions (6–10). In situ CLMS BIOPHYSICS AND COMPUTATIONAL BIOLOGY the cross-linking data with domain structures to determine full- has progressed significantly in recent years, with applications on length atomic models. For the Nsp2, the cross-links report on a isolated organelles (11–15), bacteria (16, 17), human cells complex topology with long-range interactions. Integrative mod- (18–20), and heart tissue (21). eling with structural prediction of individual domains by the An inherent difficulty of in situ CLMS is the high complexity AlphaFold2 system allowed us to generate a single consistent of the initial cell lysates and subsequent tryptic digests. Two all-atom model of the full-length Nsp2. The model reveals three general strategies have been employed to reduce the complexity putative metal binding sites and suggests a role for Nsp2 in zinc prior to the MS analysis. One strategy enriches the cross-linked regulation within the replication–transcription complex. For the N peptides out of the total tryptic digest by either tagging the cross- protein, we identified multiple intra- and interdomain cross-links. linker itself (21) or by extensive chromatography (16, 17). The Our integrative model of the N dimer demonstrates that it can other strategy aims to purify a specific protein of interest out of accommodate three single RNA strands simultaneously, both ster- eochemically and electrostatically. For the Nsp1, cross-links with Significance the 40S ribosome were highly consistent with recent cryogenic electron microscopy structures. These results highlight the impor- We present a generic methodology that extracts structural tance of cellular context for the structural probing of recalcitrant data from living, intact cells for any protein of interest. Appli- proteins and demonstrate the effectiveness of targeted in situ cation of this methodology to different viral proteins resulted CLMS and integrative modeling. in significant cross-link sets that revealed the connectivity within their structures. Importantly, we show that these cross- structural biology | mass spectrometry | in-cell techniques | integrative link sets are detailed enough to enable the integrative mod- modeling eling of the full-length protein sequence. Consequently, we report the global structural organization of Nsp2 and the dimer he genome of severe acute respiratory syndrome coronavirus of the nucleocapsid protein. We foresee that similar applica- T2 (SARS-CoV-2) encodes 29 major proteins: 16 nonstruc- tions will be highly useful to study other recalcitrant proteins tural proteins (Nsp1 to Nsp16), 4 structural proteins (S, E, M, on which the mainstream structural approaches currently fail. and N), 9 major open reading frames, and several additional noncanonical gene products (1). Despite significant progress in Author contributions: M.S., J.Z., K.Z., A.R., M.L., D.S.-D., and N.K. designed research; M.S., J.Z., K.Z., and T.E. performed research; M.B., E.B., L.B., M.S.-R., A.F., D.G.W., A.R., and D.S.-D. viral protein structure determination (2), there are still gaps in contributed new reagents/analytic tools; M.S., J.Z., K.Z., M.B., D.S.-D., and N.K. analyzed the structural knowledge of several proteins from the coronavi- data; and M.S., J.Z., K.Z., M.L., D.S.-D., and N.K. wrote the paper. rus family (3). For example, the nucleocapsid (N) protein is known The authors declare no competing interest. to form multisubunit assemblies that were not yet resolved struc- This article is a PNAS Direct Submission. turally. A possible reason for these difficulties may be the insta- This open access article is distributed under Creative Commons Attribution License 4.0 bility of certain viral proteins in the in vitro state. In such cases, (CC BY). 1 purification procedures that are an integral part of mainstream M.S., J.Z., and K.Z. contributed equally to this work. 2 structural approaches (X-ray crystallography, NMR, and cryogenic To whom correspondence may be addressed. Email: [email protected], dina. [email protected], or [email protected]. electron microscopy [cryo-EM]) may cause the purified proteins to This article contains supporting information online at https://www.pnas.org/lookup/suppl/ disassemble, denature, or aggregate. To avoid such artifacts, in doi:10.1073/pnas.2103554118/-/DCSupplemental. situ techniques for structural studies are required. Published August 9, 2021. PNAS 2021 Vol. 118 No. 34 e2103554118 https://doi.org/10.1073/pnas.2103554118 | 1of9 Downloaded by guest on September 30, 2021 the cell lysate prior to digestion. Wang et al. (18, 19) effectively to the plate was considerably weaker. These observations are in used the second strategy to study the human proteasome by accordance with the known toxic role of Nsp1, which is mediated expressing several of its subunits with a biotin tag. We propose through a global translation inhibition (24). Following the Strep the term “targeted in situ CLMS” to describe the latter ap- purifications, proteomics analyses of the elutions detected the proach, which allows the user to focus the MS resources on a tagged proteins to be the most abundant by a large margin for all small set of predetermined proteins (targets). three proteins and for both the DSS (disuccinimidyl suberate) and In this work we used targeted in situ CLMS and integrative formaldehyde cross-linking reagents (Dataset S1). The Strep tag modeling (22, 23) to probe the structures of three SARS-CoV-2 performance was also satisfactory in two other aspects. First, it proteins: Nsp1, Nsp2, and the N protein. Our motivation for enabled repeatable purification yields ranging from ∼5 μg(N choosing these proteins is the incomplete knowledge on their protein) to 0.3 μg (Nsp1) from a single plate of cells (SI Appendix, structures and functions. For each protein we identified consid- Fig. S2), which were sufficient for the MS analyses. Second, the erable cross-link sets of in situ origin. Computational integration small Strep tag did not seem to interfere with the structures of the of the cross-links with additional structural information allowed expressed proteins (SI Appendix, Fig. S3). Overall, we conclude us to build almost complete models for Nsp2 and the N protein. that the Strep tag activity is largely unaffected by amine-reactive cross-linking and suggest it as a tag of choice for such pursuits. Results While this methodology is similar to the one introduced by Targeted In Situ CLMS to Study Viral Proteins. We employed a Wang et al. (18, 19), the current protocol was modified in three targeted strategy for in situ CLMS of viral proteins inside intact major aspects: 1) The incubation time of the cells with the cross- human cells. To that end, HEK293 cells were transfected with a linker was shortened to 20 min, rather than 60 min in the original plasmid of a selected viral protein fused to a Strep tag (Fig. 1). protocol, 2) we used a Strep tag rather than a biotin tag, which We then cross-linked the intact cells with a membrane-permeable allowed us to remove biotinylated proteins from the purification, cross-linker, washed away the excess cross-linker, lysed the cells, and 3) we established a transient transfection protocol rather and purified the viral protein via the Strep tag. The purification than producing a stable cell line. The transient transfection step greatly enriches and simplifies the sample for MS and in- provides the flexibility of expressing proteins that might be toxic creases the subsequent identification rate of cross-links on the to the cells, such in the case of Nsp1 expression.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    9 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us