De Novo 3D Models of SARS-Cov-2 RNA Elements from Consensus Experimental Secondary Structures Ramya Rangan 1,†, Andrew M
Total Page:16
File Type:pdf, Size:1020Kb
3092–3108 Nucleic Acids Research, 2021, Vol. 49, No. 6 Published online 8 March 2021 doi: 10.1093/nar/gkab119 De novo 3D models of SARS-CoV-2 RNA elements from consensus experimental secondary structures Ramya Rangan 1,†, Andrew M. Watkins 2,†, Jose Chacon 2, Rachael Kretsch 1, Wipapat Kladwang2, Ivan N. Zheludev 2, Jill Townley 3, Mats Rynge4, Gregory Thain 5 and Rhiju Das 1,2,6,* 1Biophysics Program, Stanford University, Stanford, CA 94305, USA, 2Department of Biochemistry, Stanford University School of Medicine, Stanford CA 94305, USA, 3Eterna Massive Open Laboratory, 4Information Sciences Institute, University of Southern California, Marina Del Rey, CA 90292, USA, 5Department of Computer Sciences, Downloaded from https://academic.oup.com/nar/article/49/6/3092/6163084 by guest on 02 October 2021 University of Wisconsin–Madison, Madison, WI 53706 USA and 6Department of Physics, Stanford University, Stanford, CA 94305, USA Received December 16, 2020; Revised February 08, 2021; Editorial Decision February 09, 2021; Accepted February 16, 2021 ABSTRACT GRAPHICAL ABSTRACT The rapid spread of COVID-19 is motivating develop- ment of antivirals targeting conserved SARS-CoV-2 molecular machinery. The SARS-CoV-2 genome in- cludes conserved RNA elements that offer poten- tial small-molecule drug targets, but most of their 3D structures have not been experimentally char- acterized. Here, we provide a compilation of chem- ical mapping data from our and other labs, sec- ondary structure models, and 3D model ensembles based on Rosetta’s FARFAR2 algorithm for SARS- CoV-2 RNA regions including the individual stems SL1-8 in the extended 5 UTR; the reverse com- plement of the 5 UTR SL1-4; the frameshift stim- ulating element (FSE); and the extended pseudo- knot, hypervariable region, and s2m of the 3 UTR. For eleven of these elements (the stems in SL1– 8, reverse complement of SL1–4, FSE, s2m and 3 UTR pseudoknot), modeling convergence supports INTRODUCTION the accuracy of predicted low energy states; sub- sequent cryo-EM characterization of the FSE con- The COVID-19 outbreak has rapidly spread through the firms modeling accuracy. To aid efforts to discover world, presenting an urgent need for therapeutics target- small molecule RNA binders guided by computa- ing the betacoronavirus SARS-CoV-2. RNA-targeting an- tional models, we provide a second set of similarly tivirals have potential to be effective against SARS-CoV- 2, as the virus’s RNA genome harbors conserved regions prepared models for RNA riboswitches that bind predicted to have stable secondary structures (1,2) that small molecules. Both datasets (‘FARFAR2-SARS- have been verified by chemical probing (3–8), some of CoV-2’, https://github.com/DasLab/FARFAR2-SARS- which have been shown to be essential for the life cycle CoV-2; and ‘FARFAR2-Apo-Riboswitch’, at https: of related betacoronaviruses (9). Efforts to identify small //github.com/DasLab/FARFAR2-Apo-Riboswitch’)in- molecules that target stereotyped 3D RNA folds have ad- clude up to 400 models for each RNA element, which vanced over recent years (10), making RNA structures may facilitate drug discovery approaches targeting like those in SARS-CoV-2 potentially attractive targets for dynamic ensembles of RNA molecules. small molecule drugs. *To whom correspondence should be addressed. Tel: +1 650 723 597; Email:[email protected] †The authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors. C The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Nucleic Acids Research, 2021, Vol. 49, No. 6 3093 Several RNA regions in betacoronavirus genomes, in- complement of SL1–4 in the 5 UTR, the FSE and its dimer- cluding the 5 UTR, the frameshift stimulating element ized form, the 3 UTR pseudoknot, and the 3 UTR hyper- (FSE), and 3 UTR, feature RNA structures with likely variable region, along with homology models of SL2 and functional importance. These regions include a series of five s2m. The use of Rosetta’s FARFAR2 is motivated by ex- conserved stem–loops in the 5 UTR, the FSE along with tensive testing: FARFAR2 has been benchmarked on all a proposed dimerized state, a pseudoknot in the 3 UTR community-wide RNA-Puzzle modeling challenges to date proposed to form two structures, and the hypervariable re- (22–24), achieving accurate prediction of complex 3D RNA gion in the 3 UTR, which includes an absolutely conserved folds for ligand-binding riboswitches and aptamers, and octanucleotide and the stem–loop II-like motif (‘s2m’). An producing models with 3–14 A˚ RMSD across six additional NMR structure of stem–loop 2 in the 5 UTR has been recent blind modeling challenges (21). For our SARS-CoV- solved, adopting a canonical CUYG tetraloop fold (11). A 2 study, the accuracy of our original de novo models for the crystal structure for s2m in the 3 UTR has been solved FSE predicted in early 2020 has been validated by subse- for the original SARS virus, SARS-CoV-1 (12). Since re- quent cryo-EM as well, as is described below. In addition Downloaded from https://academic.oup.com/nar/article/49/6/3092/6163084 by guest on 02 October 2021 porting this work on the bioRxiv preprint server, structures to providing structural ensembles for SARS-CoV-2 RNA for the FSE have been determined as an isolated RNA and elements, we provide analogous FARFAR2 de novo and in association with the ribosome through cryo-EM (13,14). homology models for 10 riboswitch aptamers, providing a Beyond these regions, however, 3D structures for RNA benchmark dataset for virtual screening approaches that genome regions of SARS-CoV-2 or homologs have not been make use of computational RNA models. solved. In advance of detailed experimental structural charac- terization, computational predictions for the 3D structural MATERIALS AND METHODS conformations adopted by conserved RNA elements may Chemical reactivity experiments aid the search for RNA-targeting antivirals. Representative conformations from these RNA molecules’ structural en- We collected chemical reactivity profiles for SL1–4 and sembles can serve as starting points for virtual screening of SL2–6 of the 5 UTR, the reverse complement of SL1–4, small-molecule drug candidates. For example, a computa- and the hypervariable region of the 3 UTR. The DNA tem- tional model for the FSE of SARS-CoV-1 was used in a vir- plates for the stem–loop 1–4 RNA were amplified from a tual screen to discover the small-molecule binder MTDB gBlock sequence for the extended 5 UTR, and the DNA (15), and recently, SARS-CoV-2 models for 5 UTR regions template for the hyper-variable region was amplified from a have been used for virtually docking small molecules (16). gBlock sequence for the 3 UTR. The SL2–6 construct was In other prior work by Stelzer et al.(17), virtual screening designed using the Primerize webserver (25) with built-in 5 of a library of compounds against an ensemble of mod- and 3 ‘reference hairpins’ for signal normalization flank- eled RNA structures led to the de novo discovery of a set ing the region of interest and building using PCR assem- of small molecules that bound a structured element in HIV- bly following the Primerize protocol (primers and gBlock 1 (the transactivation response element, TAR). Such work sequences ordered from Integrated DNA Technologies, se- motivates our modeling of not just a single ‘native’ struc- quences in Supplementary Table S5). For amplification off ture but an ensemble of states for SARS-CoV-2 RNA re- of gBlocks, primers were designed to add a Phi2.5 T7 RNA gions. As with HIV-1 TAR, many of the SARS-CoV-2 el- polymerase promoter sequence (26) (TTCTAATACGAC ements are unlikely to adopt a single conformation but in- TCACTATT) at the amplicon’s 5 end and a 20 bp Tail2 se- stead sample conformations from a heterogeneous ensem- quence (AAAGAAACAACAACAACAAC) at its 3 end. ble. Furthermore, transitions among these conformations The PCR reactions contained 5 ng of gBlock DNA tem- may be implicated in the viral life cycle, as RNA genome re- plate, 2 M of forward and reverse primer, 0.2 mM of gions change long-range contacts with other RNA elements dNTPs, 2 units of Phusion DNA polymerase, and 1X of HF or form interactions with viral and host proteins at different buffer. The reactions were first denatured at 98◦Cfor30s. steps of replication, translation, and packaging. A possible Then for 35 cycles, the samples were denatured at 98◦Cfor therapeutic strategy is therefore to find drugs that stabilize 10 s, annealed at 64◦C for 30 s, and extended at 72◦Cfor an RNA element in a particular conformation incompatible 30◦C. This was followed by an incubation at 72◦Cfor10 with conformational changes and/or changing interactions min for a final extension. Assembly products were verified with biological partners at different stages of the complete for size via agarose gel electrophoresis and subsequently pu- viral replication cycle. Consistent with this hypothesis, prior rified using Agencourt RNAClean XP beads. Purified DNA genetic selection and mutagenesis experiments stabilizing was quantified via NanoDrop (Thermo Scientific) and8 single folds for stem–loops in the 5 UTR and the pseu- pmol of purified DNA was then used for in vitro transcrip- doknot in the 3 UTR demonstrate that changes to these tion with T7 TranscriptAid kits (Thermo Scientific). The re- RNA elements’ structural ensembles can prove lethal for vi- sulting RNA was purified with Agencourt RNAClean XP ral replication (18–20).