Group II Intron Rnps and Reverse Transcriptases: from Retroelements to Research Tools
Total Page:16
File Type:pdf, Size:1020Kb
Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press Group II Intron RNPs and Reverse Transcriptases: From Retroelements to Research Tools Marlene Belfort1 and Alan M. Lambowitz2 1Department of Biological Sciences and RNA Institute, University at Albany, State University of New York, Albany, New York 12222 2Institute for Cellular and Molecular Biology and Department of Molecular Biosciences, University of Texas at Austin, Austin, Texas 78712 Correspondence: [email protected]; [email protected] SUMMARY Group II introns, self-splicing retrotransposons, serve as both targets of investigation into their structure, splicing, and retromobility and a source of tools for genome editing and RNA anal- ysis. Here, we describe the first cryo-electron microscopy (cryo-EM) structure determination, at 3.8–4.5 Å, of a group II intron ribozyme complexed with its encoded protein, containing a reverse transcriptase (RT), required for RNA splicing and retromobility. We also describe a method called RIG-seq using a retrotransposon indicator gene for high-throughput integration profiling of group II introns and other retrotransposons. Targetrons, RNA-guided gene targeting agents widely used for bacterial genome engineering, are described next. Finally, we detail thermostable group II intron RTs, which synthesize cDNAs with high accuracy and processiv- ity, for use in various RNA-seq applications and relate their properties to a 3.0-Å crystal structure of the protein poised for reverse transcription. Biological insights from these group II intron revelations are discussed. Outline 1 Introduction 5 Technique 4. RTs with novel enzymatic properties as potent research tools 2 Technique 1. Isolation of native RNPs for cryo-EM structure determination 6 Concluding remarks 3 Technique 2. Retromobility indicator genes References and RIG-seq for high-throughput integration profiling 4 Technique 3. Targetrons for bacterial gene targeting applications Editors: Thomas R. Cech, Joan A. Steitz, and John F. Atkins Additional Perspectives on RNA Worlds available at www.cshperspectives.org Copyright # 2019 Cold Spring Harbor Laboratory Press; all rights reserved; doi: 10.1101/cshperspect.a032375 Cite this article as Cold Spring Harb Perspect Biol 2019;11:a032375 1 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press M. Belfort and A.M. Lambowitz 1 INTRODUCTION intron RNPs to the spliceosome (Fig. 2). High-resolution crystal structures of amino-terminal proteolytic fragments Group II introns are self-splicing RNAs and mobile retro- of group II intron RTs have also been reported, but with the elements that occupy a pivotal evolutionary niche as puta- RT active site region improperly folded (Zhao and Pyle tive ancestors of spliceosomal introns and the spliceosome, 2016). A recent 3.0-Å crystal structure of a full-length group retrotransposons, telomerase, and retroviruses (Lambowitz II intron RT in complex with an RNA-template–DNA and Belfort 2015; Novikova and Belfort 2017). They also are primer substrate and incoming dNTP revealed the active important genome editing tools (Enyeart et al. 2014) and structure of the enzyme for reverse transcription and a host the source of reverse transcriptases (RTs) that have broad of novel structural features that may contribute to the application in high-throughput characterization of RNAs distinctive enzymatic properties of group II intron RTs (Mohr et al. 2013b). These group II intron ribozymes cat- (Stamos et al. 2017). alyze their self-excision from RNA precursors, and ligation As progress was being made on discerning the biochem- of the flanking exons, with the aid of an intron-encoded 3 ical reactions and different functionalities of group II in- protein (IEP), which is an ancient form of RT (Fig. 1A) trons, they began to be exploited as genome editing tools (Lambowitz and Zimmerly 2011). The ribozyme comprises (Guo et al. 2000; Mohr et al. 2000; Karberg et al. 2001; a conserved, six-domain structure (DI–DVI) (Fig. 1B). Enyeart et al. 2014). The specificity of the IBS–EBS inter- Splicing occurs via two transesterification reactions, which ′ actions was harnessed to target introns to preprogrammed are initiated by the 2 -OH of a bulged adenosine nucleo- positions in a wide variety of bacterial genomes. These phile in DVI and result in ligated exons and an excised “targetrons” were then used for a range of genome editing intron lariat, in precise analogy to spliceosomal intron applications in bacteria. Subsequently, the robust properties splicing (Fig. 1A). The IEP functions as a “maturase” to of thermostable group II intron RTs (TGIRTs) were exploit- facilitate RNA splicing and reverse splicing of the excised ed for many different high-throughput RNA characteriza- intron RNA into a DNA site on the top strand. After endo- tion purposes (Mohr et al. 2013b). Thus, these key elements nuclease cleavage of the bottom strand, retrohoming pro- in evolution also provide a toolbox for genetic manipulation ceeds when the RT reverse transcribes the reverse-spliced and biochemical analysis. intron RNA into an intron cDNA that is incorporated into the host genome. Group II intron RTs have a DNA-binding domain (denoted D) and many also have a DNA endo- 2 TECHNIQUE 1. ISOLATION OF NATIVE RNPs FOR nuclease (EN) domain, both of which are involved in CRYO-EM STRUCTURE DETERMINATION retromobility. Cryo-EM structures were determined for the group II Structural studies have provided insight into the RNA Ll.LtrB intron from the ltrB gene of Lactococcus lactis.To component of the ribonucleoprotein (RNP) complex (Toor isolate RNPs from the native host, an elaborate expression et al. 2008; Robart et al. 2014; Costa et al. 2016). DI, the vector was constructed, whereby the intron RNA and the largest intron RNA domain, folds first. DI docks the active IEP were expressed under separate nisin-inducible promot- site domain, DV, and also base-pairs its exon-binding se- ers (Fig. 2A) (Gupta et al. 2014). The IEP (denoted LtrA quences (EBSs) with the intron-binding sequences (IBSs) in protein) was fused to an affinity tag, a chitin-binding do- the exons (Fig. 1B,C). These base-pairings determine the main (CBD), via an intein, to allow cleavage of the IEP from specificity of forward and reverse splicing reactions (Lam- a chitin-affinity column. After induction by nisin, a poly- bowitz and Zimmerly 2011; Marcia and Pyle 2012). DII and cyclic peptide, the L. lactis host cells containing the plasmid DIII contribute structurally to the catalytic complex, where- were lysed and the lysate applied to a chitin resin. To release as DIV encompasses an unstructured open reading frame the bound IEP and its associated intron RNA, the column (ORF) that encodes the RT, which in turn binds tightly to a was immersed in the reducing agent and nucleophile, di- stem-loop preceding the ORF (DIVa for the Ll.LtrB intron thiothreitol. For cryo-EM analysis, after purification by su- shown in Fig. 1B). How the IEP interacts with the intron crose gradient sedimentation, residual genomic DNA was RNA remained elusive until a 3.8–4.5 Å cryo-electron mi- digested with DNase and the pure RNP was recovered by gel croscopy (cryo-EM) structure of the RNP was solved (Qu filtration (Qu et al. 2016). et al. 2016). The RNP complex, purified from native host For EM, the RNP complex was prepared for negative cells, revealed its major structural features including that of staining or frozen-hydrated for cryo-EM (Qu et al. 2016). In the RT and also underscored the relationship of the group II determining the structure of the RNP, negative staining was followed by random conical tilt reconstruction, in which 3We use RT and IEP interchangeably, in keeping with nomenclature for other two major class averages were established. Then micro- RTs that have multiple activities (e.g., RNase H). graphs of frozen-hydrated intron RNP particles were 2 Cite this article as Cold Spring Harb Perspect Biol 2019;11:a032375 Downloaded from http://cshperspectives.cshlp.org/ on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press Group II Intron RNPs and Reverse Transcriptases A Precursor Splicing Intron lariat (1st step) IEP 5′ IEP A 3′ A 3′ 5′ OH Splicing (2nd step) Transcription IEP 5′ 3′ + IEP A Ligated exons Retro- homing 5′ 3′ 5′ 3′ 3′ 5′ 3′ 5′ cDNA Target DNA B δ ld2 EBS1 ld3 S A D U G ld1 EBS2 DIV DII DIII IVa ld (iii)a IVb Triad (ii)a lc2 IBS1 A DV la δ′ DI lb IBS2 3′ DVI 5′ lc1 C 3′ lntron 5′ δ EBS2 EBS1 GUGU A A GUGUUG GU CA CA U AU 3′ 5′ C CA UA A C C δ′ IBS2 IBS1 Exons ′ Figure 1. Group II intron structure and function. (A) Intron life cycle. Splicing is initiated by attack on the 5 splice site by the 2′-OH of the bulged adenosine (circled A) in DVI to form a lariat intermediate (first step). Attack on the 3′ splice site by the 3′-OH of the upstream exon results in ligated exons and the intron lariat (second step). The intron lariat can reverse splice into target DNA (retrohoming). The endonuclease of the intron-encoded protein (IEP) then nicks the opposite DNA strand, and the reverse transcriptase (RT) uses the cleaved DNA strand as a primer to make a cDNA copy of the integrated intron. After degradation of the intron RNA, second-strand DNA synthesis and repair, leading to a fully integrated DNA copy of the intron in the genome (not shown), transcription can ensue to complete the cycle. (B) Secondary structure of a group II intron. The six domains (DI–DVI) radiating from a central hub are labeled along with various subdomains, the “exon-binding sequence”–“intron-binding sequence” (EBS–IBS) pair- ings, the IEP anchor site in DIV (dashed green box), the catalytic triad in DV, and the bulged adenosine in DVI.