Enhanced group II retrohoming in magnesium-deficient Escherichia coli via selection of mutations in the core

David M. Truong1, David J. Sidote, Rick Russell, and Alan M. Lambowitz1

Institute for Cellular and Molecular Biology, Departments of Molecular Biosciences and Chemistry, University of Texas at Austin, Austin, TX 78712

Contributed by Alan M. Lambowitz, August 22, 2013 (sent for review July 5, 2013)

Mobile group II are bacterial thought to group II introns retrohome and function efficiently for tar- be evolutionary ancestors of spliceosomal introns and retroele- geting in , their natural hosts, they do so inefficiently in ments in . They consist of a catalytically active intron eukaryotes, at least in part owing to lower free Mg2+ concentrations RNA (“ribozyme”) and an intron-encoded reverse transcriptase, (6), which decrease group II intron ribozyme activity (discussed 2+ which function together to promote RNA splicing and intron mo- below). These lower Mg concentrations constitute a natural bility via reverse splicing of the intron RNA into new DNA sites barrier that impedes group II introns from invading the nuclear (“retrohoming”). Although group II introns are active in bacteria, of present-day eukaryotes and limits their utility as gene their natural hosts, they function inefficiently in eukaryotes, targeting vectors for higher . where lower free Mg2+ concentrations decrease their ribozyme Recent X-ray crystal structures of a group II intron RNA activity and constitute a natural barrier to group II intron prolifer- provide a structural framework for investigating group II in- ation within nuclear genomes. Here, we show that retrohoming of tron splicing and retrohoming mechanisms and potentially for the Ll.LtrB group II intron is strongly inhibited in an Escherichia coli improving their function in gene targeting (7–9). Group II intron mutant lacking the Mg2+ transporter MgtA, and we use this sys- RNAs consist of six conserved domains (denoted DI–DVI) that tem to select mutations in catalytic core V (DV) that par- interact via tertiary contacts to fold the RNA into a catalytically tially rescue retrohoming at low Mg2+ concentrations. We thus active 3D structure (Fig. 1A) (1). DV is a small conserved domain identified mutations in the distal stem of DV that increase retro- that binds catalytic metal ions and interacts with DI and J2/3 to ’ homing efficiency in the MgtA mutant up to 22-fold. Biochemical form the intron RNA s active site. It is thought to be the cognate assays of splicing and reverse splicing indicate that the mutations of the U2/U6 snRNAs of the , and consequently its increase the fraction of intron RNA that folds into an active con- architecture and function are central to understanding the mech- formation at low Mg2+ concentrations, and terbium-cleavage anism and evolution of RNA splicing in higher organisms (10, 11). assays suggest that this increase is due to enhanced Mg2+ binding DI, the largest domain, provides a structural scaffold for the as- to the distal stem of DV. Our findings indicate that DV is involved sembly of the other domains and contains -binding sites that ′ ′ in a critical Mg2+-dependent RNA folding step in group II introns position the 5 - and 3 -splice sites and ligated-exon junction at the and demonstrate the feasibility of selecting intron variants that ribozyme active site for RNA splicing and reverse splicing reac- function more efficiently at low Mg2+ concentrations, with impli- tions. DIII functions as a catalytic effector, DIV is the location of cations for evolution and potential applications in gene targeting. the ORF encoding the IEP, and DVI contains the branch-point adenosine used for lariat formation. directed evolution | Mg2+ transport | RNA structure Three major structural subclasses of group II intron RNAs, denoted IIA, IIB, and IIC, have been identified with differences in both peripheral and active-site elements (1). X-ray crystal obile group II introns are retrotransposons that are found structures of the Oceanobacillus iheyensis group IIC intron reveal Min prokaryotes and the mitochondrial and the folded structure of DI-V, with and without bound (7–9). DNAs of some eukaryotes and are thought to be evolutionary ancestors of spliceosomal introns, the spliceosome, retro- Significance transposons, and in higher organisms (1). They consist of two components—an autocatalytic intron RNA (“ribozyme”) and an intron-encoded (IEP) with re- Mobile group II introns are bacterial retrotransposons. They “ ” verse transcriptase activity—that function together in a ribo- consist of an autocatalytic intron RNA ( ribozyme ) and an in- nucleoprotein (RNP) complex to promote RNA splicing and tron-encoded reverse transcriptase and were likely ancestors of site-specific integration of the intron into new DNA sites in a spliceosomal introns and retroelements in eukaryotes. Al- though active in bacteria, group II introns function inefficiently process called retrohoming (1). Like spliceosomal introns, 2+ group II introns splice via two sequential transesterification in eukaryotes, where lower Mg concentrations decrease their reactions that yield an excised intron lariat RNA (2). For ribozyme activity and constitute a natural barrier to group II in- Escher- group II introns, the splicing reactions are catalyzed by the tron proliferation within nuclear genomes. By using an ichia coli 2+ intron RNA with the assistance of the IEP, which binds spe- Mg -transport mutant, we selected mutations near the fi intron RNA’s active site that enhance group II intron function ci cally to the intron RNA and stabilizes the catalytically 2+ active RNA structure. The IEP then remains bound to the at low Mg concentrations. Our results have implications for excised intron lariat RNA in an RNP that promotes retro- ribozyme mechanisms, evolution, and . homing via reverse splicing of the intron RNA directly into Author contributions: D.M.T. and A.M.L. designed research; D.M.T. and D.J.S. performed DNA sites followed by reverse by the IEP. The research; D.M.T., D.J.S., R.R., and A.M.L. analyzed data; and D.M.T., D.J.S., R.R., and A.M.L. resulting intron cDNA is integrated into the by host wrote the paper. fi enzymes (3, 4). The ribozyme-based, site-speci cDNAin- The authors declare no conflict of interest. tegration mechanism used by group II introns enabled their 1To whom correspondence may be addressed. E-mail: [email protected] development into gene targeting vectors (“targetrons”), which or [email protected]. fi combine high integration ef ciency with high and readily This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. programmable DNA target specificity (5). However, although 1073/pnas.1315742110/-/DCSupplemental.

E3800–E3809 | PNAS | Published online September 16, 2013 www.pnas.org/cgi/doi/10.1073/pnas.1315742110 Downloaded by guest on October 1, 2021 many of the same metal ion-binding sites but without the bend PNAS PLUS A Ll.LtrB-ΔORF (12, 13). Besides the catalytic metal ions, several other Mg2+ ions are seen at different sites in DV in the O. iheyensis RNA crystal structures. Three putative Mg2+ ions lie within the major and minor grooves near the κ′ motif in the proximal stem, and three others (denoted here as M3–M5) are bound to the distal stem, one (M3) within a kink adjacent to the G of the GNRA tetra- loop, and another (M4) at the R of the GNRA tetraloop. The third Mg2+ bound to the distal stem (M5) forms a bridge be- tween the base pair distal to λ′ and the third of the catalytic triad, potentially stabilizing the sharp bend in DV. These additional Mg2+-binding sites in the O. iheyensis intron are Tetraloop generally consistent with the locations of terbium-cleavage sites B CT Bulge in DV of the aI5γ intron (14). 5 4 2+ 1 2 Like other , group II introns use Mg for both RNA 3 2+ 2 3 folding and catalysis (15, 16). However, the Mg concentrations 1 4 required for group II intron function are higher than those for 1 2 5 other ribozymes. Thus, mutations in the yeast mitochondrial Mg2+ Bulge CT Tetraloop transporter Mrs2 specifically inhibit the splicing of all four yeast mt C O. iheyensis DV Ll.LtrB DV group II introns, while having minimal effects on the transcription or splicing of group I introns (17). Moreover, efficient retrohoming in Xenopus laevis oocytes, Drosophila melanogaster embryos, and zebrafish (Danio rerio) embryos requires the coinjection of addi- tional Mg2+ to achieve an intracellular concentration of 5–10 mM (6). Bacteria, the natural hosts of group II introns, typically have free intracellular Mg2+ concentrations of 1–4 mM (18), whereas X. Fig. 1. Group II intron RNA and DV structures. (A) Secondary structure 2+ Δ laevis oocyte nuclei contain 0.3 mM Mg (19), and mammalian model of the Ll.LtrB- ORF group II intron, with DV highlighted in red. Greek – 2+ letters indicate sequence elements involved in long-range tertiary inter- cells contain 0.2 1mMMg during the majority of the cycle 2+ actions (dashed red lines). The locations of the ApaLI, KpnI, and MluI sites (20). The latter values are well below the optimal Mg concen- used in library construction and the inserted phage T7 promoter sequence trations for protein-assisted group II intron RNA splicing and

(PT7) used for genetic selections are indicated. (B) Metal ions bound to DV in reverse splicing (5–10 mM) (21). For some ribozymes, it an ensemble of 15 superposed X-ray crystal structures of the O. iheyensis has been possible to select new variants that function at lower group IIC intron (3BWP, 3EOG, 3EOH, 3IGI, 4E8M, 4E8P, 4E8Q, 4E8R, 4E8V, Mg2+ concentrations or use different metal ions (22–24). How- 4FAQ, 4FAR, 4FAU, 4FAW, 4FAX, and 4FB0). Mg2+ and K+ ions are shown as 2+ + ever, the extent to which group II introns can be similarly evolved blue and violet spheres, respectively. The Mg and K ions that comprise the or engineered to decrease their Mg2+ dependence is unknown. BIOCHEMISTRY heteronuclear metal ion cluster at the active site are numbered M1/K1 and M2/K2, and the three Mg2+ ions bound to the distal stem of DV are num- Here, we used an E. coli mutant with a disruption in the mag- nesium transporter mgtA for in vivo selection of group II intron bered M3, M4, and M5. (C) Secondary structure diagrams of DV of the 2+ O. iheyensis and L. lactis Ll.LtrB group II introns showing the locations of variants with enhanced function at lower Mg concentrations from Mg2+ and K+ ions identified in the ensemble of O. iheyensis crystal structures libraries of Lactococcus lactis Ll.LtrB introns with mutations in DV. of panel B. Ions shown outside the secondary structure are bound to the We thus identified 43 improved variants that have greater than phosphodiester backbone, and those shown inside are bound within the threefold increased retrohoming efficiency within the mgtA dis- helix in the O. iheyensis intron structures. A range of possible locations (red ruptant. We find that all of these improved variants have mutations arrows) is shown for M5 in the Ll.LtrB intron owing to the different lengths in the distal stem of DV, with the two most active variants (16- and of the distal stem in subgroup IIA introns (Ll.LtrB; 5 bp) and IIC introns 22-fold increased) having mutations restricted to the distal stem. (O. iheyensis; 3 bp) (1). Dashed red lines show regions of the phosphodiester backbone that are bridged by bound Mg2+ ions in the O. iheyensis structures. Biochemical analysis demonstrates that the enhanced retrohoming of these variants reflects a higher fraction of intron RNAs that folds Abbreviations: CT, catalytic triad; H, sugar-edge/Hoogsteen contact with I(i) 2+ loop in domain I. into an active structure at low Mg concentrations, and terbium- cleavage assays on isolated DV indicate that the mutations enhance Mg2+ binding to sites in the distal stem. Together, our results Although the O. iheyensis structures do not include DVI and the suggest a model in which the variants achieve more efficient ret- intron RNA construct lacks the ability to self-splice via lariat rohoming and splicing by enhancing Mg2+ binding to the distal formation, they do show the active-site region formed by DI and stem of DV and promoting a conformational change that is re- DV, which can catalyze splicing via 5′-splice-site hydrolysis rather quired for formation of an active RNA structure, perhaps the sharp than branching. Importantly, the structures show that this active bending of DV needed to form the heteronuclear metal ion center site contains a heteronuclear metal ion-binding center consisting attheactivesite. of two pairs of Mg2+ and K+ ions bound site-specifically to DV, one pair (M1/K1) to a catalytic triad of three highly conserved Results in the proximal stem of DV, and the other pair (M2/ E. coli Mutants with Defects in Mg2+ Transport Are Deficient in Group K2) to a dinucleotide bulge at the hinge region between the II Intron Retrohoming. Because dysfunction of mitochondrial Mg2+ proximal and distal stems (Fig. 1 B and C). These two pairs of transporters in Saccharomyces cerevisiae inhibits the splicing of metal ions are brought together at the active site by a sharp bend endogenous group II introns (17), we reasoned that disrupting in DV that moves the distal stem toward the catalytic triad. This Mg2+ transporters should similarly impair the function of the sharp bend is stabilized both by bound metal ions bridging dif- Ll.LtrB intron in E. coli. The resulting Mg2+-deficient intracellular ferent regions of the phosphodiester backbone and by inter- environment could then be used to select DV variants with a de- actions between DV and other regions of the intron RNA (7). By creased Mg2+ requirement. E. coli relies on two major Mg2+ contrast, structures of DV of the yeast aI5γ and Pylaiella littoralis transport , CorA and MgtA (25), and the Salmonella LSU/2 group II introns in isolation showed an extended helix with homolog of CorA can rescue a group II intron splicing defect in

Truong et al. PNAS | Published online September 16, 2013 | E3801 Downloaded by guest on October 1, 2021 a yeast mutant with a disruption of the mitochondrial Mg2+ probe does not accurately measure the effective Mg2+ concen- transporter Mrs2 (26). tration during group II intron homing in exponentially growing We determined the retrohoming efficiency of the Ll.LtrB intron cells, either because of the nongrowth conditions needed to in E. coli mutants with targetron-mediated disruptions of the corA permeabilize the cells to the fluorescent probe or because of and mgtA (27) by using a -based assay (28) (Fig. 2A). differences in the local Mg2+ concentrations in regions of the In this assay, a precursor RNA containing an Ll.LtrB-ΔORF in- cell in which retrohoming occurs (Discussion). In the experi- tron (i.e., an Ll.LtrB intron deleted for the ORF encoding the ments below, we used the mgtA disruptant to provide a stringent ′ environment for the selection of group II intron variants with IEP) with a phage T7 promoter sequence inserted near its 3 end is 2+ expressed in tandem with the IEP (denoted LtrA protein) from improved function at lower Mg concentrations. aCapR donor plasmid. The LtrA protein promotes the splicing and retrohoming of the Ll.LtrB-ΔORF intron carrying the T7 promoter Selection of Functional DV Variants from a Partially Randomized “ ” into a target site cloned in an AmpR recipient plasmid upstream ( Doped ) Library in an E. coli mgtA Disruptant. We used the plas- of a promoterless tetracycline-resistance (tetR)gene,therebyac- mid-based mobility system described above to select variants of the Ll.LtrB-ΔORF intron that are active in retrohoming within tivating that gene. After plating on LB medium containing anti- the mgtA disruptant. The selection was done from a library of biotics, retrohoming efficiencies are quantified as the ratio of 9 R R R ∼10 intron variants in which a 36-nt region encompassing DV (Tet +Amp )/Amp colonies. “ ” fi Δ was partially randomized ( doped ) with 70% of the wild-type The retrohoming ef ciency of the Ll.LtrB- ORF in the wild- nucleotides and 10% of each of the three other nucleotides at type E. coli HMS174(DE3) strain was 48% and was decreased each position. After two rounds of selection, we used colony PCR approximately twofold in the corA disruptant and 200-fold in the ∼ R 2+ to amplify and sequence DVs from 300 individual Tet colonies mgtA disruptant (Fig. 2B). Measurements of Mg concentration and identified 106 unique DV sequences that had retrohomed fl with the uorescent probe mag-fura-2 (29) showed that expo- in the mgtA disruptant (Fig. S1). These variants contained one nentially grown wild-type HMS174(DE3) has an intracellular 2+ to nine mutations, with an average of 2.6 mutations over the free Mg concentration of 3.1 mM, in the range found pre- 36-nt randomized region. From these sequences, we generated viously for E. coli (18), whereas the corA and mgtA disruptants a mutational map displaying regions amenable or refractory 2+ have lower but similar Mg concentrations (1.6 and 1.7 mM, to mutagenesis (Fig. 3A). As expected, most of the mutants respectively; Fig. 2B). The much stronger inhibition of retro- maintained the DV secondary structure and had mutations homing in the mgtA disruptant may reflect that the fluorescent outside of regions known to be important for RNA catalysis and tertiary-structure formation. The most conserved regions were those containing the catalytic triad, the κ′ motif, and a conserved G involved in a trans sugar-edge/Hoogsteen contact to the I(i) A loop (11). Within the catalytic triad, the first two nucleotides PT7lac PT7 (A and G) and the central G-U base pair were invariant, and the third nucleotide was mutated in only a single variant with E1 Ll.LtrB-ΔORF E2 LtrA R Target site E1E2 tet strongly decreased retrohoming efficiency (DV93; C → G with Donor Plasmid Recipient Plasmid a compensatory G → U mutation in the opposite nucleotide; Fig. pACD2 pBRR3-ltrB S1). Only three other nucleotides were invariant: the G imme- capR ampR diately preceding the catalytic triad, a G that base pairs with a U or C in the λ′ motif, and the A in the terminal GNRA tetraloop.

PT7 The most mutable regions were the three terminal base pairs of the distal stem, a nucleotide between the catalytic triad and sugar-edge/ R E1 Ll.LtrB-ΔORF E2 tet Hoogsteen contact, and the dinucleotide bulge, which is a major 2+ Mobility Product Mg -binding site in the O. iheyensis group IIC and aI5γ group IIB introns (7, 14). ampR B 2+ 2+ Characteristics of Variants with Increased Retrohoming Efficiency in Strain Retrohoming [Mg ]i (mM) [Mg ]i (mM) efficiency (%) range the mgtA Disruptant. To determine which variants have increased fi WT (HMS174) 48 ± 3 3.1 ± 0.4 2.7-3.9 retrohoming ef ciency in the mgtA disruptant, we screened all 106 unique variants recovered from the selection by using a high- corA disruptant 27 ± 2 1.6 ± 0.1 1.4-1.7 throughput version of the plasmid-based mobility assay (Fig. mgtA disruptant 0.24 ± 0.02 1.7 ± 0.1 1.6-1.8 S1B). This screen was done in 96-well deep culture plates and R Fig. 2. E. coli selection system for variants of the Ll.LtrB intron with measures the number of Tet cells via OD595 after 2-d growth fi 2+ under selective conditions. We thus identified 30 candidates that mutations in DV that increase retrohoming ef ciency at low Mg concen- fi trations. (A) Plasmid-based group II intron retrohoming system used to select had mobility ef ciencies equal to or higher than the wild-type active DV variants in the E. coli mgtA disruptant. The CapR intron-donor intron in the mgtA disruptant (candidate map, Figs. 3B and 4). plasmid pACD2 uses a T7lac promoter to produce group II intron RNPs We tested all 30 of these candidates by plating assays and found consisting of the Ll.LtrB-ΔORF intron lariat RNA and the IEP, denoted LtrA 20 DV variants (denoted “improved variants”) that performed protein. The intron carries a phage T7 promoter sequence within DIV. The significantly (greater than threefold) better than the wild-type AmpR-recipient plasmid contains an Ll.LtrB target site (ligated E1-E2 se- DV sequence in the mgtA disruptant, with one (DV20) reaching quence of the ltrB gene from position −30 downstream to +15 upstream of 22-fold enhancement (improved map, Figs. 3C and 4). R the intron-insertion site) preceding a promoterless tet marker. Retrohom- The 30 candidate variants had relatively few mutations, with ing of the Ll.LtrB-ΔORF lariat RNA into the target site introduces the T7 a maximum of seven (Fig. 4). Most mutations were found in the promoter needed for tetR expression, enabling selection for TetR colonies. fi Δ distal stem of DV, but a small number were found in the prox- (B) Retrohoming ef ciency of the Ll.LtrB- ORF intron in E. coli wild-type imal stem outside the catalytic triad. Most of the mutant DVs (WT) HMS174(DE3) and mutants with disruptions in the corA and mgtA 2+ fi retained a fully wild-type secondary structure, but 11 mutant genes encoding Mg -transporters. Retrohoming ef ciencies were de- fi termined as the ratio of (TetR + AmpR)/AmpR colonies in the plasmid-based DVs, with up to sixfold increased retrohoming ef ciency, had 2+ 2+ retrohoming assay, and intracellular free Mg concentrations ([Mg ]i) were U-U or A-C mispairs in the distal stem (DV88, DV54, DV36, measured by using the fluorescent probe mag-fura-2. The values shown are DV35, DV49, DV104, DV46, DV44, DV94, DV7, and DV32; the mean ± the SEM for three determinations, with the range of Mg2+ Fig. 4). A number of the lower-performing improved variants 2+ concentrations in different experiments also indicated. contained mutations in the dinucleotide bulge, which binds Mg

E3802 | www.pnas.org/cgi/doi/10.1073/pnas.1315742110 Truong et al. Downloaded by guest on October 1, 2021 variants of the Ll.LtrB-ΔORF intron that enhance retrohoming PNAS PLUS A D DV14 16 fold Mutational Map within the mgtA disruptant relative to the wild-type intron. For each saturating selection, we limited the number of randomized nucleotides to <12 to ensure that all possible combinations of variants could be sampled in libraries of reasonable size. Our first saturating selection focused on the mutable nucleo- tides found in the 20 improved variants from the initial selection B Candidate Map E DV20 22 fold (Fig. S2). These 11 nucleotides included the G at the R position in the GNRA tetraloop, the three terminal base pairs of the distal stem, and two base pairs in the proximal stem between the catalytic triad and the sugar-edge/Hoogsteen contact (Fig. 3C). After five rounds of selection, we identified 23 unique DV sequences, which were not found among the improved variants in the initial selection (Fig. S2). All 23 of these variants had C Improved Map F P. littorallis distal DV mutations in the distal stem of DV, and 18 had an additional 1.3 fold U-G base pair in the proximal stem between the catalytic triad and the sugar-edge/Hoogsteen contact. When tested individually by using the plasmid-based plating assay, all 23 of the new var- iants outperformed the wild-type DV sequence by greater than threefold, with the three best (DV134, DV193, and DV164) fi Fig. 3. DV variants obtained in a selection for increased retrohoming effi- having 15- to 16-fold higher retrohoming ef ciency than the wild- ciency in the mgtA disruptant from a library of Ll.LtrB introns in which DV type intron (Fig. S2). However, these new variants performed no was partially randomized. The selection was done by using the plasmid- better than the two best variants, DV14 and DV20, from the based retrohoming system (Fig. 2) with a library of ∼109 Ll.LtrB-ΔORF intron initial selection. variants in which DV (intron positions 844–879) was partially randomized The second saturating selection (Fig. S3) was based on the with 70% of the wild-type nucleotide and 10% of each mutant nucleotide at sequence of the best-performing variant from the initial selec- each position. (A) DV mutational map for the 106 active variants obtained tion, DV20 (22-fold increased), which contains three nucleotide after two rounds of selection. Nucleotides that were invariant in the selec- changes in the distal stem of DV. We constructed a library of tion are indicated in bold, and those with one or two mutations are shown with no shading. Nucleotides with larger numbers of mutations are high- lighted according to a color scale shown in the figure. Nucleotides are in- dicatedasfollows:B=C,G,U;D=A,G,U;H=A,C,U;K=U,G;n = all; M = A, C; R = A, G; S = C, G; Y = C, U. Base pairs for which compensatory base changes were found in the selections are shown in red. Nucleotide sequences and retrohoming efficiences for the 106 variants are shown in Fig. S1.(B and C) Mutational maps of DV for the 30 candidate variants having

high retrohoming efficiencies in the high-throughput retrohoming assay (B) BIOCHEMISTRY and the 20 improved variants confirmed in plating assays to have at least threefold higher retrohoming efficiency than the wild-type intron in the mgtA disruptant (C). Numbers of mutations at each position are indicated by color highlighting according to the scales shown for each panel. Nucleotide sequences of the 30 candidate mutants including the 20 improved variants are shown in Fig. 4. (D and E) DVs of the top performing selected variants DV14 and DV20, respectively. (F) An additional variant containing nucleo- tide substitutions found in the distal stem of the P. littorallis LSU/2 intron, which self-splices at lower Mg2+ concentrations than other group II introns (32). Nucleotides that differ from those in the wild-type Ll.LtrB intron are shown in red, and red bars show changed numbers of hydrogen bonds be- tween base pairs.

(DV100, DV99, DV88, and DV54; one- to twofold increased). All 20 improved variants had mutations in the distal stem, and several had only a single mutation in the distal stem (DV49, DV32, DV59, and DV14; 4- to 16-fold increased). Two of the improved variants also had altered tetraloops (GAAA; DV62 fi fi Fig. 4. DV sequences of variants with increased retrohoming ef ciency in and DV37; ve- to sixfold). The two most improved variants the mgtA disruptant selected from an Ll.LtrB intron library in which DV was DV14 and DV20 (16- and 22-fold, respectively) had mutations partially randomized. The figure shows sequences and retrohoming effi- confined to the distal stem (Fig. 3 D and E). Notably, both DV14 ciencies determined by plating assay for the 30 DV variants found by the and DV20 have additional U-G or G-U wobble pairs in the distal high-throughput 96-well plate assay to have retrohoming frequencies equal stem, and the single base mutation in DV14 results in two tan- to or greater than that of the wild-type (WT) intron in the mgtA disruptant dem U-G pairs. G-U wobble pairs within RNA stems are fre- (Fig. S1). Variants with a retrohoming efficiency greater than threefold higher quently Mg2+-binding sites owing to the high electronegativity than that of the wild-type intron in the plating assay are denoted as im- fi along their major groove surface, and two tandem G-U pairs proved variants and are demarcated in the gure. Blue shading indicates have been shown to be a strong Mg2+-binding site in RNA stems a match to the wild-type sequence, and red boxes indicate mispairings. The number of mutations found in each variant is indicated to the left, and the (30, 31). retrohoming efficiency and fold increase relative to the wild-type intron determined in parallel in the plating assay (mean ± SEM for three determi- Saturation Mutagenesis of Mutable Positions in DV. We performed nations) are indicated to the right. The ranges reflect day-to-day variability in two more saturating selections focused on nucleotide residues in the retrohoming assays, which affect both wild-type and mutant retrohom- DV that were found to be mutable in the initial selection. As ing efficiency. The percentage of wild-type nucleotides at each position and before, we used the plasmid-based mobility system to select a consensus sequence for the selected variants are shown at the bottom.

Truong et al. PNAS | Published online September 16, 2013 | E3803 Downloaded by guest on October 1, 2021 variants that contained the three DV20 mutations and ran- Strain WT mgtA In vitro domized eight additional nucleotides that were mutable in the IPTG + + + + - initial selection (Figs. 3 and 4). These included the dinucleotide Intron WT WT DV14 DV20 WT WT kb 2+ fi λ′ bulge, a major Mg -binding site, the rst base pair of the motif, 5 and the same two base pairs between the catalytic triad and sugar- edge/Hoogsteen contact that were randomized in the first saturat- Precursor 3 ing selection. In the new selection, we compared the retrohoming efficiency of the library and pooled variants from three selection 2 cycles to that of the DV20 variant by using the plasmid-based plating assay. The retrohoming efficiency of the initial pool of variants (cycle 0) was ∼750-fold lower than DV20 but rapidly rose ∼ fi 1 to roughly the same level as DV20 ( 3% retrohoming ef ciency) Intron after selection cycle 1. The retrohoming efficiency of the pools from cycles 2 and 3 remained high but lower than that from cycle 1. Sequence analysis of 24 individual clones from each cycle showed 16S rRNA that the DV20 sequence with no other changes increased in abundance to 7% in cycle 1, 25% in cycle 2, and 50% in cycle 3. Fig. 5. Northern hybridization of the wild-type Ll.LtrB-ΔORF intron and The remaining 50% at cycle 3 comprised many lower-frequency variants DV14 and DV20 in wild-type HMS174(DE3) and the mgtA dis- variants, the most abundant of which (7%) was DV176, which was ruptant. Cells containing the intron-donor plasmid pACD2 and recipient fi fi plasmid pBRR3-ltrB were grown in 50-mL cultures and induced with iso- also identi ed in the rst saturating selection (Fig. S2). The pro- propyl β-D-1 thiogalactopyranoside (IPTG) as for plasmid-based retrohoming gressive enrichment of the DV20 sequence suggests that it cannot assays, and then total cellular RNA was extracted, denatured with glyoxal, be improved further by changes in the other regions randomized in and run in a 1% agarose gel. The gel was blotted to a nylon membrane and this selection. Notably, the dinucleotide bulge reverted to wild-type hybridized with a 32P-labeled intron probe. Gel loads were normalized based AC sequence in cycle 1 and the λ′ base pair reverted to wild-type on A260 and levels of 16S rRNA detected by ethidium bromide staining A-U by cycle 2, indicating that variants with mutations at these (shown below). Ll.LtrB-ΔORF intron RNA that was self-spliced in vitro was positions function suboptimally. run in a parallel lane shown to the right. An additional control lane shows We also tried combining the mutations in the distal stems of that the intron RNAs were virtually undetectable without IPTG induction. the two best-performing variants, DV14 and DV20, but found The numbers to the right of the gel indicate the positions of single-stranded RNA size markers (New England Biolabs). that the combined variant had lower retrohoming efficiency (2.2- fold wild type) than either single variant (16- and 22-fold wild type for DV14 and DV20, respectively). Finally, we substituted Splicing of Wild-Type and VariantIntronRNAsatDifferentMg2+ the DV distal stem from the Pylaiella littoralis LSU/2, which self- Concentrations. To investigate whether the DV variants have splices at unusually low magnesium concentrations (32), and enhanced group II intron ribozyme activity at low Mg2+ con- found the retrohoming efficiency to be only 1.3-fold that of the centrations, we compared the in vitro splicing of the wild-type wild-type intron (Fig. 3F). Ll.LtrB-ΔORF, DV14, and DV20 introns at three different Mg2+ concentrations. In these experiments, a 32P-labeled in vitro Northern Hybridization of Wild-Type and Variant Ll.LtrB RNAs in Vivo. transcript containing the wild-type or mutant Ll.LtrB-ΔORF in- To investigate how the DV mutations increase retrohoming ef- tron flanked by short exon sequences was spliced by adding fi ciency, we focused on the two best-performing variants from a 10-fold molar excess of LtrA protein. Plots of precursor fi the selections, DV14 and DV20, which have retrohoming ef - RNA disappearance and lariat RNA accumulation are shown ciencies 16- and 22-fold higher, respectively, than that of the in Fig. 6, and the corresponding gels are shown in Fig. S4. wild-type intron in the mgtA disruptant (Figs. 3 and 4). To de- At 5 mM Mg2+, the splicing reaction of the wild-type intron termine whether the changes in retrohoming efficiency are due proceeded in multiple phases. A prominent rapid phase gave −1 to changes in the intracellular levels of the Ll.LtrB intron RNA, a rate constant k1 of 1.9 min for the disappearance of precursor − we carried out Northern hybridizations of total cellular RNAs RNA and a slower phase gave k of 0.074 min 1, in agreement 32 2 run in a 1% agarose gel and hybridized with a P-labeled intron with previous results (21). The slow phase most likely reflects probe (Fig. 5). The cells were grown under the conditions of the a population of intron RNAs that folds and assembles with LtrA retrohoming assay so that the intron RNA levels could be cor- more slowly, as suggested previously (21). An alternative model related with the retrohoming efficiencies. postulates that the two phases reflect a rapid internal equilibra- For both the wild-type and mutant introns, the major band tion of the chemical steps of splicing, followed by a slower, irre- detected in the Northern blots corresponds to the excised intron versible step. However, we do not favor this model because RNA and comigrates with lariat RNA obtained by self-splicing of inefficient folding has been demonstrated for other group II in- the Ll.LtrB-ΔORF intron in vitro, consistent with previous tron RNAs under similar conditions (35) and because of addi- findings for the wild-type intron that excised intron lariat RNA tional results at lower Mg2+ concentrations (discussed below). accumulates in vivo (33). The Northern blots for the wild-type Together these two phases accounted for splicing of 70–80% of intron show similar levels of the excised intron RNA in both the precursor in different assays. The splicing reactions of DV14 wild-type HMS174(DE3) and mgtA mutant cells, indicating that and DV20 were also multiphasic, but with smaller amplitudes for the 200-fold decrease in retrohoming efficiency in the mutant the fast phase, resulting in less efficient splicing of the mutant cells is due instead to decreased activity of group II intron RNPs, introns at 5 mM Mg2+. The decreased amplitudes for the fast as expected for lower intracellular Mg2+ concentrations. The phase may reflect that the variants were selected to function Northern blots for the DV14 and DV20 variants show that the optimally at low Mg2+ concentrations and now function sub- levels of excised intron RNA in the mgtA disruptant are similar optimally at higher Mg2+. to or slightly lower than that of the wild-type intron, indicating The situation differs at lower Mg2+ concentrations. At 2.5 mM that their enhanced retrohoming in the mgtA disruptant is due to Mg2+, the splicing of all three introns showed two phases on the higher activity of the intron RNPs at lower Mg2+ concentrations. time scale of the experiment. For all three introns, the fast phases –1 Because the LtrA protein does not bind to DV (34), there is no again gave rate constants k1 of ∼1min , but with amplitudes that expectation that mutations in DV could increase RNP activity by were lower than those at 5 mM Mg2+. In addition, substantial enhancing protein binding at low Mg2+ concentrations. fractions of the precursor remained unspliced even after com-

E3804 | www.pnas.org/cgi/doi/10.1073/pnas.1315742110 Truong et al. Downloaded by guest on October 1, 2021 PNAS PLUS Precursor Excised Lariat

1.0 WT k1=1.9 (0.57) k2=0.074 (0.15) 1 DV14 k1=1.0 (0.36) k2=0.034 (0.17) 0.5 0.8 DV20 k1=1.9 (0.33) k2=0.080 (0.23)

0.4 2+ 0.6 0.3 0.4 0.2 Fraction

WT k1=1.0 (0.42) k2=0.021 (0.06) 5 mM Mg 0.2 0.1 DV14 k1=0.6 (0.19) k2=0.017 (0.16) DV20 k1=0.5 (0.16) k2=0.011 (0.29) 0.0 0.0 0 30 60 90 120 150 180 0 30 60 90 120 150 180 WT k1=1.0 (0.14) k2=0.008 (0.30) 1.0 WT k1=1.2 (0.21) k2=0.018 (0.40) 1 DV14 k1=0.8 (0.07) k2=0.008 (0.11) DV14 k1=1.2 (0.12) k2=0.017 (0.23) 0.5 DV20 k1=0.8 (0.09) k2=0.010 (0.15) Fig. 6. Splicing of the wild-type and variant DV14 and DV20 DV20 k1=1.0 (0.10) k2=0.017 (0.30) 0.8 Δ 2+ 0.4 2+ Ll.LtrB- ORF introns at different Mg concentrations. Splic- ing time courses were performed by incubating 32P-labeled 0.3 0.6 precursor RNA containing the wild-type (WT) or variant 0.2 introns with a 10-fold molar excess of LtrA protein in reaction Fraction media containing different Mg2+ concentrations (5, 2.5, or 0.4 2.5 mM Mg 0.1 1.5 mM) at 37 °C. Reactions were initiated by adding LtrA 0 0.0 protein and incubated for different times up to 3 h. The 0 30 60 90 120 150 180 0 30 60 90 120 150 180 products were analyzed in a denaturing 4% polyacrylamide 1.00 1 WT k=0.015 (0.013) gel, which was dried and quantified with a PhosphorImager. DV14 k=0.014 (0.019) DV20 k=0.015 (0.031) The plots show disappearance of precursor RNA (Left)and 0.02 appearance of excised lariat RNA (Right) fit to a single or 2+ 0.98 double exponential equation. The rate constants k1 or k2 in − min 1 and amplitudes [fraction of precursor RNA remaining 0.01 (Left) and fraction of precursor RNA spliced to excised lariat Fraction (Right)] for each phase are indicated in each plot and are the WT k=0.020 (0.022) 1.5 mM Mg 0.96 DV14 k=0.021 (0.030) averages from three experiments. The SDs for rate constants DV20 k=0.020 (0.052) 0 0.00 and amplitudes were <20%, except for the small slow phase 0 30 60 90 120 150 180 0 30 60 90 120 150 180 of splicing of the wild-type intron at 5 mM Mg2+, where the Time (min) Time (min) SDs were ∼40%.

pletion of the slower observed phase, suggesting the presence of Reverse Splicing of Wild-Type and Variant Intron RNAs at Different at least one additional population that folds even more slowly. At Mg2+ Concentrations. We carried out similar experiments to assess 1.5 mM Mg2+, the splicing of all three introns was slower and the effect of the DV14 and DV20 mutations on reverse splicing

only a single phase with a small amplitude (≤5%) was observed of the intron RNA during target DNA-primed reverse tran- BIOCHEMISTRY 32 on the experimental time scale. The lower amplitudes at 2.5 and scription reactions. In these experiments, a 129-bp P-labeled 1.5 mM Mg2+ provide additional evidence against the alternative DNA substrate containing the Ll.LtrB intron-insertion site was model in which the fast phase reflects internal equilibration of incubated with a 50-fold molar excess of wild-type and variant the chemical steps, because there is no expectation that lower Ll.LtrB RNPs (wild-type LtrA protein reconstituted with wild- 2+ type or variant Ll.LtrB-ΔORF intron lariat RNA) in reaction Mg concentrations would favor the precursor or that the equi- 2+ 2+ medium containing different Mg concentrations and dNTPs to librium would be so far from unity (Keq < 0.05 at 1.5 mM Mg ). The decreased rate constant at 1.5 mM Mg2+ did not result from enable target DNA-primed reverse transcription of the reverse- incomplete binding of the LtrA protein, because increasing the spliced intron RNA, as would occur in vivo. Reverse splicing of the intron lariat into the DNA target sites occurs in two steps: protein concentration did not increase the rates or extents of ′ ′ splicing of the wild-type or mutant introns (Fig. S5)butcould ligation of the 3 end of the lariat RNA to the 5 end of the reflect greater difficulty in folding into the active RNA structure or downstream exon, referred to as partial reverse splicing, followed by ligation of the 5′ end of the lariat RNA to the 3′ end of the a decreased rate for the catalytic steps at low Mg2+ concentration. upstream exon, referred to as full reverse splicing (schematic in Importantly, the amplitudes were larger for the mutants than for fi Fig. S6). Plots showing time courses for net completion of the the wild-type intron, such that the mutants splice more ef ciently fi 2+ rst step (partial plus full reverse splicing) and full reverse than the wild-type intron at low Mg concentrations. The DV20 splicing (insertion of the intron RNA between the 5′ and 3′ DNA mutant gave more splicing than DV14, mirroring the relative ac- exons) are shown in Fig. 7, and gels are shown in Fig. S6. tivity of the two variants in vivo. At 5 mM Mg2+, the reactions of wild-type RNPs were − Together, these results suggest a model in which the initial fast monophasic with a k of 0.068 min 1 for net completion of the 2+ fl − phase of splicing observed at 5 and 2.5 mM Mg re ects first step and 0.030 min 1 for full reverse splicing, with the fully a population of precursor RNAs that can fold and assemble with reverse spliced product reaching 34% at the end of the time LtrA protein rapidly to form an active RNA structure. The course. The DV14 and DV20 RNPs show similar kinetics with 2+ fl slower phases at these Mg concentrations most likely re ect little or no decrease in amplitude or reaction rate. In this case, populations that fold and assemble with LtrA more slowly. At 1.5 the reaction starts with RNPs containing lariat RNAs that have 2+ mM Mg , an initial fast phase was not observed, either because already folded into the active RNA structure. In addition, be- the fraction of the RNA that folds rapidly is too small or because cause the reaction is performed with excess RNPs, a fraction of 2+ the catalytic steps of splicing are slower at this Mg concen- inactive or misfolded RNPs would not necessarily inhibit the tration. For all three introns, the fraction of reactive RNPs reverse splicing reaction, provided that these molecules do not decreases at lower Mg2+ concentrations, but to a lesser extent compete with active RNPs for binding to the target DNA. Either for DV14 and DV20 than for the wild-type intron, enabling the or both of these factors may underlie the finding that the mutant introns to splice more efficiently than the wild-type intron mutations do not impair reverse splicing at high magnesium at 1.5 mM Mg2+. concentration, as they do for RNA splicing.

Truong et al. PNAS | Published online September 16, 2013 | E3805 Downloaded by guest on October 1, 2021 Partial + Full reverse splicing Full reverse splicing 1.0 1

0.4 0.8 2+ 0.6 0.3

0.4 0.2 Fraction WT k=0.068 (0.82) WT k=0.030 (0.34) 5 mM Mg 0.2 0.1 DV14 k=0.058 (0.80) DV14 k=0.023 (0.34) Fig. 7. Time courses of reverse splicing of group II intron RNPs DV20 k=0.069 (0.79) DV20 k=0.032 (0.32) 0.0 0.0 in target DNA-primed reverse transcription reactions for the 0 30 60 90 120 150 180 0 30 60 90 120 150 180 wild-type and variant DV14 and DV20 Ll.LtrB-ΔORF introns at 1 1 different Mg2+ concentrations. Reverse splicing time courses 0.6 were performed by incubating a 50-fold molar excess of RNPs 0.3 containing the wild-type (WT) Ll.LtrB-ΔORF and variant DV14 2+ and DV20 intron RNAs with a 129-bp internally labeled DNA 0.4 0.2 substrate containing the Ll.LtrB intron-target site in reaction 2+

Fraction media containing dNTPs for cDNA synthesis and different Mg 0.2 WT k=0.0027 (0.38) WT k=0.0029 (0.21) 0.1 concentrations (5, 2.5, or 2 mM) at 37 °C. Reactions were ini- DV14 k=0.0035 (0.51) DV14 k=0.0035 (0.31) 2.5 mM Mg tiated by adding RNPs and incubated for different times up DV20 k=0.0051 (0.56) DV20 k=0.0051 (0.35) 0.0 0.0 to 22 h. The products were run in a denaturing 6% poly- 0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200 acrylamide gel, which was dried and quantified with a Phos- 1 1 phorImager. The plots show the accumulation of products 0.06 resulting from the first and second steps of reverse splicing (lariat RNA joined to the 3′ DNA exon and linear intron RNA

0.02 2+ 0.04 inserted between the DNA exons, respectively) and of full re- verse splicing (linear intron RNA inserted between the two DNA exons) fit to an equation with a single exponential. The

Fraction 0.01 −1 0.02 2 mM Mg WT k=0.0028 (0.023) WT k=0.0027 (0.010) rate constants (k,min ) and amplitudes (fraction of DNA DV14 k=0.0029 (0.060) DV14 k=0.0030 (0.023) substrate that has undergone reverse splicing) indicated in the DV20 k=0.0032 (0.060) DV20 k=0.0034 (0.025) 0.00 0.00 plots are the averages from three experiments with SDs of 0 200 400 600 800 1000 1200 0 200 400 600 800 1000 1200 <20%. Similar results were obtained for reactions with a 100- Time (min) Time (min) fold molar excess of RNPs.

At 2.5 and 2 mM Mg2+, the reaction rate for all three introns Results of Tb3+-cleavage titration assays for wild-type and decreased 10- to 20-fold, and the amplitudes decreased pro- variant Ll.LtrB DV RNAs in reaction media containing 50 mM gressively, indicating a smaller proportion of reactive molecules. KCl and 1.5 mM or 5 mM MgCl2 are shown in Fig. 8 and Fig. S7, For all three introns at low Mg2+ concentrations, the rate con- respectively. The RNA was probed at Tb3+ concentrations stants for net completion of the first step were similar to those ranging from 10 μM to 10 mM, with cleavages displaying satu- 3+ for full reverse splicing, indicating that processes leading up to or ration at lower Tb concentrations indicating higher affinity culminating with the first step of reverse splicing are rate-limit- binding (16). Consistent with the O. iheyensis crystal structure fi and aI5γ intron Tb3+ cleavages (7, 14), we observed saturated ing. The population then presumably undergoes the rst and 3+ second steps reversibly, leading to steady-state populations of the cleavage at low Tb concentrations at the dinucleotide bulge in ′ partially and fully reverse-spliced products until the reaction is the wild-type Ll.LtrB DV RNA, primarily 3 to the A-residue ′ fi rendered irreversible by cDNA synthesis. Thus, the decreased and to a lesser extent 3 of the C-residue, indicating high-af nity 2+ fl fi binding. Moderate cleavages, two- to threefold higher than at rates at low Mg concentrations may re ect dif culties with a 3+ required RNA conformational change or catalytic step, whereas weaker sites at some Tb concentrations, were observed after the decreased amplitudes may reflect a tendency of the intron the C-residue preceding the tetraloop and a G-residue in the lariat RNAs or RNPs to revert to inactive conformations. As for proximal stem, whereas weaker cleavages occur throughout the distal stem and tetraloop. The cleavages decrease at high con- RNA splicing, the decrease in the proportion of reactive RNPs centrations of Tb3+, presumably reflecting misfolding owing to at low Mg2+ concentrations was less for the mutants than for the replacement of Mg2+ by Tb3+ throughout the RNA (37). At 5 wild-type intron, enabling the mutants to function more effi- 2+ 2+ mM Mg , several sites show twofold higher cleavage than at 1.5 ciently at low Mg concentration. DV20 again outperformed 2+ 2+ mM Mg , including the C-residue proximal to the tetraloop, all DV14 at low Mg concentrations, in agreement with the rela- four residues of the tetraloop, the catalytic triad, and the sugar- tive activity of the two variants in vivo. edge/Hoogsteen contact, whereas cleavages at other sites have intensity similar to those at 1.5 mM Mg2+ (Fig. S7). Terbium-Cleavage Assays. To investigate how mutations in the 2+ 2+ The DV14 and DV20 variants show enhanced cleavages at 1.5 distal stem affect Mg binding and Mg -dependent struc- 2+ 3+ mM Mg at sites in the distal stem, in the tetraloop, and at ture formation within DV, we performed terbium (Tb )-cleav- some sites in the proximal stem (Fig. 8). At low concentrations of age assays on isolated wild-type and variant DVs. By efficiently 3+ 3+ Tb , DV14 shows dramatically increased cleavage at the C- ′ deprotonating water and 2 -OH groups in RNA, Tb enhances residue preceding the tetraloop and a G-residue in the proximal ’ 3+ cleavage of RNA s phosphodiester backbone and is therefore stem. These cleavages seem to saturate at lower Tb concen- useful as a probe of RNA structure (36). Additionally, because trations in the mutant, suggesting increased metal-ion affinity. 3+ 2+ Tb is close in size to Mg and coordinates similarly to oxygen DV20 shows moderately enhanced cleavages at the C-residue 2+ ligands, it can replace bound Mg and cleave at these positions in preceding the tetraloop and three other sites in the distal stem, but folded RNAs, thereby mapping potential Mg2+-binding sites (16). without apparent changes in affinity. Additional weaker cleavages Tb3+ cleavage of isolated DV of the yeast aI5γ intron identified throughout the distal stem and the tetraloop are enhanced in the ahigh-affinity Mg2+-binding site at the dinucleotide bulge and mutants relative to the wild-type DV by up to twofold. Notably, in lower-affinity sites within the distal helix and tetraloop (14), which reaction medium containing 5 mM Mg2+, the cleavage intensities were later supported by the O. iheyensis intron crystal structure (7). relative to input RNA for the variants remain similar to those at

E3806 | www.pnas.org/cgi/doi/10.1073/pnas.1315742110 Truong et al. Downloaded by guest on October 1, 2021 A promotes a conformational transition to a more native struc- PNAS PLUS ture, which is inefficient for the wild-type DV at low Mg2+ concentrations. Discussion Here, we used an E. coli mutant lacking the Mg2+ transporter MgtA to select variants of the Ll.LtrB-ΔORF group II intron with mutations in DV that enable more efficient retrohoming at low Mg2+ concentrations. We find that such mutations are clustered in the distal stem of DV. The two most efficient var- iants, DV14 and DV20, have mutations restricted to the distal stem and increase retrohoming efficiencies in the mgtA mutant by 16- and 22-fold, respectively, without substantially affecting intron RNA levels in vivo. Biochemical analysis of these variants leads to a model in which the mutations enhance Mg2+ binding to the distal stem of DV, and this enhanced Mg2+ binding facilitates an RNA-folding step, resulting in increased efficiencies of both RNA splicing and reverse splicing at low Mg2+ concen- trations. Our findings reveal that DV is involved in a critical Mg2+-dependent folding step that contributes to the unusually B high Mg2+ concentrations required for group II intron func- tion, and they have implications for the evolution of splicing mechanisms in higher organisms and the use of group II introns for gene targeting. E. coli and related bacteria possess two major Mg2+ trans- porters: CorA, which maintains overall Mg2+ levels, and MgtA, which has been shown to function as a Mg2+ scavenging system 2+ C at low Mg concentrations in Salmonella typhimurium (38). In setting up our selection system, we found that the retrohoming efficiency of the wild-type Ll.LtrB-ΔORF intron in the mgtA disruptant was 100-fold lower than in the corA disruptant, even though both mutants seem to have similarly decreased in- tracellular Mg2+ concentrations, using a fluorescent probe assay (Fig. 2). This finding indicates that the fluorescent probe does not accurately measure the functional Mg2+ concentration for Fig. 8. Terbium cleavage of isolated DV from the wild-type, DV14, and DV20 group II intron retrohoming in vivo, either because of the non-

3+ 32 BIOCHEMISTRY Ll.LtrB introns at 1.5 mM MgCl2.(A) Gel assay of terbium (Tb )cleavage.5′ P- growth conditions required for uptake of the probe or because of 2+ labeled DV RNA in 50 mM KCl, 1.5 mM MgCl2, and 25 mM 3-(N-morpholino) localized differences in Mg concentration in different regions propanesulfonic acid (MOPS, pH 7.0) was incubated with increasing concen- of the cell. In E. coli, Ll.LtrB RNPs localize to the cellular poles, trations of TbCl3 (0, 0.01, 0.03, 0.1, 0.3, 1, and 10 mM) for 1 h at room tem- along with plasmid DNAs and chromosomal DNA replication perature. For each DV RNA, additional lanes show alkaline hydrolysis (A) and origins and termination sites, which are favored regions for RNase T1 (T) ladders and a control in which the RNA was preincubated in 50 Ll.LtrB retrohoming and retrotransposition (39–42). Notably, 2+ 3+ mM EDTA (E), which chelates both Mg and Tb , before adding 10 mM TbCl3. although CorA is uniformly distributed in the membrane Samples were analyzed in a denaturing 17% polyacrylamide gel, which was 3+ throughout the cell, MgtA is localized to the cell membrane near dried and scanned with a PhosphorImager. The locations of the Tb -cleavage the cellular poles, where it could serve to boost local Mg2+ con- sites are shown on a secondary structure map of DV below. Nucleotides that gave high or medium cleavage in the wild-type (WT) intron are shown in centrations that support retrohoming (Genobase of the Nara In- stitute of Science and Technology online resource; http://ecoli. red or orange circles, respectively. Nucleotides that displayed enhanced fi cleavage in the DV14 and DV20 variants, based on the scales described naist.jp/data/GenePro les/GFPimages/34-08-2.jpg) (43). To our 2+ below, are indicated by red and blue triangles, respectively. (B)Phosphor- knowledge, localized differences in intracellular Mg concen-

Imager scan showing quantification of cleavage at 10 μMTbCl3.Band tration affecting biological processes have not been reported heights are normalized to input DV RNA (top of gel). (C) Secondary struc- previously, but could have wide implications. ture map of DV showing the location of cleavage sites. Nucleotides in the Our primary selection in the mgtA disruptant used a library of wild-type DV that gave significant Tb3+ cleavage over background are Ll.LtrB-ΔORF introns in which all residues in DV were partially highlighted in yellow (low, 1.5- to threefold); orange (medium, 3.1- to 10- randomized with 70% of the wild-type nucleotide and 10% of fold); or red (high, >10-fold). Residues that showed enhanced cleavage in each mutant nucleotide at each position. In agreement with the DV14 or DV20 variants are indicated as colored triangles, with the size previous mutational analyses of DV of the yeast aI5γ group II of the triangle indicating the fold-increase relative to the wild-type DV intron (44), the most strongly conserved regions were the cata- (large, greater than threefold; medium, 2.1- to threefold; small, 1.2- to lytic triad, the sugar-edge/Hoogsteen contact, and the κ′ and μ′ twofold). Cleavages occur 3′ to the nucleotide indicated. The experiment was repeated three times with similar results. tertiary contacts (invariant or two or fewer mutations in 106 active variants). More variation (three or more mutations) was found in three other functionally important regions: the di- fi 2+ 1.5 mM Mg2+, suggesting that variant DV stems fold similarly at nucleotide bulge, which is a high-af nity Mg -binding site; the both Mg2+ concentrations (Fig. S7). In general, we find that the GNRA tetraloop, which interacts with a tetraloop acceptor in DI 3+ (45); and the λ′ motif (46). However, the variants in these variants enhance Tb cleavage relative to input RNA regions functioned suboptimally and reverted to the wild-type throughout the distal stem at 1.5 mM Mg2+ to levels found for 2+ sequences in secondary selections (Figs. S2 and S3). The latter the wild-type DV at 5 mM Mg . Considered together with the finding agrees with previous studies showing that mutations in results of the RNA splicing and reverse-splicing assays and the the dinucleotide bulge of the yeast aI5γ group II intron decrease selection for G-U base pairs, the Tb3+-cleavage data suggest that the efficiency of self-splicing (47). The most mutable regions of enhanced metal ion binding to the distal stem of the variants DV in our selection were a nucleotide between the catalytic triad

Truong et al. PNAS | Published online September 16, 2013 | E3807 Downloaded by guest on October 1, 2021 and sugar-edge/Hoogsteen contact and the three terminal base pairs of the distal stem between the λ motif and the tetraloop. The distal stem is the most variable region of DV of naturally occurring group II introns and is interchangeable among differ- ent group II introns (44). Strikingly, although the upper region of the distal stem of DV was thought to play no major functional role, we found that it is the major site of mutations that improved function of the Ll.LtrB intron at low Mg2+ concentrations. All 43 improved variants with retrohoming efficiencies greater than threefold higher than that of the wild-type intron in the mgtA disruptant had mutations in this part of the distal stem, including 15 variants that contained only mutations in this region. By contrast, we identified only two variants (DV1 and DV103) that enhance retrohoming at low Mg2+ concentrations without mutations in the distal stem (Fig. 4). Both of these variants have the same A → G mutation, which Fig. 9. Models of Mg2+ binding and terbium cleavage on tertiary structures generates a G-U base pair in the proximal stem and increases of extended and folded Ll.LtrB intron DV. Terbium (Tb3+) cleavage patterns 2+ + retrohoming efficiency in the mgtA mutant two- to threefold. The (Fig. 8) and Mg and K ions found in the O. iheyensis DV crystal structure distal stem mutations that improved function at low Mg2+ con- ensemble were modeled onto tertiary structure models of the Ll.LtrB intron centrations trend toward weaker base pairings, G-C pairs DV. (A and B) 3D models of DV of the Ll.LtrB intron based on (A) the NMR structure of the isolated DV of the P. litorralis intron (13), and (B) the folded changed to G-U or A-U pairs, or mispairings, which may en- DV structure in the crystal structures of the O. iheyensis intron (9). In both hance dynamics and facilitate bending of DV (Figs. 3 and 4 and 2+ + 2+ models, Mg and K ions were positioned based on their locations in an Fig. S2). G-U wobble pairs within RNA stems are frequent Mg - ensemble of 15 superposed O. iheyensis DV structures shown in Fig. 1B (7–9). binding sites owing to the high electronegativity along their Tb3+ cleavages found in the isolated DVs of wild-type and variant Ll.LtrB major groove surface, with the flanking base pairs influencing the introns at 10 μMTb3+ and 1.5 mM Mg2+ are shown on the model, with Tb3+ affinity and position of the bound metal ion (30, 31). The two cleavages in the wild-type DV indicated by colors on the phosphodiester most efficient variants, DV14 and DV20, both have additional backbone and those in the variant introns indicated by different-sized tri- U-G or G-U wobble pairs in the distal stem, and the single base angles as in Fig. 8. Cleavages occur 3′ to the nucleotide indicated. mutation in DV14 results in two tandem U-G base pairs, a motif identified as a strong Mg2+-binding site in RNA stems (30, 31). The strong clustering of mutations with improved function at low the site of the bend between the proximal and distal stem. Other Mg2+ in the distal stem may reflect that it is one of the few cleavages that are enhanced in the mutants, such as those within regions of DV that can be mutated without impairing other the 5′ strand of the distal stem, are not near metal ion-binding functions and/or that Mg2+-binding sites in the distal stem of DV sitesseenintheO. iheyensis structures, but are near U-G or G- are critical for a key RNA folding step. U base pairs introduced by the mutations, suggesting an origin Biochemical assays of the two most improved variants, DV14 for the increased cleavages (e.g., the hypothetical Mg2+ rep- and DV20, shows that the distal stem mutants have higher resentedasabluespherewitha“?” boundinthemajorgroove splicing and reverse-splicing activity than the wild-type intron at in Fig. 9). Although the localization of Tb3+ in the major low Mg2+ concentrations, reflecting a greater propensity to fold groove of a G-U pair is likely to restrict its access to adjacent into the catalytically active RNA structure. The introduction of 2′-OH groups and therefore limit RNA cleavage at these sites a sharp bend in DV that is required to form the active site is (31), it is possible that Tb3+ can be sufficiently dynamic at G-U likely an energetically unfavorable conformational change and is pairs to retain significant cleavage activity. Alternatively, or in an attractive candidate for a folding step that limits the efficiency addition, the metal-bound G-U pairs in the distal stem may of native-state formation at low Mg2+ concentrations, leading to increase RNA flexibility in the DV mutants, promoting for- the relatively high Mg2+ requirement of group II introns com- mation of a bent conformation and leading to enhanced pared with other ribozymes. For DV RNA in isolation or in early cleavage at multiple sites. In the intact intron, these site-bound folding intermediates of the intron, the conformational change is metal ions may stabilize a bent and functional conformation of likely to be rapid and reversible. However, upon global collapse DV directly and/or indirectly by stabilizing tertiary contacts in of the intron and encapsulation by DI, the transition may be- the folded intron structure that enforce the bend. Delocalized come very slow and divide the RNA into active and inactive metal ions may also contribute to the stability of the bent con- populations, which give rise to the multiple phases observed in formation of DV because it is more compact than the extended group II intron splicing reactions at low Mg2+ concentrations form (48). (35). By stabilizing the native, bent conformation of DV, the Group II introns are evolutionarily related to nuclear spli- mutations could increase the fraction of the RNA that readily ceosomal introns of eukaryotes (1, 11, 49). According to one folds to the native state and gives activity. The finding that the scenario, group II introns entered eukaryotes with bacterial mutations also enhance reverse-splicing activity of the con- that gave rise to mitochondria and formationally constrained intron lariat RNA is consistent with and invaded the nucleus, where they or their descendants pro- the possibility that only a limited conformational change in- liferated before degenerating into spliceosomal introns and the volving DV or its activation by Mg2+ binding may be rate-lim- spliceosome. The nuclear membrane has been hypothesized to iting at low Mg2+ concentrations. have evolved in response to this group II intron invasion as To understand how mutations in the distal stem of DV might a means of separating transcription from , thereby contribute to native folding of DV, we modeled the distal stem of preventing translation of unspliced introns (50). A further con- DV in the Ll.LtrB intron in the extended state based on the sequence of the nuclear membrane was sequestration of the NMR structure of the P. littoralis LSU/2 intron (13) and in the genome into a separate compartment, where Mg2+ is chelated folded state based on the crystal structure of the O. iheyensis by chromosomal DNA (51), possibly leading to lower free Mg2+ group II intron (9) (Fig. 9). The models show that some Tb3+ concentrations than in the cytoplasm. Indeed, such differential cleavages in the distal stem of DV of Ll.LtrB that are enhanced Mg2+ concentrations between the nucleus and cytoplasm may in variants DV14 and DV20 correlate with nearby metal ion- explain recent findings that an Ll.LtrB-ΔORF intron cannot be binding sites seen in the O. iheyensis structures, including at the spliced by the LtrA protein in the yeast nucleus but can be positions of M3 and M4 near the GNRA tetraloop and M5 near spliced by the LtrA protein after export of precursor RNAs

E3808 | www.pnas.org/cgi/doi/10.1073/pnas.1315742110 Truong et al. Downloaded by guest on October 1, 2021 to the cytoplasm (52). Spliceosomal introns have evolved to func- Methods PNAS PLUS 2+ tion at lower Mg concentrations in eukaryotes, perhaps reflecting E. coli strains, , and reagents used can be found in SI Methods. their disintegration into snRNAs, which may facilitate conforma- Methods used for retrohoming assays, determination of intracellular Mg2+ tional changes, and their increased reliance on protein cofactors, concentrations, construction of DV libraries, selection of DV variants, 3+ which can substitute for Mg2+ to promote RNA folding. How- Northern hybridizations, biochemical assays, and Tb -cleavage assays are 2+ described in detail in SI Methods. Structural modeling with PyMol and ever, the lower Mg concentrations now constitute a natural ModeRNA was performed as described in SI Methods. barrier to further group II intron proliferation in eukaryotic nu- clear genomes. The ability to select group II introns that function ACKNOWLEDGMENTS. We thank Martin Poenie (University of Texas at at lower Mg2+ concentrations may ultimately enhance their utility Austin) for advice on the determination of intracellular Mg2+ concentration in gene targeting in higher organisms, although selections done using mag-fura-2 and Philip Bevilacqua (Pennsylvania State University), Victoria J. DeRose (University of Oregon), and Roland K. O. Sigal (University directly in eukaryotes using libraries with mutations throughout the of Zurich) for comments on the manuscript. This research was supported by intron may be required to achieve maximal retrohoming efficiency. NIH Grant GM037949 and Welch Foundation Grant F-1607.

1. Lambowitz AM, Zimmerly S (2011) Group II introns: Mobile ribozymes that invade 29. Froschauer EM, Kolisek M, Dieterich F, Schweigel M, Schweyen RJ (2004) Fluorescence DNA. Cold Spring Harb Perspect Biol 3(8):a003616. measurements of free [Mg2+] by use of mag-fura 2 in Salmonella enterica. FEMS 2. Peebles CL, et al. (1986) A self-splicing RNA excises an intron lariat. Cell 44(2):213–223. Microbiol Lett 237(1):49–55. 3. Smith D, Zhong J, Matsuura M, Lambowitz AM, Belfort M (2005) Recruitment of host 30. Varani G, McClain WH (2000) The G x U wobble base pair. A fundamental building functions suggests a repair pathway for late steps in group II intron retrohoming. block of RNA structure crucial to RNA function in diverse biological systems. EMBO – Genes Dev 19(20):2477 2487. Rep 1(1):18–23. 4. Yao J, Truong DM, Lambowitz AM (2013) Genetic and biochemical assays reveal a key 31. Keel AY, Rambo RP, Batey RT, Kieft JS (2007) A general strategy to solve the phase role for replication restart proteins in group II intron retrohoming. PLOS Genet 9(4): problem in RNA crystallography. Structure 15(7):761–772. e1003469. 32. Costa M, Fontaine JM, Loiseaux-de Goër S, Michel F (1997) A group II self-splicing 5. Perutka J, Wang W, Goerlitz D, Lambowitz AM (2004) Use of computer-designed intron from the brown alga Pylaiella littoralis is active at unusually low magnesium group II introns to disrupt Escherichia coli DExH/D-box protein and DNA helicase concentrations and forms populations of molecules with a uniform conformation. genes. J Mol Biol 336(2):421–439. J Mol Biol 274(3):353–364. 6. Mastroianni M, et al. (2008) Group II intron-based gene targeting reactions in 33. Matsuura M, et al. (1997) A bacterial group II intron encoding reverse transcriptase, eukaryotes. PLOS ONE 3(9):e3121. maturase, and DNA endonuclease activities: Biochemical demonstration of maturase 7. Toor N, Keating KS, Taylor SD, Pyle AM (2008) Crystal structure of a self-spliced group activity and insertion of new genetic information within the intron. Genes Dev 11(21): II intron. Science 320(5872):77–82. 2910–2924. 8. Toor N, Rajashankar K, Keating KS, Pyle AM (2008) Structural basis for exon 34. Dai L, et al. (2008) A three-dimensional model of a group II intron RNA and its recognition by a group II intron. Nat Struct Mol Biol 15(11):1221–1222. – 9. Marcia M, Pyle AM (2012) Visualizing group II intron catalysis through the stages of interaction with the intron-encoded reverse transcriptase. Mol Cell 30(4):472 485. splicing. Cell 151(3):497–507. 35. Russell R, Jarmoskaite I, Lambowitz AM (2013) Toward a molecular understanding of – 10. Michel F, Costa M, Westhof E (2009) The ribozyme core of group II introns: A structure RNA remodeling by DEAD-box proteins. RNA Biol 10(1):44 55. in want of partners. Trends Biochem Sci 34(4):189–199. 36. Forconi M, Herschlag D (2009) Metal ion-based RNA cleavage as a structural probe. – 11. Keating KS, Toor N, Perlman PS, Pyle AM (2010) A structural analysis of the group II Methods Enzymol 468:91 106. intron active site and implications for the spliceosome. RNA 16(1):1–9. 37. Harris DA, Tinsley RA, Walter NG (2004) Terbium-mediated footprinting probes 12. Sigel RKO, et al. (2004) Solution structure of domain 5 of a group II intron ribozyme a catalytic conformational switch in the antigenomic hepatitis delta ribozyme. reveals a new RNA motif. Nat Struct Mol Biol 11(2):187–192. J Mol Biol 341(2):389–403.

13. Seetharaman M, Eldho NV, Padgett RA, Dayie KT (2006) Structure of a self-splicing 38. Snavely MD, Gravina SA, Cheung TT, Miller CG, Maguire ME (1991) Magnesium BIOCHEMISTRY group II intron catalytic effector domain 5: Parallels with spliceosomal U6 RNA. RNA transport in Salmonella typhimurium. Regulation of mgtA and mgtB expression. J Biol 12(2):235–247. Chem 266(2):824–829. 14. Sigel RKO, Vaidya A, Pyle AM (2000) Metal ion binding sites in a group II intron core. 39. Ichiyanagi K, Beauregard A, Belfort M (2003) A bacterial group II intron favors Nat Struct Biol 7(12):1111–1116. retrotransposition into plasmid targets. Proc Natl Acad Sci USA 100(26):15742–15747. 15. DeRose VJ (2003) Metal ion binding to catalytic RNA molecules. Curr Opin Struct Biol 40. Zhao J, Lambowitz AM (2005) A bacterial group II intron-encoded reverse transcriptase 13(3):317–324. localizes to cellular poles. Proc Natl Acad Sci USA 102(45):16133–16140. 16. Sigel RKO (2005) Group II intron ribozymes and metal ions–a delicate relationship. Eur 41. Beauregard A, Chalamcharla VR, Piazza CL, Belfort M, Coros CJ (2006) Bipolar J Inorg Chem 2005(12):2281–2292. localization of the group II intron Ll.LtrB is maintained in Escherichia coli deficient in 2+ 17. Gregan J, Kolisek M, Schweyen RJ (2001) Mitochondrial Mg( ) homeostasis is critical nucleoid condensation, partitioning and DNA replication. Mol Microbiol – for group II intron splicing in vivo. Genes Dev 15(17):2229 2237. 62(3):709–722. 18. Lusk JE, Williams RJ, Kennedy EP (1968) Magnesium and the growth of Escherichia 42. Yao S, Helinski DR, Toukdarian A (2007) Localization of the naturally occurring – coli. J Biol Chem 243(10):2618 2624. plasmid ColE1 at the cell pole. J Bacteriol 189(5):1946–1953. 19. Horowitz SB, Tluczek LJ (1989) Gonadotropin stimulates oocyte translation by 43. Gray AN, et al. (2011) Unbalanced charge distribution as a determinant for dependence increasing magnesium activity through intracellular potassium-magnesium exchange. of a subset of Escherichia coli membrane proteins on the membrane insertase YidC. MBio Proc Natl Acad Sci USA 86(24):9652–9656. 2(6):e00238–11. 20. Günther T (2006) Concentration, compartmentation and metabolic function of 44. Boulanger SC, et al. (1995) Studies of point mutants define three essential paired intracellular free Mg2+. Magnes Res 19(4):225–236. nucleotides in the domain 5 substructure of a group II intron. Mol Cell Biol 15(8): 21. Saldanha R, et al. (1999) RNA and protein catalysis in group II intron splicing and 4479–4488. mobility reactions using purified components. Biochemistry 38(28):9069–9083. 45. Costa M, Michel F (1995) Frequent use of the same tertiary motif by self-folding RNAs. 22. Lehman N, Joyce GF (1993) Evolution in vitro of an RNA enzyme with altered metal EMBO J 14(6):1276–1285. dependence. Nature 361(6408):182–185. 46. Boudvillain M, de Lencastre A, Pyle AM (2000) A tertiary interaction that links active- 23. Bagby SC, Bergman NH, Shechner DM, Yen C, Bartel DP (2009) A class I ligase ribozyme 2+ site domains to the 5′ splice site of a group II intron. Nature 406(6793):315–318. with reduced Mg dependence: Selection, sequence analysis, and identification of 47. Schmidt U, Podar M, Stahl U, Perlman PS (1996) Mutations of the two-nucleotide functional tertiary interactions. RNA 15(12):2129–2146. 24. Chen X, Denison L, Levy M, Ellington AD (2009) Direct selection for ribozyme cleavage bulge of D5 of a group II intron block splicing in vitro and in vivo: Phenotypes and – activity in cells. RNA 15(11):2035–2045. suppressor mutations. RNA 2(11):1161 1172. – 25. Maguire ME (2006) Magnesium transporters: Properties, regulation and structure. 48. Draper DE (2004) A guide to ions and RNA structure. RNA 10(3):335 343. Front Biosci 11:3149–3163. 49. Rogozin IB, Carmel L, Csuros M, Koonin EV (2012) Origin and evolution of spliceosomal – 26. Bui DM, Gregan J, Jarosch E, Ragnini A, Schweyen RJ (1999) The bacterial magnesium introns. Biol Direct 7:11 38. transporter CorA can functionally substitute for its putative homologue Mrs2p in the 50. Martin W, Koonin EV (2006) Introns and the origin of nucleus-cytosol yeast inner mitochondrial membrane. J Biol Chem 274(29):20438–20443. compartmentalization. Nature 440(7080):41–45. 27. Yao J, Zhong J, Lambowitz AM (2005) Gene targeting using randomly inserted group 51. Strick R, Strissel PL, Gavrilov K, Levi-Setti R (2001) Cation-chromatin binding as shown II introns (targetrons) recovered from an Escherichia coli gene disruption library. by ion microscopy is essential for the structural integrity of . J Cell Biol Nucleic Acids Res 33(10):3351–3362. 155(6):899–910. 28. Karberg M, et al. (2001) Group II introns as controllable gene targeting vectors for 52. Chalamcharla VR, Curcio MJ, Belfort M (2010) Nuclear expression of a group II intron genetic manipulation of bacteria. Nat Biotechnol 19(12):1162–1167. is consistent with spliceosomal intron ancestry. Genes Dev 24(8):827–836.

Truong et al. PNAS | Published online September 16, 2013 | E3809 Downloaded by guest on October 1, 2021